I want to parse an entire table from yahoo finance. As I understand it 'tbody' and 'thead' tags are not registered by lxml but rather as additional TR so I switched the xpath from:
/html/body/div[4]/div[4]/table[2]/tbody/tr[2]/td/table[2]/tbody/tr/td/table/tbody
to what is seen in the code below
url = 'http://finance.yahoo.com/q/is?s=MMM+Income+Statement&annual'
tree = html.parse(url)
tick_content = [td.text_content() for td in tree.xpath('/html/body/div[4]/div[4]/table[2]/tr[3]/td/table[2]/tr[1]/td/table/td[1]')]
print(tick_content)
I am returning a blank screen. Is there a special way to parse a table orrrr?
Rather than use a huge long XPath as generated by Chrome, you can just search for a table with the
yfnc_tabledata1
class; there is just the one:Get to your
<td>
from there: