Python Beautiful Soup Table Data Scraping Specific TD Tags

Question

Python Beautiful Soup Table Data Scraping Specific TD Tags

2.3k views Asked by jcmcdonald At 06 January 2025 at 15:50

This webpage has multiple tables on it: http://www.nfl.com/player/tombrady/2504211/gamelogs .

Within the HTML all of the tables are labeled the exact same:

<table class="data-table1" width="100%" border="0" summary="Game Logs For Tom Brady In 2014">

I can scrape data from only the first table (Preseason table) but I do not know how to skip the first table (Preseason) and scrape data from the second and third tables (Regular Season and Post Season).

I'm trying to scrape specific numbers.

My code:

import pickle
import math
import urllib2
from lxml import etree
from bs4 import BeautifulSoup
from urllib import urlopen

year = '2014'
lastWeek = '2'
favQB1 = "Tom Brady"

favQBurl2 = 'http://www.nfl.com/player/tombrady/2504211/gamelogs'
favQBhtml2 = urlopen(favQBurl2).read()
favQBsoup2 = BeautifulSoup(favQBhtml2)
favQBpass2 = favQBsoup2.find("table", { "summary" : "Game Logs For %s In %s" % (favQB1, year)})
favQBrows2 = []

for row in favQBpass2.findAll("tr"):
    if lastWeek in row.findNext('td'):  
        for item in row.findAll("td"):
            favQBrows2.append(item.text)
print ("Enter: Starting Quarterback QB Rating of Favored Team for the last game played (regular season): "),
print favQBrows2[15]

Original Q&A

There are 2 answers

alecxe On 09 June 2015 at 14:31

Rely on the table title, which is located in the td element in the first table row:

def find_table(soup, label):
    return soup.find("td", text=label).find_parent("table", summary=True)

Usage:

find_table(soup, "Preseason")
find_table(soup, "Regular Season")
find_table(soup, "Postseason")

FYI, find_parent() documentation reference.

**Vikas Ojha** · Accepted Answer · 2015-06-09T14:33:07+00:00

Following should work as well -

import pickle
import math
import urllib2
from lxml import etree
from bs4 import BeautifulSoup
from urllib import urlopen

year = '2014'
lastWeek = '2'
favQB1 = "Tom Brady"

favQBurl2 = 'http://www.nfl.com/player/tombrady/2504211/gamelogs'
favQBhtml2 = urlopen(favQBurl2).read()
favQBsoup2 = BeautifulSoup(favQBhtml2)
favQBpass2 = favQBsoup2.find_all("table", { "summary" : "Game Logs For %s In %s" % (favQB1, year)})[1]
favQBrows2 = []

for row in favQBpass2.findAll("tr"):
    if lastWeek in row.findNext('td'):
        for item in row.findAll("td"):
            favQBrows2.append(item.text)
print ("Enter: Starting Quarterback QB Rating of Favored Team for the last game played (regular season): "),
print favQBrows2[15]

TechQA.

Python Beautiful Soup Table Data Scraping Specific TD Tags

There are 2 answers

Related Questions in PYTHON

Related Questions in WEB-SCRAPING

Related Questions in BEAUTIFULSOUP

Related Questions in HTML-TABLE

Popular Questions

Popular Tags

Trending Questions