extracting values from html table using beautifulsoup4 (2nd row onwards, 1st and 6th column)

686 views Asked by At

I am new to python and need some guidance on extracting values from specific cells from a HTML table.

The URL that I am working on can be found here

I am looking to get the first 5 values only in the Month and Settlement columns and subsequently display them as:

"MAR 14:426'6"

Problem that I am facing is:

  1. How do I get the loop to start from the 3rd "TR" in the table
  2. How to get only values for td[0] and td[6].
  3. How to restrict the loop to only retrieve values for 5 rows

This is the code that I am working on:

tableData = soup1.find("table", id="DailySettlementTable")
for rows in tableData.findAll('tr'):
    month = rows.find('td')
    print month

Thank you and appreciate any form of guidance!

1

There are 1 answers

0
Chris On BEST ANSWER

You probably want to use slicing.

Here's a modified snippet for your code:

table = soup.find('table', id='DailySettlementTable')

# The slice notation below, [2:7], says to take the third (index 2)
# to the eighth (index 7) values from the rows we get.
for rows in table.find_all('tr')[2:7]:
    cells = rows.find_all('td')
    month = cells[0]
    settle = cells[6]

    print month.string + ':' + settle.string