Why the Selenium xpath to scrape ab table is NOT matching, although an attribute is unique given

Question

Why the Selenium xpath to scrape ab table is NOT matching, although an attribute is unique given

64 views Asked by mad max At 09 December 2024 at 16:51

I try to scrape the NASDAQ values from the www.n-tv.de website. I'm crawling with SELENIUM through the Sites. The Stock Values are on the Site in Tables.

The Source COde of Table for Example is like:

<div class="tableholder">
  <table class="cnttable zebra to le">
    <thead>
      <tr>
        <th>Name</th><th class="ri">Kurs</th><th class="ri">%</th><th class="ri">Absolut</th><th class="ri hidden-xs-down">Relation</th><th class="ri hidden-xs-down">Zeit</th><th class="ri hidden-xs-down hidden-sm-down">Handelsvolumen</th><th class="hidden-xs-down hidden-sm-down">ISIN</th>
      </tr>
    </thead>
    <tbody>
      
      <tr class="linked" onclick="document.location='https://www.n-tv.de/boersenkurse/aktien/activision-blizzard-295693';">
        <td>Activision Blizzard</td>
        <td class="ri"><span class="icon_neg">66,53$</span></td>
        <td class="ri"><span class="neg">-1,42%</span></td>
        <td class="ri"><span class="neg">-0,96</span></td>
        <td class="relation hidden-xs-down"><span class="neg">&nbsp;<span><span></span></span><span style="border-width: 24px;"></span></span></td>
        <td class="ri hidden-xs-down">31.12.</td>
        <td class="ri hidden-xs-down hidden-sm-down">8 Tsd.</td>
        <td class="hidden-xs-down hidden-sm-down">US00507V1098</td>
      </tr>
  
      
      ...
  
    </tbody>
  </table>
</div>

SO I do not understand the following Problem:

Seachrching the Web Elements of NASDAQ table i will do per Xpath :

nasdaq = driver.find_element_by_xpath('//table[@class="cnttable zebra to le"]')
       
rows_nasdaq = nasdaq.find_elements_by_class_name('linked')

I have made another solution, that works correctly by searching the tableholder elements (3 on this site) and after listing them then taking only the third object, but i really want to understand, why that xpath selctor above is not working to this the element , although i have the class name unique on this site as an attribute of the table tag element.

I do not use css or something, has someone an idea, why in this case the xpath is not matching ??

Original Q&A

There are 1 answers

**HedgeHog** · Accepted Answer · 2022-01-03T14:29:36+00:00

Assumed yo like to scrape this url https://www.n-tv.de/boersenkurse/suche/?suchbegriff=to%20le.

You have to wait for element you try to find is present in the DOM and can use selenium waits for this:

nasdaq = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, '//table[@class="cnttable zebra to le"]')))

Need to be imported

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

Example:

....
driver.get('https://www.n-tv.de/boersenkurse/suche/?suchbegriff=to%20le')
nasdaq = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, '//table[@class="cnttable zebra to le"]')))

for i in nasdaq.find_elements_by_class_name('linked'):
    print(i.get_attribute('onclick'))

Output

document.location='https://www.n-tv.de/boersenkurse/indizes/swx-sp-tra-leis-tr-303397';
document.location='https://www.n-tv.de/boersenkurse/aktien/apollo-tourism-+-leisure-1562996';
document.location='https://www.n-tv.de/boersenkurse/aktien/toqublanmonde--eo-047-11904326';
document.location='https://www.n-tv.de/boersenkurse/indizes/cb-p2p-onl-lend---digbanking-12533785';
document.location='https://www.n-tv.de/boersenkurse/indizes/concinngenddivwomin-leader-3254557';
document.location='https://www.n-tv.de/boersenkurse/indizes/concinnity-msos-leaders-39076931';
...

EDIT

Based on your comment I got the "link" - Issue, there was no table under url https://www.n-tv.de/ but the nasdaq is linked by https://www.n-tv.de/boersenkurse/indizes/nasdaq-849974 and there I found your table.

So it is not necessary to wait, but it can't hurt either. I have imported the table directly with pandas into a dataframe:

import pandas as pd
...
driver.get('https://www.n-tv.de/boersenkurse/indizes/nasdaq-849974')
WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, '//table[@class="cnttable zebra to le"]')))

pd.read_html(driver.page_source)[3]

Output

Note: Relation column is empty, cause there is no text stored in it and you can simply drop it, if you like

Name	Kurs	%	Absolut	Relation	Zeit	Handelsvolumen	ISIN
Activision Blizzard	67,12$	-0,44%	-30	nan	18:05	4 Mio.	US00507V1098
Adobe	545,25$	-3,39%	-1912	nan	18:05	2 Mio.	US00724F1012
Advanced Micro Devices	141,89$	-5,55%	-834	nan	18:05	44 Mio.	US0079031078
Airbnb	167,86$	-2,79%	-481	nan	18:05	2 Mio.	US0090661010
Align Technology	629,44$	-2,87%	-1861	nan	18:02	178 Tsd.	US0162551016
...	...	...	...	...	...	...	...

TechQA.

Why the Selenium xpath to scrape ab table is NOT matching, although an attribute is unique given

There are 1 answers

Example:

Output

EDIT

Output

Related Questions in SELENIUM

Related Questions in WEB-SCRAPING

Related Questions in XPATH

Popular Questions

Popular Tags

Trending Questions