Selenium element locator strategy to scrape text from table that updates data every 10 min

18 views Asked by At

I am trying to scrape the text value of power load in Taiwan from a table that updates every 10 minutes.

Webpage: https://www.taipower.com.tw/tc/page.aspx?mid=206

I have tried a myriad of approaches, which all result in "no such element: Unable to locate element" error message, or an equivalent result following the use of an Explicit Wait. Unfortunately, an API is not available to record the data.

Here is the html covering the desired data "latest_load":

<tbody><tr>
        <td>        
              <div class="col-xs-5 labelx">目前用電量</div>
              <div class="col-md-5 col-xs-5"><h5><span id="latest_load">2,937.9</span></h5></div><p style="margin-top:4%; color:#A9B1EA;text-align: center;">萬瓩</p>

My end goal is to scrape the "latest_load" value of '2,937.9'

I am not certain whether:

a.) my overall coding and selected libraries scraping approach is incorrect for the format of this particular webpage, especially given the fact that webpage updates every 10 minutes. b.) my particular strategy or execution of selecting an element is the source of the error c.) a combination of a+b

Here is my code, including a number of different approaches I tried to select an element:

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait


driver = webdriver.Chrome()
driver.get("https://www.taipower.com.tw/tc/page.aspx?mid=206")
driver.maximize_window()
driver.execute_script("window.scrollBy(0,300)","")

wait = WebDriverWait(driver,30)
load = wait.until(EC.presence_of_element_located((By.ID, 'col-md-5 col-xs-5')))
#load = wait.until(EC.presence_of_element_located((By.XPATH, "//span[@id='latest_load']")))
#load = driver.find_element(By.CSS_SELECTOR, '#latest_load')
#load = driver.find_element(By.XPATH, "//span[@id='latest_load']")
print(load.text)

driver.close()
0

There are 0 answers