Getting text value of a HTML tag through Selenium Web Automation in Python?

1.4k views Asked by At

I am making a reddit bot that will look for certain attributes in comments, use selenium to visit the information website, and use driver.find_element_by... to get the value inside that tag, but it is not working.

When I use driver.find_element_by_class_name(), this is the data returned:

<selenium.webdriver.remote.webelement.WebElement (session="f454dcf92728b9db4de080a27a844bf7", element="514bd57d-99d7-4fce-a05d-3fa92f66ad49")>

when I use driver.find_elements_by_css_selector(".style-scope.ytd-video-renderer"), this is returned:

[
  <selenium.webdriver.remote.webelement.WebElement (session="43cb953cde81df270260bf769fe081a2", element="6b4ee3e2-5e6b-48e2-8ec8-9083bf15baea")>, 
  <selenium.webdriver.remote.webelement.WebElement (session="43cb953cde81df270260bf769fe081a2", ...
]

when I use driver.find_elements_by_css_selector(".style-scope.ytd-video-renderer").

Suppose that this is what I'm trying to locate (The above code returned the above Selenium data for this tag):

<yt-formatted-string class="style-scope ytd-video-renderer" aria-label="Sword Art Online: Alicization Lycoris Opening Full『ReoNa - Scar/let』 by Melodic Star 2 months ago 4 minutes, 18 seconds 837,676 views">Sword Art Online: Alicization Lycoris Opening Full『ReoNa - Scar/let』</yt-formatted-string>

What I want

I want Sword Art Online: Alicization Lycoris Opening Full『ReoNa - Scar/let』 returned.

What could I do?

2

There are 2 answers

7
undetected Selenium On BEST ANSWER

Seems you were pretty close enough. When you use driver.find_element_by_class_name() the first matching WebElement is returned. On printing the same, the output is:

<selenium.webdriver.remote.webelement.WebElement (session="f454dcf92728b9db4de080a27a844bf7", element="514bd57d-99d7-4fce-a05d-3fa92f66ad49")>

which represents the WebElement itself, which possibly contains the desired text.

On similar lines driver.find_elements_by_css_selector(".style-scope.ytd-video-renderer") returns a list of matching WebElements and on printing those, the output is:

[
  <selenium.webdriver.remote.webelement.WebElement (session="43cb953cde81df270260bf769fe081a2", element="6b4ee3e2-5e6b-48e2-8ec8-9083bf15baea")>, 
  <selenium.webdriver.remote.webelement.WebElement (session="43cb953cde81df270260bf769fe081a2",
  ...
]

Solution

To extract the text Sword Art Online: Alicization Lycoris Opening Full『ReoNa - Scar/let』 from the following HTML:

<yt-formatted-string class="style-scope ytd-video-renderer" aria-label="Sword Art Online: Alicization Lycoris Opening Full『ReoNa - Scar/let』 by Melodic Star 2 months ago 4 minutes, 18 seconds 837,676 views">Sword Art Online: Alicization Lycoris Opening Full『ReoNa - Scar/let』</yt-formatted-string>

You can use either of the following Locator Strategies:

  • Using css_selector and get_attribute():

    print(driver.find_element_by_css_selector("yt-formatted-string.style-scope.ytd-video-renderer").get_attribute("innerHTML"))
    
  • Using xpath and text attribute:

    print(driver.find_element_by_xpath("//yt-formatted-string[@class='style-scope ytd-video-renderer']").text)
    

Ideally, to print the text 3,862.76 you have to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:

  • Using CSS_SELECTOR and get_attribute():

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "yt-formatted-string.style-scope.ytd-video-renderer"))).get_attribute("innerHTML"))
    
  • Using XPATH and text attribute:

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//yt-formatted-string[@class='style-scope ytd-video-renderer']"))).text)
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python


Outro

Link to useful documentation:

0
Stroe Andrei On

Use .text:

element = driver.find_element_by_xpath('//*[@id="container"]/h1/yt-formatted-string')
print(element.text)