PYTHON - Unable To Find Xpath Using Selenium

68 views Asked by At

I have been struggling with this for a while now. I have tried various was of finding the xpath for the following highlighted HTML I am trying to grab the dollar value listed under the highlighted Strong tag. enter image description here

Here is what my last attempt looks like below:

try:
     price = browser.find_element_by_xpath(".//table[@role='presentation']")
     price.find_element_by_xpath(".//tbody")
     price.find_element_by_xpath(".//tr")
     price.find_element_by_xpath(".//td[@align='right']")
     price.find_element_by_xpath(".//strong")
     print(price.get_attribute("text"))
except:
     print("Unable to find element text")

I attempted to access the table and all nested elements but I am still unable to access the highlighted portion. Using .text and get_attribute('text') also does not work.

Is there another way of accessing the nested element? Or maybe I am not using XPath as it properly should be. I have also tried the below:

 price = browser.find_element_by_xpath("/html/body/div[4]")

UPDATE: Here is the Full Code of the Site. The Site I am using here is www.concursolutions.com I am attempting to automate booking a flight using selenium. When you reach the end of the process of booking and receive the price I am unable to print out the price based on the HTML. It may have something to do with the HTML being a java script that is executed as you proceed.

enter image description here

2

There are 2 answers

0
The fourth bird On BEST ANSWER

Looking at the structure of the html, you could use this xpath expression:

//div[@id="gdsfarequote"]/center/table/tbody/tr[14]/td[2]/strong

1
Ian Lesperance On

Making it work

There are a few things keeping your code from working.

  1. price.find_element_by_xpath(...) returns a new element.

    Each time, you're not saving it to use with your next query. Thus, when you finally ask it for its text, you're still asking the <table> element—not the <strong> element.

    Instead, you'll need to save each found element in order to use it as the scope for the next query:

    table = browser.find_element_by_xpath(".//table[@role='presentation']")
    tbody = table.find_element_by_xpath(".//tbody")
    tr = tbody.find_element_by_xpath(".//tr")
    td = tr.find_element_by_xpath(".//td[@align='right']")
    strong = td.find_element_by_xpath(".//strong")
    
  2. find_element_by_* returns the first matching element.

    This means your call to tbody.find_element_by_xpath(".//tr") will return the first <tr> element in the <tbody>.

    Instead, it looks like you want the third:

    tr = tbody.find_element_by_xpath(".//tr[3]")
    

    Note: XPath is 1-indexed.

  3. get_attribute(...) returns HTML element attributes.

    Therefore, get_attribute("text") will return the value of the text attribute on the element.

    To return the text content of the element, use element.text:

    strong.text
    

Cleaning it up

But even with the code working, there’s more that can be done to improve it.

  • You often don't need to specify every intermediate element.

    Unless there is some ambiguity that needs to be resolved, you can ignore the <tbody> and <td> elements entirely:

    table = browser.find_element_by_xpath(".//table[@role='presentation']")
    tr = table.find_element_by_xpath(".//tr[3]")
    strong = tr.find_element_by_xpath(".//strong")
    
  • XPath can be overkill.

    If you're just looking for an element by its tag name, you can avoid XPath entirely:

    strong = tr.find_element_by_tag_name("strong")
    
  • The fare row may change.

    Instead of relying on a specific position, you can scope using a text search:

    tr = table.find_element_by_xpath(".//tr[contains(text(), 'Base Fare')]")
    
  • Other <table> elements may be added to the page.

    If the table had some header text, you could use the same text search approach as with the <tr>.

    In this case, it would probably be more meaningful to scope to the #gdsfarequite <div> rather than something as ambiguous as a <table>:

    farequote = browser.find_element_by_id("gdsfarequote")
    tr = farequote.find_element_by_xpath(".//tr[contains(text(), 'Base Fare')]")
    

But even better, capybara-py provides a nice wrapper on top of Selenium, helping to make this even simpler and clearer:

fare_quote = page.find("#gdsfarequote")
base_fare_row = fare_quote.find("tr", text="Base Fare"):
base_fare = tr.find("strong").text