XPath expression fails to return correct elements in Selenium (Java)

213 views Asked by At

I am trying to collect a list of products in Amazon. Specifically, I am going to the following URL: https://www.amazon.com/dp/[ASIN]/ref=olp-opf-redir?aod=1&ie=UTF8&condition=ALL where [ASIN] is the unique Amazon Standard Identification Number for the item in question. For this issue, assume the URL to be for these airpods: https://www.amazon.com/dp/B09JQMJHXY/ref=olp-opf-redir?aod=1&ie=UTF8&condition=ALL

Notice that this URL opens a side panel with a listing of different vendors selling the item in different conditions (i.e. new, used, used like new, etc.).

I created an XPath expression to obtain some of these items. The basic XPath for this is //div[@id='aod-offer-list']/div[@id='aod-offer']. I further refined this to return a list of items that are shipped only from Amazon:

//div[@id='aod-offer-list']/div[@id='aod-offer' and div[@id='aod-offer-shipsFrom']/div/div/div/span[text()='Amazon']]

When I evaluate this expression in Chrome, I get the list of offers I am interested in. However, when I run this from Eclipse, I get a list of offers that consist of multiple copies of the pinned offer at the very top of the side panel. The bizarre thing is that the pinned offer (//div[@id='aod-pinned-offer']) is not even a child of the offer list (//div[@id='aod-offer-list']). In fact, the pinned offer and the offer list are siblings of each other. Given these facts, how is it that I am getting a different WebElement list when executing in Java than when evaluating the same XPath directly in Chrome.

The relevant code:

public static void main(String[] args) {

    System.setProperty("webdriver.chrome.driver", "C:/Program Files/WebDrivers/chromedriver.exe");
    WebDriver driver = new ChromeDriver();
    driver.get("https://www.amazon.com/dp/B09JQMJHXY/ref=olp-opf-redir?aod=1&ie=UTF8&condition=ALL");
    
    List<WebElement> offers = new ArrayList<>();
    try {
//      merchants = driver.findElements(By.xpath(xpath));
        new WebDriverWait(driver, Duration.ofSeconds(10)).until(ExpectedConditions.visibilityOfElementLocated(By.xpath("//div[@id='aod-offer-list']")));
        String xpath = "//div[@id='aod-offer-list']/div[@id='aod-offer' and div[@id='aod-offer-shipsFrom']/div/div/div/span[text()='Amazon']]";
        offers = new WebDriverWait(driver, Duration.ofSeconds(10)).until(ExpectedConditions.presenceOfAllElementsLocatedBy(By.xpath(xpath)));
        System.out.println("Found " + offers.size() + " offers.");
        Iterator<WebElement> iter = offers.iterator();
        while (iter.hasNext()) {
            String script = "return arguments[0].innerHTML";
            WebElement offer = iter.next();

            WebElement soldByElement = offer.findElement(By.xpath("//a[@aria-label='Opens a new page']"));
            String soldByText = (String) ((JavascriptExecutor) driver).executeScript(script, soldByElement);
            System.out.println("Sold by: " + soldByText);

            WebElement priceElement = offer.findElement(By.xpath("//span[@class='a-offscreen']"));
            String priceString = (String) ((JavascriptExecutor) driver).executeScript(script, priceElement);
            System.out.println("Price for item " + priceString);
        }
    } catch (TimeoutException toe) {
        System.err.println(toe);
    }
    driver.quit();
}

The output:

Found 4 offers.
Sold by:  Adorama
Price for item $174.00
Sold by:  Adorama
Price for item $174.00
Sold by:  Adorama
Price for item $174.00
Sold by:  Adorama
Price for item $174.00

The output should've been:

Found 2 offers.
Sold by:  Amazon Warehouse
Price for item $160.08
Sold by:  Amazon Warehouse
Price for item $165.30

The incorrect output is pulling the price from the pinned item and the "Sold By" value from one of the vendors not shipping from Amazon. My unproven theory is that the relative path to the "Sold By" and "Price" elements are not relative from the offer element, but from the DOM itself. I tried adding a dot (.) to the XPath string, but that is not a correct notation. I need to force Selenium to resolve the path starting from the obtained offer element.

UPDATE:

If I add the following snippet

String script = "return arguments[0].innerHTML";
WebElement offer = iter.next();
String offerElement = (String) ((JavascriptExecutor) driver).executeScript(script, offer);
System.out.println(offerElement);

it prints out the correct "innerHTML" for the list of offers. In other words, I can see all the correct elements if I use this Xpath

String xpath = "//div[@id='aod-offer-list']/div[@id='aod-offer']";

Trying with https://www.amazon.com/dp/B09R5VYRVN

If you click the element below ("New & Used...")

enter image description here

You will see the slide-in pop up to the right. The listed elements in the popup are the ones that produce the issue mention in the original post.

enter image description here

1

There are 1 answers

0
Ahamed Abdul Rahman On

Sorry, your question is not so clear to me. This is my understood requirement and based on that I have scripted my program. If wrong, please let me know, will try to correct:

You want to show the "Price" and "Sold by" of those options that has "Ships from" as "Amazon" in the offer modal

Based on that, I made the following program and it worked.

public class AmazonStackOverflow {
    static WebDriver driver;

    public static void main(String[] args) {

        driver = new ChromeDriver();
        driver.get("https://www.amazon.com/dp/B09R5VYRVN?th=1");

        //These codes modify my pincode to 10001 US code. This site is working only for US.
        findElement(By.id("nav-global-location-popover-link")).click();
        findElement(By.id("GLUXZipUpdateInput")).sendKeys("10001");
        findElement(By.xpath("//input[@aria-labelledby='GLUXZipUpdate-announce']")).click();
        findElement(By.cssSelector("div.a-popover-footer input#GLUXConfirmClose")).click();


        List<WebElement> offers = new ArrayList<>();
        try {
            //After we refresh the zipcode, the UI refreshes again. So, waiting for element to disappear and appear back again.
            By offerDisplay = By.xpath("//span[@data-action='show-all-offers-display']");
            waitFor(ExpectedConditions.invisibilityOfElementLocated(offerDisplay));

            findElement(offerDisplay).click();
            offers = findElements(By.xpath("//div[@id='aod-offer-list']/div[@id='aod-offer'][.//div[@id='aod-offer-shipsFrom']//span[normalize-space()='Amazon.com']]"));
            System.out.println("Found " + offers.size() + " offers containing Ship From as Amazon");
            for (WebElement offer : offers) {
                String script = "return arguments[0].innerHTML";

                WebElement soldByElement = offer.findElement(By.xpath(".//div[./span[text()='Ships from']]/following-sibling::div[1]/span"));
                String soldByText = jsExecute(soldByElement,script);
                System.out.println("Sold by: " + soldByText);

                WebElement priceElement = offer.findElement(By.xpath(".//span[@class='a-offscreen']"));
                String priceString = jsExecute(priceElement,script);
                System.out.println("Price for item " + priceString);
            }
        } catch (TimeoutException toe) {
            System.err.println(toe);
        }
        driver.quit();
    }

    public static String jsExecute(WebElement element, String script) {
        return (String) ((JavascriptExecutor) driver).executeScript(script, element);
    }

    public static<T> WebElement findElement(By by) {
        return waitFor(elementToBeClickable(by));
    }

    public static List<WebElement> findElements(By by) {
        return waitFor(ExpectedConditions.visibilityOfAllElementsLocatedBy(by));
    }

    public static<T> T waitFor(Function<WebDriver, T> customFunction) {
        try {
            return new WebDriverWait(driver, Duration.ofSeconds(10))
                    .until(customFunction);
        } catch (TimeoutException te) {
            return null;
        }
    }
}

Output:

Found 1 offers containing Ship From as Amazon
Sold by:  Amazon.com
Price for item $200.42

If we want to show all the offers, we need to scroll to the last element wait for it to reload and take the elements. This is a time-consuming process. Will update the program if you say so.

I have added the .(dot) in xpath to find the inner element without which it will take the elements from throughout the page.

//div[@id='aod-offer-list']/div[@id='aod-offer' and div[@id='aod-offer-shipsFrom']/div/div/div/span[text()='Amazon']]

This xpath is invalid. May be you missed any letter while copy pasting?