Webpage parsing in C# .Net with AngleSharp Results in Null

1.4k views Asked by At

I'm trying to scrape some pages on walmart.com using AngleSharp, but for some reason it won't work. I've been using AngleSharp to scrape many sites in the past with no issue, but here it just won't do.

For simplicity, here's one page: https://www.walmart.com/ip/50908276, I'm trying to get the price of the item (currently at $9.99). In Chrome's Console when I type document.getElementsByClassName("Price-characteristic") I get a list of 60 [span.Price-characteristic] results. Perfect. But when I try the same using AngleBrackets it returns none.

Here's my code:

using AngleSharp;
using AngleSharp.Dom;

public async void GetPrice()
{
    var config = Configuration.Default.WithDefaultLoader();
    string address = "https://www.walmart.com/ip/50908276";

    IDocument document = await
    BrowsingContext.New(config).OpenAsync(address);

    var priceDollar = document.GetElementsByClassName("Price-characteristic");
}

I'm not too familiar with HTML so I apologize for any stark ignorance.

1

There are 1 answers

3
L.B On

Using HtmlAgilityPack and XPath

using (var client = new HttpClient())
{
    client.DefaultRequestHeaders.TryAddWithoutValidation("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36");
    var html = await client.GetStringAsync("https://www.walmart.com/ip/50908276");
    var doc = new HtmlAgilityPack.HtmlDocument();
    doc.LoadHtml(html);
    var price = doc.DocumentNode
                    .SelectSingleNode("//*[@data-product-price]")
                    .Attributes["data-product-price"]
                    .Value;

}

This code returns price as 9.99