document.DocumentNode.QuerySelectorAll does not retrun anything

53 views Asked by At

I try to access the "div.prod_inner" on the following homepage (see image) for web scraping. I use the HtmlAgilityPack and the code works on another homepage and also on "div.lay_main" returning one element.

Screen dump from the page I try to scrape

This is the code I use (C#):

var productHTMLElements = document.QuerySelectorAll("div.prod_inner);

I also tried with

var productHTMLElements = document.DocumentNode.QuerySelectorAll("div.prod_inner");

and ".prod_inner" and "prod_inner" and it does not work either.

I expect the code to fill an array with all the products found on the page that are of class. But the array remains empty.

I am all new to web scraping. What do I do wrong, and how do I do it right?

1

There are 1 answers

1
baynezy On

You are not actually using HtmlAgilityPack you are using a library called Hazz that extends HAP.

I am not familiar with that, but using XPATH should work.

doc.DocumentNode.SelectNodes(@"//div[contains(@class, ""prod_inner"")]");