Parsing through innerHTML with HtmlAgilityPack

3.2k views Asked by At

Just trying to figure out how to parse information from already parsed information.

foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//div [@class=\"result-link\"]"))
{
    if (node == null)
        Console.WriteLine("debug");
    else
    {
        //string h_url = node.Attributes["a"].Value;
        Console.WriteLine(node.InnerHtml);
    }
}

So you can kind fo see what I am trying to do with the 'string h_url' declaration. Within the "result-link" div class there's an a href attribute that I am trying to grab the href value. So the link basically.

Can't seem to figure it out. I have tried using the Attributes array:

string h_url = node.Attributes["//a[@href].Value;

With no luck.

1

There are 1 answers

0
JLRishe On BEST ANSWER

You can use XPath to select elements relative to the current node:

HtmlNodeCollection nodes = doc.DocumentNode.SelectNodes("//div[@class='result-link']");
if (nodes != null)
{
    foreach (HtmlNode node in nodes)
    {
        HtmlNode a = node.SelectSingleNode("a[@href]");
        if (a != null)
        {
            // use  a.Attributes["href"];
        }

        // etc...
    }
}