How to remove nodes above and below somewhere in the document

72 views Asked by At

Assuming I have an instance of HtmlNode pointing to table, how can I remove all nodes above and below it? we can assume table is in the same level of html and body tag

<html>
<body>
<p>please remove me</p>

<table>
....
</table>

<p>please remove me</p>
<a> ... </a>
.
<img>...</img>
</body>
</html>
1

There are 1 answers

1
har07 On BEST ANSWER

According to your HTML sample (and commonly it is), <table> is child of <body>, they are not at the same level. Assuming that table is a variable of type HtmlNode pointing to the <table> element, you can do this way :

var nodes = table.SelectNodes("following-sibling::*[1] | preceding-sibling::*[1]");
foreach (HtmlNode node in nodes)
{
    node.Remove();
}

brief explanation about XPath being used :

  • following-sibling::*[1] : select direct following sibling element regardless of the element name.
  • preceding-sibling::*[1] : select direct preceding sibling element regardless of the element name.
  • | : XPath union operator to combine two different XPath expressions