XPath expression to Select all labels not nested within a UL tag

574 views Asked by At

I have a simple question regarding an XPath expression to ask, my HTML looks like this. I am wanting to select ONLY the label which is a child of DT.

<div class="product-options" id="product-options-wrapper">
<dl class="last">
<dt>
<ul>
<label class="required">Available Grip Sizes<em>*</em></label>
</ul>
<div><label>other label</label></div>
</dt> 

<dt>
<label>Would you like this racket restrung?</label>
</dt>

<dt>
<label>String Tension</label>
</dt>
</dl>
</div>

My XPath expression: .//div[@id='product-options-wrapper']//dt/label"

I did try using a [not(@class)] expression which in this particular scenario would work, however I cannot use this in my project as I am using the same xpath on multiple documents.

So I want my query to be.. SELECT ALL LABELS EXCEPT NOT A CHILD OF UL / NESTED WITHIN A UL

Thank you ever so much

Also, can anybody reference a good site for learning more in depth XPath query/expressions?

1

There are 1 answers

6
Sergey Berezovskiy On

If 'all except not a child of ul' means 'all which are children of ul' then XPath looks like

//div[@id='product-options-wrapper']//dt/ul//label

Getting these labels with HtmlAgilityPack will look like

HtmlDocument doc = new HtmlDocument();
doc.Load(path_to_html_file);
string xpath = "//div[@id='product-options-wrapper']//dt/ul//label";
var labels = doc.DocumentNode.SelectNodes(xpath);

Result is

<label class="required">Available Grip Sizes<em>*</em></label>

UPDATE: After you completely changed question, solution will be (actually you have correct XPath for this requirement)

//div[@id='product-options-wrapper']//dt/label

Result is

<label>Would you like this racket restrung?</label>
<label>String Tension</label>