How to get tags that don't have specific ancestor and descendant using REXML

38 views Asked by At

I'd like to get A tags and the B tag from the following XML, but I want to remove the second A tag:

......many other tags.
<A>abc</A>
<A>   <<==== I want to remove this A tag from result.
  <B>def
    <A>foo</A>
    <A>hoge</A>
    <A>bar</A>
  </B>
 </A>
 .......

I'm using this XPath:

//*[self::A[not(descendant::B) or self::B]]

However this XPath gets the inside A tags of B tag twice:

 <A>abc</A>
   <B>def
      <A>foo</A>
      <A>hoge</A>
      <A>bar</A>
   </B>
   <A>foo</A>
   <A>hoge</A>
   <A>bar</A>

then, I wrote this Xpath, but it doesn't work:

//*[self::A[not(descendant::B or ancestor::B) or self::B]]

I want to get this result:

 <A>abc</A>
   <B>def
      <A>foo</A>
      <A>hoge</A>
      <A>bar</A>
   </B>

 .......

How can I solve this?

1

There are 1 answers

0
Andersson On

Try to use below XPath expression:

//*[self::A[not(./B) and not(./parent::B)] or self::B]

Output:

'<A>abc</A>'
'<B>def
    <A>foo</A>
    <A>hoge</A>
    <A>bar</A>
  </B>'

self::A[not(./B) and not(./parent::B)] means A that has no direct child or parent B element