REXML fails to select from attribute. Bug or incorrect XPath?

866 views Asked by At

I try to select an element from an SVG document by a special attribute. I set up a simple example.

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg xmlns:svg="http://www.w3.org/2000/svg" xmlns="http://www.w3.org/2000/svg">
  <g id='1'>
    <path id='2' type='A'/>
    <rect id='3' type='B'/>
  </g>
</svg>

Now I use the following syntax to retrieve the path element by its attribute "type":

require 'rexml/document'
include REXML
xmlfile = File.new "xml_as_specified_above.svg"
xmldoc = Document.new(xmlfile)
XPath.match( xmldoc.root, "//path[@type]" )

Syntax directly from http://www.w3schools.com/xpath/xpath_syntax.asp. I would expect that this expression selects the path element but this is what follows:

>> XPath.match( xmldoc.root, "//path[@type]" )
=> []

So, what is the correct syntax in XPath to address the path element by it's attribute? Or is there a bug in REXML (using 3.1.7.3)? Plus points for also retrieving the "rect" element.

4

There are 4 answers

1
mikej On BEST ANSWER

It looks like an older version of rexml is being picked up that doesn't support the full XPath spec.

Try checking the output of puts XPath::VERSION to ensure that 3.1.73 is displayed.

0
the Tin Man On

Many of us use Nokogiri these days instead of ReXML or Hpricot, another early Ruby XML parser.

Nokogiri supports both XPath, and CSS accessors, so you can use familiar HTML type paths to get at nodes:

require 'nokogiri'

svg = %q{<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg xmlns:svg="http://www.w3.org/2000/svg" xmlns="http://www.w3.org/2000/svg">
  <g id='1'>
    <path id='2' type='A'/>
    <rect id='3' type='B'/>
  </g>
</svg>
}

doc = Nokogiri::XML(svg)
puts doc.search('//svg:path[@type]')
puts doc.search('svg|path[@type]')
puts doc.search('path[@type]')

puts doc.search('//svg:rect')
puts doc.search('//svg:rect[@type]')
puts doc.search('//svg:rect[@rect="B"]')
puts doc.search('svg|rect')
puts doc.search('rect')

# >> <path id="2" type="A"/>
# >> <path id="2" type="A"/>
# >> <path id="2" type="A"/>

# >> <rect id="3" type="B"/>
# >> <rect id="3" type="B"/>
# >> <rect id="3" type="B"/>
# >> <rect id="3" type="B"/>

The first path is XPath with the namespace. The second is CSS with a namespace. The third is CSS without namespaces. Nokogiri, being friendly to humans, will allow us to deal and dispense with the namespaces a couple ways, assuming we are aware of why namespaces are good.

3
Martin Honnen On

You need to take the default namespace into account. With XPath 1.0 you need to bind a prefix (e.g. svg) to the namespace URI http://www.w3.org/2000/svg and then use a path like //svg:path[@type]. How you bind a prefix to a URI for XPath evaluation depends on the XPath API you use, I am afraid I don't know how that is done with your Ruby API, if you don't find a method or property in the API documentation yourself then maybe someone else comes along later to tell us.

0
Dimitre Novatchev On

This is the most FAQ: default namespace issue.

Solution:

Instead of:

//path[@type]

use

//svg:path[@type]