How to add spaces between nodes when using string() on a tree in XPath

1.4k views Asked by At

I have a HTML tree where I use the 'string()' query on the root to get all the text from the nodes.

However, I'd like to add a space between each nodes.

I.e.

string() on '<root><div>abc</div><div>def</div></root>' will become 'abcdef'

string() on '<root><div>abc</div><div>def</div></root>' should become 'abc def '

3

There are 3 answers

1
Birei On BEST ANSWER

You can try with itertext() method, that iterates over all text content:

from lxml import etree

root = etree.XML('<root><div>abc</div><div>def</div></root>')
print(' '.join(e for e in root.itertext()))

It yields:

abc def
0
Michael Kay On

It's not clear what output you want when the XML is more complex than shown, or when it involves mixed content. In XSLT 1.0 you'll have to do a recursive descent of the tree, involving something like

<xsl:template match="div">
  <xsl:if test="not(position()=1)"> </xsl:if>
  <xsl:value-of select="."/>
</xsl:template>
0
michael.hor257k On

'<root><div>abc</div><div>def</div></root>' should become 'abc def '

In XSLT 1.0, this would be done as:

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="UTF-8"/>

<xsl:template match="/root">
    <xsl:for-each select="div">
        <xsl:value-of select="."/>
        <xsl:text> </xsl:text>
    </xsl:for-each>
</xsl:template>

</xsl:stylesheet>

Or perhaps you meant to retrieve all text nodes, regardless of the document structure. This could be done by:

<xsl:template match="/">
    <xsl:for-each select="//text()">
        <xsl:value-of select="."/>
        <xsl:text> </xsl:text>
    </xsl:for-each>
</xsl:template>