How can I write a whitespace-only text node in XML output of XSLT

694 views Asked by At

We have a messaging pipeline which include XML-to-XML transforms.

For a source document like this (which may also be in one line without formatting):

<doc>
  <a>Foo</a>
  <b>Bar1</b>
  <b>Bar2</b>
  <b>Bar3</b>
  <c>Baz</c>
</doc>

I need the XML output of the transform to be (note the line breaks):

<x>Bar1
Bar2
Bar3</x>

But the output I'm getting is:

<x>Bar1Bar2Bar3</x>

The stylesheet looks like this:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
  <xsl:output omit-xml-declaration="yes" method="xml" version="1.0" />

  <xsl:template match="/">
    <x>
      <xsl:for-each select="//b">
        <xsl:value-of select="." />
        <xsl:if test="position() != last()">
          <xsl:text>&#xD;&#xA;</xsl:text>  <!-- something wrong here? -->
        </xsl:if>
      </xsl:for-each>
    </x>
  </xsl:template>
</xsl:stylesheet>

If I add a non-whitespace character to the text node then I end up with the new-line being preserved. So, if I modify the xsl:text node to (note the added hyphen):

<xsl:text>-&#xD;&#xA;</xsl:text>

then I get the output:

<x>Bar1-
Bar2-
Bar3</x>

How can I generate the desired output?

Note that we're limited to XSLT 1.0.

Update

I've done some more testing. Below is full code to reproduce the issue. Interestingly, this code reproduces the issue when run under .Net Framework 4.5 and .Net Core 2.1, but it gives the desired output when run under Mono.

using System;
using System.IO;
using System.Reflection;
using System.Text;
using System.Xml;
using System.Xml.Xsl;

namespace xslt
{
    class Program
    {
        static void Main(string[] args)
        {
            var doc = new XmlDocument();
            doc.LoadXml(@"<doc><a>Foo</a><b>Bar1</b><b>Bar2</b><b>Bar3</b><c>Baz</c></doc>");

            var xsl = new XmlDocument();
            xsl.LoadXml(@"<?xml version='1.0' encoding='utf-8'?>
<xsl:stylesheet xmlns:xsl='http://www.w3.org/1999/XSL/Transform' version='1.0'>
<xsl:output omit-xml-declaration='yes' method='xml' version='1.0' />

    <xsl:template match='/'>
        <x>
        <xsl:for-each select='//b'>
            <xsl:value-of select='.' />
            <xsl:if test='position() != last()'>
                <xsl:text>&#xD;&#xA;</xsl:text>  <!-- something wrong here? -->
            </xsl:if>
        </xsl:for-each>
        </x>
    </xsl:template>
</xsl:stylesheet>");

            var xslt = new XslCompiledTransform();
            xslt.Load(xsl);
            
            using (var stream = new MemoryStream())
            {
                xslt.Transform(doc, null, stream);
                Console.WriteLine(Encoding.UTF8.GetString(stream.ToArray()));
            }
        }
    }
}
2

There are 2 answers

11
zx485 On

How can I preserve whitespace-only text node in XML output of XSLT

If you really want to preserve the text() nodes between the b elements, you can match them with the XPath expression

text()[preceding::*[1][self::b]][following::*[1][self::b]]

and copy their whole content with an xsl:copy-of. The whole set of templates could look like this:

<xsl:template match="/doc">
    <x>
      <xsl:apply-templates select="node()|@*" />
    </x>
</xsl:template>

<xsl:template match="b">
      <xsl:value-of select="." />
</xsl:template>  

<xsl:template match="text()" />

<xsl:template match="text()[preceding::*[1][self::b]][following::*[1][self::b]]">
    <xsl:copy-of select="." />
</xsl:template>

This copies also the whitespaces in between and not only the newlines, so the output looks like

<x>Bar1-
  Bar2
  Bar3</x>
1
Andrew Cooper On

I was able to get this working by adding a script block to the stylesheet to build the newline-separated value.

I'm still interested to know if it's possible with pure XSL.

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"
                xmlns:msxsl="urn:schemas-microsoft-com:xslt"
                xmlns:userCSharp="http://schemas.microsoft.com/BizTalk/2003/userCSharp">
  <xsl:output omit-xml-declaration="yes" method="xml" version="1.0" />

  <xsl:template match="/">
    <x>
      <xsl:value-of select='userCSharp:JoinLines(//b)' />
    </x>
  </xsl:template>

  <msxsl:script language="C#" implements-prefix="userCSharp">
    <![CDATA[

public string JoinLines(XPathNodeIterator nodes)
{
  var builder = new StringBuilder();
  while (nodes.MoveNext())
  {
    builder.AppendLine(nodes.Current.Value);
  }
  return builder.ToString().Trim();
}

    ]]>
  </msxsl:script>
</xsl:stylesheet>