I have an XML file structured like this:
<Chem> <Formula> CO{2} </Formula> <Name> Carbon Dioxide </Name> </Chem>
How would I use XSLT to format the number (or all numbers in curly brackets) to subscript?
In XSLT 2.0 or higher, you could convert all digit characters within curly braces to subscript using:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/> <!-- identity transform --> <xsl:template match="@*|node()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> <xsl:template match="Formula"> <xsl:copy> <xsl:analyze-string select="." regex="\{{(\d+)\}}"> <xsl:matching-substring> <xsl:value-of select="translate(regex-group(1), '0123456789', '₀₁₂₃₄₅₆₇₈₉')"/> </xsl:matching-substring> <xsl:non-matching-substring> <xsl:value-of select="."/> </xsl:non-matching-substring> </xsl:analyze-string> </xsl:copy> </xsl:template> </xsl:stylesheet>
Note the double escaping of the curly brace characters: first, they are doubled to distinguish them from AVT expressions in XSLT; next, they are preceded by \ to be interpreted as literal characters in regex.
\
The result using your example input:
<?xml version="1.0" encoding="UTF-8"?> <Chem> <Formula> CO₂ </Formula> <Name> Carbon Dioxide </Name> </Chem>
In XSLT 2.0 or higher, you could convert all digit characters within curly braces to subscript using:
Note the double escaping of the curly brace characters: first, they are doubled to distinguish them from AVT expressions in XSLT; next, they are preceded by
\to be interpreted as literal characters in regex.The result using your example input: