I've been working on CSV to XML conversions using this style-sheet: XSLT 2.0 to convert CSV to XML format
I needed to account for both comma and pipe delimiters so I changed the style sheet to this:
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs">
    <xsl:output indent="yes"/>
    <xsl:strip-space elements="*"/>
    <xsl:param name="csv-uri" as="xs:string" select="'file:///c:/test.csv'"/>
    <xsl:template match="/" name="csv2xml">
        <Entity>
            <Rows>
                <xsl:choose>
                    <xsl:when test="unparsed-text-available($csv-uri)">
                        <xsl:variable name="csv" select="unparsed-text($csv-uri)" />
                        <xsl:variable name="pipe" select="'\|'"/>
                        <xsl:analyze-string  select="replace($csv,$pipe,',')" regex='\r\n?|\n' >
                            <xsl:non-matching-substring>
                                <xsl:if test="not(position()=0)" >
                                    <Row>
                                        <xsl:for-each select="tokenize(.,',')" >
                                            <xsl:element name="Column_{position()}">
                                                <xsl:value-of select="."/>
                                            </xsl:element>
                                        </xsl:for-each>
                                    </Row>
                                </xsl:if>
                            </xsl:non-matching-substring>
                        </xsl:analyze-string>
                    </xsl:when>
                    <xsl:otherwise>
                        <xsl:variable name="error">
                            <xsl:text>Error reading file: "</xsl:text>
                            <xsl:value-of select="$csv-uri"/>
                        </xsl:variable>
                        <xsl:message>
                            <xsl:value-of select="$error"/>
                        </xsl:message>
                        <xsl:value-of select="$error"/>
                    </xsl:otherwise>
                </xsl:choose>
            </Rows>
        </Entity>
    </xsl:template>
</xsl:stylesheet>
However, I've been having some difficulty accounting for quoted values from the input.
Currently if a row in the csv is:
   1,2,3,4,5,"testing,1,1,1"
   red,white,blue,green
   dogs|cats|rabbits
the "testing,1,1,1" gets split into 4 columns into the CSV instead of one column.
output:
    <?xml version="1.0" encoding="UTF-8"?>
    <Entity>
       <Rows>
          <Row>
             <Column_1>1</Column_1>
             <Column_2>2</Column_2>
             <Column_3>3</Column_3>
             <Column_4>4</Column_4>
             <Column_5>5</Column_5>
             <Column_6>"testing</Column_6>
             <Column_7>1</Column_7>
             <Column_8>1</Column_8>
             <Column_9>1"</Column_9>
          </Row>
          <Row>
             <Column_1>red</Column_1>
             <Column_2>white</Column_2>
             <Column_3>blue</Column_3>
             <Column_4>green</Column_4>
          </Row>
          <Row>
             <Column_1>dogs</Column_1>
             <Column_2>cats</Column_2>
             <Column_3>rabbits</Column_3>
          </Row>
       </Rows>
    </Entity>
I've done some research and using regex='("[^"]*")+' can accomplish this. But I'm not entirely sure how to implement this without removing something I have that I need (maybe probably in the Analyze-string block?). I need help please! Its probably something simple, so please school me or point me in the right direction. Any advice would be helpful.
 
                        
You can just add another
xsl:analyze-stringto process thexsl:non-matching-substringfrom the firstxsl:analyze-string...CSV Input
Modified XSLT 2.0
XML Output
The regex from the second
xsl:analyze-stringis using captured substrings to ignore the quotes and the comma. Here's an easier to read version:If you want to keep the quotes, move them inside the parens: