How to transform and filter a MARC21-xml into csv?

399 views Asked by At

I have xml file according to MARC21-format which looks like this:

(it is reduced to necessary tags, the file has many records which I reduced just to one)

<?xml version="1.0" encoding="UTF-8"?>
<collection xmlns="http://www.loc.gov/MARC21/slim">
<record>
  <controlfield tag="001">EntryID01</controlfield>
  <datafield tag="100" ind1="1" ind2=" ">
    <subfield code="0">PubID01</subfield>
    <subfield code="a">Lastname01, Firstname01</subfield>
  </datafield>
  <datafield tag="700" ind1="1" ind2=" ">
    <subfield code="0">PubID02</subfield>
    <subfield code="a">Lastname02, Firstname02</subfield>
  </datafield>
  <datafield tag="700" ind1="1" ind2=" ">
    <subfield code="a">Lastname03, Firstname03</subfield>
  </datafield>
</record>
</collection>

I would like to get a csv which has the following order/output.

Lastname01, Firstname01 | PubID01 | EntryID01
Lastname02, Firstname02 | PubID02 | EntryID01
Lastname03, Firstname03 | NOPUBID | EntryID01

Not all datafields have the information for PubID (subfield code="0"), if there is no PubID given it should be written "NOPUBID" instead.

My attempt was using xsltproc with this xsl: (I was trying to get the data from tag=100 first; but I need it also from tag=700)

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:marc="http://www.loc.gov/MARC21/slim"
xmlns="http://www.loc.gov/MARC21/slim"
exclude-result-prefixes="marc">
<xsl:output method="text" encoding="UTF-8" />
<xsl:strip-space elements="*"/>
    <xsl:template match="marc:controlfield[@tag=001]">
            <xsl:value-of select="controlfield"/>
    </xsl:template>
    <xsl:template match="marc:datafield[@tag=100]">
        <xsl:for-each select="record/datafield">
            <xsl:value-of select="controlfield"/>
        </xsl:for-each>
    </xsl:template>
</xsl:stylesheet>

But I fail desperatly. Any help appreciated.

1

There are 1 answers

3
michael.hor257k On BEST ANSWER

I would do:

XSLT 1.0

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
xmlns:marc="http://www.loc.gov/MARC21/slim">
<xsl:output method="text" encoding="UTF-8"/>

<xsl:template match="/marc:collection">
    <xsl:for-each select="marc:record">
        <xsl:variable name="control" select="marc:controlfield[@tag='001']" />
        <xsl:for-each select="marc:datafield[@tag='100' or @tag='700']">
            <xsl:value-of select="marc:subfield[@code='a']"/>
            <xsl:text> | </xsl:text>
            <xsl:variable name="pub" select="marc:subfield[@code='0']"/>
            <xsl:choose>
                <xsl:when test="$pub">
                    <xsl:value-of select="$pub"/>
                </xsl:when>
                <xsl:otherwise>NOPUBID</xsl:otherwise>
            </xsl:choose>
            <xsl:text> | </xsl:text>
            <xsl:value-of select="$control"/>
            <xsl:text>&#10;</xsl:text>
        </xsl:for-each>
    </xsl:for-each>
</xsl:template>

</xsl:stylesheet>