XQuery - replace empty element with a text value (on output to HTML)

374 views Asked by At

I have nodes like these across a collection of xml:tei documents:

[...]
<persName  nymRef="#Guilhem_Canast-Brus_MSP-AU" ana="#pAdo #pAud" role="par">Willelmum de Canast-Brus</persName>
<persName  nymRef="#Guilhem_Canast-Brus_MSP-AU" ana="#pAdo #pAud #pPax" role="par">Willelmum de Canast-Brus</persName>
<persName  nymRef="#Guilhem_Canast-Brus_MSP-AU" role="own">Willelmi de Canast-Brus</persName>
<persName  nymRef="#Guilhem_Canast-Brus_MSP-AU" ana="#pAdo" role="par">W<supplied reason="expname">illelmum</supplied> de Canast</persName>
<persName  nymRef="#Guilhem_Canast-Brus_MSP-AU" role="own">Willelmi<lb break="y" n="20"/>de Canast Brus</persName>
<persName  nymRef="#Guilhem_Canast-Brus_MSP-AU" ana="#pAdo #pAud #pPax" role="par">Willelmum de<lb break="y" n="22"/>Canast</persName>
<persName  nymRef="#Guilhem_Canast-Brus_MSP-AU" ana="#pAdo" role="par">Willelmum de Canast Brus</persName>
<persName  nymRef="#Guilhem_Canast-Brus_MSP-AU" role="own">Willelmi de Canast-Brus</persName>
<persName  nymRef="#Guilhem_Canast-Brus_MSP-AU" ana="#pAud #pAdo" role="par">W<supplied reason="expname">illelmum</supplied> de Canast</persName>
<persName  nymRef="#Guilhem_Canast-Brus_MSP-AU" ana="#nAdo" role="par">W<supplied reason="expname">illelmum</supplied> de Canast Bru</persName>
[...]

The following query in XQuery 3.1:

let $a := 
  <div>
   {let $x := functx:remove-elements-deep(collection($coll)//tei:persName[@nymRef="#Guilhem_Canast-Brus_MSP-AU"][text()],("supplied","corr","del"))
    for $y in $x
    let $z := normalize-space(string-join(replace($y,",","")))
      group by $z
      order by $z ascending
      return  <span>
              {$z}
              </span>
}</div>
return $a

Returns the following HTML with a number of descendant nodes (ie. supplied, corr) removed using functx:remove-elements-deep:

<div>
  <span>R de Canast</span>
  <span>W</span>
  <span>W Bru</span>
  <span>W Bru de Canast</span>
  <span>W Canast Bru</span>
  <span>W de Canast</span>
  <span>W de Canast Bru</span>
  <span>W de Canast Brus</span>
  <span>W de Canast qui dicitur Lo Brus</span>
  <span>W de Canast- Bru</span>
  <span>W de Canast-Bru</span>
  <span>W de Canast-Brus</span>
  <span>W de CanastBru</span>
  <span>W de CanastBrus</span>
  <span>Willelmi</span>
  <span>Willelmi Canast-Bru</span>
  <span>Willelmi de Canast</span>
  <span>Willelmi de Canast Bru</span>
  <span>Willelmi de Canast Brus</span>
  <span>Willelmi de Canast iunioris</span>
  <span>Willelmi de Canast qui dicitur Brus</span>
  <span>Willelmi de Canast-Brus</span>
  <span>Willelmi de CanastBru</span>
  <span>Willelmi de Canastle Bru</span>
  <span>Willelmide Canast Brus</span>
  <span>Willelmide Canast-Brus</span>
  <span>Willelmo de Canast</span>
  <span>Willelmum de Canast</span>
  <span>Willelmum de Canast Brus</span>
  <span>Willelmum de Canast-Brus</span>
  <span>Willelmum deCanast</span>
  <span>Willelmus de Canast</span>
</div>

However, there are several (empty) elements that I would like to replace with string. For example replace lb[@break="y"] with a " ", and gap with "[ ]", like in this example:

<persName  nymRef="#Guilhem_Canast-Brus_MSP-AU" role="own">Willelmi<lb break="y" n="20"/>de Canast Brus</persName>

I was looking at functx:replace-element-values but I could not identify how to integrate it.

Many thanks for any assistance.

1

There are 1 answers

0
jbrehr On

Using replace value of element by xquery, I managed to come up with a hack.

declare namespace local = "http://example.org";

declare function local:copy-replace($element as element()) {
  if ($element/self::lb[@break eq "y"])
  then " "
  else if ($element/self::gap)
  then "[  ]"
  else element {node-name($element)}
           {$element/@*,
            for $child in $element/node()
            return if ($child instance of element())
                   then local:copy-replace($child)
                   else $child
           }
   };

   local:copy-replace(<persName  nymRef="#Guilhem_Canast-Brus_MSP-AU" role="own">Willelmi<lb break="y" n="20"/>de Can<gap/>t Brus</persName>)

Returns:

<persName nymRef="#Guilhem_Canast-Brus_MSP-AU" role="own">Willelmi de Can[  ]t Brus</persName>

Which I can then continue processing as above.