Return multiple results using xmllint

1.9k views Asked by At

I am trying to extract all sections of an XML file which contain a specified tag. I have search the web and found this which works.

xmllint --xpath "/all/string(//title)"

But it only returns the first result, how can I make it find all results? Thanks!

Sample XML

<programme start="20170913125500 +0100" stop="20170913144500 +0100" channel="3b6963d34ba31ea21db5c3aee8e3b26f">
  <title lang="eng">Yangtse Incident</title>
  <sub-title lang="eng">(1957) Michael Anderson&apos;s drama, starring Richard Todd and William Hartnell, tells the true story of HMS Amethyst, a British frigate captured by Chinese communists during Mao&apos;s revolution.  [AD,S]</sub-title>
</programme>
<programme start="20170913144500 +0100" stop="20170913165500 +0100" channel="3b6963d34ba31ea21db5c3aee8e3b26f">
  <title lang="eng">The Comancheros</title>
  <sub-title lang="eng">(1961) Western starring John Wayne and Stuart Whitman. A Texas Ranger is forced to team up with his prisoner while he&apos;s on a covert mission to take on a band of thieves and gunrunners.  [S]</sub-title>
</programme>
<programme start="20170913165500 +0100" stop="20170913185500 +0100" channel="3b6963d34ba31ea21db5c3aee8e3b26f">
  <title lang="eng">The Cockleshell Heroes</title>
  <sub-title lang="eng">(1955) World War II drama. In a true-life tale of incredible bravery, ten marines try to break the blockade of Bordeaux. With José Ferrer, Trevor Howard, Victor Maddern and Anthony Newley.  [S]</sub-title>
</programme>
<programme start="20170913185500 +0100" stop="20170913190500 +0100" channel="3b6963d34ba31ea21db5c3aee8e3b26f">
  <title lang="eng">Dunkirk Interview Special</title>
  <sub-title lang="eng">Stars Harry Styles, Mark Rylance, Jack Lowden, Fionn Whitehead and Tom Glynn-Carney talk about making director Christopher Nolan&apos;s intense Second World War dramatic thriller.  [S]</sub-title>
</programme>

Result should be

Yangtse Incident
The Comancheros
The Cockleshell Heroes
Dunkirk Interview Special
2

There are 2 answers

0
cptPH On

I couldn't get it to work with just xmllint in the way you wanted it. The closest to that would be:

 xmllint --xpath "//something/programme/title/text()" test.xml

but this would give you all of the outputs in one line.

The best solution for me was this:

xmllint --xpath "//something/programme/title" test.xml | sed 's/<\/title>/\n/g' | sed 's/<title lang="eng">//g'

Of course you can use any other tool to clean up the output.

0
Daniel Haley On

If you're able to use xmlstarlet instead of xmllint, you could use the sel (select) command...

==> xml sel -t -m "//programme/title" -v . -n input.xml
Yangtse Incident
The Comancheros
The Cockleshell Heroes
Dunkirk Interview Special