I am trying to parse RSS feeds with groovy. I just wanted to extract the title and description tags' value. I used following code snippet to achieve this:
rss = new XmlSlurper().parse(url)
rss.channel.item.each {
titleList.add(it.title)
descriptionList.add(it.description)
}
After this, I am accessing these values in my JSP page. What is going wrong is the value of description that I am getting is not just of<description>
(child of <channel>
) but also of<media:description>
(another optional child of <channel>
). What can I change to only extract the value of<description>
and omit the value of <media:description>
?
Edit: To duplicate this behavior, you can execute following code on this website: http://www.tutorialspoint.com/execute_groovy_online.php
def url = "http://rss.nytimes.com/services/xml/rss/nyt/HomePage.xml"
rss = new XmlSlurper().parse(url)
rss.channel.item.each {
println"${it.title}"
println"${it.description}"
}
You will see that the media description tag is also being printed in the console.
You can tell
XmlSlurper
andXmlParser
to not try to handle namespaces in the constructor. I believe this does what you are after: