Parse channel node using XmlPullParser

133 views Asked by At

I'm trying to parse the channel node from an RSS feed but I keep getting this error thrown at parser.nextText():

org.xmlpull.v1.XmlPullParserException: precondition: START_TAG (position:END_TAG </link>@3:449 in java.io.InputStreamReader@7988a7d) 

The problem seems to be that parser.getEventType() is 3 (END_TAG), when it should be 2 (START_TAG)

Feed:

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0">
    <channel>
        <title>Podcast Title</title>
        <link>http://www.link.com</link>
        <description>A description</description>

          <item>
          </item>

          <item>
          </item>

          <item>
          </item>

    </channel>
</rss>

Code:

    XmlPullParserFactory factory = XmlPullParserFactory.newInstance();
    factory.setNamespaceAware(false);
    XmlPullParser parser = factory.newPullParser();
    InputStream stream = new URL(url).openConnection().getInputStream();
    parser.setInput(stream, "UTF-8");
    Boolean inChannel = false;

    int eventType = parser.getEventType();
    while (eventType != XmlPullParser.END_DOCUMENT) {
        String name;
        switch (eventType) {
            case XmlPullParser.START_DOCUMENT:
                break;
            case XmlPullParser.START_TAG:
                name = parser.getName();
                if (name.equalsIgnoreCase("channel"))
                    inChannel = true;
                else if (inChannel)
                {
                    channel = new ChannelItem();
                    if (name.equalsIgnoreCase("description"))
                        channel.setDescription(parser.nextText().trim());
                    else if (name.equalsIgnoreCase("media:thumbnail"))
                        channel.setThumnailUrl(parser.getAttributeValue(null, "url"));
                }
                break;
            case XmlPullParser.END_TAG:
                name = parser.getName();
                if (name.equalsIgnoreCase("channel"))
                    inChannel = false;
                break;
        }
        eventType = parser.next();
    }
2

There are 2 answers

0
Kris B On

I gave up and ended up using a this:

  final ChannelItem channel = new ChannelItem();

    final RootElement root = new RootElement("rss");
    final Element channelNode = root.getChild("channel");


    channelNode.getChild("title").setEndTextElementListener(new EndTextElementListener()
    {
        public void end(final String body) {
            channel.setTitle(body);
        }
    });

    channelNode.getChild("link").setEndTextElementListener(new EndTextElementListener()
    {
        public void end(final String body) {
            channel.setSiteUrl(body);
        }
    });


    channelNode.getChild("description").setEndTextElementListener(new EndTextElementListener()
    {
        public void end(final String body) {
            channel.setDescription(body);
        }
    });

    try
    {
        InputStream stream = new URL(url).openConnection().getInputStream();

        Xml.parse(stream, Xml.Encoding.UTF_8, root.getContentHandler());
        stream.close();
    }
    catch (Exception e)
    {
    }
    return channel;
0
Yusuf Çakal On

Parse XML with Jsoup

Xml Sample

<?xml version="1.0" encoding="UTF-8">
<tests>
    <test>
        <id>xxx</id>
        <status>xxx</status>
    </test>
    <test>
        <id>xxx</id>
        <status>xxx</status>
    </test>
    ....
</tests>

Code Sample

String html = "<?xml version=\"1.0\" encoding=\"UTF-8\"><tests><test><id>xxx</id><status>xxx</status></test><test><id>xxx</id><status>xxx</status></test></tests></xml>";
Document doc = Jsoup.parse(html, "", Parser.xmlParser())
for (Element e : doc.select("test")) {
    System.out.println(e);
}