Parsing a feed with rss version="2"

234 views Asked by At

I am trying to parse an RSS feed with Java ROME which has this incorrect version:

<rss version="2">

When I change it to "2.0" it parses correctly. How can I work around this using Java ROME?

I could subclass RSS20Parser and override isMyType but where and how do I register this new parser?

1

There are 1 answers

0
Michiel Borkent On BEST ANSWER

I solved this by creating a subclass of RSS20Parser and by overriding isMyType. Then I copied rome.properties, added the subclass to the list of parsers in WireFeedParser.classes and placed this file on the classpath. I happened to be programming in Clojure and here is the code:

(ns feeds.rss20-parser
  (:import (com.rometools.rome.io.impl RSS20Parser)
           (org.jdom2 Document))
  (:gen-class
   :name feeds.RSS20Parser
   :extends com.rometools.rome.io.impl.RSS20Parser
   :exposes-methods {isMyType parentIsMyType}))

(defn version [^Document doc]
  (some-> doc
          .getRootElement
          (.getAttribute "version")
          .getValue
          .trim))

(defn -isMyType [^feeds.RSS20Parser this ^Document doc]
  (or (.parentIsMyType this doc)
      (= "2" (version doc))))