What is the best library to parse a feeds (RSS,Atom...) in Haskell?
I'm especially interested in the points:
- Performance/memory
- Encoding issues for non-English characters?
- Correctness, detection of feed-type (RSS 1, RSS 2, Atom...), handling of non-valid feeds, etc.
I already stumbled upon feed, however it uses Strings. How can this affect performance/memory, especially if ByteString.Lazy or Text are used elsewhere throughout the app.
Your intuition is right about trying to avoid
String. The general rule of thumb in modern Haskell is to avoidStringwhenever you can and useTextorByteStringinstead. However in this case, I'm not aware of any direct drop-in replacement for thefeedpackage.In practice, because parsing feeds is usually network-bound, you shouldn't have any performance issues under normal circumstances.
However, if you really need high throughput and tight control of resources, it shouldn't be too difficult to write your own RSS parser using
xml-conduit, which I'd say it's the most mature iteratee-based XML parsing library out there. You can have a look at how it's being used by these packages.