I want to extract data from this URL: http://rss.cnn.com/rss/edition.rss
So each Item looks like this:
<item>
<title><![CDATA[Ireland stuns England at home of cricket]]></title>
<description><![CDATA[From World Cup glory to utter humiliation in the space of 10 days.]]></description>
<link>https://www.cnn.com/2019/07/24/sport/england-ireland-cricket-spt-intl/index.html</link>
<guid isPermaLink="true">https://www.cnn.com/2019/07/24/sport/england-ireland-cricket-spt-intl/index.html</guid>
<pubDate>Wed, 24 Jul 2019 13:17:56 GMT</pubDate>
<media:group>
<media:content medium="image" url="https://cdn.cnn.com/cnnnext/dam/assets/190724131447-england-ireland-tease-01-super-169.jpg" height="619" width="1100" />
<media:content medium="image" url="https://cdn.cnn.com/cnnnext/dam/assets/190724131447-england-ireland-tease-01-large-11.jpg" height="300" width="300" />
<media:content medium="image" url="https://cdn.cnn.com/cnnnext/dam/assets/190724131447-england-ireland-tease-01-vertical-large-gallery.jpg" height="552" width="414" />
<media:content medium="image" url="https://cdn.cnn.com/cnnnext/dam/assets/190724131447-england-ireland-tease-01-video-synd-2.jpg" height="480" width="640" />
<media:content medium="image" url="https://cdn.cnn.com/cnnnext/dam/assets/190724131447-england-ireland-tease-01-live-video.jpg" height="324" width="576" />
<media:content medium="image" url="https://cdn.cnn.com/cnnnext/dam/assets/190724131447-england-ireland-tease-01-t1-main.jpg" height="250" width="250" />
<media:content medium="image" url="https://cdn.cnn.com/cnnnext/dam/assets/190724131447-england-ireland-tease-01-vertical-gallery.jpg" height="360" width="270" />
<media:content medium="image" url="https://cdn.cnn.com/cnnnext/dam/assets/190724131447-england-ireland-tease-01-story-body.jpg" height="169" width="300" />
<media:content medium="image" url="https://cdn.cnn.com/cnnnext/dam/assets/190724131447-england-ireland-tease-01-t1-main.jpg" height="250" width="250" />
<media:content medium="image" url="https://cdn.cnn.com/cnnnext/dam/assets/190724131447-england-ireland-tease-01-assign.jpg" height="186" width="248" />
<media:content medium="image" url="https://cdn.cnn.com/cnnnext/dam/assets/190724131447-england-ireland-tease-01-hp-video.jpg" height="144" width="256" />
</media:group>
</item>
So I found some classes over the internet that should take care on situations like that, such as: SyndicationFeed or XDocument.Parse
So I tried this one out:
XmlReader reader = XmlReader.Create(urle);
SyndicationFeed feeds = SyndicationFeed.Load(reader); // References -> Right Click -> Add Reference -> System.ServiceModel
reader.Close();
foreach (SyndicationItem item in feeds.Items)
{
string subject = item.Title.Text;
Console.WriteLine("subject: " + subject);
if (item.Summary != null)
{
string summary = item.Summary.Text;
Console.WriteLine("desc: " + summary);
}
}
and it works pretty good with Title and Summary, but it has feature to deal with images for example, so how could i do it with SyndicationFeed for example?
The
<media:group>and its content are considered as extension elements.The
SyndicationFeedclass has a propertyElementExtensionsto address these,with a
ReadElementExtensionsmethod to read and parse them.Create a class that matches the
<media:group>xml element.Also create a class definition for the
<media:content>item.Read and parse them using an
XmlSerializeras shown below.Full code: