I am trying the parse an event list using Enlive.
Normally, each event data is isolated in a specific div (here "result")
<div class="result">
<h3>Event 1 title</h3>
<a href="http://the_site.com/event1">Event 1 page</a>
<p>Event 1 location</p>
</div>
<div class="result">
<h3>Event 2 title</h3>
<a href="http://the_site.com/event2">Event 2 page</a>
<p>Event 2 location</p>
</div>
So I created a variable that has all parsing logic for each event site:
(def parsing-config
{:source "The Site"
:results-url ["http://the_site.com"]
:parsing {
:title {:selector [[div.result] [:h3]]
:trim-fn (comp first :content)}
:url {:selector [[div.result] [:a]]
:trim-fn (:href (:attrs %))}
:location {:selector [[div.result] [:p]]
:trim-fn (comp first :content)}}
{:source "Other event site"
...}})
But for a specific site, I have divs that contain more than one event, like this:
<div class="September">
<h3>Event 1 title</h3>
<a href="http://other_site.com/event1">Event 1 page</a>
<p>Event 1 location</p>
<h3>Event 2 title</h3>
<a href="http://other_site.com/event2">Event 2 page</a>
<p>Event 2 location</p>
</div>
<div class="October">
<h3>Event 3 title</h3>
<a href="http://other_site.com/event3">Event 3 page</a>
<p>Event 3 location</p>
<h3>Event 4 title</h3>
<a href="http://other_site.com/event4">Event 4 page</a>
<p>Event 4 location</p>
</div>
How can I parse each event for this last site, while only changing the parsing-config variable and not the function that I use to parse (not shown here...)?
Thanks.
Note: The :trim-fn
functions may not be accurate.