Parsing open graph tags with nutch (into ElasticSearch)

141 views Asked by At

I have a running nutch 2.3.1/hbase installation that parses/indexes web pages just fine. Now I need to parse open graph tags (namely og:image, og:description). From several fragments found on the web I learned that tika basically supports parsing open graph tags, but I am lost trying to figure out how to integrate this into nutch.

Can someone point me into the right direction? Maybe an example?

Thanks

0

There are 0 answers