Nutch REST api Results (limited)

96 views Asked by At

I've just figured out how to complete a Nutch crawl via the REST api for the 2.3 version of Nutch. You can see my post here. So after running the crawl, I go to MongoVue to check out the results and there is no "status" or "baseUrl" fields, along with others. Now if I do a normal crawl through cygwin, I get all fields. Is there some parameter I'm missing from the POST request to UPDATEDB call?

Here is the last call I make for Updatedb.

{
  "args":{
    "crawlId":"crawl-01",
    "batch":"1428526896161-4430"
  },
  "confId":"default",
  "crawlId":"crawl-01",
  "type":"UPDATEDB"
}
1

There are 1 answers

0
itsNino91 On BEST ANSWER

I figured it out. The timestamp used in the GenerateJob step was wrong. It needed to be in a particular format and my code wasn't supporting it. Found a work around.