I am trying to import JSON data into Pig using the code:
Data = LOAD '/log/2015/06/07-TAG-AD.json.bz2'
USING JsonLoader('user: (ui:long, date:datetime, ua:chararray, ip:chararray, id:long, cntry:chararray, cty:chararray, x:float, y:float, gender:int, age:int), inv: (w:int, h:int, url:chararray, do:chararray, pos:int, adx:int, net: chararray, adv:int, dea:int), resp: (adv:long, oi:long, c:long, cr:long, p:double, b:double)');
DUMP Data;
However I keep on getting the error:
ERROR 2997: Unable to recreate exception from backed error: AttemptID:attempt_1433718762047_0074_m_000000_3 Info:Error: org.codehaus.jackson.JsonParseException: Unexpected character ('M' (code 77)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')
I imagine that it is comming from the field user.ua field since this is what the JSONs look like:
({ "_id" : ObjectId("5573fcdfba0947360b8f0144"), "user" : { "ui" : NumberLong("3559044716429019182"), "date" : ISODate("2015-06-07T08:12:15.047Z"), "ua" : "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.81 Safari/537.36", "ip" : null, "id" : null, "cntry" : "FR", "cty" : "Toulouse", "x" : null, "y" : null, "gender" : null, "age" : null }, "inv" : { "w" : 300, "h" : 250, "url" : "http://www.ladepeche.fr/", "do" : "ladepeche.fr", "pos" : null, "adx" : 1, "net" : null, "adv" : 1, "dea" : null }, "resp" : { "adv" : NumberLong(449290), "oi" : NumberLong(1862027), "c" : NumberLong(7772668), "cr" : NumberLong(28041668), "p" : 2.518448, "b" : 2.55584 } })
Shouldn't the chararray data type be able to recognise the letter 'M'?