Warning - I'm new to MongoDB and JSON.
I've a log file which contain JSON datasets. A single file has multiple JSON formats as it is capturing clickstream data. Here is an example of one log file.
[
{
"username":"",
"event_source":"server",
"name":"course.activated",
"accept_language":"",
"time":"2016-10-12T01:02:07.443767+00:00",
"agent":"python-requests/2.9.1",
"page":null,
"host":"courses.org",
"session":"",
"referer":"",
"context":{
"user_id":null,
"org_id":"X",
"course_id":"3T2016",
"path":"/api/enrollment"
},
"ip":"160.0.0.1",
"event":{
"course_id":"3T2016",
"user_id":11,
"mode":"audit"
},
"event_type":"activated"
},
{
"username":"VTG",
"event_type":"/api/courses/3T2016/",
"ip":"161.0.0.1",
"agent":"Mozilla/5.0",
"host":"courses.org",
"referer":"http://courses.org/16773734",
"accept_language":"en-AU,en;q=0.8,en-US;q=0.6,en;q=0.4",
"event":"{\"POST\": {}, \"GET\": {}}",
"event_source":"server",
"context":{
"course_user_tags":{
},
"user_id":122,
"org_id":"X",
"course_id":"3T2016",
"path":"/api/courses/3T2016/"
},
"time":"2016-10-12T00:51:57.756468+00:00",
"page":null
}
]
Now I want to store this data in MongoDB. So here are my novice questions:
- Do I need to parse the file and then split it into 2 datasets before storing in MongoDB? If yes, then is here a simple program to do this as my file has multiple dataset formats?
- Is there some magic in MongoDB that can split the various datasets when we upload it?
First of all you have invalid json format, Make sure your json being formatted as I have cite below. After Successfully having your json data you can perform Mongodb restore option to insert your data back to database.
Fo more information refer https://docs.mongodb.com/manual/reference/program/mongorestore/
Formatted json