We've discovered some domain names tied to infections. Now we have a list of DNS names in a .json file, and I'd like to produce a summarized output showing: a list of users, the unique domains they visited, the total count. Bonus points if I can also get count per domain name.
Here is a sample of the file:
{"machine": "possible_victim01", "domain": "evil.com", "timestamp":1435071870}
{"machine": "possible_victim01", "domain": "evil.com", "timestamp":1435071875}
{"machine": "possible_victim01", "domain": "soevil.com", "timestamp":1435071877}
{"machine": "possible_victim02", "domain": "bad.com", "timestamp":1435071877}
{"machine": "possible_victim03", "domain": "soevil.com", "timestamp":1435071879}
Ideally, I would like the output to be something like:
{"possible_victim01": "total": 3, {"evil.com": 2, "soevil.com": 1}}
{"possible_victim02": "total": 1, {"bad.com": 1}}
{"possible_victim03": "total": 1, {"soevil.com": 1}}
I would gladly settle for:
{"possible_victim01": "total": 3, ["evil.com", "soevil.com"]}
{"possible_victim02": "total": 1, ["bad.com"]}
{"possible_victim03": "total": 1, ["soevil.com"]}
I can get a total count of records per user, but I lose the list of domains:
cat sample.json | jq -s 'group_by(.machine) | map({machine:.[0].machine,domain:.[0].domain, count:length}) '
[{"machine": "possible_victim01", "domain": "evil.com", "count": 3},
{"machine": "possible_victim02", "domain": "bad.com", "count": 1},
{"machine": "possible_victim03", "domain": "soevil.com", "count": 1}]
This post describes how to solve the second half of the problem... JQ Aggregations and Crosstabs. I haven't found anything yet that describes the first half, getting to:
{"machine": "possible_victim01", "domain": "evil.com", "count":2}
{"machine": "possible_victim01", "domain": "soevil.com", "count":1}
{"machine": "possible_victim02", "domain": "bad.com", "count":1}
{"machine": "possible_victim03", "domain": "soevil.com", "count":1}
You need to to do
group_by
twice, once to group by the machine name, and then a sub-grouping to get the sub-counts for each domain.jq query:
Example output: