How to write a JSON object from R dataframe with grouping

1.6k views Asked by At

In general I feel there is a need to make JSON objects by folding multiple columns. There is no direct way to do this afaik. Please point it out if there is ..

I have data of this from

A B C
1 a x
1 a y
1 c z
2 d p
2 f q
2 f r

How do I write a json which looks like

{'query':'1', 'type':[{'name':'a', 'values':[{'value':'x'}, {'value':'y'}]}, {'name':'c', 'values':[{'value':'z'}]}]}

and similarly for 'query':'2'

I am looking to spit them in the mongo import/export individual json lines format. Any pointers are also appreciated..

1

There are 1 answers

3
r2evans On BEST ANSWER

You've got a little "non-standard" thing going with two keys of "value" (I don't know if this is legal json), as you can see here:

(js <- jsonlite::fromJSON('{"query":"1", "type":[{"name":"a", "values":[{"value":"x"}, {"value":"y"}]}, {"name":"c", "values":[{"value":"z"}]}]}'))
## $query
## [1] "1"
## 
## $type
##   name values
## 1    a   x, y
## 2    c      z

... with a data.frame cell containing a list of data.frames:

js$type$values[[1]]
##   value
## 1     x
## 2     y
class(js$type$values[[1]])
## [1] "data.frame"

If you can accept your "type" variable containing a vector instead of a named-list, then perhaps the following code will suffice:

jsonlite::toJSON(lapply(unique(dat[, 'A']), function(a1) {
    list(query = a1, 
         type = lapply(unique(dat[dat$A == a1, 'B']),  function(b2) {
             list(name = b2,
                  values = dat[(dat$A == a1) & (dat$B == b2), 'C'])
         }))
}))
## [{"query":[1],"type":[{"name":["a"],"values":["x","y"]},{"name":["c"],"values":["z"]}]},{"query":[2],"type":[{"name":["d"],"values":["p"]},{"name":["f"],"values":["q","r"]}]}]