How to handle arrays with QueryRecord?

1.7k views Asked by At

I'm working in Apache NiFi and I've a question: how to handle nested arrays in JSON with QueryRecord processor? For example I've a JSON:

{
   "offerName":"Viatti Strada Asimmetrico V-130 205/55 R16 91V",
   "detailedStats":[
      {
         "type":"mobile",
         "clicks":4,
         "spending":"2.95"
      }
   ]
}

How can I extract array to get the following result:

{
  "offerName": "Viatti Strada Asimmetrico V-130 205/55 R16 91V",
  "type": "mobile",
  "clicks": 4,
  "spending": "2.95"
}

I read about RPATH, but didnt find good examples.

Tried with:

SELECT RPATH(detailedStats, '/detailedStats[1]')
  FROM flowfile

But it throws error. How can i get expected result with RPATH?

2

There are 2 answers

1
Barbaros Özhan On

An alternative method might be adding a JoltTransformJSON processor with (shift type) specification, which's reached from the Advanced button of Settings tab, with the following code

[
  {
    "operation": "shift",
    "spec": {
      "detailedStats": {
        "*": {
          "@(2,offerName)": "offerName",
          "*": "&"          
        }
      }
    }
  }
]

in order to extract your desired result.

0
yaprak On

You can select like below via QueryRecord . However it seems you are having an issue while writing. I used JsonRecordSetWriter with Inherent Record Schema. this is a good tutorial If you prefer avro schema

SELECT offerName,
       RPATH_STRING(detailedStats, '/type') type,
       RPATH_INT(detailedStats, '/clicks') clicks,
       RPATH_STRING(detailedStats, '/spending') spending
  FROM FLOWFILE

result is an array, so you should split it with $.* at the downstream