How to extract exact values from a json file with xidel?

412 views Asked by At

Excuse my English, I am not a native speaker

I'm new to this so I don't know much

I am trying to extract some values from a json file with xidel with the following command in windows cmd but it's not working

xidel MyFile.json -e '$json//options/option/*[@option_id="D-ES"]/content_id'

Generally the json file has three options, English, Spanish and Portuguese, I only want all the values related to Spanish

I want to extract the following values

"group_id": "******",                                       
"content_id": "******",                                     
"current_content": "*****",                                     
"option_id": "D-ES",                                                                        
"subtitle": *****,                                                                              
"id": "ES",                                     
"desc": "Español",

And put the extracted values as follows

"group_id"-"*****","content_id"-"*****","current_content"-"*****","option_id"-"D-ES"-"subtitle"- *****,"id"- "ES""desc"- "Español",

This is part of my json file

{
  "original": {
    "id": "ING",
    "desc": "Inglés"
  },
  "dubbed": "true",
  "subbed": "false",
  "options": {
    "option": [
      {
        "group_id": "922450",
        "content_id": "284951",
        "current_content": "false",
        "option_id": "D-ES",
        "audio": "ES",
        "subtitle": null,
        "option_name": "dubbed",
        "id": "ES",
        "desc": "Español",
        "label_short": "Dob. Español",
        "label_large": "Doblada al Español",
        "intro_start_time": null,
        "intro_finish_time": null,
      },
      {
        "group_id": "275495",
        "content_id": "243856",
        "current_content": "false",
        "option_id": "D-PT",
        "audio": "PT",
        "subtitle": null,
        "option_name": "dubbed",
        "id": "PT",
        "desc": "Portugués",
        "label_short": "Dob. Portugués",
        "label_large": "Doblada al Portugués",
        "intro_start_time": null,
        "intro_finish_time": null,
      },
      {
        "group_id": "248954",
        "content_id": "245238",
        "current_content": "false",
        "option_id": "O-EN",
        "audio": "ORIGINAL",
        "subtitle": null,
        "option_name": "original",
        "id": "EN",
        "desc": "Inglés",
        "label_short": "Id. Inglés",
        "label_large": "Idioma Original Inglés",
        "intro_start_time": null,
        "intro_finish_time": null,
      }
    ]
  }
}

What command should I use to extract the values related to Spanish?

1

There are 1 answers

0
Reino On BEST ANSWER
xidel MyFile.json -e '$json//options/option/*[@option_id="D-ES"]/content_id'
  • It is generally advised to swap the quotes if you're using the Windows binary. This quoting style is for Linux.
  • To navigate the "option"-array use (option)() (or alternatively the XQuery 3.1 syntax option?*).
  • The @ is only necessary if your input is HTML/XML.

So the correct query would be -e "$json//options/(option)()[option_id='D-ES']/content_id".

I want to extract the following values [...]

xidel -s MyFile.json -e "$json//options/(option)()[option_id='D-ES']/(group_id,content_id,current_content,option_id,subtitle,id,desc)"
922450
284951
false
D-ES
ES
Español

To include the attribute names I would do:

xidel -s MyFile.json -e "for $x in ('group_id','content_id','current_content','option_id','subtitle','id','desc') return $json//options/concat($x,' - ',(option)()[option_id='D-ES']($x))"
group_id - 922450
content_id - 284951
current_content - false
option_id - D-ES
subtitle
id - ES
desc - Español

If you really want the surrounding double quotes, you can just add them (escaped with a backslash) in the concat()-function...

-e ".../concat('\"',$x,'\"-\"',(option)()[option_id='D-ES']($x),'\"')"

...or you can use xidel's "Extended Strings" syntax...

-e ".../x'\"{$x}\"-\"{(option)()[option_id='D-ES']($x)}\"'"

...or use the XQuery notation for a double quote...

--xquery ".../x'"{$x}"-"{(option)()[option_id='D-ES']($x)}"'"

The output:

"group_id"-"922450"
"content_id"-"284951"
"current_content"-"false"
"option_id"-"D-ES"
"subtitle"-""
"id"-"ES"
"desc"-"Español"

And finally to turn this sequence into a single line where each item is separated by a ,:

xidel -s MyFile.json -e "join(for $x in ('group_id','content_id','current_content','option_id','subtitle','id','desc') return $json//options/x'\"{$x}\"-\"{(option)()[option_id='D-ES']($x)}\"',',')"
"group_id"-"922450","content_id"-"284951","current_content"-"false","option_id"-"D-ES","subtitle"-"","id"-"ES","desc"-"Español"

The finally command/query prettified:

-e "
  join(
    for $x in (
      'group_id','content_id','current_content',
      'option_id','subtitle','id','desc'
    ) return
    $json//options/x'\"{$x}\"-\"{(option)()[option_id='D-ES']($x)}\"',
    ','
  )
"