Cannot read JSON with Pandas a file encoded in UCS-2 Little Endian

177 views Asked by At
with open(filename+'.json') as json_file:
data=pd.io.json.read_json(json_file,encoding='utf_16_be')

I tried multiple options for encoding but it fails. It returns empty object. I can convert only when save my file in Notepad++ as UTF8 without BOM. I open it as normally with default encoding:

with open(filename+'.json') as json_file:
data=pd.io.json.read_json(json_file)

Default encoding of the file is UTC-2 Little Endian. How to read json with this encoding?

1

There are 1 answers

0
JosefZ On

Read and follow import pandas as pd; help (pd.io.json.read_json). The following (partially commented) code snippet could help:

filename = r"D:\PShell\DataFiles\61571258" # my test case

import pandas as pd

filepath = filename + ".json"

# define encoding while opening a file 
with open(filepath, encoding='utf-16') as f:
    data = pd.io.json.read_json(f)

# or open file in binary mode and decode while converting to pandas object
with open(filepath, mode='rb') as f:
    atad = pd.io.json.read_json(f, encoding='utf-16')

# ensure that both above methods are equivalent
print((data == atad).values)

Output: .\SO\69537408.py

[[ True True True True True True True]]