convert ndjson to json in python

5.9k views Asked by At

i need to convert ndjson objects to json in python i see there is a library in pypi.org but i'm not able to use it it is the ndjson 0.3.1

{"license":"mit","count":"1551711"}
{"license":"apache-2.0","count":"455316"}
{"license":"gpl-2.0","count":"376453"}

into json

[{
    "license": "mit",
    "count": "1551711"
},
{
    "license": "apache-2.0",
    "count": "455316"
},
{
    "license": "gpl-2.0",
    "count": "376453"
}]

any help? thank you

2

There are 2 answers

2
Lenormju On BEST ANSWER

No need to use a third-party library, the json standard library from Python suffices :

import json

# the content here could be read from a file instead
ndjson_content = """\
{"license":"mit","count":"1551711"}\n\
{"license":"apache-2.0","count":"455316"}\n\
{"license":"gpl-2.0","count":"376453"}\n\
"""

result = []

for ndjson_line in ndjson_content.splitlines():
    if not ndjson_line.strip():
        continue  # ignore empty lines
    json_line = json.loads(ndjson_line)
    result.append(json_line)

json_expected_content = [
    {"license": "mit", "count": "1551711"},
    {"license": "apache-2.0", "count": "455316"},
    {"license": "gpl-2.0", "count": "376453"}
]

print(result == json_expected_content)  # True
0
Francois BAPTISTE On

If you have to process data that are too large to fit computer memory you can use this script:

with open('data.ndjson', 'r') as f_in:
    with open('data.json', 'w') as f_out:
        f_out.write('[')
        f_out.write(f_in.readline())
        while True:
            line = f_in.readline()
            if not line:
                break
            f_out.write(',')
            f_out.write(line)
        f_out.write(']')