How to JSONL serialize sets in YAML style?

287 views Asked by At

There are the three well-known format:

It is well-known, that

I want to serialize (and unserialize) Python set (maybe other object as well) into one line (JSON or YAML does not matter), just like JSONL + custom encoder/decoder would do, but in a way that is human readable (like repr() in Python, see example below) (and possibly compliant with YAML). I would also like to keep all other functionalities and avoid workarounds.

Do I have to write my own custom encoder or there are some better/existing solution? (e.g. parameter to yaml.dump() to do it in one line) What would be the most robust way to implement the example below?

E.g.:

data = [1, 2, 3, {'say', 'knights', 'ni', 'who'}, {'key': 'value'}, 3.14]
data_dumped = dump(data, ...)  # Magic dump function
print(data_dumped)
[1, 2, 3, !!set {knights: null, ni: null, say: null, who: null}, {"key": "value"}, 3.14]
data_loaded = load(data_dumped, ...)  # Magic load function
assert data == data_loaded

UPDATE: I have linked answers showcasing monkey patched JSONEncoder making set() (and other types) serializable using pickle, which is not human readable, therefore they do not answer this question. If these kinds of answers were good without modification, this question would be duplicate to the cited ones.

1

There are 1 answers

0
dlazesz On BEST ANSWER

The "YAML printed in one line" format similar to JSONL is called "inline syntax" and is achievable with default_flow_style=True, however, this is very badly documented. The result is not JSON, but still standard compliant and does not require custom encoder/decoder.

See the example:

from yaml import dump, load
data = [1, 2, 3, {'say', 'knights', 'ni', 'who'}, {'key': 'value'}, 3.14]
data_dumped = dump(data, default_flow_style=True)
print(data_dumped)
[1, 2, 3, !!set {knights: null, ni: null, say: null, who: null}, {"key": "value"}, 3.14]
data_loaded = load(data_dumped)
assert data == data_loaded