YAML 1.2 directive with multiple documents doesn't work in unsafe mode

667 views Asked by At

I'm trying to load a multi-document YAML config file like the following:

file:

%YAML 1.2
---
num_epochs: 1
---
num_epochs: 1

and the python script is:

from ruamel.yaml import YAML

yaml = YAML(typ='unsafe')
configs = yaml.load_all(Path(Experiment.config_file))
for config in configs:
    print(config)

when executed, it gives the following error:

ruamel.yaml.parser.ParserError: found incompatible YAML document
in "../MAML_tensorflow/experiment.yml", line 1, column 1

The file works if I use load_all directly import from the module. Is this expected behavior?

This is likely a bug, because setting the implementation flag to pure gives the correct parse result.

from ruamel.yaml import YAML

yaml = YAML(typ='unsafe', pure=True)
configs = yaml.load_all(Path(Experiment.config_file))
for config in configs:
print(config)

while this does not and gives the error above

from ruamel.yaml import YAML

yaml = YAML(typ='unsafe')
configs = yaml.load_all(Path(Experiment.config_file))
for config in configs:
    print(config)
1

There are 1 answers

0
Anthon On BEST ANSWER

This is, alas, the expected behaviour. ruamel.yaml is derived from PyYAML, which in itself relies on libyaml, for fast C based loading and dumping. PyYAML and libyaml have both been developed by Kirill Simonov. Although there are some 1.2 things PyYAML and libyaml support (e.g. floats without decimal point in their mantissa), those two "only" implement most of YAML 1.1.

Initially ruamel.yaml linked against libyaml to provide the fast, but since some time it has its own copy of the source, which makes it easier to create wheels (.whl) especially for Windows version of Python.

The C source included in current (0.15.33) versions of ruamel.yaml is mostly unchanged from the code in libyaml. This is the reason why the round-trip parser has no equivalent C/non-pure version, and also why the pure versions of the safe/unsafe/base-loader accept YAML 1.2 and the non-pure (C-based) version do not.

Of course this should be at least documented appropriately, but preferably the C code should be adapted to support round-tripping. While overhauling the C code for round-tripping the, it will also be adapted to support YAML 1.2 for safe/unsafe/base loading.


In ruamel.yaml 0.15.62 the C reader/parser accepts %YAML 1.2 and the emitter allows dumping with that directive.

No actual code is changed so the reader still parses YAML 1.1 unicode newlines and octals, etc. Dumping should be less problematic (as e.g. the C based dumper never dumped octals).

(Ref. the test_load_cyaml_1_2() and test_dump_cyaml_1_2() tests in test_cyaml.py)