Details¶
- support for simple lists as mapping keys by transforming these to tuples
!!omap
generates ordereddict (C) on Python 2, collections.OrderedDict on Python 3, and!!omap
is generated for these types.- Tests whether the C yaml library is installed as well as the header
files. That library doesn\'t generate CommentTokens, so it cannot be
used to do round trip editing on comments. It can be used to speed
up normal processing (so you don\'t need to install
ruamel.yaml
andPyYaml
). See the section Optional requirements. - Basic support for multiline strings with preserved newlines and
chomping ( \'
|
\', \'|+
\', \'|-
\' ). As this subclasses the string type the information is lost on reassignment. (This might be changed in the future so that the preservation/folding/chomping is part of the parent container, like comments). - anchors names that are hand-crafted (not of the form
idNNN
) are preserved - merges in dictionaries are preserved
- adding/replacing comments on block-style sequences and mappings with smart column positioning
- collection objects (when read in via RoundTripParser) have an
lc
property that contains line and column infolc.line
andlc.col
. Individual positions for mappings and sequences can also be retrieved (lc.key('a')
,lc.value('a')
resp.lc.item(3)
) - preservation of whitelines after block scalars. Contributed by Sam Thursfield.
In the following examples it is assumed you have done something like::
from ruamel.yaml import YAML
yaml = YAML()
if not explicitly specified.
Indentation of block sequences¶
Although ruamel.yaml doesn\'t preserve individual indentations of block sequence items, it does properly dump:
x:
- b: 1
- 2
back to:
x:
- b: 1
- 2
if you specify yaml.indent(sequence=4)
(indentation is counted to the
beginning of the sequence element).
PyYAML (and older versions of ruamel.yaml) gives you non-indented scalars (when specifying default_flow_style=False):
x:
- b: 1
- 2
You can use mapping=4
to also have the mappings values indented. The
dump also observes an additional offset=2
setting that can be used to
push the dash inwards, within the space defined by sequence
.
The above example with the often seen
yaml.indent(mapping=2, sequence=4, offset=2)
indentation:
x:
y:
- b: 1
- 2
The defaults are as if you specified
yaml.indent(mapping=2, sequence=2, offset=0)
.
If the offset
equals sequence
, there is not enough room for the dash
and the space that has to follow it. In that case the element itself
would normally be pushed to the next line (and older versions of
ruamel.yaml did so). But this is prevented from happening. However the
indent
level is what is used for calculating the cumulative indent for
deeper levels and specifying sequence=3
resp. offset=2
, might give
correct, but counter intuitive results.
It is best to always have sequence >= offset + 2
but this is not
enforced. Depending on your structure, not following this advice
might lead to invalid output.
Inconsistently indented YAML¶
If your input is inconsistently indented, such indentation cannot be preserved. The first round-trip will make it consistent/normalize it. Here are some inconsistently indented YAML examples.
b
indented 3, c
indented 4 positions:
a:
b:
c: 1
Top level sequence is indented 2 without offset, the other sequence 4 (with offset 2):
- key:
- foo
- bar
Positioning \':\' in top level mappings, prefixing \':\'¶
If you want your toplevel mappings to look like:
library version: 1
comment : |
this is just a first try
then set yaml.top_level_colon_align = True
(and yaml.indent = 4
).
True
causes calculation based on the longest key, but you can also
explicitly set a number.
If you want an extra space between a mapping key and the colon specify
yaml.prefix_colon = ' '
:
- https://myurl/abc.tar.xz : 23445
# ^ extra space here
- https://myurl/def.tar.xz : 944
If you combine prefix_colon
with top_level_colon_align
, the top
level mapping doesn\'t get the extra prefix. If you want that anyway,
specify yaml.top_level_colon_align = 12
where 12
has to be an
integer that is one more than length of the widest key.
Document version support¶
In YAML a document version can be explicitly set by using:
%YAML 1.x
before the document start (at the top or before a ---
). For
ruamel.yaml
x has to be 1 or 2. If no explicit version is set version
1.2 is assumed (which has been
released in 2009).
The 1.2 version does not support:
- sexagesimals like
12:34:56
- octals that start with 0 only: like
012
for number 10 (0o12
is supported by YAML 1.2) - Unquoted Yes and On as alternatives for True and No and Off for False.
If you cannot change your YAML files and you need them to load as 1.1
you can load with yaml.version = (1, 1)
, or the equivalent (version
can be a tuple, list or string) yaml.version = "1.1"
If you cannot change your code, stick with ruamel.yaml==0.10.23 and let me know if it would help to be able to set an environment variable.
This does not affect dump as ruamel.yaml never emitted sexagesimals, nor octal numbers, and emitted booleans always as true resp. false
Round trip including comments¶
The major motivation for this fork is the round-trip capability for comments. The integration of the sources was just an initial step to make this easier.
adding/replacing comments¶
Starting with version 0.8, you can add/replace comments on block style collections (mappings/sequences resuting in Python dict/list). The basic for for this is:
from __future__ import print_function
import sys
import ruamel.yaml
yaml = ruamel.yaml.YAML() # defaults to round-trip
inp = """\
abc:
- a # comment 1
xyz:
a: 1 # comment 2
b: 2
c: 3
d: 4
e: 5
f: 6 # comment 3
"""
data = yaml.load(inp)
data['abc'].append('b')
data['abc'].yaml_add_eol_comment('comment 4', 1) # takes column of comment 1
data['xyz'].yaml_add_eol_comment('comment 5', 'c') # takes column of comment 2
data['xyz'].yaml_add_eol_comment('comment 6', 'e') # takes column of comment 3
data['xyz'].yaml_add_eol_comment('comment 7\n\n# that\'s all folks', 'd', column=20)
yaml.dump(data, sys.stdout)
Resulting in::
abc:
- a # comment 1
- b # comment 4
xyz:
a: 1 # comment 2
b: 2
c: 3 # comment 5
d: 4 # comment 7
# that's all folks
e: 5 # comment 6
f: 6 # comment 3
If the comment doesn\'t start with \'#\', this will be added. The key is the element index for list, the actual key for dictionaries. As can be seen from the example, the column to choose for a comment is derived from the previous, next or preceding comment column (picking the first one found).
Make sure that the added comment is correct, in the sense that when it
contains newlines, the following is either an empty line or a line with
only spaces, or the first non-space is a #
.
Config file formats¶
There are only a few configuration file formats that are easily readable and editable: JSON, INI/ConfigParser, YAML (XML is to cluttered to be called easily readable).
Unfortunately JSON doesn\'t support comments, and although there are some solutions with pre-processed filtering of comments, there are no libraries that support round trip updating of such commented files.
INI files support comments, and the excellent ConfigObj library by Foord and Larosa even supports round trip editing with comment preservation, nesting of sections and limited lists (within a value). Retrieval of particular value format is explicit (and extensible).
YAML has basic mapping and sequence structures as well as support for ordered mappings and sets. It supports scalars various types including dates and datetimes (missing in JSON). YAML has comments, but these are normally thrown away.
Block structured YAML is a clean and very human readable format. By extending the Python YAML parser to support round trip preservation of comments, it makes YAML a very good choice for configuration files that are human readable and editable while at the same time interpretable and modifiable by a program.
Extending¶
There are normally six files involved when extending the roundtrip capabilities: the reader, parser, composer and constructor to go from YAML to Python and the resolver, representer, serializer and emitter to go the other way.
Extending involves keeping extra data around for the next process step, eventuallly resulting in a different Python object (subclass or alternative), that should behave like the original, but on the way from Python to YAML generates the original (or at least something much closer).
Smartening¶
When you use round-tripping, then the complex data you get are already subclasses of the built-in types. So you can patch in extra methods or override existing ones. Some methods are already included and you can do:
yaml_str = """\
a:
- b:
c: 42
- d:
f: 196
e:
g: 3.14
"""
data = yaml.load(yaml_str)
assert data.mlget(['a', 1, 'd', 'f'], list_ok=True) == 196