Introducing programmatic editing of Hiera YAML files
by Alex Harvey
- Introduction
- Bulk updating of Hiera data
- Python and ruamel.yaml
- hiera-bulk-edit.py
- Installing the script
- What it does
- Recipes
Introduction
If you have ever maintained a complicated, multi-team deployment of Hiera, you have probably seen data keys repeated in flagrant violation of the Don’t Repeat Yourself principle.
To an extent, this is avoidable. It is possible to declare variables in Hiera and look them up from elsewhere in Hiera by calling the hiera function from within Hiera. It is also possible to define aliases in order to look up complex data from elsewhere within Hiera.
Meanwhile, the hiera_hash function can eliminate the need to repeat Hash keys at multiple levels of the hierarchy, although Puppet 3’s automatic parameter lookup will not return merged hash lookups.
On the other hand, many Puppet users don’t know about these features, and even when they do, tight project deadlines tempt the best of us to take shortcuts.
Bulk updating of Hiera data
The problem that arises can be stated as follows: Given many Hiera files, possibly in separate Git repos and maintained in separate teams, how would you update a similar block of Hiera data in all of these files?
I spent several hours on a Friday afternoon writing a simple Ruby script to double-check that I’d manually updated ~ 10 YAML files with changes to what were essentially the same data keys, and I wondered if there is a better way.
Python and ruamel.yaml
To my surprise, I discovered that it is simply impossible to programmatically update human-edited YAML files in Ruby because its parser cannot preserve commenting and formatting.
Mike Pastore states in his comment at Ruby-Forums.com:
Most YAML libraries I’ve worked with don’t preserve formatting or comments. Some quick research turns up only one that does—and it’s for Python (ruamel.yaml). In my experience, YAML is great for human-friendly, machine-readable configuration files and not much else. It loses its allure the second you bring machine-writeability into the picture.
So to the Ruby community: someone needs to write a YAML parser that preserves commenting and formatting!
In the meantime, all power to Anthon van der Neut, who has forked the PyYAML project and solved a good 80% of the problem of preserving the commenting and formatting. He also proved to be incredibly helpful in answering questions about the parser on Stack Overflow, and in responding to bug reports.
hiera-bulk-edit.py
I realised that a script that could execute snippets of arbitrary Python code on the YAML files in memory would provide a powerful and flexible interface for bulk editing of Hiera files. In the remainder of the post, I’ll show how various data editing – and viewing – problems can be solved using my new tool.
Installing the script
To install the script, just clone my Git repository and install the Python dependencies with PIP:
$ git clone https://github.com/alexharv074/hiera-bulk-edit
$ cd hiera-bulk-edit
$ pip install -r requirements.txt
And if you wish, copy it to some place like /usr/local/bin.
What it does
Usage
$ hiera-bulk-edit.py <paths> <code_file>.py
The script loops through the files specified in paths, and for each of these, loads the contents into a Python ruamel.ordereddict structure, which the user may regard as a normal dictionary (which is Python’s equivalent of a Ruby Hash). The Python code in code_file.py is then executed on that structure, and the modified structure is written back to disk.
Note that Bash Globbing and Brace Expansion are supported in paths.
The reader will also note some variables of importance:
hiera
The YAML data is stored in a dictionary called hiera.
f
The file name of the file that is currently being edited is stored in f.
Recipes
Adding a key
Here we add (or over-write) all keys ['foo']['bar']
in all files specified in paths.
try:
if 'foo' not in hiera:
hiera['foo'] = {}
hiera['foo']['bar'] = {
'key1': 'val1',
'key2': 'val2',
}
except:
e = sys.exc_info()[0]
print "Got %s when updating %s" % (e, f)
Deleting the key again
del hiera['foo']['bar']
del hiera['foo']
Viewing a key
It is also possible to view keys as they appears in all files. In this example I use the clint project to also colour the output green, to make it easier to see:
from clint.textui import puts, colored, indent
if 'profile::base::users' in hiera and 'ec2-user' in hiera['profile::base::users']
and 'ssh_keys' in hiera['profile::base::users']['ec2-user']:
try:
print "In %s:" % f
puts(colored.green("hiera['profile::base::users']['ec2-user']:"))
with indent(4):
puts(
colored.green(
ruamel.yaml.round_trip_dump(hiera['profile::base::users']['ec2-user'])
)
)
except:
e = sys.exc_info()[0]
print "Got %s when updating %s" % (e, f)
Sorting all keys alphabetically
# http://stackoverflow.com/questions/39307956/insert-a-key-using-ruamel/39308307#39308307
if hasattr(hiera, '_yaml_comment'):
yaml_comment = hiera._yaml_comment
hiera = ruamel.yaml.comments.CommentedMap(
sorted(
hiera.items(), key=lambda t: t[0]
)
)
if hasattr(hiera, '_yaml_comment'):
hiera._yaml_comment = yaml_comment
Add a key with formatting
# http://stackoverflow.com/questions/39262556/preserve-quotes-and-also-add-data-with-quotes-in-ruamel
from ruamel.yaml.scalarstring import SingleQuotedScalarString, DoubleQuotedScalarStr
ing
hiera['foo'] = SingleQuotedScalarString('bar')
hiera['bar'] = DoubleQuotedScalarString('baz')