Jekyll2023-09-14T01:33:00+00:00https://alexharv074.github.io//feed/puppet.xmlAlex Harvey | PuppetMy blog{"avatar"=>"/assets/me.jpg", "twitter"=>"alexharv074", "github"=>"https://github.com/alexharv074", "linkedin"=>"https://www.linkedin.com/in/harveyalex/", "email"=>"alexharv074@gmail.com"}alexharv074@gmail.comData consistency testing in Puppet, Part III: Direct data assertions2020-04-25T00:00:00+00:002020-04-25T00:00:00+00:00https://alexharv074.github.io//puppet/2020/04/25/data-consistency-testing-in-puppet-part-iii-direct-data-assertions<p>In this third and probably the last part of this series, I look at the method of using Rspec to make direct assertions about Hiera data. Usually, the purpose of these assertions is to work around design flaws in a code base that cannot be easily corrected.</p>
<ul id="markdown-toc">
<li><a href="#introduction" id="markdown-toc-introduction">Introduction</a></li>
<li><a href="#code-example" id="markdown-toc-code-example">Code example</a></li>
<li><a href="#code-on-github" id="markdown-toc-code-on-github">Code on GitHub</a></li>
<li><a href="#what-are-we-testing-and-why" id="markdown-toc-what-are-we-testing-and-why">What are we testing and why</a></li>
<li><a href="#is-there-a-better-way" id="markdown-toc-is-there-a-better-way">Is there a better way</a></li>
<li><a href="#tests" id="markdown-toc-tests">Tests</a> <ul>
<li><a href="#yamllint" id="markdown-toc-yamllint">Yamllint</a> <ul>
<li><a href="#overview" id="markdown-toc-overview">Overview</a></li>
<li><a href="#rakefile" id="markdown-toc-rakefile">Rakefile</a></li>
<li><a href="#venvsh" id="markdown-toc-venvsh">venv.sh</a></li>
<li><a href="#yamllintyml" id="markdown-toc-yamllintyml">yamllint.yml</a></li>
<li><a href="#running-the-test" id="markdown-toc-running-the-test">Running the test</a></li>
</ul>
</li>
<li><a href="#rspec-assertions-about-the-data" id="markdown-toc-rspec-assertions-about-the-data">Rspec assertions about the data</a></li>
<li><a href="#assertions-against-nginx-docs" id="markdown-toc-assertions-against-nginx-docs">Assertions against Nginx docs</a></li>
<li><a href="#run-the-tests" id="markdown-toc-run-the-tests">Run the tests</a></li>
</ul>
</li>
<li><a href="#discussion" id="markdown-toc-discussion">Discussion</a></li>
<li><a href="#see-also" id="markdown-toc-see-also">See also</a></li>
</ul>
<h2 id="introduction">Introduction</h2>
<p>In my experience of infrastructure-as-code solutions, whether written in Puppet or anything else, operational usability issues remain no matter how clean the code, no matter how many unit and integration tests, and no matter how good is the documentation. Writing an infrastructure-as-code solution is not easy, and design flaws find their way in. In this post, I assume that your Hiera design is not perfect, and in particular, I assume that the single-source-of-truth (SSoT) principle has been violated.</p>
<h2 id="code-example">Code example</h2>
<p>The code example I use in this post comes from a modification of a code example I found online <a href="https://blog.serverdensity.com/deploying-nginx-with-puppet/">here</a> for deploying Nginx:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># common.yaml</span>
<span class="nn">---</span>
<span class="s">nginx::config::vhost_purge: </span><span class="no">true</span>
<span class="s">nginx::config::confd_purge: </span><span class="no">true</span>
<span class="s">nginx::nginx_vhosts:</span>
<span class="s">'example.com'</span><span class="pi">:</span>
<span class="na">ensure</span><span class="pi">:</span> <span class="s">present</span>
<span class="na">rewrite_www_to_non_www</span><span class="pi">:</span> <span class="no">true</span>
<span class="na">www_root</span><span class="pi">:</span> <span class="s">/srv/www/example.com/</span>
<span class="na">try_files</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s1">'</span><span class="s">$uri'</span>
<span class="pi">-</span> <span class="s1">'</span><span class="s">$uri/'</span>
<span class="pi">-</span> <span class="s1">'</span><span class="s">/index.php$is_args$args'</span>
<span class="s">nginx::nginx_locations:</span>
<span class="s">'php'</span><span class="pi">:</span>
<span class="na">ensure</span><span class="pi">:</span> <span class="s">present</span>
<span class="na">vhost</span><span class="pi">:</span> <span class="s">example.com</span>
<span class="na">location</span><span class="pi">:</span> <span class="s1">'</span><span class="s">~</span><span class="nv"> </span><span class="s">.php$'</span>
<span class="na">www_root</span><span class="pi">:</span> <span class="s">/srv/www/example.com/</span>
<span class="na">try_files</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s1">'</span><span class="s">$uri'</span>
<span class="pi">-</span> <span class="s1">'</span><span class="s">/index.php</span><span class="nv"> </span><span class="s">=404'</span>
<span class="na">location_cfg_append</span><span class="pi">:</span>
<span class="na">fastcgi_split_path_info</span><span class="pi">:</span> <span class="s1">'</span><span class="s">^(.+\.php)(.*)$'</span>
<span class="na">fastcgi_pass</span><span class="pi">:</span> <span class="s1">'</span><span class="s">php'</span>
<span class="na">fastcgi_index</span><span class="pi">:</span> <span class="s1">'</span><span class="s">index.php'</span>
<span class="na">fastcgi_param SCRIPT_FILENAME</span><span class="pi">:</span> <span class="s2">"</span><span class="s">/srv/www/example.com$fastcgi_script_name"</span>
<span class="na">include</span><span class="pi">:</span> <span class="s1">'</span><span class="s">fastcgi_params'</span>
<span class="na">fastcgi_param QUERY_STRING</span><span class="pi">:</span> <span class="s1">'</span><span class="s">$query_string'</span>
<span class="na">fastcgi_param REQUEST_METHOD</span><span class="pi">:</span> <span class="s1">'</span><span class="s">$request_method'</span>
<span class="na">fastcgi_param CONTENT_TYPE</span><span class="pi">:</span> <span class="s1">'</span><span class="s">$content_type'</span>
<span class="na">fastcgi_param CONTENT_LENGTH</span><span class="pi">:</span> <span class="s1">'</span><span class="s">$content_length'</span>
<span class="na">fastcgi_intercept_errors</span><span class="pi">:</span> <span class="s1">'</span><span class="s">on'</span>
<span class="na">fastcgi_ignore_client_abort</span><span class="pi">:</span> <span class="s1">'</span><span class="s">off'</span>
<span class="na">fastcgi_connect_timeout</span><span class="pi">:</span> <span class="s1">'</span><span class="s">60'</span>
<span class="na">fastcgi_send_timeout</span><span class="pi">:</span> <span class="s1">'</span><span class="s">180'</span>
<span class="na">fastcgi_read_timeout</span><span class="pi">:</span> <span class="s1">'</span><span class="s">180'</span>
<span class="na">fastcgi_buffer_size</span><span class="pi">:</span> <span class="s1">'</span><span class="s">128k'</span>
<span class="na">fastcgi_buffers</span><span class="pi">:</span> <span class="s1">'</span><span class="s">4</span><span class="nv"> </span><span class="s">256k'</span>
<span class="na">fastcgi_busy_buffers_size</span><span class="pi">:</span> <span class="s1">'</span><span class="s">256k'</span>
<span class="na">fastcgi_temp_file_write_size</span><span class="pi">:</span> <span class="s1">'</span><span class="s">256k'</span>
<span class="err"> </span><span class="s1">'</span><span class="s">server-status'</span><span class="pi">:</span>
<span class="na">ensure</span><span class="pi">:</span> <span class="s">present</span>
<span class="na">vhost</span><span class="pi">:</span> <span class="s">/srv/www/example.com/</span>
<span class="na">location</span><span class="pi">:</span> <span class="s">/server-status</span>
<span class="na">stub_status</span><span class="pi">:</span> <span class="no">true</span>
<span class="na">location_cfg_append</span><span class="pi">:</span>
<span class="na">access_log</span><span class="pi">:</span> <span class="s">off</span>
<span class="na">allow</span><span class="pi">:</span> <span class="s">127.0.0.1</span>
<span class="na">deny</span><span class="pi">:</span> <span class="s">all</span>
<span class="s">serverdensity_agent::plugin::nginx::nginx_status_url: "http://example.com/server-status"</span>
<span class="s">nginx::nginx_upstreams:</span>
<span class="s">'php'</span><span class="pi">:</span>
<span class="na">ensure</span><span class="pi">:</span> <span class="s">present</span>
<span class="na">members</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">unix:/var/run/php5-fpm.sock</span>
<span class="s">php::fpm: </span><span class="no">true</span>
<span class="s">php::fpm::settings:</span>
<span class="s">PHP/short_open_tag</span><span class="pi">:</span> <span class="s1">'</span><span class="s">On'</span>
<span class="s">php::extensions:</span>
<span class="s">json</span><span class="pi">:</span> <span class="pi">{}</span>
<span class="na">curl</span><span class="pi">:</span> <span class="pi">{}</span>
<span class="na">mcrypt</span><span class="pi">:</span> <span class="pi">{}</span>
<span class="s">php::fpm::pools:</span>
<span class="s">'www'</span><span class="pi">:</span>
<span class="na">listen</span><span class="pi">:</span> <span class="s">unix:/var/run/php5-fpm.sock</span>
<span class="na">pm_status_path</span><span class="pi">:</span> <span class="s">/php-status</span>
</code></pre></div></div>
<h2 id="code-on-github">Code on GitHub</h2>
<p>The source code for this blog post is available online at GitHub <a href="https://github.com/alexharv074/data_consistency_part_iii">here</a>.</p>
<h2 id="what-are-we-testing-and-why">What are we testing and why</h2>
<p>The code above shows how to configure an Nginx vhost using Puppet. And as it stands, this code is fine and doesn’t really need to be tested any further if all the usual tests (e.g. end-to-end tests in Beaker) pass.</p>
<p>But what if this was to be the first of many Nginx vhosts, and an operational procedure is to copy this code and use it as the basis of new vhosts in the future? In this case, I can see this code being quite problematic. Here is what I think is going to happen:</p>
<ol>
<li>People are going to make YAML errors such as indentation errors, duplicate keys, and so on.</li>
<li>The vhost domain <code class="language-plaintext highlighter-rouge">example.com</code> appears in 7 different places in the code. People are going to forget to update some of these.</li>
<li>By exposing so many of Nginx’s configuration options, I expect that over time, a lot of invalid Nginx configurations will be accidentally set.</li>
</ol>
<h2 id="is-there-a-better-way">Is there a better way</h2>
<p>As far as duplication of the vhost domain in 7 places, there is almost always a better way to handle duplication than the method I am proposing in this post. In this case, we could refactor to add a key <code class="language-plaintext highlighter-rouge">vdomain</code> and replace each occurrence of the string <code class="language-plaintext highlighter-rouge">example.com</code> like this:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">vdomain</span><span class="pi">:</span> <span class="s">example.com</span>
<span class="s">nginx::nginx_vhosts:</span>
<span class="s">"%{lookup('vdomain')}"</span><span class="pi">:</span>
<span class="na">ensure</span><span class="pi">:</span> <span class="s">present</span>
<span class="na">rewrite_www_to_non_www</span><span class="pi">:</span> <span class="no">true</span>
<span class="na">www_root</span><span class="pi">:</span> <span class="s2">"</span><span class="s">/srv/www/%{lookup('vdomain')}/"</span>
<span class="na">try_files</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s1">'</span><span class="s">$uri'</span>
<span class="pi">-</span> <span class="s1">'</span><span class="s">$uri/'</span>
<span class="pi">-</span> <span class="s1">'</span><span class="s">/index.php$is_args$args'</span>
</code></pre></div></div>
<p>And so on.</p>
<p>But this could be harder if you already have 5,000 vhosts! Then what?</p>
<p>The key point here is this method I am proposing is often used as a work-around to design flaws. Fix those design flaws if you can. If you can’t, consider this method as way better than nothing.</p>
<h2 id="tests">Tests</h2>
<h3 id="yamllint">Yamllint</h3>
<h4 id="overview">Overview</h4>
<p>Use of Yamllint on any YAML files used in configuration management is in my opinion always recommended. Why? One reason alone makes it always worthwhile: the dreaded duplicate key issue. The duplicate key issue is often almost impossible to otherwise detect and can lead to the user believing their configuration is A when it is in fact B! If this happens, you can easily lose days or even have bugs that no one can find.</p>
<p>At this time, I am unaware of any other Yamllint utility than the <a href="https://github.com/adrienverge/yamllint">Python-based version</a> by Adrien Vergé.</p>
<h4 id="rakefile">Rakefile</h4>
<p>To ensure that the installation of Yamllint itself is automatic and the whole thing is easy to use, I begin with two Rake tasks in Rakefile:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">desc</span> <span class="s1">'Install Yamllint'</span>
<span class="n">task</span> <span class="ss">:install_yamllint</span> <span class="k">do</span>
<span class="n">sh</span> <span class="s1">'yamllint --version || bash venv.sh'</span>
<span class="k">end</span>
<span class="n">desc</span> <span class="s1">'Yamllint Hiera files'</span>
<span class="n">task</span> <span class="ss">:yamllint</span> <span class="o">=></span> <span class="ss">:install_yamllint</span> <span class="k">do</span>
<span class="n">sh</span> <span class="s1">'yamllint -c yamllint.yml hieradata/*.yaml'</span>
<span class="k">end</span>
</code></pre></div></div>
<p>This refers to two other files that are expected to also exist, venv.sh, which installs Yamllint in a virtualenv, and yamllint.yml, Yamllint’s configuration file.</p>
<h4 id="venvsh">venv.sh</h4>
<p>This is a very simple shell script:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/usr/bin/env bash</span>
virtualenv venv
<span class="nb">.</span> venv/bin/activate
pip <span class="nb">install </span>yamllint
</code></pre></div></div>
<h4 id="yamllintyml">yamllint.yml</h4>
<p>Yamllint’s configuration. Customise to your liking!</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nn">---</span>
<span class="na">rules</span><span class="pi">:</span>
<span class="na">braces</span><span class="pi">:</span>
<span class="na">min-spaces-inside</span><span class="pi">:</span> <span class="m">0</span>
<span class="na">max-spaces-inside</span><span class="pi">:</span> <span class="m">0</span>
<span class="na">min-spaces-inside-empty</span><span class="pi">:</span> <span class="s">-1</span>
<span class="na">max-spaces-inside-empty</span><span class="pi">:</span> <span class="s">-1</span>
<span class="na">brackets</span><span class="pi">:</span>
<span class="na">min-spaces-inside</span><span class="pi">:</span> <span class="m">0</span>
<span class="na">max-spaces-inside</span><span class="pi">:</span> <span class="m">0</span>
<span class="na">min-spaces-inside-empty</span><span class="pi">:</span> <span class="s">-1</span>
<span class="na">max-spaces-inside-empty</span><span class="pi">:</span> <span class="s">-1</span>
<span class="na">colons</span><span class="pi">:</span>
<span class="na">max-spaces-before</span><span class="pi">:</span> <span class="m">0</span>
<span class="na">max-spaces-after</span><span class="pi">:</span> <span class="m">1</span>
<span class="na">commas</span><span class="pi">:</span>
<span class="na">max-spaces-before</span><span class="pi">:</span> <span class="m">0</span>
<span class="na">min-spaces-after</span><span class="pi">:</span> <span class="m">1</span>
<span class="na">max-spaces-after</span><span class="pi">:</span> <span class="m">1</span>
<span class="na">document-end</span><span class="pi">:</span> <span class="s">disable</span>
<span class="na">document-start</span><span class="pi">:</span>
<span class="na">level</span><span class="pi">:</span> <span class="s">error</span>
<span class="na">present</span><span class="pi">:</span> <span class="no">true</span>
<span class="na">empty-lines</span><span class="pi">:</span>
<span class="na">max</span><span class="pi">:</span> <span class="m">1</span>
<span class="na">max-start</span><span class="pi">:</span> <span class="m">0</span>
<span class="na">max-end</span><span class="pi">:</span> <span class="m">0</span>
<span class="na">empty-values</span><span class="pi">:</span>
<span class="na">forbid-in-block-mappings</span><span class="pi">:</span> <span class="no">false</span>
<span class="na">forbid-in-flow-mappings</span><span class="pi">:</span> <span class="no">false</span>
<span class="na">hyphens</span><span class="pi">:</span>
<span class="na">max-spaces-after</span><span class="pi">:</span> <span class="m">1</span>
<span class="na">indentation</span><span class="pi">:</span>
<span class="na">spaces</span><span class="pi">:</span> <span class="s">consistent</span>
<span class="na">indent-sequences</span><span class="pi">:</span> <span class="no">true</span>
<span class="na">check-multi-line-strings</span><span class="pi">:</span> <span class="no">false</span>
<span class="na">key-duplicates</span><span class="pi">:</span> <span class="s">enable</span>
<span class="na">key-ordering</span><span class="pi">:</span> <span class="s">disable</span>
<span class="na">new-line-at-end-of-file</span><span class="pi">:</span> <span class="s">enable</span>
<span class="na">new-lines</span><span class="pi">:</span>
<span class="na">type</span><span class="pi">:</span> <span class="s">unix</span>
<span class="na">octal-values</span><span class="pi">:</span>
<span class="na">forbid-implicit-octal</span><span class="pi">:</span> <span class="no">false</span>
<span class="na">forbid-explicit-octal</span><span class="pi">:</span> <span class="no">false</span>
<span class="na">trailing-spaces</span><span class="pi">:</span> <span class="s">enable</span>
<span class="na">truthy</span><span class="pi">:</span> <span class="s">disable</span>
</code></pre></div></div>
<h4 id="running-the-test">Running the test</h4>
<p>To run the Yamllint tests:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>▶ bundle exec rake yamllint
yamllint --version || bash venv.sh
yamllint 1.11.1
yamllint -c yamllint.yml hieradata/*.yaml
</code></pre></div></div>
<p>What if I deliberately insert a duplicate key:</p>
<div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gd">--- a/hieradata/common.yaml
</span><span class="gi">+++ b/hieradata/common.yaml
</span><span class="p">@@ -73,3 +73,13 @@</span> php::fpm::pools:
'www':
listen: unix:/var/run/php5-fpm.sock
pm_status_path: /php-status
<span class="gi">+
+nginx::nginx_vhosts:
+ 'example.com':
+ ensure: present
+ rewrite_www_to_non_www: true
+ www_root: /srv/www/example.com/
+ try_files:
+ - '$uri'
+ - '$uri/'
+ - '/index.php$is_args$args'
</span></code></pre></div></div>
<p>Run it again:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>▶ bundle exec rake yamllint
yamllint --version || bash venv.sh
yamllint 1.11.1
yamllint -c yamllint.yml hieradata/*.yaml
hieradata/common.yaml
77:1 error duplication of key "nginx::nginx_vhosts" in mapping (key-duplicates)
rake aborted!
Command failed with status (1): [yamllint -c yamllint.yml hieradata/*.yaml...]
/Users/alexharvey/git/home/data_consistency_part_iii/Rakefile:10:in `block in <top (required)>'
/Users/alexharvey/.rvm/gems/ruby-2.4.1/gems/rake-13.0.1/exe/rake:27:in `<top (required)>'
/Users/alexharvey/.rvm/gems/ruby-2.4.1/bin/ruby_executable_hooks:24:in `eval'
/Users/alexharvey/.rvm/gems/ruby-2.4.1/bin/ruby_executable_hooks:24:in `<main>'
Tasks: TOP => yamllint
(See full trace by running task with --trace)
</code></pre></div></div>
<p>Never underestimate the usefulness of this test!</p>
<h3 id="rspec-assertions-about-the-data">Rspec assertions about the data</h3>
<p>But the point of this post is really about direct assertions about the data using Rspec. Here is the, hopefully easy to understand, Rspec code:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">#!/usr/bin/env ruby</span>
<span class="nb">require</span> <span class="s2">"spec_helper"</span>
<span class="nb">require</span> <span class="s2">"yaml"</span>
<span class="n">data</span> <span class="o">=</span> <span class="no">YAML</span><span class="p">.</span><span class="nf">load_file</span><span class="p">(</span><span class="s2">"hieradata/common.yaml"</span><span class="p">)</span>
<span class="n">describe</span> <span class="s2">"Nginx data"</span> <span class="k">do</span>
<span class="n">data</span><span class="p">[</span><span class="s2">"nginx::nginx_vhosts"</span><span class="p">].</span><span class="nf">keys</span><span class="p">.</span><span class="nf">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">vhost</span><span class="o">|</span>
<span class="n">context</span> <span class="s2">"nginx::nginx_vhosts.</span><span class="si">#{</span><span class="n">vhost</span><span class="si">}</span><span class="s2">"</span> <span class="k">do</span>
<span class="n">ref</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="s2">"nginx::nginx_vhosts"</span><span class="p">][</span><span class="n">vhost</span><span class="p">]</span>
<span class="n">it</span> <span class="s2">"www_root"</span> <span class="k">do</span>
<span class="n">expect</span><span class="p">(</span><span class="n">ref</span><span class="p">[</span><span class="s2">"www_root"</span><span class="p">]).</span><span class="nf">to</span> <span class="n">eq</span> <span class="s2">"/srv/www/</span><span class="si">#{</span><span class="n">vhost</span><span class="si">}</span><span class="s2">/"</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="n">context</span> <span class="s2">"nginx::nginx_locations.'php'"</span> <span class="k">do</span>
<span class="n">ref</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="s2">"nginx::nginx_locations"</span><span class="p">][</span><span class="s2">"php"</span><span class="p">]</span>
<span class="n">it</span> <span class="s2">"vhost"</span> <span class="k">do</span>
<span class="n">expect</span><span class="p">(</span><span class="n">ref</span><span class="p">[</span><span class="s2">"vhost"</span><span class="p">]).</span><span class="nf">to</span> <span class="n">eq</span> <span class="n">vhost</span>
<span class="k">end</span>
<span class="n">it</span> <span class="s2">"www_root"</span> <span class="k">do</span>
<span class="n">expect</span><span class="p">(</span><span class="n">ref</span><span class="p">[</span><span class="s2">"www_root"</span><span class="p">]).</span><span class="nf">to</span> <span class="n">eq</span> <span class="s2">"/srv/www/</span><span class="si">#{</span><span class="n">vhost</span><span class="si">}</span><span class="s2">/"</span>
<span class="k">end</span>
<span class="n">context</span> <span class="s2">"location_cfg_append"</span> <span class="k">do</span>
<span class="n">inner_ref</span> <span class="o">=</span> <span class="n">ref</span><span class="p">[</span><span class="s2">"location_cfg_append"</span><span class="p">]</span>
<span class="n">it</span> <span class="s2">"fastcgi_param SCRIPT_FILENAME"</span> <span class="k">do</span>
<span class="n">expect</span><span class="p">(</span>
<span class="n">inner_ref</span><span class="p">[</span><span class="s2">"fastcgi_param SCRIPT_FILENAME"</span><span class="p">]</span>
<span class="p">).</span><span class="nf">to</span> <span class="n">eq</span> <span class="s2">"/srv/www/</span><span class="si">#{</span><span class="n">vhost</span><span class="si">}</span><span class="s2">$fastcgi_script_name"</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="n">context</span> <span class="s2">"nginx::nginx_locations.'server-status'"</span> <span class="k">do</span>
<span class="n">ref</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="s2">"nginx::nginx_locations"</span><span class="p">][</span><span class="s2">"server-status"</span><span class="p">]</span>
<span class="n">it</span> <span class="s2">"vhost"</span> <span class="k">do</span>
<span class="n">expect</span><span class="p">(</span><span class="n">ref</span><span class="p">[</span><span class="s2">"vhost"</span><span class="p">]).</span><span class="nf">to</span> <span class="n">eq</span> <span class="s2">"/srv/www/</span><span class="si">#{</span><span class="n">vhost</span><span class="si">}</span><span class="s2">/"</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="n">context</span> <span class="s2">"serverdensity_agent::plugin::nginx::nginx_status_url"</span> <span class="k">do</span>
<span class="n">it</span> <span class="k">do</span>
<span class="n">expect</span><span class="p">(</span>
<span class="n">data</span><span class="p">[</span><span class="s2">"serverdensity_agent::plugin::nginx::nginx_status_url"</span><span class="p">]</span>
<span class="p">).</span><span class="nf">to</span> <span class="n">eq</span> <span class="s2">"http://</span><span class="si">#{</span><span class="n">vhost</span><span class="si">}</span><span class="s2">/server-status"</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span>
</code></pre></div></div>
<h3 id="assertions-against-nginx-docs">Assertions against Nginx docs</h3>
<p>What if I want to take this even further, and make assertions about Nginx <a href="http://nginx.org/en/docs/http/ngx_http_fastcgi_module.html#directives">directives</a> based on documentation? Let’s do that too:</p>
<div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gd">--- a/spec/data_spec.rb
</span><span class="gi">+++ b/spec/data_spec.rb
</span><span class="p">@@ -33,6 +33,18 @@</span> describe "Nginx data" do
inner_ref["fastcgi_param SCRIPT_FILENAME"]
).to eq "/srv/www/#{vhost}$fastcgi_script_name"
end
<span class="gi">+
+ it "fastcgi_intercept_errors" do
+ expect(
+ ["on","off"].include?(inner_ref["fastcgi_intercept_errors"])
+ ).to be true
+ end
+
+ it "fastcgi_ignore_client_abort" do
+ expect(
+ ["on","off"].include?(inner_ref["fastcgi_ignore_client_abort"])
+ ).to be true
+ end
</span> end
end
</code></pre></div></div>
<h3 id="run-the-tests">Run the tests</h3>
<p>Running the tests is shown in the following screenshot:</p>
<p><img src="https://alexharv074.github.io//assets/run_tests.png" alt="Run the tests" /></p>
<p>Notice one of the cool things about the Rspec framework for testing nested YAML data is the way I can also easily nest the tests using contexts to create a nice, readable output like this.</p>
<h2 id="discussion">Discussion</h2>
<p>This post has introduced three layers of direct Hiera data testing that can be used in Puppet. In all cases, the tests have a bit of work to write them in the first place, but after that, should be quite maintainable. The cost-benefit ratio will differ in each case. I daresay that the benefit of having the Yamllint layer of testing will always outweigh the cost of writing it and the maintenance. It would only need to capture a single duplicate key to payoff the cost of setting it up. And because Yamllint is highly configurable, the tests can be made as pedantic or as forgiving as fits the personality of a team.</p>
<p>The direct assertions about YAML data keys is likely to be more contentious. Some will say this is an anti-pattern and you shoud not directly test your data. I am not sure where that idea originated but I have heard it said before. I would disagree obviously. But anyone who has an operational procedure to cut new Nginx or similar configurations by copying and editing data, I expect they will immediately find tests of the sort I have written here useful. And that has been my experience where I set these up for clients in the past. This layer of testing proved to be both useful and popular.</p>
<p>The third layer of making assertions against Nginx configuration documentation is probably taking things further than I would tend to myself, but I simply show what is possible. No one should test for the sake of it but it is good to know what is possible.</p>
<p>Finally, note well that while this post is ostensibly about Puppet, the methods shown here can be extended to any configuration management tool that uses YAML data files. I may at some point write a separate post showing how I have applied these methods in CloudFormation, Ansible and so on.</p>
<p>As always I welcome feedback and discussion if anyone has any so send me an email if you have comments.</p>
<h2 id="see-also">See also</h2>
<ul>
<li>Paul Hammond and Samantha Stoller, Jul 28 2016, <a href="https://slack.engineering/data-consistency-checks-e73261318f96">Data Consistency Checks</a> (Slack Engineering).</li>
</ul>Alex HarveyIn this third and probably the last part of this series, I look at the method of using Rspec to make direct assertions about Hiera data. Usually, the purpose of these assertions is to work around design flaws in a code base that cannot be easily corrected.Why ERB should be preferred to Jinja2 for DevOps templating2020-03-06T00:00:00+00:002020-03-06T00:00:00+00:00https://alexharv074.github.io//puppet/2020/03/06/why-erb-should-be-preferred-to-jinja2-for-devops-templating<p>The use of Jinja2 templating in DevOps has become a de facto standard after the popularisation of Ansible and Salt as configuration management tools and Python as a programming language. Jinja2 has largely displaced the earlier Ruby-based equivalent, ERB (Embedded Ruby), that was previously popular in Puppet and Chef.</p>
<p>In this post, I argue that Jinja2 has a number of flaws that make it not well-suited as a general purpose templating language.</p>
<ul id="markdown-toc">
<li><a href="#introduction" id="markdown-toc-introduction">Introduction</a></li>
<li><a href="#devops-tools-using-jinja2" id="markdown-toc-devops-tools-using-jinja2">DevOps tools using Jinja2</a></li>
<li><a href="#jinja-language-compared-to-erb" id="markdown-toc-jinja-language-compared-to-erb">Jinja language compared to ERB</a></li>
<li><a href="#a-very-incomplete-jinja-feature-comparison" id="markdown-toc-a-very-incomplete-jinja-feature-comparison">A (very incomplete) Jinja feature comparison</a></li>
<li><a href="#jinja2s-built-in-filters" id="markdown-toc-jinja2s-built-in-filters">Jinja2’s built-in filters</a></li>
<li><a href="#custom-jinja2-filters" id="markdown-toc-custom-jinja2-filters">Custom Jinja2 filters</a> <ul>
<li><a href="#custom-filters-in-ansible-and-salt" id="markdown-toc-custom-filters-in-ansible-and-salt">Custom filters in Ansible and Salt</a></li>
<li><a href="#comparing-regex_replace-in-ansible-and-salt" id="markdown-toc-comparing-regex_replace-in-ansible-and-salt">Comparing regex_replace in Ansible and Salt</a> <ul>
<li><a href="#ansible-version" id="markdown-toc-ansible-version">Ansible version</a></li>
<li><a href="#salt-version" id="markdown-toc-salt-version">Salt version</a></li>
</ul>
</li>
</ul>
</li>
<li><a href="#calling-the-shell" id="markdown-toc-calling-the-shell">Calling the shell</a></li>
<li><a href="#defining-functions-inline" id="markdown-toc-defining-functions-inline">Defining functions inline</a></li>
<li><a href="#multiline-code-blocks" id="markdown-toc-multiline-code-blocks">Multiline code blocks</a></li>
<li><a href="#white-space-control" id="markdown-toc-white-space-control">White space control</a></li>
<li><a href="#discussion" id="markdown-toc-discussion">Discussion</a></li>
<li><a href="#see-also" id="markdown-toc-see-also">See also</a></li>
</ul>
<h2 id="introduction">Introduction</h2>
<p>The Jinja2 template engine was inspired by Django and provides a Python-like language for securely generating HTML, XML, and other markup. Its benefits are said to be:</p>
<ul>
<li>sandboxed execution and optional automatic escaping for applications where security is important.</li>
<li>portability among Python versions.</li>
<li>elegance. “Jinja is beautiful”.</li>
</ul>
<div class="language-jinja highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">{%</span> <span class="k">extends</span> <span class="s2">"layout.html"</span> <span class="cp">%}</span>
<span class="cp">{%</span> <span class="k">block</span> <span class="nv">body</span> <span class="cp">%}</span>
<span class="nt"><ul></span>
<span class="cp">{%</span> <span class="k">for</span> <span class="nv">user</span> <span class="ow">in</span> <span class="nv">users</span> <span class="cp">%}</span>
<span class="nt"><li><a</span> <span class="na">href=</span><span class="s">"</span><span class="cp">{{</span> <span class="nv">user.url</span> <span class="cp">}}</span><span class="s">"</span><span class="nt">></span><span class="cp">{{</span> <span class="nv">user.username</span> <span class="cp">}}</span><span class="nt"></a></li></span>
<span class="cp">{%</span> <span class="k">endfor</span> <span class="cp">%}</span>
<span class="nt"></ul></span>
<span class="cp">{%</span> <span class="k">endblock</span> <span class="cp">%}</span>
</code></pre></div></div>
<p>It is. And used as a web framework, as intended, I have no doubt that it is a powerful, elegant tool, as advertised.</p>
<p>But is it good for code generation in general? Because in DevOps, Jinja2 is not used for generating HTML web pages, but for configuration files, YAML documents, human readable text, Markdown source code, and so on.</p>
<p>In this post I compare some of Jinja2’s features with ERB, and I argue that the community could do well to return to ERB.</p>
<h2 id="devops-tools-using-jinja2">DevOps tools using Jinja2</h2>
<p>Of DevOps tools I am aware of, Jinja2 has found its way as a templating language into all of the following systems:</p>
<table>
<thead>
<tr>
<th>Tool</th>
<th>Year</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><a href="https://docs.getpelican.com/en/stable/#">Pelican</a></td>
<td>2010</td>
<td>Static site generator</td>
</tr>
<tr>
<td><a href="https://www.saltstack.com">Salt</a></td>
<td>2011</td>
<td>Configuration management</td>
</tr>
<tr>
<td><a href="https://www.ansible.com">Ansible</a></td>
<td>2012</td>
<td>Configuration management</td>
</tr>
<tr>
<td><a href="https://github.com/cookiecutter/cookiecutter">Cookiecutter</a></td>
<td>2013</td>
<td>Project templating</td>
</tr>
<tr>
<td><a href="http://www.mkdocs.org/">MkDocs</a></td>
<td>2014</td>
<td>Static site generator</td>
</tr>
<tr>
<td><a href="https://github.com/Sceptre/sceptre">Sceptre</a></td>
<td>2017</td>
<td>Configuration management of CloudFormation</td>
</tr>
</tbody>
</table>
<p>This is a short list, and I am sure there are many more. But it is used widely.</p>
<h2 id="jinja-language-compared-to-erb">Jinja language compared to ERB</h2>
<table>
<thead>
<tr>
<th>Feature</th>
<th>Jinja2</th>
<th>ERB</th>
</tr>
</thead>
<tbody>
<tr>
<td>Basic language</td>
<td>Small, Python-like DSL</td>
<td>Ruby</td>
</tr>
</tbody>
</table>
<p>Jinja2 is a basic, Python-like DSL, as mentioned, whereas Ruby in ERB is the real Ruby, a featureful, high-level programming language optimised for data and text processing.</p>
<p>Now, if your problem is securely generating web content, I have no opinion on Flask versus Ruby-on-Rails. I assume that Jinja2’s design is a good thing. Security is good and I am totally okay with fewer features in the interest of secure content.</p>
<p>But DevOps engineers are generally not using Jinja2 to generate secure web content. As already mentioned, it is used in configuration management to code generate human-readable text, Markdown documents, configuration files, YAML documents, and so on. This is true in tools like Cookiecutter and Sceptre and also Ansible and Salt. Here, a small, Python-like DSL appears to be a limitation rather than an advantage. Actually, a fairly accidental, arbitrary limitation.</p>
<h2 id="a-very-incomplete-jinja-feature-comparison">A (very incomplete) Jinja feature comparison</h2>
<p>If we take a step back, we might consider the history of other programming languages designed with text and code generation in mind. Some of the best known ones are sed (1974), AWK (1977), Perl (1987), and Ruby (1993).</p>
<p>The following table shows a list of basic text manipulation features that are missing in Jinja2:</p>
<table>
<thead>
<tr>
<th>Feature</th>
<th>Sed</th>
<th>AWK</th>
<th>Perl</th>
<th>Ruby/ERB</th>
<th>Jinja2</th>
</tr>
</thead>
<tbody>
<tr>
<td>Regex</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>No</td>
</tr>
<tr>
<td>Split function</td>
<td>No</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>No</td>
</tr>
<tr>
<td>Read files from disk</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>No</td>
</tr>
<tr>
<td>Define functions inline</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>No</td>
</tr>
<tr>
<td>Call external programs</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes</td>
<td>No</td>
</tr>
</tbody>
</table>
<p>It could be argued that the most basic feature of a tool for editing text and data is a regular expression engine. And yet Jinja2 does not have one. The lack of a split function is surprising.</p>
<p>It is obvious that Jinja2 is not designed to edit and manipulate text. The author’s assumption is that the caller already edited the text prior to instantiation of the template.</p>
<h2 id="jinja2s-built-in-filters">Jinja2’s built-in filters</h2>
<p>Aside from basic language features, Jinja2 has (at the time of writing) 50 built-in “filters”. These are documented <a href="https://jinja.palletsprojects.com/en/2.11.x/templates/#list-of-builtin-filters">here</a>. Some of them are useful for text manipulation, such as the center and wordwrap filters. But there are not many filters and many gaps in functionality. I already mentioned there is no split filter for example.</p>
<h2 id="custom-jinja2-filters">Custom Jinja2 filters</h2>
<h3 id="custom-filters-in-ansible-and-salt">Custom filters in Ansible and Salt</h3>
<p>Users of Ansible and Salt may or may not realise that many of the filters they rely on are custom filters provided by Ansible and Salt respectively, rather than actual features of Jinja2.</p>
<p>Ansible’s filters are documented <a href="https://docs.ansible.com/ansible/latest/user_guide/playbooks_filters.html">here</a> and, as can be seen, the list of custom filters is long. There are filters for text manipulation, data transformation, set theory, regular expressions, and so on and on. The length of the list really speaks to how limited Jinja2 itself is.</p>
<p>Salt’s similarly-long list of custom filters meanwhile is documented <a href="https://docs.saltstack.com/en/latest/topics/jinja/index.html#filters">here</a>.</p>
<h3 id="comparing-regex_replace-in-ansible-and-salt">Comparing regex_replace in Ansible and Salt</h3>
<p>Often, Ansible and Salt have chosen to implement similar filters with similar usage. Thus, both provide a <code class="language-plaintext highlighter-rouge">regex_replace</code> filter.</p>
<p>Let’s have a look at the source code for these filters respectively.</p>
<h4 id="ansible-version">Ansible version</h4>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">regex_replace</span><span class="p">(</span><span class="n">value</span><span class="o">=</span><span class="s">''</span><span class="p">,</span> <span class="n">pattern</span><span class="o">=</span><span class="s">''</span><span class="p">,</span> <span class="n">replacement</span><span class="o">=</span><span class="s">''</span><span class="p">,</span> <span class="n">ignorecase</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span> <span class="n">multiline</span><span class="o">=</span><span class="bp">False</span><span class="p">):</span>
<span class="n">value</span> <span class="o">=</span> <span class="n">to_text</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="n">errors</span><span class="o">=</span><span class="s">'surrogate_or_strict'</span><span class="p">,</span> <span class="n">nonstring</span><span class="o">=</span><span class="s">'simplerepr'</span><span class="p">)</span>
<span class="n">flags</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">if</span> <span class="n">ignorecase</span><span class="p">:</span>
<span class="n">flags</span> <span class="o">|=</span> <span class="n">re</span><span class="p">.</span><span class="n">I</span>
<span class="k">if</span> <span class="n">multiline</span><span class="p">:</span>
<span class="n">flags</span> <span class="o">|=</span> <span class="n">re</span><span class="p">.</span><span class="n">M</span>
<span class="n">_re</span> <span class="o">=</span> <span class="n">re</span><span class="p">.</span><span class="nb">compile</span><span class="p">(</span><span class="n">pattern</span><span class="p">,</span> <span class="n">flags</span><span class="o">=</span><span class="n">flags</span><span class="p">)</span>
<span class="k">return</span> <span class="n">_re</span><span class="p">.</span><span class="n">sub</span><span class="p">(</span><span class="n">replacement</span><span class="p">,</span> <span class="n">value</span><span class="p">)</span>
</code></pre></div></div>
<h4 id="salt-version">Salt version</h4>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">regex_replace</span><span class="p">(</span><span class="n">txt</span><span class="p">,</span> <span class="n">rgx</span><span class="p">,</span> <span class="n">val</span><span class="p">,</span> <span class="n">ignorecase</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span> <span class="n">multiline</span><span class="o">=</span><span class="bp">False</span><span class="p">):</span>
<span class="n">flag</span> <span class="o">=</span> <span class="mi">0</span>
<span class="k">if</span> <span class="n">ignorecase</span><span class="p">:</span>
<span class="n">flag</span> <span class="o">|=</span> <span class="n">re</span><span class="p">.</span><span class="n">I</span>
<span class="k">if</span> <span class="n">multiline</span><span class="p">:</span>
<span class="n">flag</span> <span class="o">|=</span> <span class="n">re</span><span class="p">.</span><span class="n">M</span>
<span class="n">compiled_rgx</span> <span class="o">=</span> <span class="n">re</span><span class="p">.</span><span class="nb">compile</span><span class="p">(</span><span class="n">rgx</span><span class="p">,</span> <span class="n">flag</span><span class="p">)</span>
<span class="k">return</span> <span class="n">compiled_rgx</span><span class="p">.</span><span class="n">sub</span><span class="p">(</span><span class="n">val</span><span class="p">,</span> <span class="n">txt</span><span class="p">)</span>
</code></pre></div></div>
<p>This code appears to have been copy/pasted from one tool to the other at some point, and both filters are thin wrappers around the Python <a href="https://docs.python.org/3/library/re.html#re.compile"><code class="language-plaintext highlighter-rouge">re.compile</code></a> function.</p>
<p>It goes without saying that this situation is far from ideal. As a user of Sceptre and Cookiecutter, it is frustrating, to say the least, to search on Stack Overflow and find a solution to a problem that only works in Ansible. It must be frustrating when migrating from Ansible to Salt and vice versa too.</p>
<p>None of this is the fault of Jinja2, but it is a concern that Ansible and Salt appear to have invested in parallel development efforts directed to “fixing” the same limitations in Jinja2.</p>
<h2 id="calling-the-shell">Calling the shell</h2>
<p>Sometimes when doing code generation, it simply makes sense to call the shell, or sed, AWK or some other external program. This probably won’t make sense if you are generating HTML for a web site, but it might make sense if you are generating documentation from source code, for instance.</p>
<p>In this ERB example, I call an external Ruby script to auto-generate a Markdown table of contents:</p>
<div class="language-erb highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp"><%=</span> <span class="sx">%x{ruby erb/toc.rb erb/README.erb}</span> <span class="cp">-%></span>
</code></pre></div></div>
<p>But without writing a custom filter, this would be impossible in Jinja2.</p>
<h2 id="defining-functions-inline">Defining functions inline</h2>
<p>Another feature of ERB that I use often is the ability to define a Ruby function inline to deal with repeated code. In this example, I define a function filter. Note that my Ruby function then calls sed.</p>
<div class="language-erb highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp"><%</span>
<span class="c1"># A method to reformat the examples source</span>
<span class="c1"># code suitable for the public doc version.</span>
<span class="c1">#</span>
<span class="k">def</span> <span class="nf">filter</span><span class="p">(</span><span class="n">remote</span><span class="p">,</span> <span class="n">file_name</span><span class="p">)</span>
<span class="sx">%x[sed -E '
s!source( +)=.*!source</span><span class="se">\\</span><span class="sx">1= "</span><span class="si">#{</span><span class="n">remote</span><span class="si">}</span><span class="sx">"!
/variable "bucket_name"/ {
N
N
s/{.*}/{}/
}
' </span><span class="si">#{</span><span class="n">file_name</span><span class="si">}</span><span class="sx">]</span>
<span class="k">end</span>
<span class="n">remote</span> <span class="o">=</span> <span class="sx">%x{git remote -v}</span><span class="p">.</span><span class="nf">split</span><span class="p">(</span><span class="s2">"</span><span class="se">\n</span><span class="s2">"</span><span class="p">)[</span><span class="mi">0</span><span class="p">].</span><span class="nf">split</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span>
<span class="cp">-%></span>
</code></pre></div></div>
<p>Some might have concerns about using ERB to call Ruby to call sed. If so, I could rewrite that in pure Ruby in 10 minutes or so. In code generation, it is good to have options.</p>
<h2 id="multiline-code-blocks">Multiline code blocks</h2>
<p>Notice in the above example how I have defined a function within a multi-line ERB tag. This is not possible in Jinja2. Consider this Jinja2 example:</p>
<div class="language-jinja highlighter-rouge"><div class="highlight"><pre class="highlight"><code>do_bootstrap() {
<span class="cp">{%</span> <span class="k">set</span> <span class="nv">args</span> <span class="o">=</span> <span class="s2">"--kubelet-extra-args '--node-labels=nodegroup="</span> <span class="o">+</span> <span class="nv">node_group_name</span> <span class="cp">%}</span>
<span class="cp">{%</span><span class="o">-</span> <span class="k">if</span> <span class="nv">node_labels</span> <span class="o">!=</span> <span class="s2">"None"</span> <span class="cp">%}</span>
<span class="cp">{%</span><span class="o">-</span> <span class="k">set</span> <span class="nv">args</span> <span class="o">=</span> <span class="nv">args</span> <span class="o">+</span> <span class="s2">","</span> <span class="o">+</span> <span class="nv">node_labels</span> <span class="cp">%}</span>
<span class="cp">{%</span><span class="o">-</span> <span class="k">endif</span> <span class="cp">%}</span>
<span class="cp">{%</span><span class="o">-</span> <span class="k">if</span> <span class="nv">cni_custom_network</span> <span class="o">==</span> <span class="s2">"Yes"</span> <span class="cp">%}</span>
zone=$(curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone)
<span class="cp">{%</span><span class="o">-</span> <span class="k">set</span> <span class="nv">args</span> <span class="o">=</span> <span class="nv">args</span> <span class="o">+</span> <span class="s2">",k8s.amazonaws.com/eniConfig=pod-netconfig-$zone"</span> <span class="cp">%}</span>
<span class="cp">{%</span><span class="o">-</span> <span class="k">endif</span> <span class="cp">%}</span>
<span class="cp">{%</span><span class="o">-</span> <span class="k">if</span> <span class="nv">taints</span> <span class="o">!=</span> <span class="s2">"None"</span> <span class="cp">%}</span>
<span class="cp">{%</span><span class="o">-</span> <span class="k">set</span> <span class="nv">args</span> <span class="o">=</span> <span class="nv">args</span> <span class="o">+</span> <span class="s2">" --register-with-taints="</span> <span class="o">+</span> <span class="nv">taints</span> <span class="cp">%}</span>
<span class="cp">{%</span><span class="o">-</span> <span class="k">endif</span> <span class="cp">%}</span>
<span class="cp">{%</span><span class="o">-</span> <span class="k">set</span> <span class="nv">args</span> <span class="o">=</span> <span class="nv">args</span> <span class="o">+</span> <span class="s2">"'"</span> <span class="o">-</span><span class="cp">%}</span>
eval "/etc/eks/bootstrap.sh ${EKSClusterName} <span class="cp">{{</span> <span class="nv">args</span> <span class="cp">}}</span>"
}
</code></pre></div></div>
<p>That code is quite unreadable and it would be nice if Jinja2 allowed me to define multiline code inside its tags. Like this:</p>
<div class="language-jinja highlighter-rouge"><div class="highlight"><pre class="highlight"><code>do_bootstrap() {
<span class="cp">{%</span><span class="o">-</span>
<span class="k">set</span> <span class="nv">args</span> <span class="o">=</span> <span class="s2">"--kubelet-extra-args '--node-labels=nodegroup="</span> <span class="o">+</span> <span class="nv">node_group_name</span>
<span class="k">if</span> <span class="nv">node_labels</span> <span class="o">!=</span> <span class="s2">"None"</span>
<span class="k">set</span> <span class="nv">args</span> <span class="o">=</span> <span class="nv">args</span> <span class="o">+</span> <span class="s2">","</span> <span class="o">+</span> <span class="nv">node_labels</span>
<span class="k">endif</span>
<span class="k">if</span> <span class="nv">cni_custom_network</span> <span class="o">==</span> <span class="s2">"Yes"</span> <span class="cp">%}</span>
zone=$(curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone)
<span class="cp">{%</span><span class="o">-</span> <span class="k">set</span> <span class="nv">args</span> <span class="o">=</span> <span class="nv">args</span> <span class="o">+</span> <span class="s2">",k8s.amazonaws.com/eniConfig=pod-netconfig-$zone"</span>
<span class="k">endif</span>
<span class="k">if</span> <span class="nv">taints</span> <span class="o">!=</span> <span class="s2">"None"</span>
<span class="k">set</span> <span class="nv">args</span> <span class="o">=</span> <span class="nv">args</span> <span class="o">+</span> <span class="s2">" --register-with-taints="</span> <span class="o">+</span> <span class="nv">taints</span>
<span class="k">endif</span>
<span class="k">set</span> <span class="nv">args</span> <span class="o">=</span> <span class="nv">args</span> <span class="o">+</span> <span class="s2">"'"</span>
<span class="cp">%}</span>
eval "/etc/eks/bootstrap.sh ${EKSClusterName} <span class="cp">{{</span> <span class="nv">args</span> <span class="cp">}}</span>"
}
</code></pre></div></div>
<h2 id="white-space-control">White space control</h2>
<p>In the default configuration, Jinja2’s white space control features are problematic, especially if you are code generating text to be read by humans, such as Markdown documentation, and you need full control of white space.</p>
<p>Consider the following block of code:</p>
<div class="language-jinja highlighter-rouge"><div class="highlight"><pre class="highlight"><code>foo:
bar: baz
<span class="cp">{%</span> <span class="k">if</span> <span class="nv">qux</span> <span class="ow">is</span> <span class="nb">defined</span> <span class="cp">%}</span>
qux:
<span class="cp">{%</span> <span class="k">for</span> <span class="nv">el</span> <span class="ow">in</span> <span class="nv">qux</span> <span class="cp">%}</span>
- <span class="cp">{{</span> <span class="nv">el</span> <span class="cp">}}</span>
<span class="cp">{%</span> <span class="k">endfor</span> <span class="cp">%}</span>
<span class="cp">{%</span> <span class="k">endif</span> <span class="cp">%}</span>
</code></pre></div></div>
<p>If mylist contains quux and quuz, this code generates the following YAML, and I reveal white spaces using <code class="language-plaintext highlighter-rouge">sed l</code>.</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>▶ sed -n l text
foo:$
bar: baz$
$
$
qux:$
$
- quux$
- quuz$
$
$
</code></pre></div></div>
<p>Of course, what I wanted is:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>▶ sed -n l text
foo:$
bar: baz$
$
qux:$
- quux$
- quuz$
</code></pre></div></div>
<p>I could try this:</p>
<div class="language-jinja highlighter-rouge"><div class="highlight"><pre class="highlight"><code>foo:
bar: baz
<span class="cp">{%</span><span class="o">-</span> <span class="k">if</span> <span class="nv">qux</span> <span class="ow">is</span> <span class="nb">defined</span> <span class="cp">%}</span>
qux:
<span class="cp">{%</span><span class="o">-</span> <span class="k">for</span> <span class="nv">el</span> <span class="ow">in</span> <span class="nv">qux</span> <span class="cp">%}</span>
- <span class="cp">{{</span> <span class="nv">el</span> <span class="cp">}}</span>
<span class="cp">{%</span><span class="o">-</span> <span class="k">endfor</span> <span class="cp">%}</span>
<span class="cp">{%</span><span class="o">-</span> <span class="k">endif</span> <span class="cp">%}</span>
</code></pre></div></div>
<p>And now I get this:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>▶ sed -n l text
foo:$
bar: baz$
qux:$
- quux$
- quuz$
</code></pre></div></div>
<p>Notice that the new line between bar and qux is gobbled up by the Jinja2 white space trim mode.</p>
<p>In ERB, this would not be a problem. This does what I want:</p>
<div class="language-erb highlighter-rouge"><div class="highlight"><pre class="highlight"><code>foo:
bar: baz
<span class="cp"><%-</span> <span class="k">unless</span> <span class="n">qux</span><span class="p">.</span><span class="nf">nil?</span> <span class="cp">%></span>
qux:
<span class="cp"><%-</span> <span class="n">qux</span><span class="p">.</span><span class="nf">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">el</span><span class="o">|</span> <span class="cp">%></span>
- <span class="cp"><%=</span> <span class="n">el</span> <span class="cp">%></span>
<span class="cp"><%-</span> <span class="k">end</span> <span class="cp">%></span>
<span class="cp"><%-</span> <span class="k">end</span> <span class="cp">%></span>
</code></pre></div></div>
<p>In fairness, this behaviour in Jinja2 can be configured, although I have only seen Jinja2 used in its default configuration.</p>
<h2 id="discussion">Discussion</h2>
<p>This is really the tip of the iceberg. With full Ruby inside the templating engine, there is no limit on what can be done inside that template. Whereas in Jinja2, there is a quite severe and arbitrary limit.</p>
<p>I have summarised the main problems with Jinja2 that I personally encounter frequently:</p>
<ul>
<li>No ability to call the shell or other languages</li>
<li>No way to define functions</li>
<li>No multiline Jinja2 code</li>
<li>Inferior white space control</li>
<li>A small number of built-in functions</li>
<li>Confusion in forums like Stack Overflow as a result of Ansible’s and Salt’s custom set of filters.</li>
</ul>
<p>For all of the reasons given above, I do not believe that Jinja2 is a good choice for DevOps templating. If used as originally intended, as a tool for code generating HTML and other web front end markup, I regard Jinja2 as an elegant solution. But when used to code generate configuration files, human readable text, Markdown, YAML documents, and so on, ERB leads to far more productive templating.</p>
<p>This is not a small effect I am pointing to either. With Ruby in the template language, it is easy and fast to do most things. Without it, hours are frequently lost researching problems that simply have no solution.</p>
<p>Is it too late to go back?</p>
<h2 id="see-also">See also</h2>
<ul>
<li><a href="https://palletsprojects.com/p/jinja/">Jinja2</a> home page.</li>
</ul>Alex HarveyThe use of Jinja2 templating in DevOps has become a de facto standard after the popularisation of Ansible and Salt as configuration management tools and Python as a programming language. Jinja2 has largely displaced the earlier Ruby-based equivalent, ERB (Embedded Ruby), that was previously popular in Puppet and Chef.Adventures in the Terraform DSL, Part VIII: The Puppet provisioner2019-10-12T00:00:00+00:002019-10-12T00:00:00+00:00https://alexharv074.github.io//puppet/2019/10/12/adventures-in-the-terraform-dsl-part-viii-the-puppet-provisioner<p>This post discussed a proof of concept of the Terraform 0.12.2 Puppet provisioner.</p>
<ul id="markdown-toc">
<li><a href="#introduction" id="markdown-toc-introduction">Introduction</a></li>
<li><a href="#target-audience" id="markdown-toc-target-audience">Target audience</a></li>
<li><a href="#the-code" id="markdown-toc-the-code">The code</a></li>
<li><a href="#architecture" id="markdown-toc-architecture">Architecture</a></li>
<li><a href="#overview" id="markdown-toc-overview">Overview</a></li>
<li><a href="#usage" id="markdown-toc-usage">Usage</a> <ul>
<li><a href="#setting-up-puppet-bolt" id="markdown-toc-setting-up-puppet-bolt">Setting up Puppet Bolt</a></li>
<li><a href="#bolt-config" id="markdown-toc-bolt-config">Bolt config</a> <ul>
<li><a href="#boltyaml" id="markdown-toc-boltyaml">bolt.yaml</a></li>
</ul>
</li>
<li><a href="#puppetfile-contents" id="markdown-toc-puppetfile-contents">Puppetfile contents</a></li>
<li><a href="#the-terraform-code" id="markdown-toc-the-terraform-code">The Terraform code</a> <ul>
<li><a href="#maintf" id="markdown-toc-maintf">main.tf</a></li>
<li><a href="#about-the-puppet-master" id="markdown-toc-about-the-puppet-master">About the Puppet Master</a> <ul>
<li><a href="#user_data" id="markdown-toc-user_data">user_data</a></li>
<li><a href="#remote-exec-provisioner" id="markdown-toc-remote-exec-provisioner">Remote exec provisioner</a></li>
<li><a href="#default-ec2-key-pair" id="markdown-toc-default-ec2-key-pair">Default EC2 key pair</a></li>
</ul>
</li>
<li><a href="#the-amazon-linux-2-agent" id="markdown-toc-the-amazon-linux-2-agent">The Amazon Linux 2 agent</a></li>
<li><a href="#connection-type-ssh" id="markdown-toc-connection-type-ssh">Connection type SSH</a></li>
<li><a href="#the-windows-2012-node" id="markdown-toc-the-windows-2012-node">The Windows 2012 node</a></li>
</ul>
</li>
<li><a href="#the-puppet-code" id="markdown-toc-the-puppet-code">The Puppet code</a></li>
<li><a href="#running-it" id="markdown-toc-running-it">Running it</a></li>
<li><a href="#expected-output" id="markdown-toc-expected-output">Expected output</a></li>
</ul>
</li>
<li><a href="#discussion" id="markdown-toc-discussion">Discussion</a></li>
<li><a href="#see-also" id="markdown-toc-see-also">See also</a></li>
</ul>
<h2 id="introduction">Introduction</h2>
<p>In Terraform 0.12.2 a “basic Puppet provisioner” was added per feature request <a href="https://github.com/hashicorp/terraform/pull/18851">#18851</a>. The motivation for the provisioner is apparently to simplify installing, configuring and running Puppet Agents. And, since I am interested in both Terraform and Puppet, I decided to have a go at setting it up and doing a simple “hello world” with it. Also, I am fairly stubborn, and I even got it to work. This is the story of how I did it!</p>
<h2 id="target-audience">Target audience</h2>
<p>The post should help Puppet users who want to use the Terraform Puppet provisioner but it probably won’t help Terraform users much with Puppet. I assume the reader has a good understanding of Puppet, Puppet Bolt and Terraform.</p>
<h2 id="the-code">The code</h2>
<p>For readers who prefer to just go straight to the code, I have that all on GitHub <a href="https://github.com/alexharv074/terraform-puppet-provisioner-test">here</a>.</p>
<h2 id="architecture">Architecture</h2>
<p>The following diagram shows the main moving parts of the solution:</p>
<p><img src="https://alexharv074.github.io//assets/arch.jpg" alt="Puppet Terraform architecture" /></p>
<p>Here is a bit about some of these:</p>
<table>
<thead>
<tr>
<th>component</th>
<th>notes</th>
</tr>
</thead>
<tbody>
<tr>
<td>Puppet Master<sup>1</sup></td>
<td>The Puppet Master a.k.a. Puppet Server in today’s Puppet. This is an open source Puppet 6 Puppet Server with the Autosign Ruby Gem installed.</td>
</tr>
<tr>
<td>Puppet Agent</td>
<td>The Puppet Agent node. In fact I have two of these, an Amazon Linux 2 Puppet Agent node and a Windows 2012 Puppet Agent node. It is here with installing and running Puppet that the Puppet Provisioner assists.</td>
</tr>
<tr>
<td>Puppet Bolt</td>
<td>Puppet Bolt is required to be on the machine running Terraform by the Puppet provisioner. Puppet Bolt tasks are called to autosign certificates on the Puppet Master and install Puppet on the Puppet Agent.</td>
</tr>
<tr>
<td>danieldreier/autosign</td>
<td>A Puppet module used by Puppet Bolt for autosigning Puppet agent Certificate Signing Requests. This and the following module is a dependency of the Terraform Puppet provisioner. They are managed outside of Terraform by Puppet Bolt and installed using bolt puppetfile install.</td>
</tr>
<tr>
<td>puppetlabs/puppet_agent</td>
<td>A Puppet module used by Puppet Bolt for managing Puppet Agent configuration.</td>
</tr>
</tbody>
</table>
<h2 id="overview">Overview</h2>
<p>The proof of concept code spins up a Puppet Master node, configures it using a UserData shell script, and then spins up an Amazon Linux 2 agent and a Windows 2012 agent in parallel and uses the Puppet provisioner to configure them both. And by “configure” I really just mean a simple Puppet manifest that prints “hello world” in the log. Why Windows 2012? That’s what I found in Tim Sharpe’s (the provisioner author’s) <a href="https://github.com/rodjek/terraform-puppet-example">test code</a>.</p>
<p>Under the hood, the Terraform Puppet provisioner calls Puppet Bolt twice, once to sign the certificate signing request on the Puppet Master as the agent comes up for the first time and a second time to install the Puppet agent software on the node.</p>
<h2 id="usage">Usage</h2>
<h3 id="setting-up-puppet-bolt">Setting up Puppet Bolt</h3>
<p>Perhaps the most surprising feature of the Terraform Puppet provisioner is the requirement to have Puppet Bolt already set up on the machine where you run Terraform. So the first thing my code does is provide a simple shell script called setup.sh that installs and configures Puppet Bolt and then installs the Bolt Modules. (It assumes Mac OS X.) Here is that script:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/usr/bin/env bash</span>
<span class="k">if</span> <span class="o">!</span> <span class="nb">command</span> <span class="nt">-v</span> bolt <span class="p">;</span> <span class="k">then
</span>brew cask <span class="nb">install </span>puppetlabs/puppet/puppet-bolt
<span class="k">fi
</span><span class="nb">mkdir</span> <span class="nt">-p</span> ~/.puppetlabs/bolt/
<span class="o">(</span><span class="nb">cd </span>bolt <span class="o">&&</span> <span class="nb">cp</span> <span class="se">\</span>
bolt.yaml <span class="se">\</span>
Puppetfile <span class="se">\</span>
~/.puppetlabs/bolt/<span class="o">)</span>
bolt puppetfile <span class="nb">install</span>
</code></pre></div></div>
<p>This is fairly self-explanatory.</p>
<h3 id="bolt-config">Bolt config</h3>
<h4 id="boltyaml">bolt.yaml</h4>
<p>The bolt.yaml meanwhile has this content:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nn">---</span>
<span class="na">modulepath</span><span class="pi">:</span> <span class="s2">"</span><span class="s">~/.puppetlabs/bolt-code/modules:~/.puppetlabs/bolt-code/site-modules"</span>
<span class="na">concurrency</span><span class="pi">:</span> <span class="m">10</span>
<span class="na">format</span><span class="pi">:</span> <span class="s">human</span>
<span class="na">ssh</span><span class="pi">:</span>
<span class="na">host-key-check</span><span class="pi">:</span> <span class="no">false</span>
<span class="na">user</span><span class="pi">:</span> <span class="s">ec2-user</span>
<span class="na">private-key</span><span class="pi">:</span> <span class="s">~/.ssh/default.pem</span>
</code></pre></div></div>
<p>As the reader will observe below, I have had to specify some of these SSH connection details twice - here, and also in Terraform. The inconsistency seems to be that I have to tell both Terraform and Bolt about the SSH user ec2-user but I can only tell Bolt about the private key here.</p>
<h3 id="puppetfile-contents">Puppetfile contents</h3>
<p>I will also say something about the Puppetfile. The Puppetfile is used by Puppet Bolt to install the two Bolt modules that the provisioner depends upon, as mentioned above. Note that Puppetfile actually points to an unmerged pull request:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Modules from the Puppet Forge.</span>
<span class="n">mod</span> <span class="s1">'danieldreier/autosign'</span>
<span class="n">mod</span> <span class="s1">'puppetlabs/puppet_agent'</span><span class="p">,</span>
<span class="ss">:git</span> <span class="o">=></span> <span class="s1">'https://github.com/alexharv074/puppetlabs-puppet_agent.git'</span><span class="p">,</span>
<span class="ss">:ref</span> <span class="o">=></span> <span class="s1">'MODULES-9981-add_amazon_linux_2_support_to_install_task'</span>
</code></pre></div></div>
<p>At the time of writing, there was no support for Amazon Linux 2 in the puppetlabs/puppet_agent <code class="language-plaintext highlighter-rouge">puppet_agent::install</code> task. I have added some support although foresee some delays in getting it merged. Hopefully that feature will be merged soon. If so, this file would be:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">mod</span> <span class="s1">'danieldreier/autosign'</span>
<span class="n">mod</span> <span class="s1">'puppetlabs/puppet_agent'</span>
</code></pre></div></div>
<h3 id="the-terraform-code">The Terraform code</h3>
<h4 id="maintf">main.tf</h4>
<p>I have all my code in main.tf. The full contents of that file are:</p>
<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">variable</span> <span class="dl">"</span><span class="s2">key_name</span><span class="dl">"</span> <span class="p">{</span>
<span class="nx">description</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">The name of the EC2 key pair to use</span><span class="dl">"</span>
<span class="k">default</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">default</span><span class="dl">"</span>
<span class="p">}</span>
<span class="nx">variable</span> <span class="dl">"</span><span class="s2">key_file</span><span class="dl">"</span> <span class="p">{</span>
<span class="nx">description</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">The private key for the ec2-user used in SSH connections and by Puppet Bolt</span><span class="dl">"</span>
<span class="k">default</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">~/.ssh/default.pem</span><span class="dl">"</span>
<span class="p">}</span>
<span class="nx">locals</span> <span class="p">{</span>
<span class="nx">linux_instance_type</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">t2.micro</span><span class="dl">"</span>
<span class="nx">windows_instance_type</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">t2.large</span><span class="dl">"</span>
<span class="p">}</span>
<span class="nx">data</span> <span class="dl">"</span><span class="s2">aws_ami</span><span class="dl">"</span> <span class="dl">"</span><span class="s2">amazon_linux_2</span><span class="dl">"</span> <span class="p">{</span>
<span class="nx">owners</span> <span class="o">=</span> <span class="p">[</span><span class="dl">"</span><span class="s2">amazon</span><span class="dl">"</span><span class="p">]</span>
<span class="nx">most_recent</span> <span class="o">=</span> <span class="kc">true</span>
<span class="nx">filter</span> <span class="p">{</span>
<span class="nx">name</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">name</span><span class="dl">"</span>
<span class="nx">values</span> <span class="o">=</span> <span class="p">[</span><span class="dl">"</span><span class="s2">amzn2-ami-hvm-*-x86_64-ebs</span><span class="dl">"</span><span class="p">]</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="nx">data</span> <span class="dl">"</span><span class="s2">aws_ami</span><span class="dl">"</span> <span class="dl">"</span><span class="s2">windows_2012R2</span><span class="dl">"</span> <span class="p">{</span>
<span class="nx">most_recent</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">true</span><span class="dl">"</span>
<span class="nx">owners</span> <span class="o">=</span> <span class="p">[</span><span class="dl">"</span><span class="s2">amazon</span><span class="dl">"</span><span class="p">]</span>
<span class="nx">filter</span> <span class="p">{</span>
<span class="nx">name</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">name</span><span class="dl">"</span>
<span class="nx">values</span> <span class="o">=</span> <span class="p">[</span><span class="dl">"</span><span class="s2">Windows_Server-2012-R2_RTM-English-64Bit-Base-*</span><span class="dl">"</span><span class="p">]</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="nx">data</span> <span class="dl">"</span><span class="s2">template_file</span><span class="dl">"</span> <span class="dl">"</span><span class="s2">user_data</span><span class="dl">"</span> <span class="p">{</span>
<span class="nx">template</span> <span class="o">=</span> <span class="nx">file</span><span class="p">(</span><span class="dl">"</span><span class="s2">${path.module}/user_data/master.sh</span><span class="dl">"</span><span class="p">)</span>
<span class="p">}</span>
<span class="nx">data</span> <span class="dl">"</span><span class="s2">template_file</span><span class="dl">"</span> <span class="dl">"</span><span class="s2">winrm</span><span class="dl">"</span> <span class="p">{</span>
<span class="nx">template</span> <span class="o">=</span> <span class="nx">file</span><span class="p">(</span><span class="dl">"</span><span class="s2">${path.module}/user_data/win_agent.xml</span><span class="dl">"</span><span class="p">)</span>
<span class="p">}</span>
<span class="nx">resource</span> <span class="dl">"</span><span class="s2">aws_instance</span><span class="dl">"</span> <span class="dl">"</span><span class="s2">master</span><span class="dl">"</span> <span class="p">{</span>
<span class="nx">ami</span> <span class="o">=</span> <span class="nx">data</span><span class="p">.</span><span class="nx">aws_ami</span><span class="p">.</span><span class="nx">amazon_linux_2</span><span class="p">.</span><span class="nx">id</span>
<span class="nx">instance_type</span> <span class="o">=</span> <span class="nx">local</span><span class="p">.</span><span class="nx">linux_instance_type</span>
<span class="nx">key_name</span> <span class="o">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">key_name</span>
<span class="nx">user_data</span> <span class="o">=</span> <span class="nx">data</span><span class="p">.</span><span class="nx">template_file</span><span class="p">.</span><span class="nx">user_data</span><span class="p">.</span><span class="nx">rendered</span>
<span class="nx">provisioner</span> <span class="dl">"</span><span class="s2">remote-exec</span><span class="dl">"</span> <span class="p">{</span>
<span class="nx">on_failure</span> <span class="o">=</span> <span class="k">continue</span>
<span class="nx">inline</span> <span class="o">=</span> <span class="p">[</span>
<span class="dl">"</span><span class="s2">sudo sh -c 'while ! grep -q Cloud-init.*finished /var/log/cloud-init-output.log; do sleep 20; done'</span><span class="dl">"</span>
<span class="p">]</span>
<span class="nx">connection</span> <span class="p">{</span>
<span class="nx">host</span> <span class="o">=</span> <span class="nb">self</span><span class="p">.</span><span class="nx">public_ip</span>
<span class="nx">user</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">ec2-user</span><span class="dl">"</span>
<span class="nx">private_key</span> <span class="o">=</span> <span class="nx">file</span><span class="p">(</span><span class="kd">var</span><span class="p">.</span><span class="nx">key_file</span><span class="p">)</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="nx">resource</span> <span class="dl">"</span><span class="s2">aws_instance</span><span class="dl">"</span> <span class="dl">"</span><span class="s2">linux_agent</span><span class="dl">"</span> <span class="p">{</span>
<span class="nx">ami</span> <span class="o">=</span> <span class="nx">data</span><span class="p">.</span><span class="nx">aws_ami</span><span class="p">.</span><span class="nx">ami</span><span class="p">.</span><span class="nx">id</span>
<span class="nx">instance_type</span> <span class="o">=</span> <span class="nx">local</span><span class="p">.</span><span class="nx">linux_instance_type</span>
<span class="nx">key_name</span> <span class="o">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">key_name</span>
<span class="nx">provisioner</span> <span class="dl">"</span><span class="s2">puppet</span><span class="dl">"</span> <span class="p">{</span>
<span class="nx">use_sudo</span> <span class="o">=</span> <span class="kc">true</span>
<span class="nx">server</span> <span class="o">=</span> <span class="nx">aws_instance</span><span class="p">.</span><span class="nx">master</span><span class="p">.</span><span class="nx">public_dns</span>
<span class="nx">server_user</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">ec2-user</span><span class="dl">"</span>
<span class="nx">connection</span> <span class="p">{</span>
<span class="nx">host</span> <span class="o">=</span> <span class="nb">self</span><span class="p">.</span><span class="nx">public_ip</span>
<span class="nx">user</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">ec2-user</span><span class="dl">"</span>
<span class="nx">private_key</span> <span class="o">=</span> <span class="nx">file</span><span class="p">(</span><span class="kd">var</span><span class="p">.</span><span class="nx">key_file</span><span class="p">)</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="nx">depends_on</span> <span class="o">=</span> <span class="p">[</span><span class="nx">aws_instance</span><span class="p">.</span><span class="nx">master</span><span class="p">]</span>
<span class="p">}</span>
<span class="nx">resource</span> <span class="dl">"</span><span class="s2">aws_instance</span><span class="dl">"</span> <span class="dl">"</span><span class="s2">win_agent</span><span class="dl">"</span> <span class="p">{</span>
<span class="nx">ami</span> <span class="o">=</span> <span class="nx">data</span><span class="p">.</span><span class="nx">aws_ami</span><span class="p">.</span><span class="nx">windows_2012R2</span><span class="p">.</span><span class="nx">image_id</span>
<span class="nx">instance_type</span> <span class="o">=</span> <span class="nx">local</span><span class="p">.</span><span class="nx">windows_instance_type</span>
<span class="nx">key_name</span> <span class="o">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">key_name</span>
<span class="nx">get_password_data</span> <span class="o">=</span> <span class="kc">true</span>
<span class="nx">timeouts</span> <span class="p">{</span>
<span class="nx">create</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">15m</span><span class="dl">"</span>
<span class="p">}</span>
<span class="nx">provisioner</span> <span class="dl">"</span><span class="s2">puppet</span><span class="dl">"</span> <span class="p">{</span>
<span class="nx">open_source</span> <span class="o">=</span> <span class="kc">true</span>
<span class="nx">server</span> <span class="o">=</span> <span class="nx">aws_instance</span><span class="p">.</span><span class="nx">master</span><span class="p">.</span><span class="nx">public_dns</span>
<span class="nx">server_user</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">ec2-user</span><span class="dl">"</span>
<span class="nx">connection</span> <span class="p">{</span>
<span class="nx">host</span> <span class="o">=</span> <span class="nb">self</span><span class="p">.</span><span class="nx">public_ip</span>
<span class="nx">type</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">winrm</span><span class="dl">"</span>
<span class="nx">user</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">Administrator</span><span class="dl">"</span>
<span class="nx">password</span> <span class="o">=</span> <span class="nx">rsadecrypt</span><span class="p">(</span><span class="nb">self</span><span class="p">.</span><span class="nx">password_data</span><span class="p">,</span> <span class="nx">file</span><span class="p">(</span><span class="kd">var</span><span class="p">.</span><span class="nx">key_file</span><span class="p">))</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="nx">user_data</span> <span class="o">=</span> <span class="nx">data</span><span class="p">.</span><span class="nx">template_file</span><span class="p">.</span><span class="nx">winrm</span><span class="p">.</span><span class="nx">rendered</span>
<span class="nx">depends_on</span> <span class="o">=</span> <span class="p">[</span><span class="nx">aws_instance</span><span class="p">.</span><span class="nx">master</span><span class="p">]</span>
<span class="p">}</span>
</code></pre></div></div>
<h4 id="about-the-puppet-master">About the Puppet Master</h4>
<p>The Puppet provisioner assumes that you already have a Puppet Master runnings somewhere, and it is not the provisioner’s job to help you build that. Also, building the Puppet Master threw some of the biggest challenges, so keep that in mind when reviewing the overall complexity of this solution.</p>
<h5 id="user_data">user_data</h5>
<p>To configure the Puppet Master, I wrote the following shell script that is called from user_data:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/usr/bin/env bash</span>
<span class="c"># Without $HOME, a message is seen in cloud-init-output.log during autosign:</span>
<span class="c"># couldn't find login name -- expanding `~'</span>
<span class="nb">export </span><span class="nv">HOME</span><span class="o">=</span><span class="s1">'/root'</span>
install_puppetserver<span class="o">()</span> <span class="o">{</span>
wget https://yum.puppet.com/puppet6-release-el-7.noarch.rpm
rpm <span class="nt">-Uvh</span> puppet6-release-el-7.noarch.rpm
yum-config-manager <span class="nt">--enable</span> puppet6
yum <span class="nt">-y</span> <span class="nb">install </span>puppetserver
<span class="o">}</span>
configure_puppetserver<span class="o">()</span> <span class="o">{</span>
<span class="nb">echo</span> <span class="s1">'export PATH=/opt/puppetlabs/puppet/bin:$PATH'</span> <span class="se">\</span>
<span class="o">>></span> /etc/profile.d/puppet-agent.sh
<span class="nb">.</span> /etc/profile.d/puppet-agent.sh
<span class="nb">sed</span> <span class="nt">-i</span> <span class="s1">'
s/JAVA_ARGS.*/JAVA_ARGS="-Xms512m -Xmx512m"/
'</span> /etc/sysconfig/puppetserver <span class="c"># workaround for t2.micro's 1GB RAM.</span>
<span class="nb">local </span><span class="nv">public_hostname</span><span class="o">=</span><span class="si">$(</span>curl <span class="se">\</span>
http://169.254.169.254/latest/meta-data/public-hostname<span class="si">)</span>
puppetserver ca setup <span class="se">\</span>
<span class="nt">--subject-alt-names</span> <span class="s2">"</span><span class="nv">$public_hostname</span><span class="s2">"</span>,localhost,puppet
<span class="nb">echo</span> <span class="s2">"127.0.0.1 puppet"</span> <span class="o">>></span> /etc/hosts
<span class="o">}</span>
configure_autosign<span class="o">()</span> <span class="o">{</span>
gem <span class="nb">install </span>autosign
<span class="nb">mkdir</span> <span class="nt">-p</span> <span class="nt">-m</span> 750 /var/autosign
<span class="nb">chown </span>puppet: /var/autosign
<span class="nb">touch</span> /var/log/autosign.log
<span class="nb">chown </span>puppet: /var/log/autosign.log
autosign config setup
<span class="nb">sed</span> <span class="nt">-i</span> <span class="s1">'
s!journalfile:.*!journalfile: "/var/autosign/autosign.journal"!
'</span> /etc/autosign.conf
puppet config <span class="nb">set</span> <span class="se">\</span>
<span class="nt">--section</span> master autosign /opt/puppetlabs/puppet/bin/autosign-validator
systemctl restart puppetserver
<span class="o">}</span>
deploy_code<span class="o">()</span> <span class="o">{</span>
yum <span class="nt">-y</span> <span class="nb">install </span>git
<span class="nb">rm</span> <span class="nt">-rf</span> /etc/puppetlabs/code/environments/production
git clone <span class="se">\</span>
https://github.com/alexharv074/terraform-puppet-provisioner-test.git <span class="se">\</span>
/etc/puppetlabs/code/environments/production
<span class="o">}</span>
main<span class="o">()</span> <span class="o">{</span>
install_puppetserver
configure_puppetserver
configure_autosign
deploy_code
<span class="o">}</span>
main
</code></pre></div></div>
<p>Notice there is autosigning configuration provided by the autosign Ruby Gem. Your Puppet Master needs that configuration to support the Puppet Terraform provisioner. Also notice that IU had to get the Public Hostname from the EC2 instance Meta Data using curl. That is to workaround the fact that there seems to be no way to populate a Terraform template with the generated self.public_dns other than inside a connection or provisioner block!<sup>2</sup></p>
<h5 id="remote-exec-provisioner">Remote exec provisioner</h5>
<p>Also note the following “hack” to get Terraform to stop and wait before marking the Puppet Master’s aws_instance state as “created”. I refer to this code:</p>
<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="nx">provisioner</span> <span class="dl">"</span><span class="s2">remote-exec</span><span class="dl">"</span> <span class="p">{</span>
<span class="nx">on_failure</span> <span class="o">=</span> <span class="k">continue</span>
<span class="nx">inline</span> <span class="o">=</span> <span class="p">[</span>
<span class="dl">"</span><span class="s2">sudo sh -c 'while ! grep -q Cloud-init.*finished /var/log/cloud-init-output.log; do sleep 20; done'</span><span class="dl">"</span>
<span class="p">]</span>
<span class="p">}</span>
</code></pre></div></div>
<p>The code uses a remote-exec provisioner to monitor the /var/log/cloud-init-output.log every 20 seconds for the a message that Cloud-init has finished. Is there a better way? Let me know! This is apparently the only way to do this because Terraform has no equivalent of CloudFormation’s <a href="https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/cfn-signal.html">cfn-signal</a> to signal that a resource has been “created”. See also the line in the agent configs:</p>
<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="nx">depends_on</span> <span class="o">=</span> <span class="p">[</span><span class="nx">aws_instance</span><span class="p">.</span><span class="nx">master</span><span class="p">]</span>
</code></pre></div></div>
<p>That’s where I tell the agents to wait for the master to be created.</p>
<h5 id="default-ec2-key-pair">Default EC2 key pair</h5>
<p>I have assumed that you have an EC2 key pair in your AWS account called “default”. If you don’t, you can create one using:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>▶ aws ec2 create-key-pair --key-name default
</code></pre></div></div>
<p>Or you could set a Terraform variable to point to another key you want to use. For example:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>▶ export TF_VAR_key_name='my_key'
</code></pre></div></div>
<h4 id="the-amazon-linux-2-agent">The Amazon Linux 2 agent</h4>
<p>The code for the Linux agent node is this:</p>
<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">resource</span> <span class="dl">"</span><span class="s2">aws_instance</span><span class="dl">"</span> <span class="dl">"</span><span class="s2">linux_agent</span><span class="dl">"</span> <span class="p">{</span>
<span class="nx">ami</span> <span class="o">=</span> <span class="nx">data</span><span class="p">.</span><span class="nx">aws_ami</span><span class="p">.</span><span class="nx">ami</span><span class="p">.</span><span class="nx">id</span>
<span class="nx">instance_type</span> <span class="o">=</span> <span class="nx">local</span><span class="p">.</span><span class="nx">instance_type</span>
<span class="nx">key_name</span> <span class="o">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">key_name</span>
<span class="nx">provisioner</span> <span class="dl">"</span><span class="s2">puppet</span><span class="dl">"</span> <span class="p">{</span>
<span class="nx">use_sudo</span> <span class="o">=</span> <span class="kc">true</span>
<span class="nx">server</span> <span class="o">=</span> <span class="nx">aws_instance</span><span class="p">.</span><span class="nx">master</span><span class="p">.</span><span class="nx">public_dns</span>
<span class="nx">server_user</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">ec2-user</span><span class="dl">"</span>
<span class="nx">connection</span> <span class="p">{</span>
<span class="nx">host</span> <span class="o">=</span> <span class="nb">self</span><span class="p">.</span><span class="nx">public_ip</span>
<span class="nx">type</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">ssh</span><span class="dl">"</span> <span class="c1">// This could be omitted after my above-mentioned patch is merged.</span>
<span class="nx">user</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">ec2-user</span><span class="dl">"</span>
<span class="nx">private_key</span> <span class="o">=</span> <span class="nx">file</span><span class="p">(</span><span class="kd">var</span><span class="p">.</span><span class="nx">key_file</span><span class="p">)</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="nx">depends_on</span> <span class="o">=</span> <span class="p">[</span><span class="nx">aws_instance</span><span class="p">.</span><span class="nx">master</span><span class="p">]</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Things to note here:</p>
<ul>
<li>The connection block is used by the provisioner to connect to the Puppet Agent node.</li>
<li>The settings <code class="language-plaintext highlighter-rouge">server</code> and <code class="language-plaintext highlighter-rouge">server_user</code> refer to the Puppet Master node. In my case, I have the Puppet Master managed in Terraform too although I can foresee others could have their Puppet Masters on long-lived pets etc.</li>
<li>At the time of writing, the private key needed to connect to the Puppet Master lives in Puppet Bolt’s configuration in the bolt.yaml file. I find this surprising and I’m going to raise a patch if I can to change this so that the private_key to connect to the Puppet Master will be specified in Terraform too.</li>
</ul>
<h4 id="connection-type-ssh">Connection type SSH</h4>
<p>I am one of those people who doesn’t like to overspecify things in code and I tend to use default values where possible. I tried to do that for the SSH connection blocks for the Puppet Master and Agent aws_instances. I then ran into a quite confusing bug that led me on a goose chase through both the Terraform & Bolt code bases! That’s why there’s a comment there that points to <a href="https://github.com/hashicorp/terraform/issues/23004">this</a> Terraform issue that I raised.</p>
<p>In the end I fixed that bug in this open pull request <a href="https://github.com/hashicorp/terraform/pull/23057">here</a>. At the time of writing, it is unmerged and will probably go in to Terraform 0.12.11. If you have a lower Terraform, just make sure you specify the connection type on Linux explicitly as “ssh”.</p>
<h4 id="the-windows-2012-node">The Windows 2012 node</h4>
<p>And here is the Windows agent node code:</p>
<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">resource</span> <span class="dl">"</span><span class="s2">aws_instance</span><span class="dl">"</span> <span class="dl">"</span><span class="s2">win_agent</span><span class="dl">"</span> <span class="p">{</span>
<span class="nx">ami</span> <span class="o">=</span> <span class="nx">data</span><span class="p">.</span><span class="nx">aws_ami</span><span class="p">.</span><span class="nx">windows_2012R2</span><span class="p">.</span><span class="nx">image_id</span>
<span class="nx">instance_type</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">t2.large</span><span class="dl">"</span>
<span class="nx">key_name</span> <span class="o">=</span> <span class="kd">var</span><span class="p">.</span><span class="nx">key_name</span>
<span class="nx">get_password_data</span> <span class="o">=</span> <span class="kc">true</span>
<span class="nx">timeouts</span> <span class="p">{</span>
<span class="nx">create</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">15m</span><span class="dl">"</span>
<span class="p">}</span>
<span class="nx">provisioner</span> <span class="dl">"</span><span class="s2">puppet</span><span class="dl">"</span> <span class="p">{</span>
<span class="nx">open_source</span> <span class="o">=</span> <span class="kc">true</span>
<span class="nx">server</span> <span class="o">=</span> <span class="nx">aws_instance</span><span class="p">.</span><span class="nx">master</span><span class="p">.</span><span class="nx">public_dns</span>
<span class="nx">server_user</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">ec2-user</span><span class="dl">"</span>
<span class="nx">connection</span> <span class="p">{</span>
<span class="nx">host</span> <span class="o">=</span> <span class="nb">self</span><span class="p">.</span><span class="nx">public_ip</span>
<span class="nx">type</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">winrm</span><span class="dl">"</span>
<span class="nx">user</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">Administrator</span><span class="dl">"</span>
<span class="nx">password</span> <span class="o">=</span> <span class="nx">rsadecrypt</span><span class="p">(</span><span class="nb">self</span><span class="p">.</span><span class="nx">password_data</span><span class="p">,</span> <span class="nx">file</span><span class="p">(</span><span class="kd">var</span><span class="p">.</span><span class="nx">key_file</span><span class="p">))</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="nx">user_data</span> <span class="o">=</span> <span class="nx">data</span><span class="p">.</span><span class="nx">template_file</span><span class="p">.</span><span class="nx">winrm</span><span class="p">.</span><span class="nx">rendered</span>
<span class="nx">depends_on</span> <span class="o">=</span> <span class="p">[</span><span class="nx">aws_instance</span><span class="p">.</span><span class="nx">master</span><span class="p">]</span>
<span class="p">}</span>
</code></pre></div></div>
<p>This is much the same as the Amazon Linux 2 configuration other than the password field that is passed in the connection block. There, I used the same EC2 user key to get the Administrator password, which is passed to the Puppet provisioner to be used by Bolt to connect to the Windows agent node and install the Puppet agent software.</p>
<h3 id="the-puppet-code">The Puppet code</h3>
<p>Finally, I have the Puppet code itself inside manifests/site.pp:</p>
<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">node</span> <span class="k">default</span> <span class="p">{</span>
<span class="nx">notify</span> <span class="p">{</span> <span class="dl">"</span><span class="s2">Hello world from ${facts['hostname']}!</span><span class="dl">"</span><span class="p">:</span> <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Note that this code is made available in the deploy_code function in the Puppet Master UserData above.</p>
<h3 id="running-it">Running it</h3>
<p>First run the setup script.</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>▶ bash -x setup.sh
</code></pre></div></div>
<p>Then run terraform apply:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>▶ terraform init
▶ terraform apply -auto-approve
</code></pre></div></div>
<h3 id="expected-output">Expected output</h3>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>▶ terraform apply -auto-approve
data.template_file.winrm: Refreshing state...
data.template_file.user_data: Refreshing state...
data.aws_ami.ami: Refreshing state...
data.aws_ami.windows_2012R2: Refreshing state...
aws_instance.master: Creating...
aws_instance.master: Still creating... [10s elapsed]
aws_instance.master: Still creating... [20s elapsed]
aws_instance.master: Still creating... [30s elapsed]
aws_instance.master: Provisioning with 'remote-exec'...
aws_instance.master (remote-exec): Connecting to remote host via SSH...
aws_instance.master (remote-exec): Host: 13.239.139.194
aws_instance.master (remote-exec): User: ec2-user
aws_instance.master (remote-exec): Password: false
aws_instance.master (remote-exec): Private key: true
aws_instance.master (remote-exec): Certificate: false
aws_instance.master (remote-exec): SSH Agent: true
aws_instance.master (remote-exec): Checking Host Key: false
aws_instance.master: Still creating... [40s elapsed]
aws_instance.master (remote-exec): Connecting to remote host via SSH...
aws_instance.master (remote-exec): Host: 13.239.139.194
aws_instance.master (remote-exec): User: ec2-user
aws_instance.master (remote-exec): Password: false
aws_instance.master (remote-exec): Private key: true
aws_instance.master (remote-exec): Certificate: false
aws_instance.master (remote-exec): SSH Agent: true
aws_instance.master (remote-exec): Checking Host Key: false
aws_instance.master: Still creating... [50s elapsed]
aws_instance.master: Still creating... [1m0s elapsed]
aws_instance.master (remote-exec): Connecting to remote host via SSH...
aws_instance.master (remote-exec): Host: 13.239.139.194
aws_instance.master (remote-exec): User: ec2-user
aws_instance.master (remote-exec): Password: false
aws_instance.master (remote-exec): Private key: true
aws_instance.master (remote-exec): Certificate: false
aws_instance.master (remote-exec): SSH Agent: true
aws_instance.master (remote-exec): Checking Host Key: false
aws_instance.master (remote-exec): Connecting to remote host via SSH...
aws_instance.master (remote-exec): Host: 13.239.139.194
aws_instance.master (remote-exec): User: ec2-user
aws_instance.master (remote-exec): Password: false
aws_instance.master (remote-exec): Private key: true
aws_instance.master (remote-exec): Certificate: false
aws_instance.master (remote-exec): SSH Agent: true
aws_instance.master (remote-exec): Checking Host Key: false
aws_instance.master (remote-exec): Connected!
aws_instance.master: Still creating... [1m10s elapsed]
aws_instance.master: Still creating... [1m20s elapsed]
aws_instance.master: Still creating... [1m30s elapsed]
aws_instance.master: Still creating... [1m40s elapsed]
aws_instance.master: Still creating... [1m50s elapsed]
aws_instance.master: Still creating... [2m0s elapsed]
aws_instance.master: Still creating... [2m10s elapsed]
aws_instance.master: Still creating... [2m20s elapsed]
aws_instance.master: Still creating... [2m30s elapsed]
aws_instance.master: Still creating... [2m40s elapsed]
aws_instance.master: Still creating... [2m50s elapsed]
aws_instance.master: Still creating... [3m0s elapsed]
aws_instance.master: Still creating... [3m10s elapsed]
aws_instance.master: Creation complete after 3m17s [id=i-0d126b0f634539c45]
aws_instance.linux_agent: Creating...
aws_instance.win_agent: Creating...
aws_instance.win_agent: Still creating... [10s elapsed]
aws_instance.linux_agent: Still creating... [10s elapsed]
aws_instance.linux_agent: Still creating... [20s elapsed]
aws_instance.win_agent: Still creating... [20s elapsed]
aws_instance.linux_agent: Provisioning with 'puppet'...
aws_instance.linux_agent (puppet): Connecting to remote host via SSH...
aws_instance.linux_agent (puppet): Host: 54.252.134.38
aws_instance.linux_agent (puppet): User: ec2-user
aws_instance.linux_agent (puppet): Password: false
aws_instance.linux_agent (puppet): Private key: true
aws_instance.linux_agent (puppet): Certificate: false
aws_instance.linux_agent (puppet): SSH Agent: true
aws_instance.linux_agent (puppet): Checking Host Key: false
aws_instance.win_agent: Still creating... [30s elapsed]
aws_instance.linux_agent: Still creating... [30s elapsed]
aws_instance.linux_agent (puppet): Connecting to remote host via SSH...
aws_instance.linux_agent (puppet): Host: 54.252.134.38
aws_instance.linux_agent (puppet): User: ec2-user
aws_instance.linux_agent (puppet): Password: false
aws_instance.linux_agent (puppet): Private key: true
aws_instance.linux_agent (puppet): Certificate: false
aws_instance.linux_agent (puppet): SSH Agent: true
aws_instance.linux_agent (puppet): Checking Host Key: false
aws_instance.win_agent: Still creating... [40s elapsed]
aws_instance.linux_agent: Still creating... [40s elapsed]
aws_instance.linux_agent (puppet): Connecting to remote host via SSH...
aws_instance.linux_agent (puppet): Host: 54.252.134.38
aws_instance.linux_agent (puppet): User: ec2-user
aws_instance.linux_agent (puppet): Password: false
aws_instance.linux_agent (puppet): Private key: true
aws_instance.linux_agent (puppet): Certificate: false
aws_instance.linux_agent (puppet): SSH Agent: true
aws_instance.linux_agent (puppet): Checking Host Key: false
aws_instance.linux_agent (puppet): Connecting to remote host via SSH...
aws_instance.linux_agent (puppet): Host: 54.252.134.38
aws_instance.linux_agent (puppet): User: ec2-user
aws_instance.linux_agent (puppet): Password: false
aws_instance.linux_agent (puppet): Private key: true
aws_instance.linux_agent (puppet): Certificate: false
aws_instance.linux_agent (puppet): SSH Agent: true
aws_instance.linux_agent (puppet): Checking Host Key: false
aws_instance.linux_agent (puppet): Connected!
aws_instance.linux_agent (puppet): ip-172-31-10-49.ap-southeast-2.compute.internal
aws_instance.linux_agent: Still creating... [50s elapsed]
aws_instance.win_agent: Still creating... [50s elapsed]
aws_instance.linux_agent: Still creating... [1m0s elapsed]
aws_instance.win_agent: Still creating... [1m0s elapsed]
aws_instance.win_agent: Still creating... [1m10s elapsed]
aws_instance.linux_agent: Still creating... [1m10s elapsed]
aws_instance.win_agent: Still creating... [1m20s elapsed]
aws_instance.linux_agent: Still creating... [1m20s elapsed]
aws_instance.win_agent: Provisioning with 'puppet'...
aws_instance.win_agent (puppet): Connecting to remote host via WinRM...
aws_instance.win_agent (puppet): Host: 13.211.55.90
aws_instance.win_agent (puppet): Port: 5985
aws_instance.win_agent (puppet): User: Administrator
aws_instance.win_agent (puppet): Password: true
aws_instance.win_agent (puppet): HTTPS: false
aws_instance.win_agent (puppet): Insecure: false
aws_instance.win_agent (puppet): NTLM: false
aws_instance.win_agent (puppet): CACert: false
aws_instance.win_agent (puppet): Connected!
aws_instance.win_agent (puppet): WIN-IPE5577KSBA
aws_instance.linux_agent (puppet): Info: Downloaded certificate for ca from ec2-13-239-139-194.ap-southeast-2.compute.amazonaws.com
aws_instance.linux_agent (puppet): Info: Downloaded certificate revocation list for ca from ec2-13-239-139-194.ap-southeast-2.compute.amazonaws.com
aws_instance.linux_agent (puppet): Info: Creating a new RSA SSL key for ip-172-31-10-49.ap-southeast-2.compute.internal
aws_instance.win_agent (puppet): ap-southeast-2.compute.internal
aws_instance.linux_agent (puppet): Info: csr_attributes file loading from /etc/puppetlabs/puppet/csr_attributes.yaml
aws_instance.linux_agent (puppet): Info: Creating a new SSL certificate request for ip-172-31-10-49.ap-southeast-2.compute.internal
aws_instance.linux_agent (puppet): Info: Certificate Request fingerprint (SHA256): E3:E8:AD:42:EC:76:EE:F0:DF:47:F9:D1:65:6B:8C:46:0B:59:B2:1A:26:5B:56:B7:55:87:1C:B9:7E:E6:BA:3E
aws_instance.linux_agent (puppet): Info: Downloaded certificate for ip-172-31-10-49.ap-southeast-2.compute.internal from ec2-13-239-139-194.ap-southeast-2.compute.amazonaws.com
aws_instance.win_agent: Still creating... [1m30s elapsed]
aws_instance.linux_agent: Still creating... [1m30s elapsed]
aws_instance.linux_agent (puppet): Info: Using configured environment 'production'
aws_instance.linux_agent (puppet): Info: Retrieving pluginfacts
aws_instance.linux_agent (puppet): Info: Retrieving plugin
aws_instance.linux_agent (puppet): Info: Retrieving locales
aws_instance.win_agent (puppet): Directory: C:\ProgramData\PuppetLabs\Puppet
aws_instance.win_agent (puppet): Mode LastWriteTime Length Name
aws_instance.win_agent (puppet): ---- ------------- ------ ----
aws_instance.win_agent (puppet): d---- 10/12/2019 11:47 AM etc
aws_instance.linux_agent (puppet): Info: Caching catalog for ip-172-31-10-49.ap-southeast-2.compute.internal
aws_instance.linux_agent (puppet): Info: Applying configuration version '1570880860'
aws_instance.linux_agent (puppet): Notice: Hello world from ip-172-31-10-49!
aws_instance.linux_agent (puppet): Notice: /Stage[main]/Main/Node[default]/Notify[Hello world from ip-172-31-10-49!]/message: defined 'message' as 'Hello world from ip-172-31-10-49!'
aws_instance.linux_agent (puppet): Info: Creating state file /opt/puppetlabs/puppet/cache/state/state.yaml
aws_instance.linux_agent (puppet): Notice: Applied catalog in 0.01 seconds
aws_instance.linux_agent: Creation complete after 1m33s [id=i-06b88138c2feda4cf]
aws_instance.win_agent: Still creating... [1m40s elapsed]
aws_instance.win_agent: Still creating... [1m50s elapsed]
aws_instance.win_agent: Still creating... [2m0s elapsed]
aws_instance.win_agent: Still creating... [2m10s elapsed]
aws_instance.win_agent: Still creating... [2m20s elapsed]
aws_instance.win_agent: Still creating... [2m30s elapsed]
aws_instance.win_agent: Still creating... [2m40s elapsed]
aws_instance.win_agent (puppet): Info: Downloaded certificate for ca from ec2-13-239-139-194.ap-southeast-2.compute.amazonaws.com
aws_instance.win_agent (puppet): Info: Downloaded certificate revocation list for ca from ec2-13-239-139-194.ap-southeast-2.compute.amazonaws.com
aws_instance.win_agent (puppet): Info: Creating a new RSA SSL key for win-ipe5577ksba.ap-southeast-2.compute.internal
aws_instance.win_agent: Still creating... [2m50s elapsed]
aws_instance.win_agent (puppet): Info: csr_attributes file loading from C:/ProgramData/PuppetLabs/puppet/etc/csr_attributes.yaml
aws_instance.win_agent (puppet): Info: Creating a new SSL certificate request for win-ipe5577ksba.ap-southeast-2.compute.internal
aws_instance.win_agent (puppet): Info: Certificate Request fingerprint (SHA256): A1:C0:D3:AD:24:C7:80:67:F1:F4:97:FC:06:E2:16:01:12:DA:02:5F:AA:2F:57:98:9F:7D:2A:34:42:3C:D3:50
aws_instance.win_agent (puppet): Info: Downloaded certificate for win-ipe5577ksba.ap-southeast-2.compute.internal from ec2-13-239-139-194.ap-southeast-2.compute.amazonaws.com
aws_instance.win_agent (puppet): Info: Using configured environment 'production'
aws_instance.win_agent (puppet): Info: Retrieving pluginfacts
aws_instance.win_agent (puppet): Info: Retrieving plugin
aws_instance.win_agent (puppet): Info: Retrieving locales
aws_instance.win_agent (puppet): Info: Caching catalog for win-ipe5577ksba.ap-southeast-2.compute.internal
aws_instance.win_agent (puppet): Info: Applying configuration version '1570880943'
aws_instance.win_agent (puppet): Notice: Hello world from WIN-IPE5577KSBA!
aws_instance.win_agent (puppet): Notice: /Stage[main]/Main/Node[default]/Notify[Hello world from WIN-IPE5577KSBA!]/message: defined 'message' as 'Hello world from WIN-IPE5577KSBA!'
aws_instance.win_agent (puppet): Info: Creating state file C:/ProgramData/PuppetLabs/puppet/cache/state/state.yaml
aws_instance.win_agent (puppet): Notice: Applied catalog in 0.02 seconds
aws_instance.win_agent: Creation complete after 2m55s [id=i-07da31c6a0bf6ce14]
Apply complete! Resources: 3 added, 0 changed, 0 destroyed.
</code></pre></div></div>
<h2 id="discussion">Discussion</h2>
<p>It has been a journey to set all of this up that, as mentioned already, led me to submit patches into both Terraform and one of the Bolt modules. I do hope it is useful to get others up to speed quickly.</p>
<p>For those who are already using Terraform, Puppet Bolt, and Puppet to manage their EC2 infrastructure, I expect that this provisioner will be useful and they probably will want to think about using it. When it is all set up it feels quite clean and it is a nice user experience. Remember, as I mentioned already, that most of the complexity in the solution relates to setting up the Puppet Master in Terraform. That will not be a consideration for most people who are already using Puppet Masters.</p>
<p>The more interesting question might be this: Who should use this if they are <em>not</em> already using Terraform, Puppet Bolt, and Puppet to manage their EC2 instances? How <em>should</em> you manage a fleet of EC2 instances using Terraform?</p>
<p>HashiCorp <a href="https://www.terraform.io/docs/provisioners/index.html">say</a> that provisioners - any provisioners, whether Chef, Puppet, local-exec etc - should be used as a “last resort” and of the configuration management provisioners specifically:</p>
<blockquote>
<p>As a convenience to users who are forced to use generic operating system distribution images, Terraform includes a number of specialized provisioners for launching specific configuration management products.</p>
<p>We strongly recommend not using these, and instead running system configuration steps during a custom image build process. For example, HashiCorp Packer offers a similar complement of configuration management provisioners and can run their installation steps during a separate build process, before creating a system disk image that you can deploy many times.</p>
<p>If you are using configuration management software that has a centralized server component, you will need to delay the registration step until the final system is booted from your custom image.</p>
</blockquote>
<p>This is a statement of HashiCorp’s strongly opinionated view of how to do configuration management of course and it is debatable. Equally, it is possible to do configuration management using only Terraform’s features and UserData scripts. This also works fine … most of the time!</p>
<p>But if baking machine images isn’t the best fit for your problem and you foresee yourself also outgrowing UserData - and believe me you need to think hard about this because you don’t want to ever run into the limits of UserData as your configuration management solution because you’ll probably discover that you have no choice other than to rewrite everything! - that is, if your use-case may grow to include configuration management of complex applications running on Linux or Windows - then the use of a provisioner like the Puppet provisioner (or the Chef provisioner or some of the others) deserves consideration.</p>
<p>I can imagine that the requirement to also have Puppet Bolt on the machine running Terraform is going to be an issue for some users. If this provisioner is the <em>only</em> reason to use Puppet Bolt, you may decide to do your configuration management another way. But with that said, Puppet Bolt is a quite powerful tool that also deserves consideration.</p>
<h2 id="see-also">See also</h2>
<ul>
<li>
<p>Martez Reed, 10th July 2019, <a href="https://www.greenreedtech.com/terraform-puppet-provisioner/">Terraform Puppet Provisioner</a>.</p>
</li>
<li>
<p>Tim Sharpe’s (the provisioner author’s) <a href="https://github.com/rodjek/terraform-puppet-example">test code</a>.</p>
</li>
</ul>
<hr />
<p><sup>1</sup> Note that I refer throughout to the “Puppet Master” using the traditional terminology, although I probably should say “Puppet Server”.<br />
<sup>2</sup> See also Tim Sharpe’s apparent solution to the same problem <a href="https://github.com/rodjek/terraform-puppet-example/blob/master/example.tf#L33-L36">here</a>.</p>Alex HarveyThis post discussed a proof of concept of the Terraform 0.12.2 Puppet provisioner.Data consistency testing in Puppet, Part II: Testing file content2019-04-13T00:00:00+00:002019-04-13T00:00:00+00:00https://alexharv074.github.io//puppet/2019/04/13/data-consistency-testing-in-puppet-part-ii-testing-file-content<p>This post continues my blog series on data consistency testing in Puppet, where I add additional layers of automated testing around the file content in Puppet catalogs. For Part I of the series, see <a href="https://alexharv074.github.io/2018/09/30/data-consistency-testing-in-puppet-part-i-data-types.html">here</a>.</p>
<p>The source code for this blog is available at GitHub <a href="https://github.com/alexharv074/data_consistency_part_ii">here</a>. Step through the revision history to see the various examples.</p>
<ul id="markdown-toc">
<li><a href="#what-is-the-problem" id="markdown-toc-what-is-the-problem">What is the problem</a></li>
<li><a href="#example-1-json-data-in-an-erb-template" id="markdown-toc-example-1-json-data-in-an-erb-template">Example 1: JSON data in an ERB template</a> <ul>
<li><a href="#code-example-1" id="markdown-toc-code-example-1">Code example 1</a></li>
<li><a href="#testing-the-json-file-content" id="markdown-toc-testing-the-json-file-content">Testing the JSON file content</a> <ul>
<li><a href="#using-jq-on-the-compiled-catalog" id="markdown-toc-using-jq-on-the-compiled-catalog">Using JQ on the compiled catalog</a></li>
<li><a href="#using-rspec" id="markdown-toc-using-rspec">Using Rspec</a></li>
<li><a href="#testing-a-specific-field" id="markdown-toc-testing-a-specific-field">Testing a specific field</a></li>
</ul>
</li>
</ul>
</li>
<li><a href="#example-2-json-data-in-a-sourced-file" id="markdown-toc-example-2-json-data-in-a-sourced-file">Example 2: JSON data in a sourced file</a> <ul>
<li><a href="#code-example-2" id="markdown-toc-code-example-2">Code example 2</a></li>
<li><a href="#testing-the-json-file-using-pure-rspec" id="markdown-toc-testing-the-json-file-using-pure-rspec">Testing the JSON file using pure Rspec</a></li>
</ul>
</li>
<li><a href="#example-3-yaml-data" id="markdown-toc-example-3-yaml-data">Example 3: YAML data</a> <ul>
<li><a href="#code-example-3" id="markdown-toc-code-example-3">Code example 3</a></li>
</ul>
</li>
<li><a href="#example-4-ini-file-data" id="markdown-toc-example-4-ini-file-data">Example 4: INI file data</a> <ul>
<li><a href="#specific-issues-with-ini-files" id="markdown-toc-specific-issues-with-ini-files">Specific issues with INI files</a></li>
<li><a href="#code-example-4" id="markdown-toc-code-example-4">Code example 4</a></li>
</ul>
</li>
<li><a href="#example-5-java-properties" id="markdown-toc-example-5-java-properties">Example 5: Java Properties</a> <ul>
<li><a href="#code-example-5" id="markdown-toc-code-example-5">Code example 5</a></li>
</ul>
</li>
<li><a href="#discussion" id="markdown-toc-discussion">Discussion</a></li>
<li><a href="#see-also" id="markdown-toc-see-also">See also</a></li>
</ul>
<h2 id="what-is-the-problem">What is the problem</h2>
<p>Unit tests are great but they only test the logic of your code. In practice, however, mistakes are often made in data. Thinking back, I would say that missing or unexpected commas in JSON files have caused more errors in production than I can count.</p>
<p>In Part I, I looked at how to test Hiera data as it passes into Puppet manifests using Puppet’s data types. But no matter how hard we try to externalise our data in Hiera, some of it always stays inside manifests as file data.</p>
<p>That’s the problem I am trying to solve today. How do you test the data that lives in files in Puppet manifests?</p>
<h2 id="example-1-json-data-in-an-erb-template">Example 1: JSON data in an ERB template</h2>
<h3 id="code-example-1">Code example 1</h3>
<p>Suppose you have a manifest:</p>
<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">class</span> <span class="nx">loopback</span> <span class="p">{</span>
<span class="nx">$rest_api_root</span> <span class="o">=</span> <span class="dl">'</span><span class="s1">/api</span><span class="dl">'</span>
<span class="nx">$host</span> <span class="o">=</span> <span class="dl">'</span><span class="s1">0.0.0.0</span><span class="dl">'</span>
<span class="nx">$port</span> <span class="o">=</span> <span class="mi">3000</span>
<span class="nx">file</span> <span class="p">{</span> <span class="dl">'</span><span class="s1">/server/config.json</span><span class="dl">'</span><span class="p">:</span>
<span class="nx">ensure</span> <span class="o">=></span> <span class="nx">present</span><span class="p">,</span>
<span class="nx">owner</span> <span class="o">=></span> <span class="dl">'</span><span class="s1">root</span><span class="dl">'</span><span class="p">,</span>
<span class="nx">group</span> <span class="o">=></span> <span class="dl">'</span><span class="s1">root</span><span class="dl">'</span><span class="p">,</span>
<span class="nx">mode</span> <span class="o">=></span> <span class="dl">'</span><span class="s1">0444</span><span class="dl">'</span><span class="p">,</span>
<span class="nx">content</span> <span class="o">=></span> <span class="nx">template</span><span class="p">(</span><span class="dl">'</span><span class="s1">loopback/config.json.erb</span><span class="dl">'</span><span class="p">),</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>And an ERB template<sup>1</sup>:</p>
<div class="language-erb highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{
"restApiRoot": "<span class="cp"><%=</span> <span class="vi">@rest_api_root</span> <span class="cp">-%></span>",
"host": "<span class="cp"><%=</span> <span class="vi">@host</span> <span class="cp">-%></span>",
"port": <span class="cp"><%=</span> <span class="vi">@port</span> <span class="cp">-%></span>,
"remoting": {
"context": {"enableHttpContext": false},
"rest": {"normalizeHttpPath": false, "xml": false},
"json": {"strict": false, "limit": "100kb"},
"urlencoded": {"extended": true, "limit": "100kb"},
"cors": false
"errorHandler": {"disableStackTrace": false}
},
"legacyExplorer": false
}
</code></pre></div></div>
<p>And suppose I have the simplest Rspec-puppet test, one that just compiles and <a href="https://alexharv074.github.io/2016/03/16/dumping-the-catalog-in-rspec-puppet.html">writes</a> a catalog:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">require</span> <span class="s1">'spec_helper'</span>
<span class="n">describe</span> <span class="s1">'loopback'</span> <span class="k">do</span>
<span class="n">it</span> <span class="s1">'compiles'</span> <span class="k">do</span>
<span class="n">is_expected</span><span class="p">.</span><span class="nf">to</span> <span class="n">compile</span>
<span class="no">File</span><span class="p">.</span><span class="nf">write</span><span class="p">(</span><span class="s1">'catalogs/loopback.json'</span><span class="p">,</span> <span class="no">PSON</span><span class="p">.</span><span class="nf">pretty_generate</span><span class="p">(</span><span class="n">catalogue</span><span class="p">))</span>
<span class="k">end</span>
<span class="k">end</span>
</code></pre></div></div>
<p>So I run the Rspec tests and everything passes:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>▶ bundle exec rake spec
...
loopback
compiles
Finished in 0.83471 seconds (files took 1.27 seconds to load)
1 example, 0 failures
</code></pre></div></div>
<p>Great, everything passed. Release it to production!</p>
<h3 id="testing-the-json-file-content">Testing the JSON file content</h3>
<h4 id="using-jq-on-the-compiled-catalog">Using JQ on the compiled catalog</h4>
<p>Well maybe not. The following JQ on the compiled catalog shows that I just compiled a catalog with invalid JSON data in it:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>▶ jq '
.resources[]
| select((.type == "File") and (.title=="/server/config.json"))
| .parameters.content | fromjson
' < catalogs/loopback.json
</code></pre></div></div>
<p>Yields:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>jq: error (at <stdin>:95): Expected separator between values at line 11, column 18 (while parsing '{
"restApiRoot": "/api",
"host": "0.0.0.0",
"port": 3000,
"remoting": {
"context": {"enableHttpContext": false},
"rest": {"normalizeHttpPath": false, "xml": false},
"json": {"strict": false, "limit": "100kb"},
"urlencoded": {"extended": true, "limit": "100kb"},
"cors": false
"errorHandler": {"disableStackTrace": false}
},
"legacyExplorer": false
}
')
</code></pre></div></div>
<h4 id="using-rspec">Using Rspec</h4>
<p>But if the data is already inside the catalog, there must be a way to use Rspec-puppet to detect it earlier. And, of course, there is, although, as far as I am aware, how to do this has never been documented. I figured it out inside the Ruby debugger; it involves nagivating Rspec-puppet’s <code class="language-plaintext highlighter-rouge">catalogue</code> object.</p>
<p>Here I add a failing test:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="n">it</span> <span class="s1">'/server/config.json should be valid JSON'</span> <span class="k">do</span>
<span class="nb">require</span> <span class="s1">'json'</span>
<span class="n">json_data</span> <span class="o">=</span> <span class="n">catalogue</span>
<span class="p">.</span><span class="nf">resource</span><span class="p">(</span><span class="s1">'file'</span><span class="p">,</span> <span class="s1">'/server/config.json'</span><span class="p">)</span>
<span class="p">.</span><span class="nf">send</span><span class="p">(</span><span class="ss">:parameters</span><span class="p">)[</span><span class="ss">:content</span><span class="p">]</span>
<span class="n">expect</span> <span class="p">{</span> <span class="no">JSON</span><span class="p">.</span><span class="nf">parse</span><span class="p">(</span><span class="n">json_data</span><span class="p">)</span> <span class="p">}.</span><span class="nf">to_not</span> <span class="n">raise_error</span>
<span class="k">end</span>
</code></pre></div></div>
<p>The key insight is that the <code class="language-plaintext highlighter-rouge">catalogue</code> object has a <code class="language-plaintext highlighter-rouge">#resource</code> method that can look up resources in the catalog by type/title to get their parameters. In fact, I recommend attaching the debugger at that line yourself and spending some time playing around with it to further understand the <code class="language-plaintext highlighter-rouge">catalogue</code> object. More is possible! But for now, that one line is all I need.</p>
<p>So, I run the tests again, and now the invalid JSON is detected:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code> 1) loopback /server/config.json should be valid JSON
Failure/Error: expect { JSON.parse(json_data) }.to_not raise_error
expected no Exception, got #<JSON::ParserError: 743: unexpected token at '{
"restApiRoot": "/api",
"host": "0.0.0.0",
"por... "cors": false
"errorHandler": {"disableStackTrace": false}
},
"legacyExplorer": false
}
'> with backtrace:
# ./spec/classes/init_spec.rb:21:in `block (3 levels) in <top (required)>'
# ./spec/classes/init_spec.rb:21:in `block (2 levels) in <top (required)>'
# ./spec/classes/init_spec.rb:21:in `block (2 levels) in <top (required)>'
</code></pre></div></div>
<h4 id="testing-a-specific-field">Testing a specific field</h4>
<p>What if I also want to make assertions about specific fields in the JSON data? I can do that too.</p>
<p>Here is a test case that tests the contents of ERB interpolated fields against regular expressions:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="n">it</span> <span class="s1">'restApiRoot, host and port should look ok'</span> <span class="k">do</span>
<span class="n">json_data</span> <span class="o">=</span> <span class="n">catalogue</span><span class="p">.</span><span class="nf">resource</span><span class="p">(</span><span class="s1">'file'</span><span class="p">,</span> <span class="s1">'/server/config.json'</span><span class="p">).</span><span class="nf">send</span><span class="p">(</span><span class="ss">:parameters</span><span class="p">)[</span><span class="ss">:content</span><span class="p">]</span>
<span class="n">parsed</span> <span class="o">=</span> <span class="no">JSON</span><span class="p">.</span><span class="nf">parse</span><span class="p">(</span><span class="n">json_data</span><span class="p">)</span>
<span class="n">expect</span><span class="p">(</span><span class="n">parsed</span><span class="p">[</span><span class="s1">'restApiRoot'</span><span class="p">]).</span><span class="nf">to</span> <span class="n">match</span> <span class="sr">%r{^/[</span><span class="se">\w</span><span class="sr">/]+$}</span>
<span class="n">expect</span><span class="p">(</span><span class="n">parsed</span><span class="p">[</span><span class="s1">'host'</span><span class="p">]).</span><span class="nf">to</span> <span class="n">match</span> <span class="sr">/^(\d+(\.|$)){4}$/</span>
<span class="n">expect</span><span class="p">(</span><span class="n">parsed</span><span class="p">[</span><span class="s1">'port'</span><span class="p">]).</span><span class="nf">to</span> <span class="n">be_a</span><span class="p">(</span><span class="no">Integer</span><span class="p">)</span>
<span class="k">end</span>
</code></pre></div></div>
<p>On running these new tests:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>loopback
compiles
should contain file /server/config.json
/server/config.json should be valid JSON
restApiRoot, host and port should look ok
Finished in 0.78467 seconds (files took 1.3 seconds to load)
4 examples, 0 failures
</code></pre></div></div>
<h2 id="example-2-json-data-in-a-sourced-file">Example 2: JSON data in a sourced file</h2>
<h3 id="code-example-2">Code example 2</h3>
<p>This all works fine if your data is in Puppet templates. But sometimes Puppet’s built-in file server is used.<sup>2</sup> What if our Loopback class looked like this:</p>
<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">class</span> <span class="nx">loopback</span> <span class="p">{</span>
<span class="nx">file</span> <span class="p">{</span> <span class="dl">'</span><span class="s1">/server/config.json</span><span class="dl">'</span><span class="p">:</span>
<span class="nx">ensure</span> <span class="o">=></span> <span class="nx">present</span><span class="p">,</span>
<span class="nx">owner</span> <span class="o">=></span> <span class="dl">'</span><span class="s1">root</span><span class="dl">'</span><span class="p">,</span>
<span class="nx">group</span> <span class="o">=></span> <span class="dl">'</span><span class="s1">root</span><span class="dl">'</span><span class="p">,</span>
<span class="nx">mode</span> <span class="o">=></span> <span class="dl">'</span><span class="s1">0444</span><span class="dl">'</span><span class="p">,</span>
<span class="nx">source</span> <span class="o">=></span> <span class="dl">'</span><span class="s1">puppet:///modules/loopback/config.json</span><span class="dl">'</span><span class="p">,</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>The JSON config file is saved at <code class="language-plaintext highlighter-rouge">files/config.json</code>. Imagine it has the same typo it had before.</p>
<h3 id="testing-the-json-file-using-pure-rspec">Testing the JSON file using pure Rspec</h3>
<p>I can’t use Rspec-puppet to test this file because the file content simply doesn’t end up inside the Puppet catalog. Rather, Puppet’s file server is used and the file content is retrieved when the catalog is actually applied.</p>
<p>But I can still test the file. I just use pure Rspec. Here’s how I do it:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="n">it</span> <span class="s1">'/server/config.json should be valid JSON'</span> <span class="k">do</span>
<span class="n">json_data</span> <span class="o">=</span> <span class="no">File</span><span class="p">.</span><span class="nf">read</span><span class="p">(</span><span class="s1">'files/config.json'</span><span class="p">)</span> <span class="c1">## THIS LINE CHANGES</span>
<span class="n">expect</span> <span class="p">{</span> <span class="no">JSON</span><span class="p">.</span><span class="nf">parse</span><span class="p">(</span><span class="n">json_data</span><span class="p">)</span> <span class="p">}.</span><span class="nf">to_not</span> <span class="n">raise_error</span>
<span class="k">end</span>
</code></pre></div></div>
<p>Actually, it’s easier to test plain text files served by Puppet’s file server, although, in practice, these kinds of files - because they are not generated dynamically by ERB - tend to be less error prone. Still, it’s good to “test all the things” and this is the method I typically use.</p>
<h2 id="example-3-yaml-data">Example 3: YAML data</h2>
<p>Of course, not all file content is JSON data, although the same general approach can be used for any type of file data, as long as there is a Ruby library that can parse it. And that means pretty much anything.</p>
<h3 id="code-example-3">Code example 3</h3>
<p>Here is a YAML example. The manifest:</p>
<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">class</span> <span class="nx">hiera</span> <span class="p">{</span>
<span class="nx">$codedir</span> <span class="o">=</span> <span class="dl">'</span><span class="s1">/etc/puppetlabs/code</span><span class="dl">'</span>
<span class="nx">$confdir</span> <span class="o">=</span> <span class="dl">'</span><span class="s1">/etc/puppetlabs/puppet</span><span class="dl">'</span>
<span class="nx">file</span> <span class="p">{</span> <span class="dl">"</span><span class="s2">$confdir/hiera.yaml</span><span class="dl">"</span><span class="p">:</span>
<span class="nx">ensure</span> <span class="o">=></span> <span class="nx">present</span><span class="p">,</span>
<span class="nx">owner</span> <span class="o">=></span> <span class="dl">'</span><span class="s1">root</span><span class="dl">'</span><span class="p">,</span>
<span class="nx">group</span> <span class="o">=></span> <span class="dl">'</span><span class="s1">root</span><span class="dl">'</span><span class="p">,</span>
<span class="nx">mode</span> <span class="o">=></span> <span class="dl">'</span><span class="s1">0444</span><span class="dl">'</span><span class="p">,</span>
<span class="nx">content</span> <span class="o">=></span> <span class="nx">template</span><span class="p">(</span><span class="dl">'</span><span class="s1">hiera/hiera.yaml.erb</span><span class="dl">'</span><span class="p">),</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>The template:</p>
<div class="language-erb highlighter-rouge"><div class="highlight"><pre class="highlight"><code>---
:yaml:
:datadir: "<span class="cp"><%=</span> <span class="vi">@codedir</span> <span class="cp">-%></span>/environments/%{::environment}/hieradata"
:backends:
- yaml
- json
:hierarchy:
- "nodes/%{::trusted.certname}"
- "virtual/%{::virtual}"
- "common"
</code></pre></div></div>
<p>And the Rspec example:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">require</span> <span class="s1">'spec_helper'</span>
<span class="nb">require</span> <span class="s1">'yaml'</span>
<span class="n">describe</span> <span class="s1">'hiera'</span> <span class="k">do</span>
<span class="n">it</span> <span class="s1">'compiles'</span> <span class="k">do</span>
<span class="n">is_expected</span><span class="p">.</span><span class="nf">to</span> <span class="n">compile</span>
<span class="no">File</span><span class="p">.</span><span class="nf">write</span><span class="p">(</span><span class="s1">'catalogs/hiera.json'</span><span class="p">,</span> <span class="no">PSON</span><span class="p">.</span><span class="nf">pretty_generate</span><span class="p">(</span><span class="n">catalogue</span><span class="p">))</span>
<span class="k">end</span>
<span class="n">it</span> <span class="s1">'datadir in hiera.yaml should be correct'</span> <span class="k">do</span>
<span class="n">yaml_data</span> <span class="o">=</span> <span class="n">catalogue</span>
<span class="p">.</span><span class="nf">resource</span><span class="p">(</span><span class="s1">'file'</span><span class="p">,</span> <span class="s1">'/etc/puppetlabs/puppet/hiera.yaml'</span><span class="p">)</span>
<span class="p">.</span><span class="nf">send</span><span class="p">(</span><span class="ss">:parameters</span><span class="p">)[</span><span class="ss">:content</span><span class="p">]</span>
<span class="n">parsed</span> <span class="o">=</span> <span class="no">YAML</span><span class="p">.</span><span class="nf">load</span><span class="p">(</span><span class="n">yaml_data</span><span class="p">)</span>
<span class="n">expect</span><span class="p">(</span><span class="n">parsed</span><span class="p">[</span><span class="ss">:"yaml"</span><span class="p">][</span><span class="ss">:"datadir"</span><span class="p">])</span>
<span class="p">.</span><span class="nf">to</span> <span class="n">eq</span> <span class="s1">'/etc/puppetlabs/code/environments/%{::environment}/hieradata'</span>
<span class="k">end</span>
<span class="k">end</span>
</code></pre></div></div>
<h2 id="example-4-ini-file-data">Example 4: INI file data</h2>
<h3 id="specific-issues-with-ini-files">Specific issues with INI files</h3>
<p>To validate INI files I have used the TwP <a href="https://github.com/twp/inifile">inifile</a> library in the past.</p>
<p>INI files, however, present a few specific challenges:</p>
<ol>
<li>
<p>Reading an INI file from string data as opposed to a file on disk isn’t documented in the inifile library’s docs, although it is documented in the source code <a href="https://github.com/TwP/inifile/blob/134595662bdb986a03dae075daeeb3734313645f/lib/inifile.rb#L59">here</a>.</p>
</li>
<li>
<p>The library hasn’t been committed to since 2014! It probably is not maintained.</p>
</li>
<li>
<p>It is almost impossible to produce a typo in an INI file that causes this parser to raise an error. Thus, I don’t bother testing for raised errors at all.</p>
</li>
</ol>
<h3 id="code-example-4">Code example 4</h3>
<p>Here is an example inifile manifest:</p>
<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">class</span> <span class="nx">puppet</span><span class="p">::</span><span class="nx">agent</span> <span class="p">{</span>
<span class="nx">$agent_certname</span> <span class="o">=</span> <span class="dl">'</span><span class="s1">agent01.example.com</span><span class="dl">'</span>
<span class="nx">$puppet_server</span> <span class="o">=</span> <span class="dl">'</span><span class="s1">puppet</span><span class="dl">'</span>
<span class="nx">file</span> <span class="p">{</span> <span class="dl">'</span><span class="s1">/etc/puppetlabs/puppet/puppet.conf</span><span class="dl">'</span><span class="p">:</span>
<span class="nx">ensure</span> <span class="o">=></span> <span class="nx">present</span><span class="p">,</span>
<span class="nx">owner</span> <span class="o">=></span> <span class="dl">'</span><span class="s1">root</span><span class="dl">'</span><span class="p">,</span>
<span class="nx">group</span> <span class="o">=></span> <span class="dl">'</span><span class="s1">root</span><span class="dl">'</span><span class="p">,</span>
<span class="nx">mode</span> <span class="o">=></span> <span class="dl">'</span><span class="s1">0444</span><span class="dl">'</span><span class="p">,</span>
<span class="nx">content</span> <span class="o">=></span> <span class="nx">template</span><span class="p">(</span><span class="dl">'</span><span class="s1">puppet/puppet.conf.erb</span><span class="dl">'</span><span class="p">),</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>And the template:</p>
<div class="language-erb highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[main]
certname = <span class="cp"><%=</span> <span class="vi">@agent_certname</span> <span class="cp">%></span>
server = <span class="cp"><%=</span> <span class="vi">@puppet_server</span> <span class="cp">%></span>
environment = production
runinterval = 1h
</code></pre></div></div>
<p>And an Rspec example:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">require</span> <span class="s1">'spec_helper'</span>
<span class="nb">require</span> <span class="s1">'inifile'</span>
<span class="n">describe</span> <span class="s1">'puppet::agent'</span> <span class="k">do</span>
<span class="n">it</span> <span class="s1">'compiles'</span> <span class="k">do</span>
<span class="n">is_expected</span><span class="p">.</span><span class="nf">to</span> <span class="n">compile</span>
<span class="no">File</span><span class="p">.</span><span class="nf">write</span><span class="p">(</span><span class="s1">'catalogs/puppet__agent.json'</span><span class="p">,</span> <span class="no">PSON</span><span class="p">.</span><span class="nf">pretty_generate</span><span class="p">(</span><span class="n">catalogue</span><span class="p">))</span>
<span class="k">end</span>
<span class="n">it</span> <span class="s1">'certname in /etc/puppetlabs/puppet/puppet.conf should be correct'</span> <span class="k">do</span>
<span class="n">inifile_data</span> <span class="o">=</span> <span class="n">catalogue</span>
<span class="p">.</span><span class="nf">resource</span><span class="p">(</span><span class="s1">'file'</span><span class="p">,</span> <span class="s1">'/etc/puppetlabs/puppet/puppet.conf'</span><span class="p">)</span>
<span class="p">.</span><span class="nf">send</span><span class="p">(</span><span class="ss">:parameters</span><span class="p">)[</span><span class="ss">:content</span><span class="p">]</span>
<span class="n">parsed</span> <span class="o">=</span> <span class="no">IniFile</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="ss">:content</span> <span class="o">=></span> <span class="n">inifile_data</span><span class="p">)</span>
<span class="n">expect</span><span class="p">(</span><span class="n">parsed</span><span class="p">.</span><span class="nf">sections</span><span class="p">).</span><span class="nf">to</span> <span class="n">eq</span> <span class="p">[</span><span class="s1">'main'</span><span class="p">]</span>
<span class="n">expect</span><span class="p">(</span><span class="n">parsed</span><span class="p">[</span><span class="s1">'main'</span><span class="p">][</span><span class="s1">'certname'</span><span class="p">]).</span><span class="nf">to</span> <span class="n">eq</span> <span class="s1">'agent01.example.com'</span>
<span class="k">end</span>
<span class="k">end</span>
</code></pre></div></div>
<p>Remember, unlike JSON and YAML, where the parsers ship with Ruby, you must also add the inifile library to Gemfile.</p>
<h2 id="example-5-java-properties">Example 5: Java Properties</h2>
<h3 id="code-example-5">Code example 5</h3>
<p>I have tested Java Properties files in the past using Jonas Thiel’s <a href="https://github.com/jnbt/java-properties">java-properties</a> library.</p>
<p>Here is an example. Manifest:</p>
<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">class</span> <span class="nx">javaprops</span> <span class="p">{</span>
<span class="nx">file</span> <span class="p">{</span> <span class="dl">'</span><span class="s1">/home/webapp/config.properties</span><span class="dl">'</span><span class="p">:</span>
<span class="nx">ensure</span> <span class="o">=></span> <span class="nx">present</span><span class="p">,</span>
<span class="nx">owner</span> <span class="o">=></span> <span class="dl">'</span><span class="s1">root</span><span class="dl">'</span><span class="p">,</span>
<span class="nx">group</span> <span class="o">=></span> <span class="dl">'</span><span class="s1">root</span><span class="dl">'</span><span class="p">,</span>
<span class="nx">mode</span> <span class="o">=></span> <span class="dl">'</span><span class="s1">0444</span><span class="dl">'</span><span class="p">,</span>
<span class="nx">content</span> <span class="o">=></span> <span class="nx">template</span><span class="p">(</span><span class="dl">'</span><span class="s1">javaprops/config.properties.erb</span><span class="dl">'</span><span class="p">),</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Template:</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// vim: set ft=java:</span>
<span class="n">dataSource</span><span class="o">.</span><span class="na">dialect</span> <span class="o">=</span> <span class="s">"org.hibernate.dialect.MySQL5InnoDBDialect"</span>
<span class="n">dataSource</span><span class="o">.</span><span class="na">driverClassName</span> <span class="o">=</span> <span class="s">"com.mysql.jdbc.Driver"</span>
<span class="n">dataSource</span><span class="o">.</span><span class="na">url</span> <span class="o">=</span> <span class="s">"jdbc:mysql://localhost:3306/icescrum?useUnicode=true&characterEncoding=utf8"</span>
<span class="n">dataSource</span><span class="o">.</span><span class="na">username</span> <span class="o">=</span> <span class="s">"root"</span>
<span class="n">dataSource</span><span class="o">.</span><span class="na">password</span> <span class="o">=</span> <span class="s">"myDbPass"</span>
</code></pre></div></div>
<p>And a test:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">require</span> <span class="s1">'spec_helper'</span>
<span class="nb">require</span> <span class="s1">'java-properties'</span>
<span class="n">describe</span> <span class="s1">'javaprops'</span> <span class="k">do</span>
<span class="n">it</span> <span class="s1">'compiles'</span> <span class="k">do</span>
<span class="n">is_expected</span><span class="p">.</span><span class="nf">to</span> <span class="n">compile</span>
<span class="no">File</span><span class="p">.</span><span class="nf">write</span><span class="p">(</span><span class="s1">'catalogs/javaprops.json'</span><span class="p">,</span> <span class="no">PSON</span><span class="p">.</span><span class="nf">pretty_generate</span><span class="p">(</span><span class="n">catalogue</span><span class="p">))</span>
<span class="k">end</span>
<span class="n">it</span> <span class="s1">'dataSource.username in /home/webapp/config.properties should be root'</span> <span class="k">do</span>
<span class="n">java_properties</span> <span class="o">=</span> <span class="n">catalogue</span>
<span class="p">.</span><span class="nf">resource</span><span class="p">(</span><span class="s1">'file'</span><span class="p">,</span> <span class="s1">'/home/webapp/config.properties'</span><span class="p">)</span>
<span class="p">.</span><span class="nf">send</span><span class="p">(</span><span class="ss">:parameters</span><span class="p">)[</span><span class="ss">:content</span><span class="p">]</span>
<span class="n">parsed</span> <span class="o">=</span> <span class="no">JavaProperties</span><span class="p">.</span><span class="nf">parse</span><span class="p">(</span><span class="n">java_properties</span><span class="p">)</span>
<span class="n">expect</span><span class="p">(</span><span class="n">parsed</span><span class="p">[</span><span class="ss">:"dataSource.username"</span><span class="p">]).</span><span class="nf">to</span> <span class="n">eq</span> <span class="s1">'"root"'</span>
<span class="k">end</span>
<span class="k">end</span>
</code></pre></div></div>
<h2 id="discussion">Discussion</h2>
<p>In my experience of maintaining Puppet and other configuration management systems in production, data problems break production far more than code and logic problems. Whether developers write tests or otherwise for their code, the bugs in software do tend to be found and fixed whereas an INI file typo might not be detected until a strange behaviour in a system is seen by an end user.</p>
<p>It may not always be the job of the infrastructure developers to detect these errors but I believe strongly that everything that can be tested that is likely to change and break should be tested. I am also aware that, in practice, a lot of data of this kind does unfortunately go untested. For that reason, I hope that the method I devised and documented here takes off and is copied by others.</p>
<p>In a subsequent part of this series I will give examples of how script content such as Bash shell scripts and Python scripts can be unit tested after extraction from a Puppet catalog via Rspec-puppet.</p>
<h2 id="see-also">See also</h2>
<ul>
<li>Paul Hammond and Samantha Stoller, Jul 28 2016, <a href="https://slack.engineering/data-consistency-checks-e73261318f96">Data Consistency Checks</a> (Slack Engineering).</li>
</ul>
<p><sup>1</sup> And the reader can <em>surely</em> notice that this isn’t going to actually generate valid JSON right? No, I doubt it. The dreaded missing JSON comma is hard to notice.<br />
<sup>2</sup> Although I personally recommend almost always keeping your file data inside Puppet catalogs by always using the <code class="language-plaintext highlighter-rouge">template()</code> or <code class="language-plaintext highlighter-rouge">file()</code> function.</p>Alex HarveyThis post continues my blog series on data consistency testing in Puppet, where I add additional layers of automated testing around the file content in Puppet catalogs. For Part I of the series, see here.Data consistency testing in Puppet, Part I: Data types2018-09-30T00:00:00+00:002018-09-30T00:00:00+00:00https://alexharv074.github.io//puppet/2018/09/30/data-consistency-testing-in-puppet-part-i-data-types<p>When maintaining Puppet (or any Infrastructure-as-code solution) in production, human errors are made most frequently in data. A class expected an array of strings but you just passed in a string. The data wasn’t valid JSON because you left out a comma. You added a comment to an INI file using a hash symbol instead of the semicolon. And so on.</p>
<p>In this first part of a blog series on data consistency testing in Puppet, I look at the benefits of properly using Puppet’s data types to prevent human errors and speed up your team’s velocity.</p>
<ul id="markdown-toc">
<li><a href="#puppet-data-types" id="markdown-toc-puppet-data-types">Puppet data types</a></li>
<li><a href="#using-puppet-types" id="markdown-toc-using-puppet-types">Using Puppet types</a> <ul>
<li><a href="#simple-example" id="markdown-toc-simple-example">Simple example</a> <ul>
<li><a href="#code-example-1" id="markdown-toc-code-example-1">Code example 1</a></li>
<li><a href="#making-this-better" id="markdown-toc-making-this-better">Making this better</a></li>
<li><a href="#better-still" id="markdown-toc-better-still">Better still</a></li>
</ul>
</li>
<li><a href="#real-life-example" id="markdown-toc-real-life-example">Real life example</a> <ul>
<li><a href="#code-example-2" id="markdown-toc-code-example-2">Code example 2</a></li>
<li><a href="#making-this-better-1" id="markdown-toc-making-this-better-1">Making this better</a></li>
</ul>
</li>
</ul>
</li>
<li><a href="#discussion" id="markdown-toc-discussion">Discussion</a> <ul>
<li><a href="#why-data-testing-matters" id="markdown-toc-why-data-testing-matters">Why data testing matters</a></li>
<li><a href="#errors-detected-early" id="markdown-toc-errors-detected-early">Errors detected early</a></li>
<li><a href="#improved-error-messages" id="markdown-toc-improved-error-messages">Improved error messages</a></li>
<li><a href="#types-as-documentation" id="markdown-toc-types-as-documentation">Types as documentation</a></li>
<li><a href="#a-reason-to-use-puppet" id="markdown-toc-a-reason-to-use-puppet">A reason to use Puppet</a></li>
</ul>
</li>
<li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
</ul>
<h2 id="puppet-data-types">Puppet data types</h2>
<p>When Puppet 4 was released back in 2015, a lot of features were added, including much-awaited ones like iteration and manifest-ordering and all-in-one packaging. Also released was a feature that came as a surprise to many - data types. Data types seemed a bit like a solution to the problem you didn’t know you had. Considering that the other languages people were familiar with didn’t have them - e.g. Ruby, Python, Perl, Bash etc - why would Puppet need them?</p>
<p>Strictly-speaking, Puppet remains a dynamically-typed language, but the data types bring the most important benefit of static-typing, namely the ability of the compiler to detect unexpected data. This speeds up development, as data errors can be detected without a single line of test code.</p>
<h2 id="using-puppet-types">Using Puppet types</h2>
<h3 id="simple-example">Simple example</h3>
<h4 id="code-example-1">Code example 1</h4>
<p>With that said, the benefits of Puppet’s data types do not appear to be widely understood. Most of the time, I see code that looks like this:</p>
<div class="language-puppet highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">hostname</span> <span class="p">(</span>
<span class="nc">String</span> <span class="nv">$hostname</span><span class="p">,</span>
<span class="p">)</span> <span class="p">{</span>
<span class="n">host</span> <span class="p">{</span> <span class="s1">'hostname'</span><span class="p">:</span>
<span class="py">ensure</span> <span class="p">=></span> <span class="n">present</span><span class="p">,</span>
<span class="py">name</span> <span class="p">=></span> <span class="nv">$hostname</span><span class="p">,</span>
<span class="py">ip</span> <span class="p">=></span> <span class="nv">$facts</span><span class="p">[</span><span class="s1">'ipaddress'</span><span class="p">],</span>
<span class="py">host_aliases</span> <span class="p">=></span> <span class="nv">$hostname</span><span class="p">,</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>And when I say most of the time, I mean in most of the Puppet Supported modules too, e.g. <a href="https://github.com/puppetlabs/puppetlabs-concat/blob/master/manifests/init.pp">concat</a>, <a href="https://github.com/puppetlabs/puppetlabs-ntp/blob/master/manifests/init.pp">ntp</a>, <a href="https://github.com/puppetlabs/puppetlabs-apt/blob/master/manifests/init.pp">apt</a> etc.</p>
<p>Many no doubt have seen code like this and wondered what the point of it is. Why declare the hostname as a “String”. What else would it be?</p>
<h4 id="making-this-better">Making this better</h4>
<p>To benefit from Puppet’s types, it is necessary to use them precisely to define a range of acceptable inputs. Declaring hostname as a String has the advantage of preventing compilation if an Array is passed in, although it’s unlikely someone would pass an Array into a hostname field. What about an empty string though? That’s more plausible.</p>
<p>Now imagine the following Rspec example:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">describe</span> <span class="s1">'hostname'</span> <span class="k">do</span>
<span class="n">let</span><span class="p">(</span><span class="ss">:hiera_config</span><span class="p">)</span> <span class="p">{</span> <span class="s1">'spec/fixtures/hiera/hiera.yaml'</span> <span class="p">}</span>
<span class="n">it</span> <span class="p">{</span> <span class="n">is_expected</span><span class="p">.</span><span class="nf">to</span> <span class="n">compile</span> <span class="p">}</span>
<span class="k">end</span>
</code></pre></div></div>
<p>Notice two things here:</p>
<ol>
<li>I have configured Rspec to use real Hiera data</li>
<li>This is the only Rspec code I am going to write in this blog post.</li>
</ol>
<p>All testing is based on simply passing real Hiera data into the compiler.</p>
<p>So, I also create a common.yaml file with:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="s">hostname::hostname: ''</span>
</code></pre></div></div>
<p>Now, running Rspec leads to:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>error during compilation: Parameter host_aliases failed on Host[hostname]: Host aliases cannot be an em
pty string. Use an empty array to delete all host_aliases (file: /Users/alexharvey/git/home/puppet-tes
t/spec/fixtures/modules/hostname/manifests/init.pp, line: 4)
</code></pre></div></div>
<p>That’s pretty confusing. The user’s mistake was to pass an empty string for the hostname, whereas the error messages directs them to look at host aliases on another line in the file.</p>
<p>To force compilation to abort if an empty string is passed in, we can do this instead:</p>
<div class="language-puppet highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">hostname</span> <span class="p">(</span>
<span class="nc">String</span><span class="p">[</span><span class="m">1</span><span class="p">]</span> <span class="nv">$hostname</span><span class="p">,</span>
<span class="p">)</span> <span class="p">{</span>
<span class="c"># ...
</span><span class="p">}</span>
</code></pre></div></div>
<p>The declaration <code class="language-plaintext highlighter-rouge">String[1]</code> means string of minimum length 1.</p>
<p>If I run the test again, compilation now aborts with this error:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Error while evaluating a Resource Statement, Class[Hostname]: parameter 'hostname' expects a String[1]
value, got String (line: 2, column: 1)
</code></pre></div></div>
<p>This is better, even if the message is still cryptic. At least the error message has directed the user to the right line in the code, and has informed them that the problem is in the data.</p>
<h4 id="better-still">Better still</h4>
<p>Quite often, hostnames are expected to match a pattern. Suppose a hostnaming convention exists: <code class="language-plaintext highlighter-rouge">AAABCCCnnn</code> where:</p>
<ul>
<li>AAA = department</li>
<li>B = L or W (Linux or Windows)</li>
<li>CCC = app</li>
<li>nnn = a number from 1 to 999.</li>
</ul>
<p>We can now further improve our code using a <a href="https://puppet.com/docs/puppet/5.3/lang_data_abstract.html#pattern">Pattern</a> type as follows:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="n">hostname</span> <span class="p">(</span>
<span class="no">Pattern</span><span class="p">[</span><span class="sr">/^[A-Z]{3}[LW][A-Z]{3}\d{3}$/</span><span class="p">]</span> <span class="vg">$hostname</span><span class="p">,</span>
<span class="p">)</span> <span class="p">{</span>
<span class="c1"># ...</span>
<span class="p">}</span>
</code></pre></div></div>
<p>If we pass in the empty string here, we now get a much better error message:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Error: Evaluation Error: Error while evaluating a Resource Statement, Class[Hostname]: parameter 'host
name' expects a match for Pattern[/^[A-Z]{3}[LW][A-Z]{3}\d{3}$/], got ''
</code></pre></div></div>
<p>By using the types, we have:</p>
<ol>
<li>Made it very difficult for compilation to proceed if invalid data is passed in</li>
<li>Set it up so that if bad data is passed in, Puppet’s compiler aborts with a helpful message.</li>
</ol>
<h3 id="real-life-example">Real life example</h3>
<p>If the example of the hostnaming convention is a bit abstract, a second, more complex example using a nested Hash structure makes the value of Puppet’s type clearer.</p>
<h4 id="code-example-2">Code example 2</h4>
<p>Imagine an ELK data node that expects a Hash of volume groups:</p>
<div class="language-puppet highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">profile::elasticsearch::data_node</span> <span class="p">(</span>
<span class="nc">Hash</span> <span class="nv">$volume_groups</span><span class="p">,</span>
<span class="p">)</span> <span class="p">{</span>
<span class="nf">create_resources</span><span class="p">(</span><span class="n">lvm::volume_group</span><span class="p">,</span> <span class="nv">$volume_groups</span><span class="p">)</span>
<span class="c"># ...
</span><span class="p">}</span>
</code></pre></div></div>
<p>This is a modification of my open source ELK solution from <a href="https://github.com/alexharv074/elk">here</a>.</p>
<p>The declaration “Hash” here is not likely to detect actual errors and does not add much as documentation. The chances are that in the absence of some sample data, it is not going to be easy to figure out what the actual YAML Hash of volume groups really looks like. The user probably would need to carefully study the internals of the LVM module or its documentation to figure out that a structure like this would be required:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="s">profile::elasticsearch::data_node::volume_groups:</span>
<span class="s">esvg00</span><span class="pi">:</span>
<span class="na">physical_volumes</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s2">"</span><span class="s">%{facts.espv}"</span>
<span class="na">logical_volumes</span><span class="pi">:</span>
<span class="na">eslv00</span><span class="pi">:</span>
<span class="na">mountpath</span><span class="pi">:</span> <span class="s">/srv/es</span>
</code></pre></div></div>
<p>But let’s suppose the user mucks up the LV struct:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="s">profile::elasticsearch::data_node::volume_groups:</span>
<span class="s">esvg00</span><span class="pi">:</span>
<span class="na">physical_volumes</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s2">"</span><span class="s">%{facts.espv}"</span>
<span class="na">logical_volumes</span><span class="pi">:</span>
<span class="na">eslv00</span><span class="pi">:</span> <span class="s">/srv/es</span>
</code></pre></div></div>
<p>The mistake above is not easy to spot and an easy mistake to make.</p>
<p>Now let’s see what happens if I compile my actual code using these modifications:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>error during compilation: Evaluation Error: Error while evaluating a Resource Statement, Evaluation Er
ror: Error while evaluating a Function Call, no implicit conversion of String into Hash (file: /Users/
alexharvey/git/home/elk/spec/fixtures/modules/lvm/manifests/volume_group.pp, line: 34, column: 3) (fil
e: /Users/alexharvey/git/home/elk/spec/fixtures/modules/profile/manifests/elasticsearch/data_node.pp,
line: 43)
</code></pre></div></div>
<p>Huh? How did I cause an error at line 34 in spec/fixtures/modules/lvm/manifests/volume_group.pp? That’s a file from a Supported Puppet module.</p>
<h4 id="making-this-better-1">Making this better</h4>
<p>Confusing errors like this one can be avoided if Puppet’s type system is used. Here I refactor to declare the expected data types:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">profile::elasticsearch::data_node</span> <span class="p">(</span>
<span class="no">Hash</span><span class="p">[</span><span class="no">Pattern</span><span class="p">[</span><span class="sr">/^[a-z]+vg\d+$/</span><span class="p">],</span> <span class="no">Struct</span><span class="p">[{</span>
<span class="n">physical_volumes</span> <span class="o">=></span> <span class="no">Array</span><span class="p">[</span><span class="no">Stdlib</span><span class="o">::</span><span class="no">Absolutepath</span><span class="p">],</span>
<span class="n">logical_volumes</span> <span class="o">=></span> <span class="no">Hash</span><span class="p">[</span>
<span class="no">Pattern</span><span class="p">[</span><span class="sr">/^[a-z]+lv\d+$/</span><span class="p">],</span> <span class="no">Struct</span><span class="p">[{</span>
<span class="n">mountpath</span> <span class="o">=></span> <span class="no">Stdlib</span><span class="o">::</span><span class="no">Absolutepath</span>
<span class="p">}]]</span> <span class="vg">$volume_groups</span><span class="p">,</span>
<span class="p">)</span> <span class="p">{</span>
<span class="n">create_resources</span><span class="p">(</span><span class="n">lvm</span><span class="o">::</span><span class="n">volume_group</span><span class="p">,</span> <span class="vg">$volume_groups</span><span class="p">)</span>
<span class="c1"># ...</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Notice that I have declared everything from the structure of the data down to the naming convention of the logical volumes and the volume groups.</p>
<p>Running the tests now leads to an error message that explains to the user exactly what they did wrong and where:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>error during compilation: Evaluation Error: Error while evaluating a Function Call, Class[Profile::Elas
ticsearch::Data_node]: parameter 'volume_groups' entry 'esvg00' entry 'logical_volumes' entry 'eslv00'
expects a Struct value, got String (file: /Users/alexharvey/git/home/elk/spec/fixtures/modules/role/man
ifests/elk_stack.pp, line: 3, column: 3)
</code></pre></div></div>
<h2 id="discussion">Discussion</h2>
<h3 id="why-data-testing-matters">Why data testing matters</h3>
<p>I expect that many people will be skeptical of the claim that Puppet’s data types can significantly increase team velocity.</p>
<p>All the same, I have seen at site after site the same story recorded in revision histories everywhere - human errors, typos etc in the frequently-changing configuration data files causing immense wasted effort. And when you think about it, this is natural; data is supposed to change and this is the reason we separate it from the manifests in the first place. On the other hand, we are usually told to do behaviour-driven development and write unit tests that test only the logic and behaviour of classes. An important opportunity is missed becauses tests are not useful if the thing they are testing is not expected to change.</p>
<p>(Of course, I am not suggesting that unit tests are not important either, but the emphasis is often in the wrong place.)</p>
<h3 id="errors-detected-early">Errors detected early</h3>
<p>The Puppet types means that most data errors can be detected early - in the pre-commit stage, in fact - while the developer is present and likely to be aware of the mistake made. Since the compilation tests take only seconds to run, mistakes picked up early end up costing only a little lost time. Compare this to the worst case where a data error finds its way all the way into production without detection by any other testing. In that case, a simple typo can end up costing the team quite a lot of lost time and not to mention a possible production outage.</p>
<h3 id="improved-error-messages">Improved error messages</h3>
<p>Another key benefit of allowing Puppet’s types to detect data errors is that we avoid a lot of Puppet’s otherwise sometimes confusing error messages. Next time someone is complaining that Puppet is hard to debug, ask them if the confusing error message could have been avoided if Puppet’s data types were used properly.</p>
<h3 id="types-as-documentation">Types as documentation</h3>
<p>Yet another benefit of use of Puppet’s types is that the types document the assumptions about the data, and this documentation is unlikely to exist otherwise. The example I gave already was of the hostnaming convention. This makes the code easier to use by the team and leads to fewer questions about naming conventions etc which otherwise may be part of the undocumented tribal knowledge and thus sees people get more work done.</p>
<h3 id="a-reason-to-use-puppet">A reason to use Puppet</h3>
<p>In my opinion, the data types feature is a big reason to choose Puppet over other configuration management solutions like Ansible and Chef. None of the others have this feature - even newer tools like Terraform don’t have this<sup>1</sup> - and it is unlikely that they ever will. That a feature like this could be implemented is one of the benefits of Puppet’s early design choice to be its own special purpose configuration language.</p>
<p>Of course, this benefit of Puppet is not sold well if no one knows about it! I daresay that not many DevOps engineers out there have seen any kind of data consistency testing in their infrastructure code, and much less the highly efficient use of Puppet’s types.</p>
<p>I do hope this post helps to get the good word out.</p>
<h2 id="conclusion">Conclusion</h2>
<p>I have argued in this post that proper use of Puppet’s data types increases team velocity for a number of reasons, and given a couple of examples of how to actually use them. I’ve shown that data errors can be detected without writing any explicit tests in Rspec aside from the compilation tests.</p>
<p>In the next part, I will discuss the use of Rspec to directly test the data in the Hiera files, for data errors that can’t be detected as easily using just the data types.</p>
<p><sup>1</sup> Actually, Amazon Cloudformation has a data types feature similar to Puppet’s. This is the only other tool I’m aware of that has it.</p>Alex HarveyWhen maintaining Puppet (or any Infrastructure-as-code solution) in production, human errors are made most frequently in data. A class expected an array of strings but you just passed in a string. The data wasn’t valid JSON because you left out a comma. You added a comment to an INI file using a hash symbol instead of the semicolon. And so on.The pros and cons of Puppet PDK2018-09-30T00:00:00+00:002018-09-30T00:00:00+00:00https://alexharv074.github.io//puppet/2018/09/30/pros-and-cons-of-pdk<p>It was about a year ago that Puppet released its Puppet Development Kit (PDK), to simplify and streamline the development of Puppet modules. This post investigates the pros and cons of using PDK compared to managing modules like any other Ruby project.</p>
<ul id="markdown-toc">
<li><a href="#example-project" id="markdown-toc-example-project">Example project</a></li>
<li><a href="#doing-things-the-old-way" id="markdown-toc-doing-things-the-old-way">Doing things the old way</a> <ul>
<li><a href="#setup-files" id="markdown-toc-setup-files">Setup files</a> <ul>
<li><a href="#fixturesyml" id="markdown-toc-fixturesyml">.fixtures.yml</a></li>
<li><a href="#gitignore" id="markdown-toc-gitignore">.gitignore</a></li>
<li><a href="#gemfile" id="markdown-toc-gemfile">Gemfile</a></li>
<li><a href="#rakefile" id="markdown-toc-rakefile">Rakefile</a></li>
<li><a href="#spec_helperrb" id="markdown-toc-spec_helperrb">spec_helper.rb</a></li>
</ul>
</li>
<li><a href="#running-the-tests" id="markdown-toc-running-the-tests">Running the tests</a></li>
<li><a href="#other-things" id="markdown-toc-other-things">Other things</a></li>
<li><a href="#shared-boilerplate-problem" id="markdown-toc-shared-boilerplate-problem">Shared boilerplate problem</a></li>
</ul>
</li>
<li><a href="#doing-things-the-new-way" id="markdown-toc-doing-things-the-new-way">Doing things the new way</a> <ul>
<li><a href="#pdk-new-module" id="markdown-toc-pdk-new-module">pdk new module</a></li>
<li><a href="#pdk-new-class" id="markdown-toc-pdk-new-class">pdk new class</a></li>
<li><a href="#running-the-tests-1" id="markdown-toc-running-the-tests-1">Running the tests</a></li>
<li><a href="#pdk-convert" id="markdown-toc-pdk-convert">pdk convert</a></li>
</ul>
</li>
<li><a href="#discussion" id="markdown-toc-discussion">Discussion</a> <ul>
<li><a href="#advantages" id="markdown-toc-advantages">Advantages</a></li>
<li><a href="#disadvantages" id="markdown-toc-disadvantages">Disadvantages</a> <ul>
<li><a href="#loss-of-control" id="markdown-toc-loss-of-control">Loss of control</a></li>
<li><a href="#pollution-of-config" id="markdown-toc-pollution-of-config">Pollution of config</a></li>
<li><a href="#barrier-to-understanding" id="markdown-toc-barrier-to-understanding">Barrier to understanding</a></li>
<li><a href="#barrier-to-advanced-testing" id="markdown-toc-barrier-to-advanced-testing">Barrier to advanced testing</a></li>
<li><a href="#only-solves-sync-for-puppet" id="markdown-toc-only-solves-sync-for-puppet">Only solves sync for Puppet</a></li>
</ul>
</li>
<li><a href="#how-i-will-use-it" id="markdown-toc-how-i-will-use-it">How I will use it</a></li>
</ul>
</li>
<li><a href="#conclusion" id="markdown-toc-conclusion">Conclusion</a></li>
<li><a href="#see-also" id="markdown-toc-see-also">See also</a></li>
</ul>
<h2 id="example-project">Example project</h2>
<p>Suppose I have a simple “hello world” module:</p>
<div class="language-puppet highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">foo</span> <span class="p">{</span>
<span class="n">notify</span> <span class="p">{</span> <span class="s1">'bar'</span><span class="p">:</span> <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>And a unit test for it:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">describe</span> <span class="s1">'foo'</span> <span class="k">do</span>
<span class="n">it</span> <span class="p">{</span> <span class="n">is</span><span class="p">.</span><span class="nf">expected_to</span> <span class="n">contain_notify</span><span class="p">(</span><span class="s1">'bar'</span><span class="p">)</span> <span class="p">}</span>
<span class="k">end</span>
</code></pre></div></div>
<p>How do I set up Rspec to run the test?</p>
<h2 id="doing-things-the-old-way">Doing things the old way</h2>
<h3 id="setup-files">Setup files</h3>
<h4 id="fixturesyml">.fixtures.yml</h4>
<p>First, I need a simple <code class="language-plaintext highlighter-rouge">.fixtures.yml</code>:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">fixtures</span><span class="pi">:</span>
<span class="na">symlinks</span><span class="pi">:</span>
<span class="na">foo</span><span class="pi">:</span> <span class="s2">"</span><span class="s">#{source_dir}"</span>
</code></pre></div></div>
<p>This is needed to create symbolic links in the fixtures directory, in order to ensure that Puppet can find both dependent modules and the code under test. This requirement is documented in the <code class="language-plaintext highlighter-rouge">puppetlabs_spec_heler</code> README.</p>
<h4 id="gitignore">.gitignore</h4>
<p>I need to gitignore some files:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Gemfile.lock
spec/fixtures
</code></pre></div></div>
<p>It’s necessary to gitignore the spec/fixtures directory so that Git won’t see the temporary files used during testing. And gitignoring <code class="language-plaintext highlighter-rouge">Gemfile.lock</code> is a preference of mine, as I always want my tests to run against latest-everything. Not everyone would agree on that decision, and some people do revision control their Gemfile.locks.</p>
<h4 id="gemfile">Gemfile</h4>
<p>The simplest Gemfile I use for Puppet testing is:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">source</span> <span class="s1">'https://rubygems.org'</span>
<span class="n">group</span> <span class="ss">:tests</span> <span class="k">do</span>
<span class="n">gem</span> <span class="s1">'puppetlabs_spec_helper'</span>
<span class="k">end</span>
<span class="k">if</span> <span class="n">puppetversion</span> <span class="o">=</span> <span class="no">ENV</span><span class="p">[</span><span class="s1">'PUPPET_GEM_VERSION'</span><span class="p">]</span>
<span class="n">gem</span> <span class="s1">'puppet'</span><span class="p">,</span> <span class="n">puppetversion</span>
<span class="k">else</span>
<span class="n">gem</span> <span class="s1">'puppet'</span>
<span class="k">end</span>
</code></pre></div></div>
<p>Note that I expect an optional environment variable there that allows me to test against different Puppet versions. This is important for modules I support on the Forge.</p>
<h4 id="rakefile">Rakefile</h4>
<p>My simplest Rakefile contains these lines:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">require</span> <span class="s1">'puppetlabs_spec_helper/rake_tasks'</span>
<span class="no">PuppetLint</span><span class="p">.</span><span class="nf">configuration</span><span class="p">.</span><span class="nf">send</span><span class="p">(</span><span class="s1">'disable_2sp_soft_tabs'</span><span class="p">)</span>
<span class="no">PuppetLint</span><span class="p">.</span><span class="nf">configuration</span><span class="p">.</span><span class="nf">send</span><span class="p">(</span><span class="s1">'disable_arrow_alignment'</span><span class="p">)</span>
<span class="no">PuppetLint</span><span class="p">.</span><span class="nf">configuration</span><span class="p">.</span><span class="nf">send</span><span class="p">(</span><span class="s1">'disable_variables_not_enclosed'</span><span class="p">)</span>
</code></pre></div></div>
<p>That just says to include the standard Rake tasks from puppetlabs_spec_helper, and sets up my linting preferences.</p>
<h4 id="spec_helperrb">spec_helper.rb</h4>
<p>Finally, my simplest spec helper is:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">require</span> <span class="s1">'puppetlabs_spec_helper/module_spec_helper'</span>
<span class="no">RSpec</span><span class="p">.</span><span class="nf">configure</span> <span class="k">do</span> <span class="o">|</span><span class="n">c</span><span class="o">|</span>
<span class="n">c</span><span class="p">.</span><span class="nf">formatter</span> <span class="o">=</span> <span class="ss">:documentation</span>
<span class="n">c</span><span class="p">.</span><span class="nf">tty</span> <span class="o">=</span> <span class="kp">true</span>
<span class="k">end</span>
</code></pre></div></div>
<p>Again, I just include the default configuration from puppetlabs_spec_helper, set Rspec output to documentation mode; and tty true is a setting that’s needed for colouring in build pipelines like Travis, Bitbucket etc.</p>
<h3 id="running-the-tests">Running the tests</h3>
<p>We run the tests using commands familiar to Ruby developers:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>▶ bundle exec rake spec
I, [2018-10-02T20:44:19.368027 #72078] INFO -- : Creating symlink from spec/fixtures/modules/foo to /Users/alexharvey/git/home/pdktest
/Users/alexharvey/.rvm/rubies/ruby-2.4.1/bin/ruby -I/Users/alexharvey/.rvm/gems/ruby-2.4.1/gems/rspec-core-3.8.0/lib:/Users/alexharvey/.rvm/gems/ruby-2.4.1/gems/rspec-
support-3.8.0/lib /Users/alexharvey/.rvm/gems/ruby-2.4.1/gems/rspec-core-3.8.0/exe/rspec --pattern spec/\{aliases,classes,defines,unit,functions,hosts,integration,plan
s,type_aliases,types\}/\*\*/\*_spec.rb --color
foo
should contain Notify[bar]
Finished in 0.15346 seconds (files took 1.24 seconds to load)
1 example, 0 failures
</code></pre></div></div>
<h3 id="other-things">Other things</h3>
<p>Okay, I simplified the problem a bit by focusing only on Rspec, didn’t I. To fully set up module testing, I probably want all this running in a CI pipeline like Travis CI; I could want Rubocop; perhaps I need something like Beaker or Test Kitchen or equivalent; maybe I want Rspec-puppet-facts for multi-OS testing; if I intend to publish this on the Forge, I’ll need metadata and probably Puppet-blacksmith.</p>
<p>For now, I’m just focusing on the pain-point I hear most often about, which is how to set up Rspec.</p>
<h3 id="shared-boilerplate-problem">Shared boilerplate problem</h3>
<p>I should also mention what I’m calling the “shared boilerplate problem”, a problem that has been solved in one way by <a href="https://github.com/voxpupuli/modulesync">modulesync</a>. This is the problem of how to keep these files like Gemfile, Rakefile etc in sync when you manage lots of projects that all need the same files.</p>
<p>I don’t use modulesync as I found it too complicated, whereas I have preferred to write a very simple custom Ruby script which I called <code class="language-plaintext highlighter-rouge">sync_spec</code>.</p>
<p>Well, it definitely should be noted that PDK automatically solves this problem too.</p>
<h2 id="doing-things-the-new-way">Doing things the new way</h2>
<h3 id="pdk-new-module">pdk new module</h3>
<p>The Puppet PDK automates generation of all the above boilerplate and much, much more. It’s very easy to use too and has a nice user interface. I started by running the following command:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>▶ pdk new module pdktest
</code></pre></div></div>
<p>Then I was asked four questions. My Forge username, which happened to be the same as my laptop username, so PDK guessed that correctly. Then my full name. The license I use. And the operating systems I wished to support. This led to the creation of:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>17 files changed, 673 insertions(+)
</code></pre></div></div>
<p>And 17 files and 673 lines of code for free is either a lot of time saved or a lot of magic, depending how you look at it. On the other hand, to do what I had wanted I needed only 6 files and 30 lines of code. Still, it’s certainly easy.</p>
<h3 id="pdk-new-class">pdk new class</h3>
<p>Next, I created a class:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>▶ pdk new class foo
pdk (INFO): Creating '/Users/alexharvey/git/home/pdktest/manifests/foo.pp' from template.
pdk (INFO): Creating '/Users/alexharvey/git/home/pdktest/spec/classes/foo_spec.rb' from template.
</code></pre></div></div>
<p>This created an empty foo class with documentation examples, and then an empty spec file with an assumption that I would use rspec-puppet-facts thrown in for free. That’s fine, I can refactor that out.</p>
<h3 id="running-the-tests-1">Running the tests</h3>
<p>Then, PDK also magically ran the tests for me:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>▶ pdk test unit
pdk (INFO): Using Ruby 2.4.4
pdk (INFO): Using Puppet 5.5.3
[✔] Preparing to run the unit tests.
[✔] Running unit tests.
Evaluated 1 tests in 0.286737 seconds: 0 failures, 0 pending.
</code></pre></div></div>
<p>If I was new to Puppet and/or Ruby, I would have no idea what happened here. As it is, I assume that PDK actually ran all the Rspec tests for me. And hid all the output too! What if a test fails, I wondered. So I tried changed my Rspec assertion to something that would fail:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>▶ pdk test unit
pdk (INFO): Using Ruby 2.4.4
pdk (INFO): Using Puppet 5.5.3
[✔] Preparing to run the unit tests.
[✖] Running unit tests.
Evaluated 1 tests in 0.383444 seconds: 1 failures, 0 pending.
failed: rspec: ./spec/classes/foo_spec.rb:4: expected that the catalogue would not contain Notify[bar]
foo should not contain Notify[bar]
Failure/Error:
describe 'foo' do
it { is_expected.to_not contain_notify('bar') }
end
</code></pre></div></div>
<p>That’s pretty clever. PDK knows which bits of the Rspec output are important, and it’s showing me only that.</p>
<h3 id="pdk-convert">pdk convert</h3>
<p>Another cool feature is <code class="language-plaintext highlighter-rouge">pdk convert</code>. I can go into any module that I maintain and in one command convert to the PDK way of doing things. For instance, my firewall_multi module:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>▶ pdk convert
------------Files to be added-----------
appveyor.yml
.gitlab-ci.yml
.pdkignore
.yardopts
spec/default_facts.yml
.rubocop.yml
----------Files to be modified----------
metadata.json
spec/spec_helper.rb
Gemfile
.gitignore
.travis.yml
.rspec
Rakefile
----------------------------------------
You can find a report of differences in convert_report.txt.
pdk (INFO): Module conversion is a potentially destructive action. Ensure that you have committed your module to a version control system or have a backup, and review
the changes above before continuing.
Do you want to continue and make these changes to your module? Yes
------------Convert completed-----------
6 files added, 7 files modified.
</code></pre></div></div>
<p>And I found that nothing was broken and <code class="language-plaintext highlighter-rouge">pdk test unit</code> then ran all my tests.</p>
<p>Further inspection led me to realise that there are some improvements I can make in my <code class="language-plaintext highlighter-rouge">.travis.yml</code>, although I was inclined to reject all of the remainder of changes.</p>
<p>Still, it’s an impressive tool.</p>
<h2 id="discussion">Discussion</h2>
<h3 id="advantages">Advantages</h3>
<p>PDK is an opinionated tool for setting up Rspec and a whole range of other Puppet development tools in the way that Puppet like it. I can see that it lowers the barrier to entry to a lot of automated testing and other best practices; it hides all the messy details of Ruby and Rspec, and replaces all that with a nice user experience. It is also supported by Puppet and the Puppet community, and it solves the shared Rspec boilerplate problem more cleanly than modulesync.</p>
<p>There are many, obvious reasons to use it. But there are also reasons to not use it.</p>
<h3 id="disadvantages">Disadvantages</h3>
<h4 id="loss-of-control">Loss of control</h4>
<p>PDK evidently works best if you accept its preferences and ways of doing things.</p>
<p>If, on the other hand, you have an opinionated module setup of your own, it will be necessary to use <a href="https://github.com/puppetlabs/pdk-templates">pdk-templates</a>, and I would expect that going in that direction leads quickly to a testing setup that is more complicated for novices and experts alike. And also, although I haven’t delved into it yet, I expect that many features of the PDK-managed tools simply can’t be used at all with PDK, and I expect that some PDK preferences can’t be turned off.</p>
<h4 id="pollution-of-config">Pollution of config</h4>
<p>Another problem with PDK is that it pollutes your repos with nearly 700 lines of config, most of which aren’t applicable to you. So, most users won’t be using the AppVeyor or the Gitlab CI Runner for instance.</p>
<p>On the other hand, it is my belief that code should also be documentation and for code to be good documentation, superfluous config that is unused needs to be removed. PDK appears to make this impossible. As a consequence, someone reading and trying to understand the design of PDK-managed tests would at times not know which Rspec options the tests actually depended on.</p>
<h4 id="barrier-to-understanding">Barrier to understanding</h4>
<p>Success with Puppet requires the user to understand Ruby and Rspec. (No, it really does.) And the Ruby community has produced wonderful tools, including Bundler, Rake, Yard etc. As a Puppet developer, I want people in my team to know how to write Gemfiles and Rakefiles and so on by themselves. This knowledge is going to be crucial, sooner or later, even if only in the context of debugging.</p>
<p>Likewise, learning how to design and write good unit tests - and I mean with the full understanding of what you are doing and why, rather than those who are writing tests for the sake of it - is hard. Learning how to think of a piece of code as a unit, as a black or white box, and expressing its expected behaviour in Rspec is an art - a rewarding art too - and also the secret to the rapid development of infrastructure-as-code. But I don’t believe in falsely raising expectations that this is, or that it should be, easy. Lawyers, doctors, engineers, and others have no such demand that their work should be easy, and neither should the infrastructure-as-code developer. It is as easy or as difficult as it needs to be, but no more or less.</p>
<h4 id="barrier-to-advanced-testing">Barrier to advanced testing</h4>
<p>PDK also presents a barrier for advanced users who do testing that goes beyond what the engineers at Puppet currently do. For example, PDK gets in the way of the Rspec data testing pattern I advocate. What if I want to test that all file content representing JSON files inside Puppet catalogs is valid JSON? What if I want Bash shell scripts to be tested in shUnit2 or BATS?</p>
<p>PDK actually gets in the way and makes it it hard to do this.</p>
<h4 id="only-solves-sync-for-puppet">Only solves sync for Puppet</h4>
<p>For many, a solution to the shared boilerplate problem is likely to be a big reason to use PDK, but I believe that, in practice, most teams will eventually require a solution that handles both Puppet repos and other infrastructure-as-code repos that are unrelated to Puppet. If a site truly embraces infrastructure-as-code, it’s likely that they will have Bash scripts, Ruby and Python apps, so on, and in my experience, these other projects also end up with shared boilerplate that needs to be managed. In the past, I have used my <code class="language-plaintext highlighter-rouge">sync_spec</code> solution for this too, and I would expect that most sites do something similar. And if not, they probably should.</p>
<h3 id="how-i-will-use-it">How I will use it</h3>
<p>So, will I use PDK, and if so how?</p>
<p>The above may have caused the reader to assume that I am not going to use PDK at all, but that’s not the case. Rather, I intend to use it to keep abreast of best practices from Puppet and the community by periodically comparing my code to changes recommended by <code class="language-plaintext highlighter-rouge">pdk convert</code>. It’s also possible that I’ll use pdk-generated code as a starting point for my modules, and edit away the bits I don’t need.</p>
<p>And for most other users, too, this would be my recommended way of using PDK. Of course, I can understand why some groups - e.g. the VoxPopuli community - might choose to fully embrace the PDK way of doing things. And Puppet Enterprise customers may use PDK in order to be better supported by Puppet.</p>
<p>But for most teams who aren’t already Ruby and Puppet power users, my recommendation is to do what I do.</p>
<h2 id="conclusion">Conclusion</h2>
<p>I have tried out PDK and written about my experiences with it. I argue that many teams should think carefully about whether they really want to fully embrace this tool’s way of doing things, or if it’s still better to learn the Ruby way. Meanwhile, I personally plan to use the tool as a convenient way to keep informed of best practices in Puppet.</p>
<h2 id="see-also">See also</h2>
<p>For other views:</p>
<ul>
<li>Rob Nelson, <a href="https://rnelson0.com/2018/06/08/convert-a-puppet-module-from-bundle-based-testing-to-the-puppet-development-kit-pdk/">Convert a Puppet module from bundle-based testing to the Puppet Development Kit (PDK)</a>.</li>
</ul>Alex HarveyIt was about a year ago that Puppet released its Puppet Development Kit (PDK), to simplify and streamline the development of Puppet modules. This post investigates the pros and cons of using PDK compared to managing modules like any other Ruby project.Analysing Puppet module dependencies using JQ2018-09-06T00:00:00+00:002018-09-06T00:00:00+00:00https://alexharv074.github.io//puppet/2018/09/06/analysing-puppet-module-dependencies-using-jq<p>About a month ago, it seems, Puppet’s stdlib module version 5.0.0 was <a href="https://github.com/puppetlabs/puppetlabs-stdlib/commit/597769a73cc194ea9daa8a49b5707be45ad5240b">released</a>, and if my own ELK project is any indication, a lot of code bases out there that use Librarian Puppet to pull in stdlib will be confused and broken as mine was.</p>
<p>The main reason for this post, however, was just to document the method of analysing the dependencies I discovered.</p>
<h2 id="broken-build">Broken build</h2>
<p>The failed build is available online <a href="https://travis-ci.org/alexharv074/elk/jobs/420704510">here</a> and I ran it again in Librarian verbose mode <a href="https://travis-ci.org/alexharv074/elk/jobs/423859931">here</a>. Librarian can be seen failing like this:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[Librarian] Resolving puppetlabs-concat (>= 0) <https://forgeapi.puppetlabs.com>
[Librarian] Checking manifests
[Librarian] Module puppetlabs-concat found versions: 5.0.0, 4.2.1, 4.2.0, 4.1.1, 4.1.0, 4.0.1, 4.0.0, 3.0.0, 2.2.1, 2.2.0, 2.1.0, 1.2.5, 1.2.4, 1.2.3, 1.2.2, 1.2.1, 1.2.0, 1.1.2, 1.1.1, 1.1.0, 1.1.0-rc1, 1.0.4, 1.0.3, 1.0.2, 1.0.1, 1.0.0, 1.0.0-rc1
[Librarian] Checking puppetlabs-concat/5.0.0 <https://forgeapi.puppetlabs.com>
[Librarian] Conflict between puppetlabs-concat/5.0.0 <https://forgeapi.puppetlabs.com> and puppetlabs-concat (< 5.0.0, >= 3.0.0) <https://forgeapi.puppetlabs.com>
[Librarian] Backtracking from puppetlabs-concat/5.0.0 <https://forgeapi.puppetlabs.com>
[Librarian] Checking puppetlabs-concat/4.2.1 <https://forgeapi.puppetlabs.com>
[Librarian] Resolved puppetlabs-concat (>= 0) <https://forgeapi.puppetlabs.com> at puppetlabs-concat/4.2.1 <https://forgeapi.puppetlabs.com>
[Librarian] Resolved puppetlabs-concat (>= 0) <https://forgeapi.puppetlabs.com>
[Librarian] Conflict between puppetlabs-stdlib (< 5.0.0, >= 4.13.1) <https://forgeapi.puppetlabs.com> and puppetlabs-stdlib/5.0.0 <https://forgeapi.puppetlabs.com>
Could not resolve the dependencies.
</code></pre></div></div>
<p>While it was clear that one of my modules wanted <code class="language-plaintext highlighter-rouge">puppetlabs-stdlib/5.0.0</code> and this conflicted with concat’s requirement for <code class="language-plaintext highlighter-rouge">< 5.0.0, >= 4.13.1</code>, it was less clear as to which one!</p>
<h2 id="querying-stdlib-versions">Querying stdlib versions</h2>
<p>This JQ command here allowed me to view all dependencies conveniently:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>▶ cat spec/**/metadata.json | \
> jq '.dependencies[] | select(.name=="puppetlabs/stdlib") | .version_requirement'
">= 4.16.0 < 5.0.0"
">= 4.13.1 < 5.0.0"
">=3.2.0 <5.0.0"
">= 4.13.1 < 5.0.0"
">= 4.13.0 < 5.0.0"
">= 3.0.0"
">=4.13.0 <5.0.0"
">= 4.0.0 < 5.0.0"
">=3.2.0 <5.0.0"
">= 4.13.1 < 5.0.0"
">= 4.22.0 <5.0.0"
">= 4.13.1 < 5.0.0"
">= 1.0.2 <5.0.0"
">= 4.13.0 < 5.0.0"
</code></pre></div></div>
<p>This led me to realise:</p>
<ol>
<li>Practically every module out there is specifying stdlib <code class="language-plaintext highlighter-rouge">< 5.0.0</code>!</li>
<li>No actual module was specifying <code class="language-plaintext highlighter-rouge">== 5.0.0</code>.</li>
</ol>
<h2 id="the-root-cause">The root cause</h2>
<p>It turns out that the problem was simply that I had a Puppetfile line that I thought was requesting stdlib, <em>any</em> version, like this:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mod 'puppetlabs/stdlib'
</code></pre></div></div>
<p>Whereas evidently such a line causes Librarian Puppet to require the <em>latest</em> version.</p>
<h2 id="the-fix">The fix</h2>
<p>So, I changed my Puppetfile to this and everything was fine:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code># All other dependencies specify < 5.0.0 whereas if I allow latest
# released ~ 18 days ago at the time of writing, librarian-puppet
# can't resolve the dependencies.
#
mod 'puppetlabs/stdlib', '< 5.0.0'
</code></pre></div></div>
<h2 id="conclusion">Conclusion</h2>
<p>I thought the behaviour of Librarian Puppet here was surprising enough to be worth documenting, and the <code class="language-plaintext highlighter-rouge">jq</code> command useful enough that I’ll probably want it again some day. I also hope this is helpful to others.</p>Alex HarveyAbout a month ago, it seems, Puppet’s stdlib module version 5.0.0 was released, and if my own ELK project is any indication, a lot of code bases out there that use Librarian Puppet to pull in stdlib will be confused and broken as mine was.Pretty-printing Puppet data2018-09-02T00:00:00+00:002018-09-02T00:00:00+00:00https://alexharv074.github.io//puppet/2018/09/02/pretty-printing-puppet-data<p>Sometimes it is useful to be able to pretty-print Puppet data when debugging. It would be great if there was native support for this, e.g. a built-in function <code class="language-plaintext highlighter-rouge">pp()</code> would be nice.</p>
<p>Until then, I found a useful <a href="https://gist.github.com/Cinderhaze/6d1e90dec0184284eb25910b5ce06b5f">Gist</a> by Github user Cinderhaze a.k.a. Daryl, which is the basis of this method.</p>
<h2 id="sample-code">Sample code</h2>
<p>Consider the following snippet:</p>
<div class="language-puppet highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># data.pp
</span><span class="nv">$data</span> <span class="o">=</span> <span class="p">{</span>
<span class="s2">"test.ltd"</span> <span class="p">=></span> <span class="p">{</span>
<span class="s2">"ensure"</span> <span class="p">=></span> <span class="s2">"present"</span><span class="p">,</span>
<span class="s2">"zone_contact"</span> <span class="p">=></span> <span class="s2">"contact.test.ltd"</span><span class="p">,</span>
<span class="s2">"zone_ns"</span> <span class="p">=></span> <span class="p">[</span><span class="s2">"ns0.test.ltd"</span><span class="p">,</span> <span class="s2">"ns1.test.ltd"</span><span class="p">],</span>
<span class="s2">"zone_serial"</span> <span class="p">=></span> <span class="s2">"2018010101"</span><span class="p">,</span>
<span class="s2">"zone_ttl"</span> <span class="p">=></span> <span class="s2">"767200"</span><span class="p">,</span>
<span class="s2">"zone_origin"</span> <span class="p">=></span> <span class="s2">"test.ltd"</span><span class="p">,</span>
<span class="s2">"hash_data"</span> <span class="p">=></span> <span class="p">{</span>
<span class="s2">"newyork"</span> <span class="p">=></span> <span class="p">{</span><span class="s2">"owner"</span> <span class="p">=></span> <span class="s2">"11.22.33.44"</span><span class="p">},</span>
<span class="s2">"tokyo"</span> <span class="p">=></span> <span class="p">{</span><span class="s2">"owner"</span> <span class="p">=></span> <span class="s2">"22.33.44.55"</span><span class="p">},</span>
<span class="s2">"london"</span> <span class="p">=></span> <span class="p">{</span><span class="s2">"owner"</span> <span class="p">=></span> <span class="s2">"33.44.55.66"</span><span class="p">},</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="nf">notice</span><span class="p">(</span><span class="nv">$data</span><span class="p">)</span>
</code></pre></div></div>
<p>If I apply, the output is unreadable:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ puppet apply data.pp
Notice: Scope(Class[main]): {test.ltd => {ensure => present, zone_contact => contact.test.ltd, zone_ns => [ns0.test.ltd, ns1.test.ltd], zone_serial => 2018010101, zone_ttl => 767200, zone_origin => test.ltd, hash_data => {newyork => {owner => 11.22.33.44}, tokyo => {owner => 22.33.44.55}, london => {owner => 33.44.55.66}}}}
Notice: Compiled catalog for alexs-macbook-pro.local in environment production in 0.02 seconds
Notice: Applied catalog in 0.01 seconds
</code></pre></div></div>
<h2 id="pretty-printing">Pretty-printing</h2>
<p>Using the idea suggested by Cinderhaze:</p>
<div class="language-puppet highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># data.pp
</span><span class="nv">$data</span> <span class="o">=</span> <span class="p">{</span>
<span class="s2">"test.ltd"</span> <span class="p">=></span> <span class="p">{</span>
<span class="s2">"ensure"</span> <span class="p">=></span> <span class="s2">"present"</span><span class="p">,</span>
<span class="s2">"zone_contact"</span> <span class="p">=></span> <span class="s2">"contact.test.ltd"</span><span class="p">,</span>
<span class="s2">"zone_ns"</span> <span class="p">=></span> <span class="p">[</span><span class="s2">"ns0.test.ltd"</span><span class="p">,</span> <span class="s2">"ns1.test.ltd"</span><span class="p">],</span>
<span class="s2">"zone_serial"</span> <span class="p">=></span> <span class="s2">"2018010101"</span><span class="p">,</span>
<span class="s2">"zone_ttl"</span> <span class="p">=></span> <span class="s2">"767200"</span><span class="p">,</span>
<span class="s2">"zone_origin"</span> <span class="p">=></span> <span class="s2">"test.ltd"</span><span class="p">,</span>
<span class="s2">"hash_data"</span> <span class="p">=></span> <span class="p">{</span>
<span class="s2">"newyork"</span> <span class="p">=></span> <span class="p">{</span><span class="s2">"owner"</span> <span class="p">=></span> <span class="s2">"11.22.33.44"</span><span class="p">},</span>
<span class="s2">"tokyo"</span> <span class="p">=></span> <span class="p">{</span><span class="s2">"owner"</span> <span class="p">=></span> <span class="s2">"22.33.44.55"</span><span class="p">},</span>
<span class="s2">"london"</span> <span class="p">=></span> <span class="p">{</span><span class="s2">"owner"</span> <span class="p">=></span> <span class="s2">"33.44.55.66"</span><span class="p">},</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="nv">$content</span> <span class="o">=</span> <span class="nf">inline_template</span><span class="p">(</span><span class="s2">"
<%- require 'json' -%>
<%= JSON.pretty_generate(@data) %>
"</span><span class="p">)</span>
</code></pre></div></div>
<p>I now get nice readable JSON-formatted output:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ puppet apply data.pp
Notice: Scope(Class[main]):
{
"test.ltd": {
"ensure": "present",
"zone_contact": "contact.test.ltd",
"zone_ns": [
"ns0.test.ltd",
"ns1.test.ltd"
],
"zone_serial": "2018010101",
"zone_ttl": "767200",
"zone_origin": "test.ltd",
"hash_data": {
"newyork": {
"owner": "11.22.33.44"
},
"tokyo": {
"owner": "22.33.44.55"
},
"london": {
"owner": "33.44.55.66"
}
}
}
}
Notice: Compiled catalog for alexs-macbook-pro.local in environment production in 0.04 seconds
Notice: Applied catalog in 0.01 seconds
</code></pre></div></div>
<h2 id="using-awesome-print-instead">Using awesome-print instead</h2>
<p>Or we could use the Ruby awesome_print library:</p>
<div class="language-puppet highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$data</span> <span class="o">=</span> <span class="p">{</span>
<span class="s2">"test.ltd"</span> <span class="p">=></span> <span class="p">{</span>
<span class="s2">"ensure"</span> <span class="p">=></span> <span class="s2">"present"</span><span class="p">,</span>
<span class="s2">"zone_contact"</span> <span class="p">=></span> <span class="s2">"contact.test.ltd"</span><span class="p">,</span>
<span class="s2">"zone_ns"</span> <span class="p">=></span> <span class="p">[</span><span class="s2">"ns0.test.ltd"</span><span class="p">,</span> <span class="s2">"ns1.test.ltd"</span><span class="p">],</span>
<span class="s2">"zone_serial"</span> <span class="p">=></span> <span class="s2">"2018010101"</span><span class="p">,</span>
<span class="s2">"zone_ttl"</span> <span class="p">=></span> <span class="s2">"767200"</span><span class="p">,</span>
<span class="s2">"zone_origin"</span> <span class="p">=></span> <span class="s2">"test.ltd"</span><span class="p">,</span>
<span class="s2">"hash_data"</span> <span class="p">=></span> <span class="p">{</span>
<span class="s2">"newyork"</span> <span class="p">=></span> <span class="p">{</span><span class="s2">"owner"</span> <span class="p">=></span> <span class="s2">"11.22.33.44"</span><span class="p">},</span>
<span class="s2">"tokyo"</span> <span class="p">=></span> <span class="p">{</span><span class="s2">"owner"</span> <span class="p">=></span> <span class="s2">"22.33.44.55"</span><span class="p">},</span>
<span class="s2">"london"</span> <span class="p">=></span> <span class="p">{</span><span class="s2">"owner"</span> <span class="p">=></span> <span class="s2">"33.44.55.66"</span><span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="nv">$content</span> <span class="o">=</span> <span class="nf">inline_template</span><span class="p">(</span><span class="s2">"
<%- require 'awesome_print' -%>
<%= ap(@data) %>
"</span><span class="p">)</span>
<span class="nf">notice</span><span class="p">(</span><span class="nv">$content</span><span class="p">)</span>
</code></pre></div></div>
<p>And get:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ puppet apply data.pp
{
"test.ltd" => {
"ensure" => "present",
"zone_contact" => "contact.test.ltd",
"zone_ns" => [
[0] "ns0.test.ltd",
[1] "ns1.test.ltd"
],
"zone_serial" => "2018010101",
"zone_ttl" => "767200",
"zone_origin" => "test.ltd",
"hash_data" => {
"newyork" => {
"owner" => "11.22.33.44"
},
"tokyo" => {
"owner" => "22.33.44.55"
},
"london" => {
"owner" => "33.44.55.66"
}
}
}
}
Notice: Scope(Class[main]):
{"test.ltd"=>{"ensure"=>"present", "zone_contact"=>"contact.test.ltd", "zone_ns"=>["ns0.test.ltd", "ns1.test.ltd"], "zone_serial"=>"2018010101", "zone_ttl"=>"767200", "zone_origin"=>"test.ltd", "hash_data"=>{"newyork"=>{"owner"=>"11.22.33.44"}, "tokyo"=>{"owner"=>"22.33.44.55"}, "london"=>{"owner"=>"33.44.55.66"}}}}
Notice: Compiled catalog for alexs-macbook-pro.local in environment production in 0.08 seconds
Notice: Applied catalog in 0.01 seconds
</code></pre></div></div>Alex HarveySometimes it is useful to be able to pretty-print Puppet data when debugging. It would be great if there was native support for this, e.g. a built-in function pp() would be nice.JQ commands for Puppet catalogs2017-11-30T00:00:00+00:002017-11-30T00:00:00+00:00https://alexharv074.github.io//puppet/2017/11/30/jq-commands-for-puppet-catalogs<p>This is a page dedicated to useful JQ commands for querying a compiled Puppet catalog.</p>
<ul id="markdown-toc">
<li><a href="#general-note-about-puppet-3-v-puppet-45-catalogs" id="markdown-toc-general-note-about-puppet-3-v-puppet-45-catalogs">General note about Puppet 3 v Puppet 4/5 catalogs</a></li>
<li><a href="#list-all-file-resources-by-title" id="markdown-toc-list-all-file-resources-by-title">List all file resources by title</a></li>
<li><a href="#select-a-file-resource-based-on-title" id="markdown-toc-select-a-file-resource-based-on-title">Select a file resource based on title</a></li>
<li><a href="#select-all-files-resource-based-on-title-that-contains-a-string" id="markdown-toc-select-all-files-resource-based-on-title-that-contains-a-string">Select all files resource based on title that contains a string</a></li>
<li><a href="#list-all-the-classes-in-a-catalog" id="markdown-toc-list-all-the-classes-in-a-catalog">List all the classes in a catalog</a></li>
<li><a href="#list-all-defined-types-in-a-catalog" id="markdown-toc-list-all-defined-types-in-a-catalog">List all defined types in a catalog</a></li>
<li><a href="#list-all-resource-types-in-a-catalog" id="markdown-toc-list-all-resource-types-in-a-catalog">List all resource types in a catalog</a></li>
<li><a href="#debugging-a-catalog-dependency-issue" id="markdown-toc-debugging-a-catalog-dependency-issue">Debugging a catalog dependency issue</a></li>
<li><a href="#change-the-rspec-to-generate-the-catalog-file" id="markdown-toc-change-the-rspec-to-generate-the-catalog-file">Change the Rspec to generate the catalog file</a></li>
<li><a href="#find-these-resources-in-the-catalogs-along-with-their-locations-in-the-manifests" id="markdown-toc-find-these-resources-in-the-catalogs-along-with-their-locations-in-the-manifests">Find these resources in the catalogs along with their locations in the manifests</a></li>
</ul>
<h2 id="general-note-about-puppet-3-v-puppet-45-catalogs">General note about Puppet 3 v Puppet 4/5 catalogs</h2>
<p>In Puppet 4 the catalog structure changed a little, and it is important to be aware that the crucial resources key is in the top level in Puppet 4/5, whereas it’s nested under the data key in Puppet 3.</p>
<p>All commands below are for Puppet 4/5.</p>
<h2 id="list-all-file-resources-by-title">List all file resources by title</h2>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ jq '.resources[] | select(.type == "File") | .title' < catalog.json
</code></pre></div></div>
<h2 id="select-a-file-resource-based-on-title">Select a file resource based on title</h2>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ jq '.resources[] | select((.type == "File") and (.title=="sources.list.d"))' < catalog.json
</code></pre></div></div>
<h2 id="select-all-files-resource-based-on-title-that-contains-a-string">Select all files resource based on title that contains a string</h2>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ jq '.resources[] | select((.title | contains("/home")) and (.type == "File")) | .title' < catalog.json
</code></pre></div></div>
<h2 id="list-all-the-classes-in-a-catalog">List all the classes in a catalog</h2>
<p>Similar to listing resources:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ jq '.resources[] | select(.type=="Class") | .title' < catalog.json
</code></pre></div></div>
<h2 id="list-all-defined-types-in-a-catalog">List all defined types in a catalog</h2>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ jq -r '.resources[] | select(.type | contains("::")) | [.type, .title] | @csv' < catalog.json
</code></pre></div></div>
<h2 id="list-all-resource-types-in-a-catalog">List all resource types in a catalog</h2>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ jq '.resources[] | .type' < catalog.json | sort -u
</code></pre></div></div>
<h2 id="debugging-a-catalog-dependency-issue">Debugging a catalog dependency issue</h2>
<p>Example error message:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1) rspec should compile into a catalogue without dependency cycles
Failure/Error: is_expected.to compile.with_all_deps
error during compilation: Could not retrieve dependency 'File[/etc/apt/sources.list.d]' of Exec[apt_update]
# ./spec/hosts/role_default_spec.rb:25:in `block (2 levels) in <top (required)>'
</code></pre></div></div>
<p>Rspec doesn’t tell us which resources are involved so we need some magic to figure this out.</p>
<h2 id="change-the-rspec-to-generate-the-catalog-file">Change the Rspec to generate the catalog file</h2>
<p>Comment out this so writing out of the catalog file proceeds:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">it</span> <span class="s1">'should compile and write out a catalog file'</span> <span class="k">do</span>
<span class="c1"># is_expected.to compile.with_all_deps</span>
<span class="no">File</span><span class="p">.</span><span class="nf">write</span><span class="p">(</span>
<span class="s1">'catalogs/default_default_vagrant_vagrant.json'</span><span class="p">,</span>
<span class="no">PSON</span><span class="p">.</span><span class="nf">pretty_generate</span><span class="p">(</span><span class="n">catalogue</span><span class="p">)</span>
<span class="p">)</span>
<span class="k">end</span>
</code></pre></div></div>
<h2 id="find-these-resources-in-the-catalogs-along-with-their-locations-in-the-manifests">Find these resources in the catalogs along with their locations in the manifests</h2>
<p>Find the File resource:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ jq 'resources[] |
> select((.type == "File") and (.title=="sources.list.d")) |
> {"type": .type, "title": .title, "parameters": .parameters}' < catalog.json
{
"type": "File",
"title": "sources.list.d",
"parameters": {
"path": "/etc/apt/sources.list.d/",
"ensure": "directory",
"owner": "root",
"group": "root",
"purge": false,
"recurse": false,
"notify": "Exec[apt_update]"
}
}
</code></pre></div></div>
<p>Now find the Exec resource:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ jq '.resources[] |
> select((.type == "Exec") and (.title=="apt_update")) |
> {"type": .type, "title": .title, "file": .file, "line": .line, "parameters": .parameters}' \
> < catalog.json
{
"type": "Exec",
"title": "apt_update",
"file": "/path/to/manifests/apt.pp",
"line": 30,
"parameters": {
"command": "/usr/local/sbin/apt_update",
"logoutput": "on_failure",
"refreshonly": false,
"subscribe": "File[/etc/apt/sources.list.d]",
"require": [
"File[/etc/apt/apt.conf.d/99auth]",
"File[/usr/local/sbin/apt_update]"
]
}
}
</code></pre></div></div>
<p>So we can see the bug and know which file to edit to fix it.</p>Alex HarveyThis is a page dedicated to useful JQ commands for querying a compiled Puppet catalog.Merge a Git repository and its history into a subdirectory of a second Git repository2017-10-04T00:00:00+00:002017-10-04T00:00:00+00:00https://alexharv074.github.io//puppet/2017/10/04/merge-a-git-repository-and-its-history-into-a-subdirectory-of-a-second-git-repository<p>On more than one occasion, I have needed to merge a Git repository and its history into a subdirectory of a second Git repository.</p>
<p>In this post, I document how to merge a Git repo git@git.example.com:BAR/repo2.git (“second repo”) into the subdir/ directory another Git repo git@git.example.com:FOO/repo1.git (“first repo”). And after the merge, I explain how to filter the history so that commands like git log, git blame and git show work as expected and show a history as if the files in the subdirectory had always been there.</p>
<h2 id="set-up-a-test-environment">Set up a test environment</h2>
<p>I begin by cloning the first repo into /var/tmp as follows:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>▶ cd /var/tmp
▶ git clone git@git.example.com:FOO/repo1.git
</code></pre></div></div>
<h2 id="merge-the-second-repo-into-a-subdirectory">Merge the second repo into a subdirectory</h2>
<p>Now merge the second repo into the modules subdirectory of the first repo. Firstly, add a remote and fetch its content:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>▶ git remote add -f repo2 git@git.example.com:BAR/repo2.git
</code></pre></div></div>
<p>The -f option here means “after adding the remote, also fetch”.</p>
<p>Next, perform the merge but tell Git via <code class="language-plaintext highlighter-rouge">--no-commit</code> to pretend the merge failed and stop before committing, to give us a chance to inspect and further tweak the merge result before committing.</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>▶ git merge -s ours --no-commit repo2/master --allow-unrelated-histories
</code></pre></div></div>
<p>There should be a message printed to the screen, “Automatic merge went well; stopped before committing as requested”.</p>
<p>The <code class="language-plaintext highlighter-rouge">--no-commit</code> option says to perform the merge but pretend the merge failed and not autocommit, to give the user a chance to inspect and further tweak the merge result before committing. This is so that we can modify the tree using the read-tree command below.</p>
<p>A note about the <code class="language-plaintext highlighter-rouge">--allow-unrelated-histories</code> option. Since Git 2.9, the default behaviour of git merge has changed:</p>
<blockquote>
<p>“git merge” used to allow merging two branches that have no common base by default, which led to a brand new history of an existing project created and then get pulled by an unsuspecting maintainer, which allowed an unnecessary parallel history merged into the existing project. The command has been taught not to allow this by default, with an escape hatch <code class="language-plaintext highlighter-rouge">--allow-unrelated-histories</code> option to be used in a rare event that merges histories of two projects that started their lives independently.</p>
</blockquote>
<p>See <a href="https://github.com/git/git/blob/master/Documentation/RelNotes/2.9.0.txt#L58-L68">here</a> in the Git change log.</p>
<p>Naturally, the object of the git merge command, repo2/master, means “the master branch of the Git repo at the remote named ‘repo2’”.</p>
<p>Next, read the commits from the root of repo2/master and place the files resulting from them under subdir:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>▶ git read-tree --prefix=subdir -u repo2/master:
</code></pre></div></div>
<p>Note the colon at the end there. The colon is actually a delimiter, with remote/branch on the left-hand side and a path on the right-hand side. Our path is an empty string, which means to use the root.</p>
<p>This commands returns more or less instantly and should provide no output.</p>
<p>The working tree should now have a directory at subdir. But the merge commit has not been done yet; the changes have only been read onto the current tree. Finally, we add the merge commit:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>▶ git commit
</code></pre></div></div>
<p>Now I am prompted to use the default merge commit message, which is: “Merge remote-tracking branch ‘repo2/master’”.</p>
<p>Also, I see below that this commit intends to add all of the files from the repo2 repo under subdir.</p>
<h2 id="repairing-the-history">Repairing the history</h2>
<p>At this point, all might appear to be fine until we try to inspect the history of the files we have added.</p>
<p>If I run git log on one of those files, the only history is the merge commit from above that added them. If I run git blame on those files, they are shown to have their old paths. Same with git show.</p>
<p>At this point, I have a git filter-branch script that I wrote that repairs the history, that is based on a Stack Overflow post that I lost the original reference to.</p>
<p>To write it, I started with this:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>▶ git filter-branch --tree-filter \
> '(echo === $GIT_COMMIT:; git ls-tree $GIT_COMMIT) >> /tmp/tree.log'
</code></pre></div></div>
<p>This helped me get my head around what was going on inside the filter and allowed me to make the observation that all of the commits from the second repo are bunched together and not ordered by date.</p>
<p>So, the next step is to get the initial commit’s and the latest commit’s SHA1s from the original repo2 repo and save them as $first and $last. Then:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/usr/bin/env bash</span>
<span class="nv">first</span><span class="o">=</span>c4096edb47f3a07e4f9d670c7edff564329b82f9
<span class="nv">last</span><span class="o">=</span>01d14f0a82c860849e9cfb5884f5b54e8486b248
<span class="nv">subdir</span><span class="o">=</span>subdir
git filter-branch <span class="nt">--tree-filter</span> <span class="s1">'
first='</span><span class="s2">"</span><span class="nv">$first</span><span class="s2">"</span><span class="s1">'
last='</span><span class="s2">"</span><span class="nv">$last</span><span class="s2">"</span><span class="s1">'
subdir='</span><span class="s2">"</span><span class="nv">$subdir</span><span class="s2">"</span><span class="s1">'
log_file=/tmp/filter.log
[ "$GIT_COMMIT" = "$first" ] && seen_first=true
if [ "$seen_first" = "true" ] && [ "$seen_last" != "true" ]; then
echo "=== $GIT_COMMIT: making changes"
files=$(git ls-tree --name-only $GIT_COMMIT)
mkdir -p $subdir
for i in $files; do
mv $i $subdir || echo "ERR: mv $i $subdir failed"
done
else
echo "=== $GIT_COMMIT: ignoring"
fi \
>> $log_file
[ "$GIT_COMMIT" = "$last" ] && seen_last=true
status=0 # tell tree-filter never to fail
'</span>
</code></pre></div></div>
<p>A few notes about the script:</p>
<p>Obviously, its purpose is to rewrite the history, moving the files from their original locations in the history to their new locations in their new repo.</p>
<p>The variable $status is used internally by Git to cause the behaviour documented in git help filter-branch:</p>
<blockquote>
<p>If any evaluation of returns a non-zero exit status, the whole operation will be aborted.</p>
</blockquote>
<p>I discovered the status variable by using set -x.</p>
<p>Finally, the /tmp/filter.log gives me confidence that I know what has changed and what hasn’t changed, before I finally rewrite the original repo using <code class="language-plaintext highlighter-rouge">git push origin --force</code>.</p>
<h2 id="a-better-way">A better way?</h2>
<p>I have no doubt there is an easier way to do this, but at the moment, it’s a procedure that has worked for me. Do let me know if you know of that better way!</p>Alex HarveyOn more than one occasion, I have needed to merge a Git repository and its history into a subdirectory of a second Git repository.