Managing AWS Default VPC Security Groups with Terraform

When it comes to Amazon Web Services support Terraform has coverage that’s second to none. It includes most of Amazons current services, rapidly adds newly released ones, and even helps granularise existing resources by adding terraform specific extensions for things like individual rules with aws_security_group_rule. This awesome coverage makes it even more jarring when you encounter one of the rare edge cases, such as VPC default security groups.

It’s worth taking a step back and thinking about how Terraform normally works. When you write code to manage resources terraform expects to fully own the created resources life cycle. It will create it, ensure that changes made are correctly reflected (and remove those made manually), and when resources code is removed from the .tf files it will destroy it. While this is fine for 99% of the supported Amazon resources the VPC default security group is a little different.

Each Amazon Virtual Private Cloud (VPC) created will have a default security group provided. This is created by Amazon itself and is often undeletable. Rather than leaving it unmanaged, which happens all too often, we can instead add it to terraforms control with the special aws_default_security_group resource. When used this resource works a little differently than most others. Terraform doesn’t attempt to create the group, instead it’s adopted under its management umbrella. This allows you to control what rules are placed in this default group and stops the security group already exists errors that will happen if you try to manage it as a normal group.

The terraform code to add the default VPC security group looks surprisingly normal:

resource "aws_vpc" "myvpc" {
  cidr_block = "10.2.0.0/16"
}

resource "aws_default_security_group" "default" {
  vpc_id = "${aws_vpc.myvpc.id}"

  # ... snip ...
  # security group rules can go here
}

One nice little tweak I’ve found useful is to customise the default security group to only allow inbound access on port 22 from my current (very static) IP address.

# use the swiss army knife http data source to get your IP
data "http" "my_local_ip" {
    url = "https://ipv4.icanhazip.com"
}

resource "aws_security_group_rule" "ssh_from_me" {
  type            = "ingress"
  from_port       = 22
  to_port         = 22
  protocol        = "tcp"
  cidr_blocks     = ["${chomp(data.http.my_local_ip.body)}/32"]

  security_group_id = "${aws_default_security_group.default.id}"
}

Automatic Terraform documentation with terraform-docs

Terraform code reuse leads to modules. Modules lead to variables and outputs. Variables and outputs lead to massive amount of boilerplate documentation. terraform-docs lets you shortcut some of these steps and jump straight to consistent, easy to use, automatically generated documentation instead.

Terraform-docs, a self-contained binary implemented in Go, and released by Segment, provides an efficient way to add documentation to your terraform code without requiring large changes to your workflow or massive amounts of additional boilerplate. In its simplest invocation it reads the descriptions provided in your variables and outputs and displays them on the command line:

    /**
    *
    * A sample terraform file with a variable and output
    *
    */

variable "greeting" {
  type        = "string"
  description = "The string used as a greeting"
  default     = "hello"
}

output "introduction" {
  description = "The full, polite, introduction"
  value       = "${var.greeting} from terraform"
}

Running terraform-docs against this code produces:

A sample terraform file with a variable and output

  var.greeting (hello)
  The string used as a greeting

  output.introduction
  The full, polite, introduction

This basic usage makes it simpler to use existing code by presenting the official interface without over-burdening you with implementation details. Once you’ve added descriptions to your variables and outputs, something you should really already be doing, you can start to expose the documentation in other ways. By adding the markdown option -

terraform-docs markdown .

you can generate the docs in a GitHub friendly way that provides an easy, web based, introduction to what your code accepts and returns. We used this quite heavily in the GOV.UK AWS repo and it’s been invaluable. The ability to browse an overview of the terraform code makes it simpler to determine if a specific module does what you actually need without requiring you to read all of the implementation.

A collection of terraform variables and their defaults

When we first adopted terraform-docs we hit issues with the code being updated without the documentation changing to match it. We soon settled on using git precommit hooks, such as this terraform-docs githook script by Laura Martin or the heavy handed GOV.UK update-docs script. Once we had these in place the little discrepancies stopped slipping through and the reference documentation became a lot more trusted.

As an aside if you plan on using terraform-docs as part of your automated continuous integration pipeline you’ll probably want to create a terraform-docs package. I personally use FPM Cookery for this and it’d been an easy win so far.

I’ve become a big fan of terraform-docs and it’s great to see such self-contained tools making such a positive impact on the terraform ecosystem. If you’re writing tf code for consumption by more than just yourself (and even then) it’s well worth a second look.

Automatic datasource configuration with Grafana 5

When I first started my Prometheus experiments with docker-compose one of the most awkward parts of the process, especially to document, were the manual steps required to click around the Grafana dashboard in order to add the Prometheus datasource. Thanks to the wonderful people behind Grafana there has been a push in the newest major version, 5 at time of writing, to make Grafana easier to automate. And it really does pay off.

Instead of forcing you to load the UI and play clicky clicky games with vague instructions to go here, and then the tab on the left, no, the other left, down a bit… you can now configure the data source with a YAML file that’s loaded on startup.

# from datasource.yaml
apiVersion: 1

datasources:
- name: Prometheus
  type: prometheus
  access: proxy
  isDefault: true
  url: http://prometheus:9090
  # don't set this to true in production
  editable: true

Because I’m using this code base in a tinkering lab I set editable to true. This allows me to make adhoc changes. In production you’d want to make this false so people can’t accidentally break your backing store.

It only takes a little code to link everything together, add the config file and expose it to the container. You can see all the changes required in the Upgrade grafana and configure datasource via a YAML file pull request. Getting the exact YAML syntax, and confusing myself over access proxy vs direct was the hardest part. It’s only a single step along the way to a more automation friendly Grafana but it is an important one and a positive example that they are heading in the right direction.

Aqua Security microscanner – a first look

I’m a big fan of baking testing into build and delivery pipelines so when a new tool pops up in that space I like to take a look at what features it brings to the table and how much effort it’s going to take to roll out. The Aqua Security microscanner, from a company you’ve probably seen at least one excellent tech talk from in the last year, is a quite a new release that surfaces vulnerable operating systems packages in your container builds.

To experiment with microscanner I’m going to add it to my simple Gemstash Dockerfile.

FROM ubuntu:16.04
MAINTAINER dean.wilson@gmail.com

RUN apt-get update && 
    apt-get -y upgrade && 
    apt-get install -y 
      build-essential 
      ruby 
      ruby-dev 
      libsqlite3-dev 
      curl 
    && gem install --no-ri --no-rdoc gemstash

EXPOSE 9292

HEALTHCHECK --interval=15s --timeout=3s 
  CMD curl -f http://localhost:9292/ || exit 1

CMD ["gemstash", "start", "--no-daemonize"]

This is a conceptually simple Dockerfile. We update the Ubuntu package list, upgrade packages where needed, add dependencies required to build our rubygems and then install gemstash. From this very boilerplate base we only need to make a few changes for microscanner to run.

> git diff Dockerfile
diff --git a/gemstash/Dockerfile b/gemstash/Dockerfile
index 741838f..bab819a 100644
--- a/gemstash/Dockerfile
+++ b/gemstash/Dockerfile
@@ -2,7 +2,6 @@ FROM ubuntu:16.04
 MAINTAINER dean.wilson@gmail.com

 RUN apt-get update && 
-    apt-get -y upgrade && 
     apt-get install -y 
       build-essential 
       ruby 
@@ -11,6 +10,14 @@ RUN apt-get update && 
       curl 
     && gem install --no-ri --no-rdoc gemstash

+ARG token
+RUN apt-get update && apt-get -y install ca-certificates wget && 
+    wget -O /microscanner https://get.aquasec.com/microscanner && 
+    chmod +x /microscanner && 
+    /microscanner ${token} && 
+    rm -rf /microscanner
+

Firstly we remove the package upgrade step, as we want to ensure vulnerabilities are present in our container. We then use the newer ARG directive to tell Docker we will be passing a value named token in at build time. Lastly we attempt to add microscanner and its dependencies, in a single image layer. As we’re using the wget and ca- certificates packages it does have a small impact on container size but microscanner itself is added, used and removed without a trace.

You’ll notice we specify a token when running the scanner. This grants access to the Aqua scanning servers, and is rate limited. How do you get a token? You request it by calling out to the Aqua Security container with your email address:

docker run --rm -it aquasec/microscanner --register foo@mailinator.com
# ... snip ...
Aqua Security MicroScanner, version 2.6.4
Community Edition

Accept and proceed? Y/N:
y
Please check your email for the token.

Once you have the token (mine came through in seconds) you can build the container:

docker build --build-arg=token=A1A1Aaa1AaAaAAA1 --no-cache .

For this experiment I’ve taken the big hammer of --no-cache to ensure all the packages are tested on each build. This will have a build time performance aspect and should be considered along with the other best practices. If your container has vulnerable package versions you’ll get a massive dump of JSON in your build output. Individual packages will show their vulnerabilities:

{
      "resource": {
        "format": "deb",
        "name": "systemd",
        "version": "229-4ubuntu21.1",
        "arch": "amd64",
        "cpe": "pkg:/ubuntu:16.04:systemd:229-4ubuntu21.1",
        "name_hash": "2245b39c177e93fc015ba051be4e8574"
      },
      "scanned": true,
      "vulnerabilities": [
        {
          "name": "CVE-2018-6954",
          "description": "systemd-tmpfiles in systemd through 237 mishandles symlinks present in non-terminal path components, which allows local users to obtain ownership of arbitrary files via vectors involving creation of a directory and a file under that directory, and later replacing that directory with a symlink. This occurs even if the fs.protected_symlinks sysctl is turned on.",
          "nvd_score": 7.2,
          "nvd_score_version": "CVSS v2",
          "nvd_vectors": "AV:L/AC:L/Au:N/C:C/I:C/A:C",
          "nvd_severity": "high",
          "nvd_url": "https://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2018-6954",
          "vendor_score": 5,
          "vendor_score_version": "Aqua",
          "vendor_severity": "medium",
          "vendor_url": "https://people.canonical.com/~ubuntu-security/cve/2018/CVE-2018-6954.html",
          "publish_date": "2018-02-13",
          "modification_date": "2018-03-16",
          "fix_version": "any in ubuntu 17.04",
          "solution": "Upgrade operating system to ubuntu version 17.04 (includes fixed versions of systemd)"
        }
}

You’ll also see some summary information, total number of issues, run time and container operating system values for example.

  "vulnerability_summary": {
    "total": 147,
    "medium": 77,
    "low": 70,
    "negligible": 6,
    "score_average": 4.047619,
    "max_score": 5,
    "max_fixable_score": 5,
    "max_fixable_severity": "medium"
  },

If any of the vulnerabilities are considering to be high in severity then the build should fail, preventing you from going live with known issues.

It’s very early days for microscanner and there’s a certain amount of inflexibility that will shake out over use, such as being able to fail builds on medium or even low severity issues, or only show packages with vulnerabilities, but it’s a very easy way to add this kind of safety net to your containers and worth keeping an eye on.

Validate AWS CIS security benchmarks with prowler

Despite the number of Amazon Web Services that have the word simple in their titles, keeping on top of a large cloud deployment isn’t an easy ask. There are a lot of important, complex, aspects to consider so it’s advisable to pay attention to the best practices, reference architectures, and benchmarks published by AWS and their partners. In this post we’ll take a look at the CIS security benchmark and a tool that will save you a lot of manual verifying.

CIS, the “Center For Internet Security”, publish best practice, security configuration guides, that present a number of recommendations that you should be aware of if you’re running production workloads in AWS. You don’t have to change your environment to suit every recommendation, or even agree with them, but you should read through it once and note where you’re consciously different to their advice. The guide itself, which you can find on the CIS AWS Benchmark page, or as an AWS static whitepaper link that doesn’t require an email address to read, is quite low level but well worth a read. Being aware of all the potential issues will help shape your cloud environments for the better. But, as good, lazy, admins we won’t go and check each of the recommendations by hand. Instead we’ll use a python application called Prowler.

The recommendations are terse but mostly clear. As the screenshot shows they aid in verification and remediation by presenting instructions for how to reach the given values in the web console or via the CLI.

AWS CIS example policy

Prowler however provides us a third way. It has checks for most of the recommendations, and even some bonus extras, and will iterate through them and assign us a pass or fail for each. Let’s install it and run some experiments.

Installing prowler

Prowler is a python program so we’ll install it, and the required dependencies, into a virtualenv to keep the versions isolated.

# create a new virtual env
virtualenv prowler-sweep
cd prowler-sweep
source bin/activate

# Get prowler from github
git clone https://github.com/toniblyx/prowler
cd prowler

# install the dependencies
pip install ansi2html awscli

You now have all the code required for prowler to run a sweep of your security settings.

Running prowler

I uses different profiles, configured in .aws/credential for most of my experiments so for now I’ll run prowler as me, but with read only access. If you want to run this as a dedicated user or under EC2 the installation guide has lists the required IAM permissions.

./prowler -p full-readonly

  _ __  _ __ _____      _| | ___ _ __
 | '_ | '__/ _   / / / |/ _  '__|
 | |_) | | | (_)  V  V /| |  __/ |
 | .__/|_|  ___/ _/_/ |_|___|_|v2.0-beta2
 |_| the handy cloud security tool

 Date: Wed 30 May 18:47:25 BST 2018

In its most basic mode prowler will run from the command line and show its results in glorious, colourful, ANSI.

Prowler output in glorious ANSI colour

In additional to text with control characters it can also provide basic HTML reports or even JSON and CSV for further processing and integration into your existing tools. Once you’ve finished a full sweep in your format of choice you can start to prioritise the findings and often add remediation to your Terraform or CloudFormation code bases.

Above and beyond

In addition to the CIS recommendations Prowler adds some of its own checks, for example some services didn’t exist when the last benchmark was published, and for common operational practises that are worth following. You can even extend it yourself if you have local rules or compliance requirements. There’s a list of additional prowler checks and description on the GitHub repository.

AWS security is a big, sprawling, topic with many moving parts, and while no third party resource will ever cover all your use cases documents like the CIS benchmark and tools like prowler can help quickly provide a baseline and safety net to ensure if you do get breached it won’t be because of a simple oversight.

The simple vims – code comments

After finding a bug in my custom written, bulk code comment / uncomment, vim function I decided to invest a little time to find a mature replacement that would remove my maintenance burden. In addition to removing my custom code I wanted a packaged solution, to make it easier to include across all of my vim installs.

After a little googling I found the ideal solution, the vim-commentary plugin. It ticks all my check boxes:

  • mature enough all the obvious bugs should have been found
  • receives attention when it needs it
  • has a narrow, well defined, focus
  • as a user it works the way I’d have approached it
  • And while it’s not a selection criteria, Tim Pope writing it is a big plus

I use the Vundle package manager for vim so installing commentary was quick and painless. I already have the vundle boilerplate in my .vimrc config file:

" set the runtime path to include Vundle and initialise
set rtp+=~/.vim/bundle/Vundle.vim
call vundle#begin()

" let Vundle manage Vundle, required
Plugin 'VundleVim/Vundle.vim'
" ... snip ... Lots of other plugins

call vundle#end()            " required

So all I had to do was add the new Plugin directive

" ... snip ...
Plugin 'VundleVim/Vundle.vim'
Plugin 'tpope/vim-commentary'
" ... snip ...

and then re-source the configuration and install the new plugin

:source %
:PluginInstall

Once it’s installed using it is as easy as selecting the text you want to comment out and typing gc. You can also use gcc (which can take a count) to comment out the current line. To uncomment code repeat the operation. Predictable enough that your muscle memory will learn it quickly. If you want to change the comment style, for example puppet code defaults to the horrible /* file { '/tmp/foo': */ format, you can override the default by adding an autocmd line to your .vimrc

    autocmd FileType puppet setlocal commentstring=# %s

I replaced my own custom code with commentary a few weeks ago and it’s quickly become a great, intuitive, replacement. If you use vim for writing code and want a simple way to comment and uncomment blocks it’s an excellent choice.

Choria Progress Update

It’s been a while since my previous update and quite a bit have happened since.

Choria Server

As previously mentioned the Choria Server will aim to replace mcollectived eventually. Thus far I was focussed on it’s registration subsystem, Golang based MCollective RPC compatible agents and being able to embed it into other software for IoT and management backplanes.

Over the last few weeks I learned that MCollective will no longer be shipped in Puppet Agent version 6 which is currently due around Fall 2018. This means we have to accelerate making Choria standalone in it’s own right.

A number of things have to happen to get there:

  • Choria Server should support Ruby agents
  • The Ruby libraries Choria Server needs either need to be embedded and placed dynamically or provided via a Gem
  • The Ruby client needs to be provided via a Gem
  • New locations for these Ruby parts are needed outside of AIO Ruby

Yesterday I released the first step in this direction, you can now replace mcollectived with choria server. For now I am marking this as a preview/beta feature while we deal with issues the community finds.

The way this works is that we provide a small shim that uses just enough of MCollective to get the RPC framework running – luckily this was initially developed as a MCollective plugin and it retained its quite separate code base. When the Go code needs to invoke a ruby agent it will call the shim to do so, the shim in turn will provide the result from the agent – in JSON format – back to Go.

This works for me with any agent I’ve tried it with and I am quite pleased with the results:

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root     10820  0.0  1.1 1306584 47436 ?       Sl   13:50   0:06 /opt/puppetlabs/puppet/bin/ruby /opt/puppetlabs/puppet/bin/mcollectived

MCollective would of course include the entire Puppet as soon as any agent that uses Puppet is loaded – service, package, puppet – and so over time things only get worse. Here is Choria:

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root     32396  0.0  0.5 296436  9732 ?        Ssl  16:07   0:03 /usr/sbin/choria server --config=/etc/choria/server.conf

I run a couple 100 000 instances of this and this is what you get, it never changes really. This is because Choria spawns the Ruby code and that will exit when done.

This has an unfortunate side effect that the service, package and puppet agents are around 1 second slower per invocation because loading Puppet is really slow. Ones that do not load Puppet are only marginally slower.

irb(main):002:0> Benchmark.measure { require "puppet" }.real
=> 0.619865644723177

There is a page set up dedicated to the Beta that details how to run it and what to look out for.

JSON pure protocol

Some of the reasons for breakage that you might run into – like mco facts is not working now with Choria Server – is due to a hugely significant change in the background. Choria – both plugged into MCollective and Standalone – is JSON safe. The Ruby Plugin is optionally so (and off by default) but the Choria daemon only supports JSON.

Traditionally MCollective have used YAML on the wire, being quite old JSON was really not that big a deal back in the early 2000s when the foundation for this choice was laid down, XML was more important. Worse MCollective have exposed Ruby specific data types and YAML extensions on the wire which have made creating cross platform support nearly impossible.

YAML is also of course capable of carrying any object – which means some agents are just never going to be compatible with anything but Ruby. This was the case with the process agent but I fixed that before shipping it in Choria. It also essentially means YAML can invoke things you might not have anticipated and so happens big security problems.

Since quite some time now the Choria protocol is defined, versioned and JSON schemas are available. The protocol makes the separation between Payload, Security, Transport and Federation much clearer and the protocol can now support anything that can move JSON – Middleware, REST, SSH, Postal Doves are all capable of carrying Choria packets.

There is a separate Golang implementation of the protocol that is transport agnostic and the schemas are there. Version 1 of the protocol is a tad skewed to MCollective but Version 2 (not yet planned) will drop those shackles. A single Choria Server is capable of serving multiple versions of the network protocol and communicate with old and new clients.

Golang being a static language and having a really solid and completely compatible implementation of the protocol means making ones for other languages like Python etc will not be hard. However I think long term the better option for other languages are still a capable REST gateway.

I did some POC work on a very very light weight protocol suitable for devices like Arduino and will provide bridging between the worlds in our Federation Brokers. You’ll be able to mco rpc wallplug off, your client will talk full Choria Protocol and the wall plug might speak a super light weight MQTT based protocol and you will not even know this.

There are some gotchas as a result of these changes, also captured in the Choria Server evaluation documentation. To resolve some of these I need to be much more aggressive with what I do to the MCollective libraries, something I can do once they are liberated out of Puppet Agent.

Adding rich object data types to Puppet

Extending Puppet using types, providers, facts and functions are well known and widely done. Something new is how to add entire new data types to the Puppet DSL to create entirely new language behaviours.

I’ve done a bunch of this recently with the Choria Playbooks and some other fun experiments, today I’ll walk through building a small network wide spec system using the Puppet DSL.

Overview


A quick look at what we want to achieve here, I want to be able to do Choria RPC requests and assert their outcomes, I want to write tests using the Puppet DSL and they should run on a specially prepared environment. In my case I have a AWS environment with CentOS, Ubuntu, Debian and Archlinux machines:

Below I test the File Manager Agent:

  • Get status for a known file and make sure it finds the file
  • Create a brand new file, ensure it reports success
  • Verify that the file exist and is empty using the status action

cspec::suite("filemgr agent tests", $fail_fast, $report) |$suite| {
 
  # Checks an existing file
  $suite.it("Should get file details") |$t| {
    $results = choria::task("mcollective", _catch_errors => true,
      "action" => "filemgr.status",
      "nodes" => $nodes,
      "silent" => true,
      "fact_filter" => ["kernel=Linux"],
      "properties" => {
        "file" => "/etc/hosts"
      }
    )
 
    $t.assert_task_success($results)
 
    $results.each |$result| {
      $t.assert_task_data_equals($result, $result["data"]["present"], 1)
    }
  }
 
  # Make a new file and check it exists
  $suite.it("Should support touch") |$t| {
    $fname = sprintf("/tmp/filemgr.%s", strftime(Timestamp(), "%s"))
 
    $r1 = choria::task("mcollective", _catch_errors => true,
      "action" => "filemgr.touch",
      "nodes" => $nodes,
      "silent" => true,
      "fact_filter" => ["kernel=Linux"],
      "fail_ok" => true,
      "properties" => {
        "file" => $fname
      }
    )
 
    $t.assert_task_success($r1)
 
    $r2 = choria::task("mcollective", _catch_errors => true,
      "action" => "filemgr.status",
      "nodes" => $nodes,
      "silent" => true,
      "fact_filter" => ["kernel=Linux"],
      "properties" => {
        "file" => $fname
      }
    )
 
    $t.assert_task_success($r2)
 
    $r2.each |$result| {
      $t.assert_task_data_equals($result, $result["data"]["present"], 1)
      $t.assert_task_data_equals($result, $result["data"]["size"], 0)
    }
  }
}

I also want to be able to test other things like lets say discovery:

  cspec::suite("${method} discovery method", $fail_fast, $report) |$suite| {
    $suite.it("Should support a basic discovery") |$t| {
      $found = choria::discover(
        "discovery_method" => $method,
      )
 
      $t.assert_equal($found.sort, $all_nodes.sort)
    }
  }

So we want to make a Spec like system that can drive Puppet Plans (aka Choria Playbooks) and do various assertions on the outcome.

We want to run it with mco playbook run and it should write a JSON report to disk with all suites, cases and assertions.

Adding a new Data Type to Puppet


I’ll show how to add the Cspec::Suite data Type to Puppet. This comes in 2 parts: You have to describe the Type that is exposed to Puppet and you have to provide a Ruby implementation of the Type.

Describing the Objects


Here we create the signature for Cspec::Suite:

# modules/cspec/lib/puppet/datatypes/cspec/suite.rb
Puppet::DataTypes.create_type("Cspec::Suite") do
  interface <<-PUPPET
    attributes => {
      "description" => String,
      "fail_fast" => Boolean,
      "report" => String
    },
    functions => {
      it => Callable[[String, Callable[Cspec::Case]], Any],
    }
  PUPPET
 
  load_file "puppet_x/cspec/suite"
 
  implementation_class PuppetX::Cspec::Suite
end

As you can see from the line of code cspec::suite(“filemgr agent tests”, $fail_fast, $report) |$suite| {….} we pass 3 arguments: a description of the test, if the test should fail immediately on any error or keep going and there to write the report of the suite to. This corresponds to the attributes here. A function that will be shown later takes these and make our instance.

We then have to add our it() function which again takes a description and yields out `Cspec::Case`, it returns any value.

When Puppet needs the implementation of this code it will call the Ruby class PuppetX::Cspec::Suite.

Here is the same for the Cspec::Case:

# modules/cspec/lib/puppet/datatypes/cspec/case.rb
Puppet::DataTypes.create_type("Cspec::Case") do
  interface <<-PUPPET
    attributes => {
      "description" => String,
      "suite" => Cspec::Suite
    },
    functions => {
      assert_equal => Callable[[Any, Any], Boolean],
      assert_task_success => Callable[[Choria::TaskResults], Boolean],
      assert_task_data_equals => Callable[[Choria::TaskResult, Any, Any], Boolean]
    }
  PUPPET
 
  load_file "puppet_x/cspec/case"
 
  implementation_class PuppetX::Cspec::Case
end

Adding the implementation


The implementation is a Ruby class that provide the logic we want, I won’t show the entire thing with reporting and everything but you’ll get the basic idea:

# modules/cspec/lib/puppet_x/cspec/suite.rb
module PuppetX
  class Cspec
    class Suite
      # Puppet calls this method when it needs an instance of this type
      def self.from_asserted_hash(description, fail_fast, report)
        new(description, fail_fast, report)
      end
 
      attr_reader :description, :fail_fast
 
      def initialize(description, fail_fast, report)
        @description = description
        @fail_fast = !!fail_fast
        @report = report
        @testcases = []
      end
 
      # what puppet file and line the Puppet DSL is on
      def puppet_file_line
        fl = Puppet::Pops::PuppetStack.stacktrace[0]
 
        [fl[0], fl[1]]
      end
 
      def outcome
        {
          "testsuite" => @description,
          "testcases" => @testcases,
          "file" => puppet_file_line[0],
          "line" => puppet_file_line[1],
          "success" => @testcases.all?{|t| t["success"]}
        }
      end
 
      # Writes the memory state to disk, see outcome above
      def write_report
        # ...
      end
 
      def run_suite
        Puppet.notice(">>>")
        Puppet.notice(">>> Starting test suite: %s" % [@description])
        Puppet.notice(">>>")
 
        begin
          yield(self)
        ensure
          write_report
        end
 
 
        Puppet.notice(">>>")
        Puppet.notice(">>> Completed test suite: %s" % [@description])
        Puppet.notice(">>>")
      end
 
      def it(description, &blk)
        require_relative "case"
 
        t = PuppetX::Cspec::Case.new(self, description)
        t.run(&blk)
      ensure
        @testcases << t.outcome
      end
    end
  end
end

And here is the Cspec::Case:

# modules/cspec/lib/puppet_x/cspec/case.rb
module PuppetX
  class Cspec
    class Case
      # Puppet calls this to make instances
      def self.from_asserted_hash(suite, description)
        new(suite, description)
      end
 
      def initialize(suite, description)
        @suite = suite
        @description = description
        @assertions = []
        @start_location = puppet_file_line
      end
 
      # assert 2 things are equal and show sender etc in the output
      def assert_task_data_equals(result, left, right)
        if left == right
          success("assert_task_data_equals", "%s success" % result.host)
          return true
        end
 
        failure("assert_task_data_equals: %s" % result.host, "%snntis not equal tonn %s" % [left, right])
      end
 
      # checks the outcome of a choria RPC request and make sure its fine
      def assert_task_success(results)
        if results.error_set.empty?
          success("assert_task_success:", "%d OK results" % results.count)
          return true
        end
 
        failure("assert_task_success:", "%d failures" % [results.error_set.count])
      end
 
      # assert 2 things are equal
      def assert_equal(left, right)
        if left == right
          success("assert_equal", "values matches")
          return true
        end
 
        failure("assert_equal", "%snntis not equal tonn %s" % [left, right])
      end
 
      # the puppet .pp file and line Puppet is on
      def puppet_file_line
        fl = Puppet::Pops::PuppetStack.stacktrace[0]
 
        [fl[0], fl[1]]
      end
 
      # show a OK message, store the assertions that ran
      def success(what, message)
        @assertions << {
          "success" => true,
          "kind" => what,
          "file" => puppet_file_line[0],
          "line" => puppet_file_line[1],
          "message" => message
        }
 
        Puppet.notice("&#x2714;︎ %s: %s" % [what, message])
      end
 
      # show a Error message, store the assertions that ran
      def failure(what, message)
        @assertions << {
          "success" => false,
          "kind" => what,
          "file" => puppet_file_line[0],
          "line" => puppet_file_line[1],
          "message" => message
        }
 
        Puppet.err("✘ %s: %s" % [what, @description])
        Puppet.err(message)
 
        raise(Puppet::Error, "Test case %s fast failed: %s" % [@description, what]) if @suite.fail_fast
      end
 
      # this will show up in the report JSON
      def outcome
        {
          "testcase" => @description,
          "assertions" => @assertions,
          "success" => @assertions.all? {|a| a["success"]},
          "file" => @start_location[0],
          "line" => @start_location[1]
        }
      end
 
      # invokes the test case
      def run
        Puppet.notice("==== Test case: %s" % [@description])
 
        # runs the puppet block
        yield(self)
 
        success("testcase", @description)
      end
    end
  end
end

Finally I am going to need a little function to create the suite – cspec::suite function, it really just creates an instance of PuppetX::Cspec::Suite for us.

# modules/cspec/lib/puppet/functions/cspec/suite.rb
Puppet::Functions.create_function(:"cspec::suite") do
  dispatch :handler do
    param "String", :description
    param "Boolean", :fail_fast
    param "String", :report
 
    block_param
 
    return_type "Cspec::Suite"
  end
 
  def handler(description, fail_fast, report, &blk)
    suite = PuppetX::Cspec::Suite.new(description, fail_fast, report)
 
    suite.run_suite(&blk)
    suite
  end
end

Bringing it together


So that’s about it, it’s very simple really the code above is pretty basic stuff to achieve all of this, I hacked it together in a day basically.

Lets see how we turn these building blocks into a test suite.

I need a entry point that drives the suite – imagine I will have many different plans to run, one per agent and that I want to do some pre and post run tasks etc.

plan cspec::suite (
  Boolean $fail_fast = false,
  Boolean $pre_post = true,
  Stdlib::Absolutepath $report,
  String $data
) {
  $ds = {
    "type"   => "file",
    "file"   => $data,
    "format" => "yaml"
  }
 
  # initializes the report
  cspec::clear_report($report)
 
  # force a puppet run everywhere so PuppetDB is up to date, disables Puppet, wait for them to finish
  if $pre_post {
    choria::run_playbook("cspec::pre_flight", ds => $ds)
  }
 
  # Run our test suite
  choria::run_playbook("cspec::run_suites", _catch_errors => true,
    ds => $ds,
    fail_fast => $fail_fast,
    report => $report
  )
    .choria::on_error |$err| {
      err("Test suite failed with a critical error: ${err.message}")
    }
 
  # enables Puppet
  if $pre_post {
    choria::run_playbook("cspec::post_flight", ds => $ds)
  }
 
  # reads the report from disk and creates a basic overview structure
  cspec::summarize_report($report)
}

Here’s the cspec::run_suites Playbook that takes data from a Choria data source and drives the suite dynamically:

plan cspec::run_suites (
  Hash $ds,
  Boolean $fail_fast = false,
  Stdlib::Absolutepath $report,
) {
  $suites = choria::data("suites", $ds)
 
  notice(sprintf("Running test suites: %s", $suites.join(", ")))
 
  choria::data("suites", $ds).each |$suite| {
    choria::run_playbook($suite,
      ds => $ds,
      fail_fast => $fail_fast,
      report => $report
    )
  }
}

And finally a YAML file defining the suite, this file describes my AWS environment that I use to do integration tests for Choria and you can see there’s a bunch of other tests here in the suites list and some of them will take data like what nodes to expect etc.

suites:
  - cspec::discovery
  - cspec::choria
  - cspec::agents::shell
  - cspec::agents::process
  - cspec::agents::filemgr
  - cspec::agents::nettest
 
choria.version: mcollective plugin 0.7.0
 
nettest.fqdn: puppet.choria.example.net
nettest.port: 8140
 
discovery.all_nodes:
  - archlinux1.choria.example.net
  - centos7.choria.example.net
  - debian9.choria.example.net
  - puppet.choria.example.net
  - ubuntu16.choria.example.net
 
discovery.mcollective_nodes:
  - archlinux1.choria.example.net
  - centos7.choria.example.net
  - debian9.choria.example.net
  - puppet.choria.example.net
  - ubuntu16.choria.example.net
 
discovery.filtered_nodes:
  - centos7.choria.example.net
  - puppet.choria.example.net
 
discovery.fact_filter: operatingsystem=CentOS

Conclusion


So this then is a rather quick walk through of extending Puppet in ways many of us would not have seen before. I spent about a day getting this all working which included figuring out a way to maintain the mutating report state internally etc, the outcome is a test suite I can run and it will thoroughly drive a working 5 node network and assert the outcomes against real machines running real software.

I used to have a MCollective integration test suite, but I think this is a LOT nicer mainly due to the Choria Playbooks and extensibility of modern Puppet.

$ mco playbook run cspec::suite --data `pwd`/suite.yaml --report `pwd`/report.json

The current code for this is on GitHub along with some Terraform code to stand up a test environment, it’s a bit barren right now but I’ll add details in the next few weeks.

Overriding yum variables

If you work with rpm-based systems you will probably have seen content like this in the repo config files:

[base]
name=CentOS-$releasever - Base
mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=os&infra=$infra
#baseurl=http://mirror.centos.org/centos/$releasever/os/$basearch/
gpgcheck=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7

The items in bold are yum variables.

Today, I needed to install i386 packages on a system running an x86_64 kernel (don't ask!).

Here's how I did it:

echo i386 > /etc/yum/vars/basearch

Documentation here: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/deployment_guide/sec-using_yum_variables