Show server side response timings in chrome developer tools

While trying to add additional performance annotations to one of my side projects I recently stumbled over the exceptionally promising Server-Timing HTTP header and specification. It’s a simple way to add semi-structured values describing aspects of the response generation and how long they each took. These can then be processed and displayed in your normal web development tools.

In this post I’ll show a simplified example, using Flask, to add timings to a single page response and display them using Google Chrome developer tools. The sample python flask application below returns a web page consisting of a single string and some fake information detailing all the actions assembling the page could have required.

# cat hello.py

from flask import Flask, make_response
app = Flask(__name__)


@app.route("/")
def hello():

    # Collect all the timings you want to expose
    # each string is how long it took in microseconds
    # and the human readable name to display
    sub_requests = [
        'redis=0.1; "Redis"',
        'mysql=2.1; "MySQL"',
        'elasticsearch=1.2; "ElasticSearch"'
    ]

    # Convert timings to a single string
    timings = ', '.join(sub_requests)

    resp.headers.set('Server-Timing', timings)

    return resp

Once you’ve started the application, with FLASK_APP=hello.py flask run, you can request this page via curl to confirm the header and values are present.

    $ curl -sI http://127.0.0.1:5000/ | grep Timing
    ...
    Server-Timing: redis=0.1; "Redis", mysql=2.1; "MySQL", elasticsearch=1.2; "ElasticSearch"
    ...

Now we’ve added the header, and some sample data, to our tiny Flask application let’s view it in Chrome devtools. Open the developer tools with Ctrl-Shift-I and then click on the network tab. If you hover the mouse pointer over the coloured section in “Waterfall” you should see an overlay like this:

Chrome devtools response performance graph

The values provided by our header are at the bottom under “Server Timing”.

Support for displaying the values provided with this header isn’t yet wide spread. The example, and screenshot, presented here are from Chrome 62.0.3202.75 (Official Build) (64-bit) and may require changes as the spec progresses from its current draft status. The full potential of the Server-Timing header won’t be obvious for a while but even with only a few supporting tools it’s still a great way to add some extra visibility to your projects.

Containers will not fix your broken culture

From DevOps, Docker, and Empathy by Jérôme Petazzoni (title shamelessly stolen from a talk by Bridget Kromhout):

The DevOps movement is more about a culture shift than embracing a new set of tools. One of the tenets of DevOps is to get people to talk together.

Implementing containers won’t give us DevOps.

You can’t buy DevOps by the pound, and it doesn’t come in a box, or even in intermodal containers.

It’s not just about merging “Dev” and “Ops,” but also getting these two to sit at the same table and talk to each other.

Docker doesn’t enforce these things (I pity the fool who preaches or believes it) but it gives us a table to sit at, and a common language to facilitate the conversation. It’s a tool, just a tool indeed, but it helps people share context and thus understanding.

Sorting terraform variable blocks

When developing terraform code, it is easy to end up with a bunch of variable definitions that are listed in no particular order.

Here's a bit of python code that will sort terraform variable definitions. Use it as a filter from inside vim, or as a standalone tool if you have all your variable definitions in one file.

eg:

tf_sort < variables.tf > variables.tf.sorted
mv variables.tf.sorted variables.tf

Here's the code:

#!/usr/bin/env python
# sort terraform variables

import sys
import re

# this regex matches terraform variable definitions
# we capture the variable name so we can sort on it
pattern = r'(variable ")([^"]+)(" {[^{]+})'


def process(content):
    # sort the content (a list of tuples) on the second item of the tuple
    # (which is the variable name)
    matches = sorted(re.findall(pattern, content), key=lambda x: x[1])

    # iterate over the sorted list and output them
    for match in matches:
        print ''.join(map(str, match))

        # don't print the newline on the last item
        if match != matches[-1]:
            print


# check if we're reading from stdin
if not sys.stdin.isatty():
    stdin = sys.stdin.read()
    if stdin:
        process(stdin)

# process any filenames on the command line
for filename in sys.argv[1:]:
    with open(filename) as f:
        process(f.read())

Use your GitHub SSH key with AWS EC2 (via Terraform)

Like most people I have too many credentials in my life. Passwords, passphrases and key files seem to grow in number almost without bound. So, in an act of laziness, I decided to try and remove one of them. In this case it’s my AWS EC2 SSH key and instead reuse my GitHub public key when setting up my base AWS infrastructure.

Once you start using EC2 on Amazon Web Services you’ll need to create, or supply an existing, SSH key pair to allow you to log in to the Linux hosts. While this is an easy enough process to click through I decided to automate the whole thing and use an existing key, one of those I use for GitHub. One of its lesser known features is that GitHub exposes a users SSH public keys. This is available from everywhere, without authenticating against anything and so seemed like a prime candidate for reuse.

The terraform code to do this was a lot quicker to write than the README. As this is for my own use I could use a newer version of 0.10.* and harness the locals functionality to keep the actual resources simpler to read by hiding all the variable composing in a single place. You can find the results of this, the terraform-aws-github-ssh-keys module on GitHub, and see an example of its usage here:

module "github-ssh-keys" {
  source = "deanwilson/github-ssh-keys/aws"

  # fetch the ssh key from this user name
  github_user = "deanwilson"

  # create the key with a specific name in AWS
  aws_key_pair_name = "deanwilson-from-github"
}

I currently use this for my own test AWS accounts. The common baseline setup of these doesn’t get run that often in comparison to the services running in the environment so I’m only tied to GitHub being available occasionally. Once the key’s created it has a long life span and has no external network dependencies.

After the module was (quite fittingly) available on GitHub I decided to go a step further and publish it to the Terraform Module Registry. I’ve never used it before so after a brief read about the module format requirements, which all seem quite sensible, I decided to blunder my way through and see how easy it was. The Answer? Very.

Screen shot of my module on the Terraform registry

The process was pleasantly straight forward. You sign in using your GitHub account, select your Terraform modules from a drop down and then you’re live. You can see how github-ssh-keys looks as an example. Adding a module was quick, easy to follow, and well worth finishing off your modules with.

Prevent commits to the local git master branch

I’ve been a fan of Yelps pre-commit git hook manager ever since I started using it to Prevent AWS credential leaks. After a recent near miss involving a push to master I decided to take another look and see if it could provide a safety net that would only allow commits on non-master branches. It turns out it can, and it’s actually quite simple to enable if you follow the instructions below.

Firstly we’ll install pre-commit globally.

pip install pre-commit

Before we enable the plugin we’ll make a commit to an unprotected local master branch to ensure everything’s working the way we think it is.

# confirm we're on master
$ git branch
* master

# create a local change we can work with
$ echo "Text" >> text

$ git add text

# successfully commit the change
$ git commit -v -m "Add text"
[master e1b84e5] Add text
 1 file changed, 1 insertion(+)
 create mode 100644 text

Now we’ve confirmed we can commit to master normally we’ll add the pre-commit config to prevent it.

$ cat .pre-commit-config.yaml

- repo: https://github.com/pre-commit/pre-commit-hooks.git
  sha: v0.9.5
  hooks:
  -  id: no-commit-to-branch

and then we activate the config.

$ pre-commit install
pre-commit installed at ~/protected-branch-test/.git/hooks/pre-commit

If anything fails then you’ll probably need to read through ~/.pre-commit/pre- commit.log to find the issue. Now we’ve installed the pre-commit pip, added its config, and then enabled it we should be protected. No more accidental committing to the master branch for us! But let’s verify.

# make a change to the checkout
echo "More text" >> text

git commit -m "Added more text"
... snip ...
Don't commit to branch.............Failed
... snip ...

# and the change is not committed.

By default this plugin protects the master branch. If you have other branches you want to deny commits on you can add the args key to the config as shown in this snippet.

  hooks:
  -  id: no-commit-to-branch
     args:
     - --branch=release

If you need to commit to master while this plugin is enabled you can use the --no-verify argument to git commit to disable all pre-commit hooks.

It’s worth noting you can also prevent inadvertent pushes to master at the remote end by enabling branch protection on a number of the popular git providers, both GitHub and BitBucket support this. This approach has the advantage of not needing client side configuration but does require that all your providers support it, and that you actually enable it on each of them and their repositories. While you can of course do that by hand there are also a few tools that will manage this for you, but that’s a something for a different post.

I wrote a book

A few months ago while stunningly bored I decided, in a massive fit of hubris, that I was going to write and publish a technical book. I wrote a pile of notes and todo items and after a good nights sleep decided it’d be a lot more work than I had time for. So I decided to repurpose Puppet CookBook and try going through the publication process with that instead. But (disclaimer) with a different title as there is already an excellent real book called Puppet Cookbook that goes in to a lot more depth than my site does.

My Puppet CookBook has always been a hand built site. It’s ruby, erb templates and currently uses blueprint for the CSS. My hope was to just add another small wrapper and some slightly modified templates and pass that to a tool that actually knows how to format ebooks. Which ended up being very close to what actually happened. I did some research on both asciidoctor and pandoc and ended up using the latter, mostly because its desired input format was closest to how I already produce the site.

Completely skipping the monotonous updating and rewording of a number of recipes part of the process we soon get to the interesting part, tooling. I generated markdown from my own tooling (but as one massive file instead of many pages) and then ran pandoc against it and some supporting information, such as a cover image and book specific material. The actual commands look a lot like this:

#!/bin/bash
set -xue

book=$foo.epub

# generate the raw markdown text from my custom template
bundle exec ruby -I lib bin/build-markdown.rb > ebook/book-source.md


# Generate the ebook itself
pandoc 
  -t epub3 
  -o ebook/$book ebook/metadata.yaml ebook/introduction.md ebook/book-source.md ebook/changelog.md 
  --table-of-contents --toc-depth 2 
  --epub-cover-image ebook/${book}-cover.png

# open the reader to see if it looks... passable
FBReader ebook/$book

Once the book was generating, and looked readable, it was time to attempt submitting to see what else I was missing. I’m not going to detail all the text fields you have to complete when submitting, as it’s dull, self-explanatory and will probably change over time. It is worth noting that uploading the epub here is the first time I actually received any feedback on how valid my submission was. And it very much wasn’t. While Amazon does offer their tool chain on Linux it’s all pre-built 32 bit binaries. Even when I managed to make them run they didn’t return any errors and seem to be validating off old versions of the ebook specs. After a few iterations of blind fix and upload I had to swallow my pride and ask a friend to use his Apple laptop to run some of the Amazon publishing tool chain, such as Kindle Previewer, for me so I could see the reasons my submission was being rejected. This was worth the shame as it gave me actionable errors messages and cut the cycle time down significantly.

Once the previewer ran clean I re-submitted the book and it went through on the first try. I then went and did other stuff for a few hours and then returned, search for the books name and ‘lo there was a page for it on Amazon.co.uk

There are still a few oddities I have no clue how royalties work when publishing via the Amazon Kindle publisher. I think I’m now due about 3 cans of coke worth but it’d cost me more in time to figure out how to get that sweet 2.750 than I’ll ever make from it. You don’t get a free copy. As an author if I want to see how the book looks to customers I have to buy a copy. You also can’t submit a book as ‘free’. If you’re in the UK then the minimum you can sell it for it 99p. There is a way around this, as I’d like to offer mine for free, but you need to set the book up on another site and then have Amazon price match. Which is a massive pain to set up. I also had to do a lot of googling for things like “how to insert a hard page break” in a kindle book (<mbp:pagebreak /> for the record) but that might be an indication of how unprepared I was.

I’m not going to link to the book here, and I’m also not going to recommend that anyone actually buy it. All the material contained inside it can be seen, with actual coloured syntax, on Puppet CookBook. Which is something I state very clearly in the foreword.

I’ve found the sales numbers interesting. Firstly I’ve done zero promotion of it, and I’ve not linked to it from anywhere. The google results for it are amazingly terrible. I have problems finding it with a search engine and I wrote it. There’s also no way to see how many copies you’ve sold since release. In terms of raw numbers I think I’m still in double rather than treble digits, you can also see that I won’t be retiring on my royalties.

Picture of very low royalties

Writing a tech book is never going to be a retirement plan. In my case it’s even slightly worse as the material is online and free, and viewed by about 1500 unique visitors a day for the last 6 or so years. This was never intended to be anything beyond an experiment but I’ve had some nice feedback from readers and it’s something I always wanted to do so I think it was worth investing a weekend into it. The technology is freely available, the process is actually quite easy (if you have something other than Linux available) and I’d heartily recommend it anyone interested.

Managing multiple puppet modules with modulesync

With the exception of children, puppies and medical compliance frameworks managing one of something is normally much easier than managing a lot of them. If you have a lot of puppet modules, and you’ll eventually always have a lot of puppet modules, you’ll get bitten by this and find yourself spending as much time managing supporting functionality as the puppet code itself.

Luckily you’re not the first person to have a horde of puppet modules that share a lot of common scaffolding. The fine people at Vox Pupuli had the same issue and maintain an excellent tool, modulesync that solves this very problem. With modulesync and a little YAML you’ll soon have a consistent, easy to iterate, on set of modules.

To get started with module sync you need three things, well four if you count the puppet module horde you want to manage.

I’ve been using modulesync for some of my projects for a while but we recently adopted it for the GDS Operations Puppet Modules so there’s now a full, but nascent, example we can look at. You can find all the modulesync code in our public repo.

First we set up the basic module sync config in modulesync.yml -

---
git_base: 'git@github.com:'
namespace: gds-operations
branch: modulesync
...
# vim: syntax=yaml

This YAML mostly controls how we interact with our upstream. git_base is the base of the URL to run git operations against. In our case we explicitly specify GitHub (which is also the default) but this is easy to change if you use bitbucket, gitlab or a local server. We treat namespace as the GitHub organisation modules are under. As we never push directly to master we specify a branch our changes should be pushed to for later processing as a pull request.

The second config file, managed_modules.yml, contains a list of all the modules we want to manage:

---
- puppet-aptly
- puppet-auditd
- puppet-goenv

By default modulesync will perform any operations against every module in this file. It’s possible to filter this down to specific modules but there’s only really value in doing that as a simple test. After all keeping the modules in sync is pretty core to the tools purpose.

The last thing to configure is a little more abstract. Any files you want to manage across the modules should be placed in the moduleroot directory and given a .erb extension. At the moment we’re treating all the files in this directory as basic, static, files modulesync does expand them provides a @configs hash, which contains any values you specify in the base config_defaults.yml file. These values can also be overridden with more specific values stored along side the module itself in the remote repository.

Once you’ve created the config files and added at least a basic file to moduleroot, a LICENSE file is often a safe place to start, you can run modulesync to see what will be changed. In this case I’m going to be working with the gds-operations/puppet_modulesync_config repo.

bundle install

# run the module sync against a single module and show potential changes
bundle exec msync update -f puppet-rbenv --noop

This command will filter the managed modules (using the -f flag to select them) clone the remote git repo(s), placing them under modules/, change the branch to either master or the one specified in modulesync.yml and then present a diff of changes from the expanded templates contained in moduleroot against the cloned remote repo. None of the changes are actually made thanks to the --noop flag. If you’re happy with the diff you can add a commit message (with -m message), remove --noop and then run the command again to push the amended branch.

bundle exec msync update -m "Add LICENSE file" -f puppet-rbenv

Once the branch is pushed you can review and create a pull request as usual.

Screen shot of GitHub pull request from modulesync change

We’re at a very early stage of adoption so there is a large swathe of functionality we’re not using so I’ve not mentioned. If you’re actually using the moduleroot templates as actual templates you can have a local override, in each remote module/github repo, that can localise the configuration and be correctly merged with the main configuration. This allows you to push settings out to where they’re needed while still keeping most modules baselined. You can also customise the syncing workflow to specify bumping the minor version, updating the CHANGELOG and a number of other helpful shortcuts provided by modulesync.

Once you get above half-a-dozen modules it’s a good time to take a step back and think about how you’re going to manage dependencies, versions, spec_helpers and such in an ongoing, iterative way and modulesync presents one very helpful possible solution.

The Choria Emulator

In my previous posts I discussed what goes into load testing a Choria network, what connections are made, subscriptions are made etc.

From this it’s obvious the things we should be able to emulate are:

  • Connections to NATS
  • Subscriptions – which implies number of agents and sub collectives
  • Message payload sizes

To make it realistically affordable to emulate many more machines that I have I made an emulator that can start numbers of Choria daemons on a single node.

I’ve been slowly rewriting MCollective daemon side in Go which means I already had all the networking and connectors available there, so a daemon was written:

usage: choria-emulator --instances=INSTANCES [<flags>]
 
Emulator for Choria Networks
 
Flags:
      --help                 Show context-sensitive help (also try --help-long and --help-man).
      --version              Show application version.
      --name=""              Instance name prefix
  -i, --instances=INSTANCES  Number of instances to start
  -a, --agents=1             Number of emulated agents to start
      --collectives=1        Number of emulated subcollectives to create
  -c, --config=CONFIG        Choria configuration file
      --tls                  Enable TLS on the NATS connections
      --verify               Enable TLS certificate verifications on the NATS connections
      --server=SERVER ...    NATS Server pool, specify multiple times (eg one:4222)
  -p, --http-port=8080       Port to listen for /debug/vars

You can see here it takes a number of instances, agents and collectives. The instances will all respond with ${name}-${instance} on any mco ping or RPC commands. It can be discovered using the normal mc discovery – though only supports agent and identity filters.

Every instance will be a Choria daemon with the exact same network connection and NATS subscriptions as real ones. Thus 50 000 emulated Choria will put the exact same load of work on your NATS brokers as would normal ones, performance wise even with high concurrency the emulator performs quite well – it’s many orders of magnitude faster than the ruby Choria client anyway so it’s real enough.

The agents they start are all copies of this one:

emulated0
=========
 
Choria Agent emulated by choria-emulator
 
      Author: R.I.Pienaar <rip@devco.net>
     Version: 0.0.1
     License: Apache-2.0
     Timeout: 120
   Home Page: http://choria.io
 
   Requires MCollective 2.9.0 or newer
 
ACTIONS:
========
   generate
 
   generate action:
   ----------------
       Generates random data of a given size
 
       INPUT:
           size:
              Description: Amount of text to generate
                   Prompt: Size
                     Type: integer
                 Optional: true
            Default Value: 20
 
 
       OUTPUT:
           message:
              Description: Generated Message
               Display As: Message

You can this has a basic data generator action – you give it a desired size and it makes you a message that size. It will run as many of these as you wish all called like emulated0 etc.

It has an mcollective agent that go with it, the idea is you create a pool of machines all with your normal mcollective on it and this agent. Using that agent then you build up a different new mcollective network comprising the emulators, federation and NATS.

Here’s some example of commands – you’ll see these later again when we talk about scenarios:

We download the dependencies onto all our nodes:

$ mco playbook run setup-prereqs.yaml --emulator_url=https://example.net/rip/choria-emulator-0.0.1 --gnatsd_url=https://example.net/rip/gnatsd --choria_url=https://example.net/rip/choria

We start NATS on our first node:

$ mco playbook run start-nats.yaml --monitor 8300 --port 4300 -I test1.example.net

We start the emulator with 1500 instances per node all pointing to our above NATS:

$ mco playbook run start-emulator.yaml --agents 10 --collectives 10 --instances 750 --monitor 8080 --servers 192.168.1.1:4300

You’ll then setup a client config for the built network and can interact with it using normal mco stuff and the test suite I’ll show later. Simularly there are playbooks to stop all the various parts etc. The playbooks just interact with the mcollective agent so you could use mco rpc directly too.

I found I can easily run 700 to 1000 instances on basic VMs – needs like 1.5GB RAM – so it’s fairly light. Using 400 nodes I managed to build a 300 000 node Choria network and could easily interact with it etc.

Finally I made a ec2 environment where you can stand up a Puppet Master, Choria, the emulator and everything you need and do load tests on your own dime. I was able to do many runs with 50 000 emulated nodes on EC2 and the whole lot cost me less than $20.

The code for this emulator is very much a work in progress as is the Go code for the Choria protocol and networking but the emulator is here if you want to take a peek.

Job applications and GitHub profile oddities

I sift through a surprising amount, to me at least, of curricula vitae / resumes each month and one pattern I’ve started to notice is the ‘fork only’ GitHub profile.

There’s been a lot written over the last few years about using your GitHub profile as an integral part of your job application. Some in favour, some very much not. While each side has valid points when recruiting I like to have all the information I can to hand, so if you include a link to your profile I will probably have a rummage around. When it comes to what I’m looking for there are a lot of different things to consider. Which languages do you use? Is the usage idiomatic? Do you have docs or tests? How do you respond to people in issues and pull requests? Which projects do you have an interest in? Have you solved any of the same problems we have?

Recently however I’ve started seeing a small but growing percentage of people that have an essentially fork only profile. Often of the bigger, trendier projects, Docker, Kubernetes, Terraform for example, and there will be no contributed code. In the most blatant case there were a few amended CONTRIBUTORS files with the applicants name and email but no actual changes to the code base.

Although you shouldn’t place undue weight on an applicants GitHub profile in most cases, and in the Government we deliberately don’t consider it in any phase past the initial CV screen, it can be quite illuminating. In the past it provided an insight towards peoples attitude, aptitudes and areas of interest and now as a warning sign that someone may be more of a system gamer than a system administrator.