Viewing AlertManager Email Alerts via MailHog

After adding AlertManager to my Prometheus test stack in a previous post I spent some time triggering different failiure cases and generating test messages. While it’s slightly satisfying seeing rows change from green to red I soon wanted to actually send real alerts, with all their values somewhere I could easily view. My criteria were:

  • must be easy to integrate with AlertManager
  • must not require external network access
  • must be easy to use from docker-compose
  • should have as few moving parts as possible

A few short web searches later I stumbled back onto a small server I’ve used for this in the past - MailHog. MailHog is an awesome little server that listens for SMTP traffic and then displays it using an internal HTTP server. It has sensible defaults so no configuration was required, comes as a single binary and even has a working dockerhub image. My solution was found!

The amount of work to include it was even less than I’d hoped. A new docker-compose.yaml file for mailhog itself, a very basic AlertManager configuration file and a few lines of docker config to put the right configs in each of the containers later and we have a working email alert view:

MailHog screen shot of Alertmanager emails

Adding AlertManager to docker-compose Prometheus

What’s the use of monitoring if you can’t raise alerts? It’s half a solution at best and now I have basic monitoring working, as discussed in Prometheus experiments with docker-compose, it felt like it was time to add AlertManager, Prometheus often used partner in crime, so I can investigate raising, handling and resolving alerts. Unfortunately this turned out to be a lot harder than ‘just’ adding a basic exporter.

Before we delve into the issues and how I worked around them in my implementation let’s see the result of all the work, adding a redis alert and forcing it to trigger. Ignoring all the implementation details for now we need to do four things to add AlertManager to our experiments:

  • add the AlertManager container
  • tell Prometheus how to contact AlertManager
  • tell Prometheus where the alert rules files are located
  • add an alerting rule to confirm everything is connected

Assuming we’re in the root of docker-compose-prometheus we’ll run our docker-compose command to create all the instances we need for testing:

docker-compose 
  -f prometheus-server/docker-compose.yaml   
  -f alertmanager-server/docker-compose.yaml 
  -f redis-server/docker-compose.yaml        
up -d

You can confirm all the containers are available by running:

docker-compose 
  -f prometheus-server/docker-compose.yaml   
  -f alertmanager-server/docker-compose.yaml 
  -f redis-server/docker-compose.yaml        
ps

Screen shot of Prometheus alerting rule

In this screenshot you can see the Prometheus alerting page, with our RedisDown alert against a green background as everything is working correctly. We also show the RedisDown AlertManager rule configuration. This rule checks the redis_up value returned by the redis exporter. If redis is down it will be 0, and if it doesn’t recover in the next minute it will trigger an alert. It’s worth noting here that you can confirm your rules files are valid using this, less scary than it looks, promtool command:

# the left hand argument to `-v` is the local file from this repo.
docker run 
  -v `pwd`/redis-server/redis.rules:/fileof.rules 
  -it --entrypoint=promtool prom/prometheus:v2.1.0 check rules /fileof.rules

Checking /fileof.rules
  SUCCESS: 1 rules found

Everything seems to be configured correctly, so lets break it and confirm alerting is working. First we will kill the redis container. This will cause the exporter to change the value of redis_up.

# kill the container
docker kill prometheusserver_redis-server_1

# check it has exited
docker ps -a | grep prometheusserver_redis-server_1

# simplified output
library/redis:4.0.8    Exited (137) 2 minutes ago    prometheusserver_redis-server_1

The alert will then change to “State PENDING” on the prometheus alerts page. Once the minute it up it will change to “State FIRING” and, if everything is working, appear in AlertManager too.

Screen shot of a triggered Prometheus alerting rule

In addition to using the web UI you can directly query alertmanager via the command line using the docker container

docker exec -ti prometheusserver_alert-manager_1 amtool 
  --alertmanager.url http://127.0.0.1:9093 alert

Alertname  Starts At                Summary
RedisDown  2018-03-09 18:33:58 UTC  Redis Availability alert.

At this point we have a basic but working AlertManager running alongside our local prometheus. It’s far from a complete or comprehensive configuration, and the alerts don’t yet go anywhere, but it’s a solid base to start your own experiments from. You can see all the code to make this work in the add_alert_manager branch

Now we’ve covered how AlertManager fits into our tests and how to confirm it’s working we will delve into how it’s configured, something that was much more work than I expected. Prometheus, by design, runs with a single configuration file. While this is fine for a number of use cases, my design goal of combining any combination of docker-compose files to create a test environment doesn’t play well with it. This became clear to me when I needed to add the alertmanager configuration to the main config file, but only when alertmanager is included. The config to enable AlertManager and its alerting rules is concise:

rule_files:
  - "/etc/prometheus/*.rules"

alerting:
  alertmanagers:
    - static_configs:
      - targets: ['alert-manager:9093']

The first part, rule_files:, accepts wild card selection of alert rule files. Each of these files contain one of more alert rules, such as our RedisDown example above. This globbing makes it easy to add rules to prometheus from each included component. The second part tells prometheus where it can find the alertmanager instance it should raise alerts with.

In order to use these configs I had to add another step to running prometheus; collecting all the configuration snippets and combining them into a single file before starting the process. My first thought was to create my own Prometheus container and preprocess the configuration before starting the daemon. I quickly decided against this as I don’t want to be responsible for maintaining my own fork of the Dockerfile. I was also worried about timing issues and start up race conditions from all the other containers adding their configs. Instead I decided to add another container.

This tiny busybox based container, which I named promconf-concat, runs a short shell script in a loop. This code concatenates all the configuration fragments, starting with the base config, together. If the complete config file has changed it replaces the existing, volume mounted, file which prometheus then detects as changed and reloads.

I have a strong suspicion I’ll be revisiting this part of the project again and splitting the fragments more. Adding ordering will probably be required as some of the exporters (such as MySQL) can’t be configured as targets via the file_sd_configs mechanism. However for now it’s allowed me to test the basic alerting functionality and continue to delver more deeply into Prometheus.

Green system percentage vs user visible issues

How much of your system does your internal monitoring need to consider down before something is user visible? While there will always be the perfect chain of three or four things that can cripple a chunk of you customer visible infrastructure there are often a lot of low importance checks that will flare up and consume time and attention. But what’s the ratio?

As a small thought experiment on one project I’ve recently started to leave a new, very simple four panel, Grafana dashboard open on a Raspberry PI driven monitor that shows the percentage of the internal monitoring checks that are currently in a successful state next to the number of user visible issues and incidents. I’ve found watching the percentage of the system that’s working rise and fall without anyone outside the company, and often the team, noticing to be strangely hypnotic. I’ve also added a couple of panels to show the number of events of each of those types over the last hour.

Fugly Dashboard showing 4 panels described in the page

I was hoping the numbers would provide some inspiration towards questions like, “Are we monitoring at the right level?”, “Do we need to be running all of these at this frequency?” and similar questions but so far I’ve mostly found it to be reassuring that it can withstand small internal failures while also worrying about the amount of state churn it seems to detect. While it’s not been as helpful as alert summary roll ups it has been a great source of visual white noise while thinking about other alerting issues.

Prometheus experiments with docker-compose

As 2018 rolls along the time has come to rebuild parts of my homelab again. This time I’m looking at my monitoring and metrics setup, which is based on sensu and graphite, and planning some experiments and evaluations using Prometheus. In this post I’ll show how I’m setting up my tests and provide the Prometheus experiments with docker-compose source code in case it makes your own experiments a little easier to run.

My starting requirements were fairly standard. I want to use containers where possible. I want to test lots of different backends and I want to be able to pick and choose which combinations of technologies I run for any particular tests. As an example I have a few little applications that make use of redis and some that use memcached, but I don’t want to be committed to running all of the backing services for each smaller experiment. In terms of technology I settled on docker-compose to help keep the container sprawl in check while also enabling me to specify all the relationships. While looking into compose I found Understanding multiple Compose files and my basic structure began to emerge.

Starting with prometheus and grafana themselves I created the prometheus-server directory and added a basic prometheus config file to configure the service. I then added configuration for each of the things it was to collect from; prometheus and grafana in this case. Once these were in place I added the prometheus and grafana docker-compose.yaml file and created the stack.

docker-compose -f prometheus-server/docker-compose.yaml up -d

docker-compose -f prometheus-server/docker-compose.yaml ps

> docker-compose -f prometheus-server/docker-compose.yaml ps
        Name                   Command       State   Ports
-----------------------------------------------------------------------
prometheusserver_grafana_1     /run.sh       Up  0.0.0.0:3000->3000/tcp
prometheusserver_prometheus_1  /bin/prom ... Up  0.0.0.0:9090->9090/tcp

After manually configuring the prometheus data source in Grafana, all of which is covered in the README you have a working prometheus scraping itself and grafana and a grafana that allows you to experiment with presenting the data.

While this is a good first step I need visibility into more than the monitoring system itself, so it’s time to add another service. Keeping our goal of being modular in mind I decided to break everything out into separate directories and isolate the configuration. Adding a new service is as simple as adding a redis-server directory and writing a docker-compose file to run redis and the prometheus exporter we use to get metrics from it. This part is simple as most of the work is done for us. We use third party docker containers and everything is up and running. But how do we add the redis exporter to the prometheus targets? That’s where docker-composes merging behaviour shines.

In our base docker-compose.yaml file we define the prometheus service and the volumes assigned to it:

services:
  prometheus:
    image: prom/prometheus:v2.1.0
    ports:
      - 9090:9090
    networks:
      - public
    volumes:
      - prometheus_data:/prometheus
      - ${PWD}/prometheus-server/config/prometheus.yml:/etc/prometheus/prometheus.yml
      - ${PWD}/prometheus-server/config/targets/prometheus.json:/etc/prometheus/targets/prometheus.json
      - ${PWD}/prometheus-server/config/targets/grafana.json:/etc/prometheus/targets/grafana.json
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'

You can see we’re mounting individual target files in to prometheus for it to probe. Now in our docker-compose-prometheus/redis-server/docker-compose.yaml file we’ll reference back to the existing prometheus service and add to the volumes array.

  prometheus:
    volumes:
      - ${PWD}/redis-server/redis.json:/etc/prometheus/targets/redis.json

Rather than overriding the array this incomplete service configuration adds another element to it. Allowing us to build up our config over multiple docker-compose files. In order for this to work we have to run the compose commands with each config specified every time. Resulting in the slightly hideous -

docker-compose 
  -f prometheus-server/docker-compose.yaml 
  -f redis-server/docker-compose.yaml 
  up -d

Once you’re running a stack with 3 or 4 components you’ll probably reach for aliases and add a base docker-compose replacement

alias dc='docker-compose -f prometheus-server/docker-compose.yaml -f redis-server/docker-compose.yaml'

and then call that with actual commands like dc up -d and dc logs. Adding your own application to the testing stack is as easy as adding a backing resource. Create a directory and the two config files and everything should be hooked in.

It’s early in the process and I’m sure to find issues with this naive approach but it’s enabled me to create arbitrarily complicated prometheus test environments and start evaluating its ecosystem of plugins and exporters. I’ll add more to it and refine where possible, the manual steps should hopefully be reduced by Grafana 5 for example, but hopefully it’ll remain a viable way for myself and others to run quick, adhoc tests.

A short 2017 review

It’s time for a little 2017 navel gazing. Prepare for a little self-congratulation and a touch of gushing. You’ve been warned. In general my 2017 was a decent one in terms of tech. I was fortunate to be presented a number of opportunities to get involved in projects and chat to people that I’m immensely thankful for and I’m going to mention some of them here to remind myself how lucky you can be.

Let’s start with conferences, I was fortunate enough to attend a handful of them in 2017. Scale Summit was, as always, a great place to chat about our industry. In addition to the usual band of rascals I met Sarah Wells in person for the first time and was blown away by the breadth and depth of her knowledge. She gave a number of excellent talks over 2017 and they’re well worth watching. The inaugural Jeffcon filled in for a lack of Serverless London (fingers crossed for 2018) and was inspiring throughout, from the astounding keynote by Simon Wardley keynote all the way to the after conference chats.

I attended two DevopsDays, London, more about which later, and Stockholm. It was the first in Sweden and the organisers did the community proud. In a moment of annual leave burning I also attended Google Cloud and AWS Summits at the Excel centre. It’s nice to see tech events so close to where I’m from. I finished the year off with the GDS tech away day, DockerCon Europe and Velocity EU.

DevopsDays holds a special place in my heart as the conference and community that introduced me to so many of my peers that I heartily respect. The biggest, lasting contribution, of Patricks for me is building those bridges. When the last “definition of Devops” post is made I’ll still cherish the people I met from that group of very talented folk. That’s one of the reasons I was happy to be involved in the organisation of my second London DevOps. You’d be amazed at the time, energy and passion the organisers, speakers and audience invest in to a DevopsDays event. But it really does show on the day(s).

I was also honoured to be included in the Velocity Europe Program Committee. Velocity has always been one of the important events of industry and to go from budgeting most of a year in advance to attend to being asked to help select from the submitted papers, and even more than that, be a session chair, was something I’m immensely proud of and thankful to James Turnbull for even thinking of me. The speakers, some of who were old hands at large events and some giving their first conference talk (in their second language no less!), were a pleasure to work with and made a nerve wracking day so much better than I could have hoped. It was also a stark reminder of how much I hate speaking in front of a room full of people.

Moving away from gushing over conferences, I published a book. It was a small experiment and it’s been very educational. It’s sold a few copies, made enough to pay for the domain for a few years and led to some interesting conversations with readers. I also wrote a few Alexa skills. While they’re not the more complicated or interesting bits of code from last year they have a bit of a special significance to me. I’m from a very non-technical background so it’s nice for my family to actually see, or in this case hear, something I’ve built.

Other things that helped keep me sane were tech reviewing a couple of books, hopefully soon to be published, and reviewing talk submissions. Some for conferences I was heavily involved in and some for events I wasn’t able to attend. It’s a significant investment of time but nearly every one of them taught me something. Even about technology I consider myself competent in.

I still maintain a small quarterly Pragmatic Investment Plan (PiP), which I started a few years ago, and while it’s more motion than progress these days it does keep me honest and ensure I do at least a little bit of non-work technology each month. Apart from Q1 2017 I surprisingly managed to read a tech book each month, post a handful of articles on my blog, and attend a few user groups here and there. I’ve kept the basics of the PiP for 2018 and I’m hoping it keeps me moving.

My general reading for the year was the worst it’s been for five years. I managed to read, from start to finish, 51 books. Totalling under 15,000 pages. I did have quite a few false starts and unfinished books at the end which didn’t help.

Oddly, my most popular blog post of the year was Non-intuitive downtime and possibly not lost sales. It was mentioned in a lot of weekly newsletters and resulted in quite a bit of traffic. SRE weekly also included it, which was a lovely change of pace from my employer being mentioned in the “Outages” section.

All in all 2017 was a good year for me personally and contained at least one career highlight. In closing I’d like to thank you for reading UnixDaemon, especially if you made it this far down, and let’s hope we both have an awesome 2018.

Terraform testing thoughts

As your terraform code grows in both size and complexity you should invest in tests and other ways to ensure everything is doing exactly what you intended. Although there are existing ways to exercise parts of your code I think Terraform is currently missing an important part of testing functionality, and I hope by the end of this post you’ll agree.

I want puppet catalog compile testing in terraform

Our current terraform testing process looks a lot like this:

  • precommit hooks to ensure the code is formatted and valid before it’s checked in
  • run terraform plan and apply to ensure the code actually works
  • execute a sparse collection of AWSSpec / InSpec tests against the created resources
  • Visually check the AWS Console to ensure everything “looks correct”

We ensure the code is all syntactically validate (and pretty) before it’s checked in. We then run a plan, which often finds issues with module paths, names and such, and then the slow, all encompassing, and cost increasing apply happens. And then you spot an unexpanded variable. Or that something didn’t get included correctly with a count.

I think there is a missed opportunity to add a separate phase, between plan and apply above, to expose the compiled plan in a easy to integrate format such as JSON or YAML. This would allow existing testing tools, and things like custom rspec matchers and cucumber test cases, to verify your code before progressing to the often slow, and cash consuming, apply phase. There are a number of things you could usefully test in a serialised plan output. Are your “fake if” counts doing what you expect? Are those nested data structures translating to all the tags you expect? How about the stringified splats and local composite variables? And what are the actual values hidden behind those computed properties? All of this would be visible at this stage. Having these tests would allow you to catch a lot of more subtle logic issues before you invoke the big hammer of actually creating resources.

I’m far from the first person to request this and upstream have been fair and considerate but it’s not something that’s on the short term road map. Work arounds do exist but they all have expensive limitations. The current plan file is in a binary format that isn’t guaranteed to be backwards compatible to external clients. Writing a plan output parser is possible but “a tool like this is very likely to be broken by future Terraform releases, since we don’t consider the human-oriented plan output to be a compatibility constraint” and hooking the plan generation code, an approach taken by palantir/tfjson will be a constant re-investment as terraforms core rapidly changes.

Adding a way to publish the plan in an easy to process way would allow many other testing tools and approaches to bloom and I hope I’ve managed to convince you that it’d be a great addition to terraform.

Show server side response timings in chrome developer tools

While trying to add additional performance annotations to one of my side projects I recently stumbled over the exceptionally promising Server-Timing HTTP header and specification. It’s a simple way to add semi-structured values describing aspects of the response generation and how long they each took. These can then be processed and displayed in your normal web development tools.

In this post I’ll show a simplified example, using Flask, to add timings to a single page response and display them using Google Chrome developer tools. The sample python flask application below returns a web page consisting of a single string and some fake information detailing all the actions assembling the page could have required.

# cat hello.py

from flask import Flask, make_response
app = Flask(__name__)


@app.route("/")
def hello():

    # Collect all the timings you want to expose
    # each string is how long it took in microseconds
    # and the human readable name to display
    sub_requests = [
        'redis=0.1; "Redis"',
        'mysql=2.1; "MySQL"',
        'elasticsearch=1.2; "ElasticSearch"'
    ]

    # Convert timings to a single string
    timings = ', '.join(sub_requests)

    resp.headers.set('Server-Timing', timings)

    return resp

Once you’ve started the application, with FLASK_APP=hello.py flask run, you can request this page via curl to confirm the header and values are present.

    $ curl -sI http://127.0.0.1:5000/ | grep Timing
    ...
    Server-Timing: redis=0.1; "Redis", mysql=2.1; "MySQL", elasticsearch=1.2; "ElasticSearch"
    ...

Now we’ve added the header, and some sample data, to our tiny Flask application let’s view it in Chrome devtools. Open the developer tools with Ctrl-Shift-I and then click on the network tab. If you hover the mouse pointer over the coloured section in “Waterfall” you should see an overlay like this:

Chrome devtools response performance graph

The values provided by our header are at the bottom under “Server Timing”.

Support for displaying the values provided with this header isn’t yet wide spread. The example, and screenshot, presented here are from Chrome 62.0.3202.75 (Official Build) (64-bit) and may require changes as the spec progresses from its current draft status. The full potential of the Server-Timing header won’t be obvious for a while but even with only a few supporting tools it’s still a great way to add some extra visibility to your projects.

Use your GitHub SSH key with AWS EC2 (via Terraform)

Like most people I have too many credentials in my life. Passwords, passphrases and key files seem to grow in number almost without bound. So, in an act of laziness, I decided to try and remove one of them. In this case it’s my AWS EC2 SSH key and instead reuse my GitHub public key when setting up my base AWS infrastructure.

Once you start using EC2 on Amazon Web Services you’ll need to create, or supply an existing, SSH key pair to allow you to log in to the Linux hosts. While this is an easy enough process to click through I decided to automate the whole thing and use an existing key, one of those I use for GitHub. One of its lesser known features is that GitHub exposes a users SSH public keys. This is available from everywhere, without authenticating against anything and so seemed like a prime candidate for reuse.

The terraform code to do this was a lot quicker to write than the README. As this is for my own use I could use a newer version of 0.10.* and harness the locals functionality to keep the actual resources simpler to read by hiding all the variable composing in a single place. You can find the results of this, the terraform-aws-github-ssh-keys module on GitHub, and see an example of its usage here:

module "github-ssh-keys" {
  source = "deanwilson/github-ssh-keys/aws"

  # fetch the ssh key from this user name
  github_user = "deanwilson"

  # create the key with a specific name in AWS
  aws_key_pair_name = "deanwilson-from-github"
}

I currently use this for my own test AWS accounts. The common baseline setup of these doesn’t get run that often in comparison to the services running in the environment so I’m only tied to GitHub being available occasionally. Once the key’s created it has a long life span and has no external network dependencies.

After the module was (quite fittingly) available on GitHub I decided to go a step further and publish it to the Terraform Module Registry. I’ve never used it before so after a brief read about the module format requirements, which all seem quite sensible, I decided to blunder my way through and see how easy it was. The Answer? Very.

Screen shot of my module on the Terraform registry

The process was pleasantly straight forward. You sign in using your GitHub account, select your Terraform modules from a drop down and then you’re live. You can see how github-ssh-keys looks as an example. Adding a module was quick, easy to follow, and well worth finishing off your modules with.

Prevent commits to the local git master branch

I’ve been a fan of Yelps pre-commit git hook manager ever since I started using it to Prevent AWS credential leaks. After a recent near miss involving a push to master I decided to take another look and see if it could provide a safety net that would only allow commits on non-master branches. It turns out it can, and it’s actually quite simple to enable if you follow the instructions below.

Firstly we’ll install pre-commit globally.

pip install pre-commit

Before we enable the plugin we’ll make a commit to an unprotected local master branch to ensure everything’s working the way we think it is.

# confirm we're on master
$ git branch
* master

# create a local change we can work with
$ echo "Text" >> text

$ git add text

# successfully commit the change
$ git commit -v -m "Add text"
[master e1b84e5] Add text
 1 file changed, 1 insertion(+)
 create mode 100644 text

Now we’ve confirmed we can commit to master normally we’ll add the pre-commit config to prevent it.

$ cat .pre-commit-config.yaml

- repo: https://github.com/pre-commit/pre-commit-hooks.git
  sha: v0.9.5
  hooks:
  -  id: no-commit-to-branch

and then we activate the config.

$ pre-commit install
pre-commit installed at ~/protected-branch-test/.git/hooks/pre-commit

If anything fails then you’ll probably need to read through ~/.pre-commit/pre- commit.log to find the issue. Now we’ve installed the pre-commit pip, added its config, and then enabled it we should be protected. No more accidental committing to the master branch for us! But let’s verify.

# make a change to the checkout
echo "More text" >> text

git commit -m "Added more text"
... snip ...
Don't commit to branch.............Failed
... snip ...

# and the change is not committed.

By default this plugin protects the master branch. If you have other branches you want to deny commits on you can add the args key to the config as shown in this snippet.

  hooks:
  -  id: no-commit-to-branch
     args:
     - --branch=release

If you need to commit to master while this plugin is enabled you can use the --no-verify argument to git commit to disable all pre-commit hooks.

It’s worth noting you can also prevent inadvertent pushes to master at the remote end by enabling branch protection on a number of the popular git providers, both GitHub and BitBucket support this. This approach has the advantage of not needing client side configuration but does require that all your providers support it, and that you actually enable it on each of them and their repositories. While you can of course do that by hand there are also a few tools that will manage this for you, but that’s a something for a different post.

I wrote a book

A few months ago while stunningly bored I decided, in a massive fit of hubris, that I was going to write and publish a technical book. I wrote a pile of notes and todo items and after a good nights sleep decided it’d be a lot more work than I had time for. So I decided to repurpose Puppet CookBook and try going through the publication process with that instead. But (disclaimer) with a different title as there is already an excellent real book called Puppet Cookbook that goes in to a lot more depth than my site does.

My Puppet CookBook has always been a hand built site. It’s ruby, erb templates and currently uses blueprint for the CSS. My hope was to just add another small wrapper and some slightly modified templates and pass that to a tool that actually knows how to format ebooks. Which ended up being very close to what actually happened. I did some research on both asciidoctor and pandoc and ended up using the latter, mostly because its desired input format was closest to how I already produce the site.

Completely skipping the monotonous updating and rewording of a number of recipes part of the process we soon get to the interesting part, tooling. I generated markdown from my own tooling (but as one massive file instead of many pages) and then ran pandoc against it and some supporting information, such as a cover image and book specific material. The actual commands look a lot like this:

#!/bin/bash
set -xue

book=$foo.epub

# generate the raw markdown text from my custom template
bundle exec ruby -I lib bin/build-markdown.rb > ebook/book-source.md


# Generate the ebook itself
pandoc 
  -t epub3 
  -o ebook/$book ebook/metadata.yaml ebook/introduction.md ebook/book-source.md ebook/changelog.md 
  --table-of-contents --toc-depth 2 
  --epub-cover-image ebook/${book}-cover.png

# open the reader to see if it looks... passable
FBReader ebook/$book

Once the book was generating, and looked readable, it was time to attempt submitting to see what else I was missing. I’m not going to detail all the text fields you have to complete when submitting, as it’s dull, self-explanatory and will probably change over time. It is worth noting that uploading the epub here is the first time I actually received any feedback on how valid my submission was. And it very much wasn’t. While Amazon does offer their tool chain on Linux it’s all pre-built 32 bit binaries. Even when I managed to make them run they didn’t return any errors and seem to be validating off old versions of the ebook specs. After a few iterations of blind fix and upload I had to swallow my pride and ask a friend to use his Apple laptop to run some of the Amazon publishing tool chain, such as Kindle Previewer, for me so I could see the reasons my submission was being rejected. This was worth the shame as it gave me actionable errors messages and cut the cycle time down significantly.

Once the previewer ran clean I re-submitted the book and it went through on the first try. I then went and did other stuff for a few hours and then returned, search for the books name and ‘lo there was a page for it on Amazon.co.uk

There are still a few oddities I have no clue how royalties work when publishing via the Amazon Kindle publisher. I think I’m now due about 3 cans of coke worth but it’d cost me more in time to figure out how to get that sweet 2.750 than I’ll ever make from it. You don’t get a free copy. As an author if I want to see how the book looks to customers I have to buy a copy. You also can’t submit a book as ‘free’. If you’re in the UK then the minimum you can sell it for it 99p. There is a way around this, as I’d like to offer mine for free, but you need to set the book up on another site and then have Amazon price match. Which is a massive pain to set up. I also had to do a lot of googling for things like “how to insert a hard page break” in a kindle book (<mbp:pagebreak /> for the record) but that might be an indication of how unprepared I was.

I’m not going to link to the book here, and I’m also not going to recommend that anyone actually buy it. All the material contained inside it can be seen, with actual coloured syntax, on Puppet CookBook. Which is something I state very clearly in the foreword.

I’ve found the sales numbers interesting. Firstly I’ve done zero promotion of it, and I’ve not linked to it from anywhere. The google results for it are amazingly terrible. I have problems finding it with a search engine and I wrote it. There’s also no way to see how many copies you’ve sold since release. In terms of raw numbers I think I’m still in double rather than treble digits, you can also see that I won’t be retiring on my royalties.

Picture of very low royalties

Writing a tech book is never going to be a retirement plan. In my case it’s even slightly worse as the material is online and free, and viewed by about 1500 unique visitors a day for the last 6 or so years. This was never intended to be anything beyond an experiment but I’ve had some nice feedback from readers and it’s something I always wanted to do so I think it was worth investing a weekend into it. The technology is freely available, the process is actually quite easy (if you have something other than Linux available) and I’d heartily recommend it anyone interested.