Category Archives: ruby

Building an Infinite Procedurally-Generated World

I had a lot of fun writing my last blog post: All Work & No Play – Taking Time to Code for Fun. In it I talked about writing fun code that keeps you interested in programming and keeps you creative. I used the example of writing a 2D procedurally-generated, infinite world. In this post, I am going to explain details of how that example works.


Bring the Noise

To build our terrain, we need something better than just randomly selecting a tile for each set of coordinates. Since we want the world to be infinite, we can’t design it by hand: that’s where Perlin noise comes in. Perlin noise is an algorithm for generating organic looking, multidimensional noise quickly (at least for 2D). You’ll want to know a few terms when playing with Perlin noise:

The number of subsequent generations to run, usually double the frequency and half the amplitude.
How to weight the additional octaves when adding them.

These can be fun to play with when generating terrain; they will make it smooth and flat or bumpy and spiky. Further explanation can be found here.

Homework assignment: try merging two noise datasets that have been generated with different octave / persistence values.

From Noise to Tiles

Once you have access to random 2D data, it’s pretty straight forward to convert that noise into usable data. Simply set up thresholds for each tile type you want to support, eg:

  • water if < 0.3
  • grass if >= 0.3 and <= 0.6
  • mountain if > 0.6

I recommend leaving the upper and lower cases open. If they are not left open, you may end up with holes in your map for unexpectedly high or low values. An alternative is to clamp or scale the noise values when you generate them.

Going Beyond Tiles

An infinite world of terrain isn’t all that interesting for very long without something in it. We want to add things like trees, caves, and towns to our game. We can simply use the same random number generator (RNG) that we used to generate our noise to determine when to place and how to build objects. We have to be careful though. If we use a single RNG for the whole world, the order in which we discover things will change the how they are generated.


The trick to deterministically creating the world based on a seed is to break it up into chunks (see screenshot above). A chunk is simply a range of tiles in the world. We keep 9 chunks loaded at a time. Each chunk is 50×50 grid of tiles. When the player enters a place with no chunk, we generate a new one. The first thing we do is seed a new RNG for that chunk with the global seed and the coordinates for that chunk. This allows us to throw away the chunk when the player leaves it, but know that it will be recreated the same when the player comes back. Just make sure your code is deterministic and that it only pulls random values from the chunk’s RNG and things should work out well.

The source for this project can be found on github. Please leave a comment if you are doing something similar and want an excuse to show off.

Example uses tileset from Oryx Design Lab. Thanks!

The post Building an Infinite Procedurally-Generated World appeared first on Atomic Spin.

Some travlrmap updates

Been a while since I posted here about my travlrmap web app, I’ve been out of town the whole of February – first to Config Management Camp and then on holiday to Spain and Andorra.

I released version 1.5.0 last night which brought a fair few tweaks and changes like updating to latest Bootstrap, improved Ruby 1.9.3 UTF-8 support, give it a visual spruce up using the Map Icons Collection and gallery support.

I take a lot of photos and of course often these photos coincide with travels. I wanted to make it easy to put my travels and photos on the same map so have started adding a gallery ability to the map app. For now it’s very simplistic, it makes a point with a custom HTML template that just opens a new tab to the Flickr slideshow feature. This is not what I am after exactly, ideally when you click view gallery it would just open a overlay above the map and show the gallery with escape to close – that would take you right back to the map. There re some bootstrap plugins for this but they all seem to have some pain points so that’s not done now.

Today there’s only Flickr support and a gallery takes a spec like :gallery: flickr,user=ripienaar,set=12345 and from there it renders the Flickr set. Once I get the style of popup gallery figured out I’ll make that pluggable through gems so other photo gallery tools can be supported with plugins.

As you can see from above the trip to Spain was a Road Trip, I kept GPX tracks of almost the entire trip and will be adding support to show those on the map and render them. Again they’ll appear as a point just like galleries and clicking on them will show their details like a map view of the route and stats. This should be the aim for the 1.6.0 release hopefully.

Some travlrmap updates

Last weekend I finally got to a point of 1.0.0 of my travel map software, this week inbetween other things I made a few improvements:

  • Support 3rd party tile sets like Open Streetmap, Map Quest, Water Color, Toner, Dark Matter and Positron. These let you customise your look a bit, the Demo Site has them all enabled.
  • Map sets are supported, I use this to track my Travel Wishlist vs Places I’ve been.
  • Rather than list every individual yaml file in a directory to define a set you can now just point at a directory and everything will get loaded
  • You can designate a single yaml file as writable, the geocoder can then save points to disk directly without you having to do any YAML things.
  • The geocoder renders better on mobile devices and support geocoding based on your current position to make it easy to add points on the go.
  • Lots of UX improvements to the geocoder

Seems like a huge amount of work but it was all quite small additions, mainly done in a hour or so after work.

Travlrmap 1.0.0

As mentioned in my previous 2 posts I’ve been working on rebuilding my travel tracker app. It’s now reached something I am happy to call version 1.0.0 so this post introduces it.

I’ve been tracking major travels, day trips etc since 1999 and plotting it on maps using various tools like the defunct Xerox Parc Map Viewer, XPlanet and eventually wrote a PHP based app to draw them on Google Maps. During the years I’ve also looked at various services to use instead so I don’t have to keep doing this myself but they all die, change business focus or hold data ransom so I am still fairly happy doing this myself.

The latest iteration of this can be seen at It’s a Ruby app that you can host on the free tier at Heroku quite easily. Features wise version 1.0.0 has:

  • Responsive design that works on mobile and PC
  • A menu of pre-defined views so you can draw attention to a certain area of the map
  • Points can be catagorized by type of visit like places you've lived, visited or transited through. Each with their own icon.
  • Points can have urls, text, images and dates associated with them
  • Point clustering that combines many points into one when zoomed out with extensive configuration options
  • Several sets of colored icons for point types and clusters. Ability to add your own.
  • A web based tool to help you visually construct the YAML snippets needed using search
  • Optional authentication around the geocoder
  • Google Analytics support
  • Export to KML for viewing in tools like Google Earth
  • Full control over the Google Map like enabling or disabling the street view options

It’s important to note the intended use isn’t something like a private Foursquare or Facebook checkin service, it’s not about tracking every coffee shop. Instead it’s for tracking major city or attraction level places you’ve been to. I’m contemplating adding a mobile app to make it easier to log visits while you’re out and about but it won’t become a checkin type service.

I specifically do not use a database or anything like that, it’s just YAML files that you can check into GitHub, easily backup and hopefully never loose. Data longevity is the most important aspect for me so the input format is simple and easy to convert to others like JSON or KML. This also means I do not currently let the app write into any data files where it’s hosted. I do not want to have to figure out the mechanics of not loosing some YAML file sat nowhere else but a webserver. Though I am planning to enable writing to a incoming YAML file as mentioned above.

Getting going with your own is really easy. Open up a free Heroku account and set up a free app with one dynamo. Clone the demo site into your own GitHub and push to Heroku. That’s it, you should have your own up and running with place holder content ready to start receiving your own points which you can make using the included geocoder. You can also host it on any Ruby based app server like Passenger without any modification from the Heroku one.

The code is on GitHub ripienaar/travlrmap under Apache 2. Docs for using it and configuration references are on it’s dedicated gh-pages page.

Marker clustering using GMaps.js

In a previous post I showed that I am using a KML file as input into GMaps.js to put a bunch of points on a map for my travels site. This worked great, but I really want to do some marker clustering since too many points is pretty bad looking as can be seen below.

I’d much rather do some clustering and expand out to multiple points when you zoom in like here:

Turns out there are a few libraries for this already, I started out with one called Marker Clusterer but ended up with a improved version of this called Marker Clusterer Plus. And GMaps.js supports cluster libraries natively so should be easy, right?

Turns out the way Google maps loads KML files is done using a layer over the map and the points just are not accessible to any libraries, so the cluster libraries does nothing. Ok, so back to drawing points using my code.

I added a endpoint to the app that emits my points as JSON:

   "popup_html":"<p>\n<font size=\"+2\">Helsinki</font>\n<hr>\nBusiness trip in 2005<br /><br />\n\n</p>\n",
   "comment":"Business trip in 2005",

Now adding all the points and getting them clustered is pretty easy:

<script type="text/javascript">
    var map;
    function addPoints(data) {
      var markers_data = [];
      if (data.length > 0) {
        for (var i = 0; i < data.length; i++) {
            lat: data[i].lat,
            lng: data[i].lon,
            title: data[i].title,
            icon: data[i].icon,
            infoWindow: {
              content: data[i].popup_html
      infoWindow = new google.maps.InfoWindow({});
      map = new GMaps({
        div: '#main_map',
        zoom: 15,
        lat: 0,
        lng: 20,
        markerClusterer: function(map) {
          options = {
            gridSize: 40
          return new MarkerClusterer(map, [], options);
      points = $.getJSON("/points/json");

This is pretty simple, the GMaps() object takes a markerClusterer option that expects an instance of the clusterer. I fetch the JSON data and each row gets added as a point. Then it all just happens automagically. Marker Clusterer Plus can take a ton of options that lets you specify custom icons, grid sizes, tweak when to kick in clustering etc. Here I am just setting the gridSize to show how to do that. In this example I have custom icons used for the clustering, might blog about that later when I figured out how to get them to behave perfectly.

You can see this in action on my travel site. As an aside I’ve taken a bit of time to document how the Sinatra app works and put together a demo deployable to Heroku that should give people hints on how to get going if anyone wants to make a map of their own.

Ruby, Google Maps and KML

Since 1999 I kept record of most places I’ve traveled to. In the old days I used a map viewer from PARC Xerox to view these travels, I then used XPlanet which made a static image. Back in 2005 as Google Maps became usable from Javascript I made something to show my travels on an interactive map. It was using Gmaps EZ and PHP to draw points from a XML file.

Since then google made their v2 API defunct and something went bad with the old php code and so the time came to revisit all of this into the 4th iteration of a map tracking my travels.

Google Earth came out in 2005 as well – so just a bit late for me to use it’s data formats – but today it seems obvious that the data belong in a KML file. Hand building KML files though is not on, so I needed something to build the KML file in Ruby.

My new app maintains points in YAML files, have more or less an identical format to the old PHP system.

First to let people come up with categories of points you define a bunch of types of points first:


And then we have a series of points each referencing a type:

- :type: :visit
  :lon: -73.961334
  :title: New York
  :lat: 40.784506
  :country: United States
  :comment: Sample Data
  :linktext: Wikipedia
- :type: :transit
  :lon: -71.046524
  :title: Boston
  :lat: 42.363871
  :country: United States
  :comment: Sample Data

Here we have 2 points, both link to Wikipedia one using text and one using an image, one is a visit and one is a transit.

I use the ruby_kml Gem to convert this into KML:

First we set up the basic document and we set up the types using KML styles:

kml =
document = => "Travlrmap Data")
@config[:types].each do |k, t|
  document.styles <<
    :id         => "travlrmap-#{k}-style",
    :icon_style => => => t[:icon]))

This sets up the types and give them names like travlrmap-visited-style.

We’ll now reference these in the KML file for each point:

folder = => "Countries")
folders = {}
@points.sort_by{|p| p[:country]}.each do |point|
  unless folders[point[:country]]
    folder.features << folders[point[:country]] = => point[:country])
  folders[point[:country]].features <<
    :name        => point[:title],
    :description => point_comment(point),
    :geometry    => => {:lat => point[:lat], :lng => point[:lon]}),
    :style_url   => "#travlrmap-#{point[:type]}-style"
document.features << folder
kml.objects << document

The points are put in folders by individual country. So in Google Earth I get a nice list of countries to enable/disable as I please etc.

I am not showing how I create the comment html here – it’s the point_comment method – it’s just boring code with a bunch of if’s around linkimg, linktext and href. KML documents does not support all of HTML but the basics are there so this is pretty easy.

So this is the basics of making a KML file from your own data, it’s fairly easy though the docs for ruby_kml isn’t that hot and specifically don’t tell you that you have to wrap all the points and styles and so forth in a document as I have done here – it seems a recent requirement of the KML spec though.

Next up we have to get this stuff onto a google map in a browser. As KML is the format Google Earth uses it’s safe to assume the Google Maps API support this stuff directly. Still, a bit of sugar around the Google APIs are nice because they can be a bit verbose. Previously I used GMapsEZ – which I ended up really hating as the author did all kinds of things like refuse to make it available for download instead hosting it on a unstable host. Now I’d say you must use gmaps.js to make it real easy.

For viewing a KML file, you basically just need this – more or less directly from their docs – there’s some ERB template stuff in here to set up the default view port etc:

<script type="text/javascript">
    var map;
      infoWindow = new google.maps.InfoWindow({});
      map = new GMaps({
        div: '#main_map',
        zoom: <%= @map_view[:zoom] %>,
        lat: <%= @map_view[:lat] %>,
        lng: <%= @map_view[:lon] %>,
        url: '',
        suppressInfoWindows: true,
        preserveViewport: true,
        events: {
          click: function(point){

Make sure there’s a main_map div setup with your desired size and the map will show up there. Really easy.

You can see this working on my new travel site at The code is on Github as usual but it’s a bit early days for general use or release. The generated KML file can be fetched here.

Right now it supports a subset of older PHP code features – mainly drawing lines is missing. I hope to add a way to provide some kind of index to GPX files to show tracks as I have a few of those. Turning a GPX file into a KML file is pretty easy and the above JS code should show it without modification.

I’ll post a follow up here once the code is sharable, if you’re brave though and know ruby you can grab the travlrmap gem to install your own.

passenger native libs on CentOS 7

I'm setting up a new puppet master running under passenger on CentOS 7 using packages from the puppetlabs and foreman repos. I used a fork of Stephen Johnson's puppet module to set everything up (with puppet apply). All went swimmingly, except I would see this error in the logs the first time the puppet master app loaded (ie. the first time it got a request):

[ 2014-11-07 23:22:13.2600 2603/7f1a0660e700 Pool2/Spawner.h:159 ]: [App 2643 stderr] *** Phusion Passenger: no found for the current Ruby interpreter. Compiling one (set PASSENGER_COMPILE_NATIVE_SUPPORT_BINARY=0 to disable)...
[ 2014-11-07 23:22:13.2600 2603/7f1a0660e700 Pool2/Spawner.h:159 ]: [App 2643 stderr] # mkdir -p /usr/share/gems/gems/passenger-4.0.18/lib/phusion_passenger/locations.ini/buildout/ruby/ruby-2.0.0-x86_64-linux
[ 2014-11-07 23:22:13.2600 2603/7f1a0660e700 Pool2/Spawner.h:159 ]: [App 2643 stderr] Not a valid directory. Trying a different one...
[ 2014-11-07 23:22:13.2600 2603/7f1a0660e700 Pool2/Spawner.h:159 ]: [App 2643 stderr] -------------------------------
[ 2014-11-07 23:22:13.2600 2603/7f1a0660e700 Pool2/Spawner.h:159 ]: [App 2643 stderr] # mkdir -p /var/lib/puppet/.passenger/native_support/4.0.18/ruby-2.0.0-x86_64-linux
[ 2014-11-07 23:22:13.2600 2603/7f1a0660e700 Pool2/Spawner.h:159 ]: [App 2643 stderr] # cd /var/lib/puppet/.passenger/native_support/4.0.18/ruby-2.0.0-x86_64-linux
[ 2014-11-07 23:22:13.2600 2603/7f1a0660e700 Pool2/Spawner.h:159 ]: [App 2643 stderr] # /usr/bin/ruby '/usr/share/gems/gems/passenger-4.0.18/ruby_extension_source/extconf.rb'
[ 2014-11-07 23:22:13.3048 2603/7f1a0660e700 Pool2/Spawner.h:159 ]: [App 2643 stderr] /usr/bin/ruby: No such file or directory -- /usr/share/gems/gems/passenger-4.0.18/ruby_extension_source/extconf.rb (LoadError)
[ 2014-11-07 23:22:13.3156 2603/7f1a0660e700 Pool2/Spawner.h:159 ]: [App 2643 stderr] Compilation failed.
[ 2014-11-07 23:22:13.3156 2603/7f1a0660e700 Pool2/Spawner.h:159 ]: [App 2643 stderr] -------------------------------
[ 2014-11-07 23:22:13.3157 2603/7f1a0660e700 Pool2/Spawner.h:159 ]: [App 2643 stderr] Ruby native_support extension not loaded. Continuing without native_support.

I double checked, and I do have the native libs installed – they're in the rubygem-passenger-native-libs rpm – the main library is in /usr/lib64/gems/ruby/passenger-4.0.18/native/

Digging in the passenger code, it tries to load the native libs by doing:

require 'native/passenger_native_support'

If I hacked this to:

require '/usr/lib64/gems/ruby/passenger-4.0.18/native/passenger_native_support'

then it loaded correctly.

It seems that /usr/lib64/gems/ruby/passenger-4.0.18 is not in the ruby load path.

Additional directories can be added to the ruby load path by setting an environment variable, RUBYLIB.

To set RUBYLIB for the apache process, I added the following line to /etc/sysconfig/httpd and restarted apache:


The passenger native libraries now load correctly.

Acquiring a Modern Ruby (Part One)

Last month marked the 21st anniversary of the programming language Ruby. I use Ruby pretty much all the time. If I need to write a command line tool, a web app, pretty much anything, I'll probably start with Ruby. This means I've always got Ruby to hand on whatever machine I'm using. However, in practice, getting a recent Ruby on any platform isn't actually as simple as it sounds. Ruby is a fast-moving language, with frequent releases. Mainstream distributions often lag behind the current release. Let's have a quick look at the history of the releases of Ruby over the last year:

require 'nokogiri'
require 'open-uri'
news = Nokogiri::HTML(open(""))
news.xpath("//div[@class='post']/following-sibling::*").each do |item|
  match = item.text.match /Ruby (\S+) is released.*Posted by.*on ((?:\d{1,2} [a-zA-z]{3} \d{4}))/m
  if match
    puts "Ruby #{match[1]} was announced on #{match[2]}"

Ruby 2.1.0-rc1 was announced on 20 Dec 2013
Ruby 2.1.0-preview2 was announced on 22 Nov 2013
Ruby 1.9.3-p484 was announced on 22 Nov 2013
Ruby 2.0.0-p353 was announced on 22 Nov 2013
Ruby 2.1.0-preview1 was announced on 23 Sep 2013
Ruby 2.0.0-p247 was announced on 27 Jun 2013
Ruby 1.9.3-p448 was announced on 27 Jun 2013
Ruby 1.8.7-p374 was announced on 27 Jun 2013
Ruby 1.9.3-p429 was announced on 14 May 2013
Ruby 2.0.0-p195 was announced on 14 May 2013
Ruby 2.0.0-p0 was announced on 24 Feb 2013
Ruby 1.9.3-p392 was announced on 22 Feb 2013
Ruby 2.0.0-rc2 was announced on 8 Feb 2013
Ruby 1.9.3-p385 was announced on 6 Feb 2013
Ruby 1.9.3-p374 was announced on 17 Jan 2013

So 15 releases in 2013, including a major version (2.0.0) in February, and a release candidate of 2.1 shortly before Christmas. For a Ruby developer today, the current releases are:

  • Ruby 2.1.1
  • 2.0.0-p451
  • 1.9.3-p545

Let's compare to what's available in Debian stable:

$ apt-cache show ruby
Package: ruby
Source: ruby-defaults
Version: 1:1.9.3
Installed-Size: 31
Maintainer: akira yamada <>
Architecture: all
Replaces: irb, rdoc
Provides: irb, rdoc
Depends: ruby1.9.1 (>=

So the current version in Debian is older than January 2013? What about Ubuntu? The latest 'saucy' offers us 2.0.0. That's fractionally better, but really, it's pretty old. What about my trusty Fedora? At the time of writing, that gives me 2.0.0p353. Still not exactly current, and not a sniff of a 2.1 package. Archlinux offers a Ruby 2.1, but even that's not right up to date:

$ ruby --version
ruby 2.1.0p0 (2013-12-25 revision 44422) [x86_64-linux]

Now, I will grant that there might be third-party repositories, or backports or other providers of packages which might be more up-to-date, but my experience of this hasn't always been that positive. On the whole, it looks like we need to look somewhere else. On a Unix-derived system, this typically means building from source.

Building From Source

Building Ruby from source isn't difficult, not if you have some idea what you're doing, and can read documentation. However, equally, it's not entirely trivial. It does require you to know how to provide libraries for things like ffi, ncurses, readline, openssl, yaml etc. And, if you're using different systems, they have different names, and you might be working with different compilers and versions of make. In recognition of this, several tools have emerged to shield the complexity, and make it easier to get a modern Ruby on your system. The most popular are RVM, ruby-build, and ruby-install. Let's review each of them, briefly. I'm going to use a CentOS 6 machine as my example Linux machine, but each of these tools will work on all popular Linux distributions and Mac OSX. I've had success on FreeBSD and Solaris too, but I've not tested this recently, so YMMV.


RVM, the Ruby Version Manager, is the father of the tools designed to make managing modern Ruby versions easier. It's what one might call a monolithic tool - it does a huge range of things all in one place. We'll discuss it several times in this series, as it provides functionality beyond that of simply installing a modern Ruby, but for the purposes of this series, we're only going to use it to install Ruby.

RVM is a shell script. The most popular way to install it is via the fashionable 'curl pipe through bash' approach:

$ curl -sSL | bash
Creating group 'rvm'

Installing RVM to /usr/local/rvm/
Installation of RVM in /usr/local/rvm/ is almost complete:

  * First you need to add all users that will be using rvm to 'rvm' group,
    and logout - login again, anyone using rvm will be operating with `umask u=rwx,g=rwx,o=rx`.

  * To start using RVM you need to run `source /etc/profile.d/`
    in all your open shell windows, in rare cases you need to reopen all shell windows.

# Administrator,
#   Thank you for using RVM!
#   We sincerely hope that RVM helps to make your life easier and more enjoyable!!!
# ~Wayne, Michal & team.

In case of problems: and

Hmm, ok, let's try that.

# gpasswd -a sns rvm
Adding user sns to group rvm

I also added:

source /etc/profile.d/

to the end of ~/.bash_profile

$ rvm --version

rvm 1.25.19 (master) by Wayne E. Seguin <>, Michal Papis <> []

First let's see what Rubies it knows about:

$ rvm list known
# MRI Rubies

# GoRuby

# Topaz

# TheCodeShop - MRI experimental patches

# jamesgolick - All around gangster

# Minimalistic ruby implementation - ISO 30170:2012

# JRuby

# Rubinius

# Ruby Enterprise Edition

# Kiji

# MagLev

# Mac OS X Snow Leopard Or Newer

# Opal

# IronRuby

Wow - that's a lot of Rubies. Let's just constrain ourselves to MRI, and install the latest stable 2.1:

$ rvm install ruby
Searching for binary rubies, this might take some time.
No binary rubies available for: centos/6/x86_64/ruby-2.1.1.
Continuing with compilation. Please read 'rvm help mount' to get more information on binary rubies.
Checking requirements for centos.
Installing requirements for centos.
Updating system.
Installing required packages: patch, libyaml-devel, libffi-devel, glibc-headers, gcc-c++, glibc-devel, patch, readline-devel, zlib-devel, openssl-devel, autoconf, automake, libtool, bisonsns password required for 'yum install -y patch libyaml-devel libffi-devel glibc-headers gcc-c++ glibc-devel patch readline-devel zlib-devel openssl-devel autoconf automake libtool bison':

The first thing to notice, and this is pretty cool, is that RVM will try to locate a precompiled binary. Unfortunately there isn't one for our platform, so it's going to go ahead and install one from source. It's going to install the various required packages, and then crack on.

This assumes that I've got sudo set up for my user. As it happens, I don't, but we can fix that. For a mac or an ubuntu machine this would be in place anyway. Please hold, caller... ...ok done. My sns user now has sudo privileges, and we can continue:

$ rvm install ruby
Searching for binary rubies, this might take some time.
No binary rubies available for: centos/6/x86_64/ruby-2.1.1.
Continuing with compilation. Please read 'rvm help mount' to get more information on binary rubies.
Checking requirements for centos.
Installing requirements for centos.
Updating system.
Installing required packages: patch, libyaml-devel, libffi-devel, glibc-headers, gcc-c++, glibc-devel, patch, readline-devel, zlib-devel, openssl-devel, autoconf, automake, libtool, bisonsns password required for 'yum install -y patch libyaml-devel libffi-devel glibc-headers gcc-c++ glibc-devel patch readline-devel zlib-devel openssl-devel autoconf automake libtool bison':
Requirements installation successful.
Installing Ruby from source to: /usr/local/rvm/rubies/ruby-2.1.1, this may take a while depending on your cpu(s)...
ruby-2.1.1 - #downloading ruby-2.1.1, this may take a while depending on your connection...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 11.4M  100 11.4M    0     0  61.0M      0 --:--:-- --:--:-- --:--:-- 64.9M
ruby-2.1.1 - #extracting ruby-2.1.1 to /usr/local/rvm/src/ruby-2.1.1.
ruby-2.1.1 - #configuring....................................................
ruby-2.1.1 - #post-configuration.
ruby-2.1.1 - #compiling...................................................................................
ruby-2.1.1 - #installing................................
ruby-2.1.1 - #making binaries executable.
Rubygems 2.2.2 already installed, skipping installation, use --force to reinstall.
ruby-2.1.1 - #gemset created /usr/local/rvm/gems/ruby-2.1.1@global
ruby-2.1.1 - #importing gemset /usr/local/rvm/gemsets/global.gems.....
ruby-2.1.1 - #generating global wrappers.
ruby-2.1.1 - #gemset created /usr/local/rvm/gems/ruby-2.1.1
ruby-2.1.1 - #importing gemsetfile /usr/local/rvm/gemsets/default.gems evaluated to empty gem list
ruby-2.1.1 - #generating default wrappers.
ruby-2.1.1 - #adjusting #shebangs for (gem irb erb ri rdoc testrb rake).
Install of ruby-2.1.1 - #complete
Ruby was built without documentation, to build it run: rvm docs generate-ri

Great - let's see what we have:

$ ruby --version
ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-linux]

In order to verify our installation, we're going to test OpenSSL and Nokogiri. If these two work, we can be pretty confident that we have a functional Ruby:

$ gem install nokogiri --no-ri --no-rdoc
Fetching: mini_portile-0.5.2.gem (100%)
Successfully installed mini_portile-0.5.2
Fetching: nokogiri-1.6.1.gem (100%)
Building native extensions.  This could take a while...
Successfully installed nokogiri-1.6.1
2 gems installed

Let's test this now. Here's a simple test to prove that both SSL and Nokogiri are working:

require 'nokogiri'
require 'open-uri'
require 'openssl'
https_url = ''
puts Nokogiri::HTML(open(https_url)).css('input')

If this works, we should see a PEM file printed to screen, and the HTML of various inputs on the Google homepage:

$ ruby ruby_test.rb
<input name="ie" value="ISO-8859-1" type="hidden">
<input value="en-GB" name="hl" type="hidden">
<input name="source" type="hidden" value="hp">
<input autocomplete="off" class="lst" value="" title="Google Search" maxlength="2048" name="q" size="57" style="color:#000;margin:0;padding:5px 8px 0 6px;vertical-align:top">
<input class="lsb" value="Google Search" name="btnG" type="submit">
<input class="lsb" value="I'm Feeling Lucky" name="btnI" type="submit" onclick="if(this.form.q.value)this.checked=1; else top.location='/doodles/'">
<input type="hidden" id="gbv" name="gbv" value="1">

So, RVM was fairly painless. We had to fiddle about with users and sudo, and add a line to our shell profile, but once that was done, we were easily able to install Ruby.

Ruby Build

Ruby-build is a dedicated tool, also written in shell, designed specifically to install Ruby, and provide fine-grained control over the configuration, build and installation. In order to get it we need to install Git. As with RVM, I'll set up a user with sudo access in order to permit this. Once we have Git installed, we can clone the project and install it.

$ git clone
Initialized empty Git repository in /home/sns/ruby-build/.git/
remote: Reusing existing pack: 3077, done.
remote: Total 3077 (delta 0), reused 0 (delta 0)
Receiving objects: 100% (3077/3077), 510.82 KiB | 386 KiB/s, done.
Resolving deltas: 100% (1404/1404), done.
$ cd ruby-build
$ sudo ./

Like RVM, we can find out what Rubies ruby-build knows about:

$ ruby-build --definitions

Let's opt for ruby 2.1 again. Now, unlike RVM, ruby-build doesn't make any attempt to install development tools suitable to allow Ruby to be built from source. It does provide a wiki with information on what is needed, but if you expect ruby-build to magically install your toolkit, you're going to be disappointed. Additionally, ruby-build requires you to state exactly where you want Ruby to be installed. It doesn't have a default, so you need to work out where it should go. Realistically this boils down to two choices - do you want to try to install it somewhere global, where everyone can see it, or just locally for your own use. The former adds some more complications around permissions and so forth, so in this instance we'll install to a local directory.

First, let's check the wiki for the build dependencies:

$ yum install gcc-c++ glibc-headers openssl-devel readline libyaml-devel readline-devel zlib zlib-devel

Now that's all installed, we can go ahead with building Ruby:

$ ruby-build 2.1.1 ~/local/mri-2.1.1
Downloading ruby-2.1.1.tar.gz...
Installing ruby-2.1.1...
Installed ruby-2.1.1 to /home/sns/local/mri-2.1.1

Not a very verbose output... in fact one might be forgiven for wondering at times if anything's happening at all! But, now we have a Ruby, let's add it to our shell path, and do our nokogiri/openssl test.

$ export PATH=$PATH:/home/sns/local/mri-2.1.1/bin/
$ ruby --version
ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-linux]

$ gem install nokogiri --no-rdoc --no-ri
$ vi ruby_test.rb
$ ruby ruby_test.rb
<input name="ie" value="ISO-8859-1" type="hidden">
<input value="en-GB" name="hl" type="hidden">
<input name="source" type="hidden" value="hp">
<input autocomplete="off" class="lst" value="" title="Google Search" maxlength="2048" name="q" size="57" style="color:#000;margin:0;padding:5px 8px 0 6px;vertical-align:top">
<input class="lsb" value="Google Search" name="btnG" type="submit">
<input class="lsb" value="I'm Feeling Lucky" name="btnI" type="submit" onclick="if(this.form.q.value)this.checked=1; else top.location='/doodles/'">
<input type="hidden" id="gbv" name="gbv" value="1">

So, ruby-build did what it said on the tin. It didn't need us to create a group, or source any files in our profile. All we needed to do was update our shell path, which, of course, we'd add to our shell profile to make permanent. We did have to install the build dependencies manually, but this is documented on the wiki. It's a lighter-weight solution than RVM, optimised for local users.

Ruby Install

The third of our 'install from source' options is 'Ruby Install'. Another shell utility, it is more aligned with 'Ruby Build' than RVM, as it only really has one purpose in life - to install Ruby. It doesn't have all the extra features of RVM, but it does install build dependencies for most platforms.

To install, we obtain a tarball of the latest release:

$ wget -O ruby-install-0.4.0.tar.gz
$tar xzvf ruby-install-0.4.0.tar.gz
$ cd ruby-install-0.4.0
$ sudo make install
[sudo] password for sns:
for dir in `find etc lib bin sbin share -type d 2>/dev/null`; do mkdir -p /usr/local/$dir; done
for file in `find etc lib bin sbin share -type f 2>/dev/null`; do cp $file /usr/local/$file; done
mkdir -p /usr/local/share/doc/ruby-install-0.4.0
cp -r *.md *.txt /usr/local/share/doc/ruby-install-0.4.0/

This will make the ruby-install tool available. Again, we can see which Rubies are available:

$ ruby-install
Known ruby versions:
    1:      1.9.3-p484
    1.9:    1.9.3-p484
    1.9.1:  1.9.1-p431
    1.9.2:  1.9.2-p320
    1.9.3:  1.9.3-p484
    2.0:    2.0.0-p353
    2.0.0:  2.0.0-p353
    2:      2.1.0
    2.1:    2.1.0
    stable: 2.1.0
    1.7:    1.7.10
    stable: 1.7.10
    2.1:    2.1.1
    2.2:    2.2.5
    stable: 2.2.5
    1.0:    1.0.0
    1.1:    1.1RC1
    stable: 1.0.0
    1.0:    1.0.0
    stable: 1.0.0

Because we obtained the stable release of the ruby-install tool, we don't have the very latest version available. We could simply do as we did with ruby-build, and get the latest version straight from Git, but the process would be the same. For now, we're going to opt for the latest stable release the tool offers us, which is 2.1.0:

$ ruby-install ruby
>>> Installing ruby 2.1.0 into /home/sns/.rubies/ruby-2.1.0 ...
>>> Installing dependencies for ruby 2.1.0 ...
[sudo] password for sns:
... grisly details ...

The install is substantially more verbose than either ruby-build or RVM, showing the grisly details of the compiling and linking. Once installed, let's check the version and run our test:

$ export PATH=$PATH:/home/sns/.rubies/ruby-2.1.0/bin/
$ ruby --version
ruby 2.1.0p0 (2013-12-25 revision 44422) [x86_64-linux]

$ gem install nokogiri --no-ri --no-rdoc
Building native extensions.  This could take a while...
Successfully installed nokogiri-1.6.1
1 gem installed
$ vi ruby_test.rb
$ ruby ruby_test.rb
<input name="ie" value="ISO-8859-1" type="hidden">
<input value="en-GB" name="hl" type="hidden">
<input name="source" type="hidden" value="hp">
<input autocomplete="off" class="lst" value="" title="Google Search" maxlength="2048" name="q" size="57" style="color:#000;margin:0;padding:5px 8px 0 6px;vertical-align:top">
<input class="lsb" value="Google Search" name="btnG" type="submit">
<input class="lsb" value="I'm Feeling Lucky" name="btnI" type="submit" onclick="if(this.form.q.value)this.checked=1; else top.location='/doodles/'">
<input type="hidden" id="gbv" name="gbv" value="1">

So, ruby-install was pretty easy to use. We didn't need to mess about with groups or sourcing extra files in our shell profile. There was a default, local path for the Rubies, the dependencies were identified and installed for us, and everything worked. The only downside was that, on account of using the stable version of the tool, we didn't get access to the very latest Ruby. That's easily fixed by simply getting the tool straight from Git - which we'll look at when we come to discuss managing multiple versions of Ruby.


So at this stage in our journey, we've explored three approaches to simplifying the process of installing a Ruby from source - RVM, ruby-build and ruby-install. Given that the stated objective was to install a recent version of Ruby, from source, as simply as possible, in my view the order of preference goes:

  1. Ruby Install: This does the simplest thing that could possibly work. It has sane defaults, solves and installs dependencies, and requires nothing more than a path setting.

  2. RVM: This requires only slightly more setup than Ruby Install, and certainly meets the requirements. It's much more heavyweight, as it does many more things than just install Ruby, and these capabilities will come into consideration later in the series.

  3. Ruby Build: This brings up the rear. It works, but it doesn't have sane defaults, and needs extra manual steps to install build dependencies.

As we continue the series, we'll look into ways to manage multiple versions of Ruby on a workstation, strategies for getting the relevant versions of Ruby onto servers, running Ruby under Microsoft Windows, and automating the whole process. Next time we'll talk about Ruby on Windows. Until then, bye for now.

Better Puppet Modules Using Hiera Data

When writing Puppet Modules there tend to be a ton of configuration data – generally things like different paths for different operating systems. Today the general pattern to manage this data is a class module::param with a bunch of logic in it.

Here’s a simplistic example below – for an example of the full horror of this pattern see the puppetlabs-ntp module.

# ntp/manifests/init.pp
class ntp (
     $config = $ntp::params::config,
     $keys_file = $ntp::params::keys_file
   ) inherits ntp::params {

# ntp/manifests/params.pp
class ntp::params {
   case $::osfamily {
      'AIX': {
         $config = "/etc/ntp.conf"
         $keys_file = '/etc/ntp.keys'
      'Debian': {
         $config = "/etc/ntp.conf"
         $keys_file = '/etc/ntp/keys'
      'RedHat': {
         $config = "/etc/ntp.conf"
         $keys_file = '/etc/ntp/keys'
      default: {
         fail("The ${module_name} module is not supported on an ${::osfamily} based system.")

This is the exact reason Hiera exists – to remove this kind of spaghetti code and move it into data, instinctively now whenever anyone see code like this they think they should refactor this and move the data into Hiera.

But there’s a problem. This works for your own modules in your own repos, you’d just use the Puppet 3 automatic parameter bindings and override the values in the ntp class – not ideal, but many people do it. If however you wanted to write a module for the Forge though there’s a hitch because the module author has no idea what kind of hierarchy exist where the module is used. If the site even used Hiera and today the module author can’t ship data with his module. So the only sensible thing to do is to embed a bunch of data in your code – the exact thing Hiera is supposed to avoid.

I proposed a solution to this problem that would allow module authors to embed data in their modules as well as control the Hierarchy that would be used when accessing this data. Unfortunately a year on we’re still nowhere and the community – and the forge – is suffering as a result.

The proposed solution would be a always-on Hiera backend that as a last resort would look for data inside the module. Critically the module author controls the hierarchy when it gets to the point of accessing data in the module. Consider the ntp::params class above, it is a code version of a Hiera Hierarchy keyed on the $::osfamily fact. But if we just allowed the module to supply data inside the module then the module author has to just hope that everyone has this tier in their hierarchy – not realistic. My proposal then adds a module specific Hierarchy and data that gets consulted after the site Hierarchy.

So lets look at how to rework this module around this proposed solution:

# ntp/manifests/init.pp
class ntp ($config, $keysfile)  {

Next you configure Hiera to consult a hierarchy on the $::osfamily fact, note the new data directory that goes inside the module:

# ntp/data/hiera.yaml
  - "%{::osfamily}"

And finally we create some data files, here’s just the one for RedHat:

# ntp/data/RedHat.yaml
ntp::config: /etc/ntp.conf
ntp::keys_file: /etc/ntp/keys

Users of the module could add a new OS without contributing back to the module or forking the module by simply providing similar data to the site specific hierarchy leaving the downloaded module 100% untouched!

This is a very simple view of what this pattern allows, time will tell what the community makes of it. There are many advantages to this over the ntp::params pattern:

This helps the contributor to a public module:

  • Adding a new OS is easy, just drop in a new YAML file. This can be done with confidence as it will not break existing code as it will only be read on machines of the new OS. No complex case statements or 100s of braces to get right
  • On a busy module when adding a new OS they do not have to worry about complex merge problems, working hard at rebasing or any git escoteria – they’re just adding a file.
  • Syntactically it’s very easy, it’s just a YAML file. No complex case statements etc.
  • The contributor does not have to worry about breaking other Operating Systems he could not test on like AIX here. The change is contained to machines for the new OS
  • In large environments this help with change control as it’s just data – no logic changes

This helps the maintainer of a module:

  • Module maintenance is easier when it comes to adding new Operating Systems as it’s simple single files
  • Easier contribution reviews
  • Fewer merge commits, less git magic needed, cleaner commit history
  • The code is a lot easier to read and maintain. Fewer tests and validations are needed.

This helps the user of a module:

  • Well written modules now properly support supplying all data from Hiera
  • He has a single place to look for the overridable data
  • When using a module that does not support his OS he can deploy it into his site and just provide data instead of forking it

Today I am releasing my proposed code as a standalone module. It provides all the advantages above including the fact that it’s always on without any additional configuration needed.

It works exactly as above by adding a data directory with a hiera.yaml inside it. The only configuration being considered in this hiera.yaml is the hierarchy.

This module is new and does some horrible things to get itself activated automatically without any configuration, I’ve only tested it on Puppet 3.2.x but I think it will work in 3.x as is. I’d love to get feedback on this from users.

If you want to write a forge module that uses this feature simply add a dependency on the ripienaar/module_data module, soon as someone install this dependency along with your module the backend gets activated. Similarly if you just want to use this feature in your own modules, just puppet module install ripienaar/module_data.

Note though that if you do your module will only work on Puppet 3 or newer.

It’s unfortunate that my Pull Request is now over a year old and did not get merged and no real progress is being made. I hope if enough users adopt this solution we can force progress rather than sit by and watch nothing happen. Please send me your feedback and use this widely.

CLI Report viewer for Puppet

When using Puppet you often run it in a single run mode on the CLI and then go afk. When you return you might notice it was slow for some or other reason but did not run it with –evaltrace and in debug mode so the information to help you answer this simply isn’t present – or scrolled off or got rotated away from your logs.

Typically you’d deploy something like foreman or report handlers on your masters which would receive and display reports. But while you’re on the shell it’s a big context switch to go and find the report there.

Puppet now saves reports in it’s state dir including with apply if you ran it with –write-catalog-summary and in recent versions these reports include the performance data that you’d only find from –evaltrace.

So to solve this problem I wrote a little tool to show reports on the CLI. It’s designed to run on the shell of the node in question and as root. If you do this it will automatically pick up the latest report and print it but it will also go through and check the sizes of files and show you stats. You can run it against saved reports on some other node but you’ll lose some utility. The main focus of the information presented is to let you see logs from the past run but also information that help you answer why it was slow to run.

It’s designed to work well with very recent versions of Puppet maybe even only 3.3.0 and newer, I’ve not tested it on older versions but will gladly accept patches.

Here are some snippets of a report of one of my nodes and some comments about the sections. A full sample report can be found here.

First it’s going to show you some metadata about the report, what node, when for etc:

sudo report_print.rb
Report for in environment production at Thu Oct 10 13:37:04 +0000 2013
             Report File: /var/lib/puppet/state/last_run_report.yaml
             Report Kind: apply
          Puppet Version: 3.3.1
           Report Format: 4
   Configuration Version: 1381412220
                    UUID: 99503fe8-38f2-4441-a530-d555ede9067b
               Log Lines: 350 (show with --log)

Some important information here, you can see it figured out where to find the report by parsing the Puppet config – agent section – what version of Puppet and what report format. You can also see the report has 350 lines of logs in it but it isn’t showing them by default.

Next up it shows you a bunch of metrics from the report:

Report Metrics:
                        Total: 320
                        Total: 320
                      Success: 320
                      Failure: 0
                        Total: 436
                  Out of sync: 317
                      Changed: 317
                    Restarted: 7
            Failed to restart: 0
                      Skipped: 0
                       Failed: 0
                    Scheduled: 0
                        Total: 573.671295
                      Package: 509.544123
                         Exec: 33.242635
      Puppetdb conn validator: 22.767754
             Config retrieval: 4.096973
                         File: 1.343388
                         User: 1.337979
                      Service: 1.180588
                  Ini setting: 0.127856
                       Anchor: 0.013984
            Datacat collector: 0.008954
                         Host: 0.003265
             Datacat fragment: 0.00277
                     Schedule: 0.000504
                        Group: 0.00039
                   Filebucket: 0.000132

These are numerically sorted and the useful stuff is in the last section – what types were to blame for the biggest slowness in your run. Here we can see we spent 509 seconds just doing packages.

Having seen how long each type of resource took it then shows you a little report of how many resources of each type was found:

Resources by resource type:
    288 File
     30 Datacat_fragment
     25 Anchor
     24 Ini_setting
     22 User
     18 Package
      9 Exec
      7 Service
      6 Schedule
      3 Datacat_collector
      1 Group
      1 Host
      1 Puppetdb_conn_validator
      1 Filebucket

From here you’ll see detail about resources and files, times, sizes etc. By default it’s going to show you 20 of each but you can increase that using the –count argument.

First we see the evaluation time by resource, this is how long the agent spent to complete a specific resource:

Slowest 20 resources by evaluation time:
    356.94 Package[activemq]
     41.71 Package[puppetdb]
     33.31 Package[apache2-prefork-dev]
     33.05 Exec[compile-passenger]
     23.41 Package[passenger]
     22.77 Puppetdb_conn_validator[puppetdb_conn]
     22.12 Package[libcurl4-openssl-dev]
     10.94 Package[httpd]
      4.78 Package[libapr1-dev]
      3.95 Package[puppetmaster]
      3.32 Package[ntp]
      2.75 Package[puppetdb-terminus]
      2.71 Package[mcollective-client]
      1.86 Package[ruby-stomp]
      1.72 Package[mcollective]
      0.58 Service[puppet]
      0.30 Service[puppetdb]
      0.18 User[jack]
      0.16 User[jill]
      0.16 User[ant]

You can see by far the longest here was the activemq package that took 356 seconds and contributed most to the 509 seconds that Package types took in total. A clear indication that maybe this machine is picking the wrong mirrors or that I should create my own nearby mirror.

File serving in Puppet is notoriously slow so when run as root on the node in question it will look for all File resources and print the sizes. Unfortunately it can’t know if a file contents came from source or content as that information isn’t in the report. Still this might give you some information on where to target optimization. In this case nothing really stands out:

20 largest managed files (only those with full path as resource name that are readable)
     6.50 KB /usr/local/share/mcollective/mcollective/util/actionpolicy.rb
     3.90 KB /etc/mcollective/facts.yaml
     3.83 KB /var/lib/puppet/concat/bin/
     2.78 KB /etc/sudoers
     1.69 KB /etc/apache2/conf.d/puppetmaster.conf
     1.49 KB /etc/puppet/fileserver.conf
     1.20 KB /etc/puppet/rack/
    944.00 B /etc/apache2/apache2.conf
    573.00 B /etc/ntp.conf
    412.00 B /usr/local/share/mcollective/mcollective/util/actionpolicy.ddl
    330.00 B /etc/apache2/mods-enabled/passenger.conf
    330.00 B /etc/apache2/mods-available/passenger.conf
    262.00 B /etc/default/puppet
    215.00 B /etc/apache2/mods-enabled/worker.conf
    215.00 B /etc/apache2/mods-available/worker.conf
    195.00 B /etc/apache2/ports.conf
    195.00 B /var/lib/puppet/concat/_etc_apache2_ports.conf/fragments.concat
    195.00 B /var/lib/puppet/concat/_etc_apache2_ports.conf/fragments.concat.out
    164.00 B /var/lib/puppet/concat/_etc_apache2_ports.conf/fragments/10_Apache ports header
    158.00 B /etc/puppet/hiera.yaml

And finally if I ran it with –log I’d get the individual log lines:

350 Log lines:
   Thu Oct 10 13:37:06 +0000 2013 /Stage[main]/Concat::Setup/File[/var/lib/puppet/concat]/ensure (notice): created
   Thu Oct 10 13:37:06 +0000 2013 /Stage[main]/Concat::Setup/File[/var/lib/puppet/concat/bin]/ensure (notice): created
   Thu Oct 10 13:37:06 +0000 2013 /Stage[main]/Concat::Setup/File[/var/lib/puppet/concat/bin/]/ensure (notice): defined content as '{md5}2fbba597a1513eb61229551d35d42b9f'

The code is on GitHub, I’d like to make it available as a Puppet Forge module but there really is no usable option to achieve this. The Puppet Face framework is the best available option but the UX is so poor that I would not like to expose anyone to this to use my code.