Hi! Welcome...

Syndication of blogs and tweets by users of the Freenode ##infra-talk IRC channel

26 January 2012 ~ Comments Off

Oldskool: A Gem extendible search engine

Back in the day The Well had a text based conference system, you used dial in, then telnet and later ssh to their server and interacted with other members through a text system called PicoSpan. Eventually things moved to the web and it became a lot more forum like. The thing that I really loved was that in the web version of the forums there was a command line. You could type many of the same commands into the web CLI as you would into the Unix one and have the same effects. Posting, searching, jumping through conferences. It was the web with the CLI power for those who wanted it.

The browser is more and more our interface to all things online and frankly it sux a bit, I want the CLI speed for accessing the Web sites that I like. I’ve created a PHP system I called cmd ages ago that simply routed a command like “guk greenwich” to the Google UK search engine with results restricted to those from the UK. There are of course various online tools that does the same but I found that their ‘book’ keyword would search Amazon US while I wanted UK so I just did one that I can tweak to my liking.

Recently thanks to Googles widely hated changes to their Search UI simply redirecting to Google searches with keywords filled in just was not enough anymore. I want web search back the way it was before they made it suck. So I do what hackers do and wrote a Ruby based pluggable search system. You can see a screenshot of it here showing a Google search.

What you’re seeing here is the oldskool-gcse plugin in action. It uses the Google JSON API to query a Google Custom Search Engine and format the results in a way that does not suck. The Custom Search Engines are quite nice as you can customize all sorts of things in them like which sites to exclude, which to favor, limit results to certain countries or languages allowing you to really customize your search experience. The only down side to the GCSE approach is that Google limits API calls to 100 a day, for me that’s enough for searching but ymmv.

Using this method of searching can have some privacy wins, Google recently announced merging all their online accounts into one and will have all your online activity influence your searches. I wasn’t too worried since by then I had already written Oldskool and will simply use a different Google Account to access their search API than the one I use to read my work mail for example. Simple effective win.

My default search in oldskool is a GCSE that resembles a normal Google search but I can also search for “puppet exec” and oldskool will route that request to a specific GCSE that bumps the official Puppet Labs docs to the top, exclude some annoying things etc. So oldskool is a single entry frontend to many different GCSE backends is quite powerful.

As I said it’s plugable and I’ve written one other plugin that uses my Passmakr gem to generate random passwords. I can just search for pass 10 to get a 10 character password:

Writing your own plugins is very easy and I hope to see ones that queries Redmine instances or other internal databases that you might have using the Oldskool framework to display all the data in one handy place.

It retains the most basic feature of simple keyword base redirects, so I can search for book reamde to get Amazon UK book results instantly.

Config is through a simple YAML file:

---
:google_api_key: your.key
:username: http_auth_user
:password: http_auth_pass
:keywords:
- :type: :gcse
  :cx: you_gcse
  :keywords:
  - :default
- :type: :gcse
  :cx: your_gcse
  :keywords:
  - puppet
- :type: :url
  :url: http://amazon.co.uk/exec/obidos/search-handle-url/index=books-uk&field-keywords=%Q%
  :keywords:
  - book
  - books
- :type: :password
  :keywords: pass

This sets up 2 GCSE searches – one marked as my default search – and the mentioned book search and one that uses the password plugin I’ve shown above.

It needs no writable access to the webserver it runs on and it’s all managed by Bundler and Sinatra – perfect for hosting on the free Heroku tier.

As this is effectively my Web CLI I want it integrated in as many places as possible. I use a lot of desktops – 3 regularly – so the browser is my unified UI to all of this. Your instance will publish OpenSearch meta data which will make it seamlessly integrate into Firefox, Chrome, IE, Gnome DO, Gnome Shell and many many other places.

Here’s Firefox search box the first time you browse to a new instance:

And here is Chrome, you do not even have to add it just start typing the URL to your instance and press tab, the URL bar transforms into a Oldskool search box magically. You can add it permanently and make it default by right clicking on the URL bar and choosing Edit Search Engines….

The code is in my GitHub – Oldskool, Oldskool GCSE and Oldskool Password. I will blog again tomorrow or on another day about creating your own plugins etc.

25 January 2012 ~ Comments Off

How GitHub Uses GitHub to Build GitHub

I wrote a post a while back linking to an interesting video about the culture at GitHub, entitled: Optimizing for Happiness – why you want to go work at Github!.

Since then, i’ve watched a few other interesting talks about the culture and how they work at GitHub and two in particular are worth noting here.

Firstly, Zach Holman, one of the early “Githubbers” recently gave a talk about “How GitHub Uses GitHub to Build GitHub“:

Build features fast. Ship them. That’s what we try to do at GitHub. Our process is the anti-process: what’s the minimum overhead we can put up with to keep our code quality high, all while building features as quickly as possible? It’s not just features, either: faster development means happier developers. This talk will dive into how GitHub uses GitHub: we’ll look at some of our actual Pull Requests, the internal apps we build on our own API, how we plan new features, our Git branching strategies, and lots of tricks we use to get everyone – developers, designers, and everyone else involved with new code. We think it’s a great way to work, and we think it’ll work in your company, too.

You can watch the video here and also check out a series of blog posts he wrote on the same subject.

The second talk i’d recommend I had the pleasure of seeing live at a local conference i’ve attend (DIBI Conference). It’s by Corey Donohoe (@atmos):

The talk will cover the metrics driven approach GitHub uses to analyze performance and growth of our product. It will cover deployment strategies for rapid customer feedback as well as configuration management to ensure reproducibility.

You can watch the video here.

Both are great talks and well worth a watch.

20 January 2012 ~ Comments Off

OpenBSD PPPoE and RFC 4638

I upgraded my Internet connection from ADSL 2+ to FTTC a while ago. I’m with Eclipse as an ISP, but it’s basically the same product as BT Infinity, right down to the Openreach-branded modem, (a Huawei Echolife HG612 to be exact).

With this modem, you need to use a router or some software that can do RFC 2516 PPPoE so I simply kept using OpenBSD on my trusty Soekris net4501 and set up a pppoe(4) interface, job done. However what became apparent is the 133 MHz AMD Elan CPU couldn’t fully utilise the 40 Mb/s bandwidth I now had, at best I could get 16-20 Mb/s with a favourable wind. An upgrade was needed.

Given I’d had around 8 years of flawless service from the net4501, another Soekris board was the way to go. Enter the net6501 with comparatively loads more CPU grunt, RAM and interestingly Gigabit NIC chips; not necessarily for the faster speed, but because they can naturally handle a larger MTU.

The reason for this was that I had read that the Huawei modem and BT FTTC network fully supported RFC 4638, which means you can have an MTU of 1,500 bytes on your PPPoE connection which matches what you’ll have on your internal network. Traditionally a PPPoE connection only allowed 1,492 bytes on account of the overhead of 8 bytes of PPPoE headers in every Ethernet frame payload. Because of this it was almost mandatory to perform MSS clamping on traffic to prevent problems. So having an MTU of 1,500 bytes should avoid the need for any clamping trickery, but means your Ethernet interface needs to cope with an MTU of 1,508 bytes, hence the Gigabit NIC (which can accommodate an MTU of 9,000 bytes with no problems).

One small problem remained, pppoe(4) on OpenBSD 5.0 didn’t support RFC 4638. While I sat down and started to add support I noticed someone had added this to the NetBSD driver already, (which is where the OpenBSD driver originated from), so based on their changes I created a similar patch and with some necessary improvements based on feedback from OpenBSD developers it has now been committed to CVS in time for the 5.1 release.

To make use of the larger MTU is fairly obvious, simply set the MTU explicitly on both the Ethernet and PPPoE interfaces to 8 bytes higher than their default. As an example, my /etc/hostname.em0 now contains:

1
mtu 1508 up

And similarly my /etc/hostname.pppoe0 contains:

1
2
3
4
5
inet 0.0.0.0 255.255.255.255 NONE mtu 1500 \
        pppoedev em0 authproto chap \
        authname 'user' authkey 'secret' up
dest 0.0.0.1
!/sbin/route add default -ifp \$if 0.0.0.1

I also added support to tcpdump(8) to display the additional PPPoE tag used to negotiate the larger MTU, so when you bring the interface up, watch for PPP-Max-Payload tags going back and forth during the discovery phase.

With that done the remaining step is to remove any scrub max-mss rules in pf.conf(5) as with any luck they should no longer be required.

10 January 2012 ~ Comments Off

Troubleshooting Headless Tests on a Remote CI Server

I just ran into a problem that was causing the Jasmine tests on our Jenkins CI box to hang forever, and I figured I should document this handy little troubleshooting tip in case someone else might find it helpful.

If you hop onto your CI box while your headless browser tests are running, you should see an Xvfb process that looks something like this:

# ps -efww | grep Xvfb
jenkins  18833     1  1 21:18 ?        00:00:00 /usr/bin/Xvfb :99 -screen 0 1280x1024x24 -ac

Since Xvfb is using display :99, you’ll want to run x11vnc accordingly:

$ x11vnc -display :99

Now you should have a VNC server listening on port 5900. Just fire up your VNC viewer and connect as usual:

$ vncviewer host.example.com:5900

But what if you’re accessing your CI server over an insecure network? You can use SSH local port forwarding to create a secure tunnel:

$ ssh -L 5900:localhost:5900 host.example.com x11vnc -display :99

Now you can connect to the VNC server over the secure tunnel:

$ vncviewer localhost:5900

09 January 2012 ~ Comments Off

Upgrading RHEL 6.2 to CentOS 6.2

I had a utility server running RHEL 6.2 (I installed it as part of a RHEV evaluation process). However, I have no RHEL entitlements so am not able to get updates.

So, I converted it to CentOS 6.2, with a little help from this post:

yum clean all
mkdir ~/centos
cd ~/centos
wget http://mirror.centos.org/centos/6.2/os/x86_64/RPM-GPG-KEY-CentOS-6
wget http://mirror.centos.org/centos/6.2/os/x86_64/Packages/centos-release-6-2.el6.centos.7.x86_64.rpm
wget http://mirror.centos.org/centos/6.2/os/x86_64/Packages/yum-3.2.29-22.el6.centos.noarch.rpm
wget http://mirror.centos.org/centos/6.2/os/x86_64/Packages/yum-utils-1.1.30-10.el6.noarch.rpm
wget http://mirror.centos.org/centos/6.2/os/x86_64/Packages/yum-plugin-fastestmirror-1.1.30-10.el6.noarch.rpm
rpm --import RPM-GPG-KEY-CentOS-6
rpm -e --nodeps redhat-release-server
rpm -e yum-rhn-plugin rhn-check rhnsd rhn-setup
rpm -Uhv --force *.rpm
yum upgrade
reboot

Nice!

08 January 2012 ~ Comments Off

Benchmarking Puppet Stacks

I decided this week-end to try the more popular puppet master stacks and benchmark them with puppet-load (which is a tool I wrote to simulate concurrent clients).

My idea was to check the common stacks and see which one would deliver the best concurrency. This article is a follow-up of my previous post about puppet-load and puppet master benchmarking

Methodology

I decided to try the following stacks:

  • Apache and Passenger, which is the blessed stack, with MRI 1.8.7 and 1.9.2
  • Nginx and Mongrel
  • JRuby with minzuno

The setup is the following:

  • one m1.large ec2 instance as the master
  • one m1.small ec2 instance as the client (in the same availability zone if that matters)

To recap, m1.large instances are:

  • 2 cpu with 2 virtual core each
  • 8 GiB of RAM

All the benchmarks were run on the same instance couples to prevent skew in the numbers.

The master uses my own production manifests, consisting of about 100 modules. The node for which we’ll compile a catalog contains 1902 resources exactly (which makes it a big catalog).

There is no storeconfigs involved at all (this was to reduce setup complexity).

The methodology is to setup the various stacks on the master instance and run puppet-load on the client instance. To ensure everything is hot on the master, a first run of the benchmark is run at full concurrency first. Then multiple run of puppet-load are performed simulating an increasing number of clients. This pre-heat phase also make sure the manifests are already parsed and no I/O is involved.

Tuning has been done as best as I could on all stacks. And care was taken for the master instance to never swap (all the benchmarks involved consumed about 4GiB of RAM or less).

Puppet Master workload

Essentially a puppet master compiling catalog is a CPU bound process (that’s not because a master speaks HTTP than its workload is a webserver workload). That means during the compilation phase of a client connection, you can be guaranteed that puppet will consume 100% of a CPU core.

Which essentially means that there is usually little benefit of using more puppet master processes than CPU cores on a server.

A little bit of scaling math

When we want to scale a puppet master server, there is a rough computation that allows us to see how it will work.

Here are the elements of our problem:

  • 2000 clients
  • 30 minutes sleep interval, clients evenly distributed in time
  • master with 8 CPU core and 8GiB of RAM
  • our average catalog compilation is 10s

30 minutes interval means that every 30 minutes we must compile 2000 catalogs for our 2000 nodes. That leaves us with 2000/30 = 66 catalogs per minute.

That’s about a new client checking-in about every seconds.

Since we have 8 CPU, that means we can accommodate 8 catalogs compilation in parallel, not more (because CPU time is a finite quantity).

Since 66/8 = 8.25, we can accommodate 8 clients per minute, which means each client must be serviced in less than 60/8.25 = 7.27s.

Since our catalogs take about 10s to compile (in my example), we’re clearly in trouble and we would need to either add more master servers or increase our client sleep time (or not compile catalogs, but that’s another story).

Results

Comparing our stacks

Let’s first compare our favorite stacks for an increasing concurrent clients number (increasing concurrency).

For setups that requires a fixed number of workers (Passenger, Mongrel) those were setup with 25 puppet master workers. This was fitting in the available RAM.

For JRuby, I had to use the at the time of writing jruby-head because of a bug in 1.6.5.1. I also had to comment out the Puppet execution system (in lib/puppet/util.rb).

Normally this sub-system is in use only on clients, but when the master loads the types it knows for validation, it also autoloads the providers. Those are checking if some support commands are available by trying to execute them (yes I’m talking to you rpm and yum providers).

I also had to comment out when puppet tries to become the puppet user, because that’s not supported under JRuby.

JRuby was run with Sun java 1.6.0_26, so it couldn’t benefit from the invokedynamic work that went into Java 1.7. I fully expect this feature to improve the performances dramatically.

The main metric I’m using to compare stacks is the TPS (transaction per seconds). This is in fact the number of catalogs a master stack can compile in one second. The higher the better. Since compiling a catalog on our server takes about 12s, we have TPS numbers less than 1.

Here are the main results:

Puppet Master Stack / Catalog compiled per Seconds

And, here is the failure rate:

Puppet Master Stack / Failure rate

First notice that some of the stack exhibited failures at high concurrency. The errors I could observe were clients timeouts., even tough I configured a large client side timeout (around 10 minutes). This is what happens when too many clients connect at the same time. Everything slows down until the client times out.

Fairness

In this graph, I plotted the min, average, median and max time of compilation for a concurrency of 16 clients.

Puppet Master Stack / fairness

Of course, the better is when min and max are almost the same.

Digging into the number of workers

For the stacks that supports a configurable number of workers (mongrel and passenger), I wanted to check what impact it could have. I strongly believe that there’s no reason to use a large number (compared to I/O bound workloads).

Puppet Master Stack / Worker # influence

Conclusions

Beside being fun this project shows why Passenger is still the best stack to run Puppet. JRuby shows some great hopes, but I had to massage the Puppet codebase to make it run (I might publish the patches later).

That’d would be really awesome if we could settle on a corpus of manifests to allow comparing benchmark results between Puppet users. Anyone want to try to fix this?

03 January 2012 ~ Comments Off

Graphite, JMXTrans, Ganglia, Logster, Collectd, say what ?

Given that @patrickdebois is working on improving data collection I thought it would be a good idea to describe the setup I currently have hacked together.

(Something which can be used as a starting point to improve stuff, and I have to write documentation anyhow)

I currently have 3 sources , and one target, which will eventually expand to at least another target and most probably more sources too.

The 3 sources are basically typical system data which I collect using collectd, However I`m using collectd-carbon from https://github.com/indygreg/collectd-carbon.git to send data to Graphite.

I`m parsing the Apache and Tomcat logfiles with logster , currently sending them only to Graphite, but logster has an option to send them to Ganglia too.

And I`m using JMXTrans to collect JMX data from Java apps that have this data exposed and send it to Graphite. (JMXTrans also comes with a Ganglia target option)

Rather than going in depth over the config it's probably easier to point to a Vagrant box I build https://github.com/KrisBuytaert/vagrant-graphite which brings up a machine that does pretty much all of this on localhost.

Obviously it's still a work in progress and lots of classes will need to be parametrized and cleaned up. But it's a working setup, and not just on my machine ..

03 January 2012 ~ Comments Off

#monitoringsucks and we’ll fix it !

If you are hacking on monitoring solutions, and want to talk to your peers solving the problem
Block the monday and tuesday after fosdem in your calendar !

That's right on february 6 and 7 a bunch of people interrested to fix the problem will be meeting , discussing and hacking stuff together in Antwerp

In short a #monitoringsucks hackathon

Inuits is opening up their offices for everybody who wants to join the effort Please let us (@KrisBuytaert and @patrickdebois) know if you want to join us in Antwerp

Obviously if you can't make it to Antwerp you can join the effort on ##monitoringsucks on Freenode or on Twitter.

The location will be Duboistraat 50 , Antwerp
It is about 10 minutes walk from the Antwerp Central Trainstation
Depending on Traffic Antwerp is about half an hour north of Brussels and there are hotels at walking distance from the venue.

Plenty of parking space is available on the other side of the Park

03 January 2012 ~ Comments Off

RESTful way to manage your databases

I have a need in my development environment to easily create/drop mySQL databases and users. Initially I was gonna implement a simple hacky HTTP GET method but was dissuaded by Ben Black from doing so. He suggested I write a proper RESTful interface. Without further ado I present to you dbrestadmin

https://github.com/vvuksan/dbrestadmin

It is my first foray into writing RESTful services so things may be rough around the edges. However it allows you to do following

  • manage multiple database servers
  • create/drop databases
  • list databases
  • create/drop users
  • list users
  • give user grants
  • view grants given to the user
  • view database privileges on a particular database given to a user

For example need to create a database called testdb on dbserver ID=0 use this cURL command

curl -X POST http://myhost/dbrestadmin/v1/databases/0/dbs/testdb

Create a user test2 with password test

curl -X POST "http://localhost:8000/dbrestadmin/v1/databases/0/users/test2@localhost" -d "password=test"

Give test2 user all privileges on testdb

curl -X POST "http://localhost:8000/dbrestadmin/databases/0/users/test2@'localhost'/grants" -d "grants=all privileges&database=testdb"

There is more. You can see all of the methods here

https://github.com/vvuksan/dbrestadmin/blob/master/API.md

Improvements and constructive criticism welcome

01 January 2012 ~ Comments Off

Goodbye, 2011.

This year's been pretty good, but the last two months were pretty lame.

In the last six weeks, I found out Caramel has lymphoma, got unemployed, and had emergency surgery to remove my appendix on Christmas Day. The unemployment caused me to lose an in-progress mortgage refinance.

I'll pick up the mortgage thing once I remedy the employment problem, but I'm staying quite happily unemployed until after my kid is born - should be any day now!

Most of my career-growing moves were outside of work: at meetups, in open source efforts, or in networking with folks on IRC or twitter. Lots of awesome folks out there, so go introduce yourself. Don't be a dick. :)

I didn't write much on this site, but mainly, that was due to an increase in my activities on IRC and twitter. Most of what I published this year was code and was less writing about said code. I'd like to fix that, though.

This years successes were topped by two new major projects, fpm and logstash. I also released some major improvements to xdotool and other tools.

The current implementation of logstash isn't very old, but prototypes, hacks, and other incarnations of pretty much the same thing date back to at least 2005 and probably earlier. This project has been a long-time-coming, and Pete Fritchman and I have been talking about logstash for years, so it's nice to finally have some code shipped and a community building around it.

FPM had a crazy positive response. I wrote it as a hack, and it's used all over the place now. Bonus that people are contributing patches and other improvements as well.

Sysadvent was another excellent success, the end of which marked the 4th year and 100th article posted to the project. It is awesome seeing such community involvement from so many different authors.

This year also cemented my move to git from svn. Why? Github, mostly, and not really the features of git itself. Sharing code and patches is so much easier on github than it is with other services.

I went to CarolinaCon and OSCON to talk about logstash. I also went to DevOps Days Mountain View and gave a lightning talk on logstash.

My OSCON talk was overflowing with people standing at the back of the room, etc; it went awesomely. I've also been able to do lunchtime logstash presentations at places like Square and others. I also gave talks at BayLISA meetings. It was a good year for getting out of the house and talking about code.

I tried to get a count of how much code I'd written this year, but I had lots of web-based projects that included third-party stuff like jquery, and I'm too lazy to pick through the results and trim that stuff out. I'm up to about 70 different projects on github now, some useful; some not; all fun!

Looking forward to 2012 :)