MCollective Plugin – FileMD5er

I've been watching the Marionette Collective for a while, and even gave it a small trial in a couple of testing environments, but this weekend was the first time I've experimented with it at a slightly larger scale (just over a hundred small VM nodes - you have to love EC2) and I'm still impressed.

I can see how it's going to make parts of my work flow easier, and in an attempt to learn a little more about how the plugin system works under the hood I decided to write a small agent, FileMD5er. The agent itself is very simple and addresses a small annoyance I've scripted around for a while. When you're bringing files under Puppet (or Chef) management you need to dig through the hosts and locate any files with differences compared to the most common adhoc file. With a quick mc-filemd5er /path/to/file I can easily spot any machines that have a slightly different version of the file, and then fold them in to centralised management.

Writing the plugin itself was quite easy. The two problems I encountered were finding the right generation of existing plugin to crib from (some of the official MCollective Plugins are of a newer format than others) and not naming the class and the .rb file the same name. Which caused it to half work.

I'll be putting more of my MCollective Plugins on Github as the become a little more generic and hopefully useful to someone else.

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

PHP: the Good Parts – Short Review

After working my way through JavaScript: The Good Parts I decided to put away all my misconceptions and give PHP a try. While I'm not actually looking to write any projects in the language at the moment I was interested to see how much of the PHP bashing was still based in fact and to learn what an expert in the language could show me. So I bought PHP: The Good Parts, which is a completely different book from the previous title in the series.

While Javascript the good parts was a little dry (and a lot opinionated) it did provide an excellent overview of the language and showed why it shouldn't be overlooked or treated as a toy. The PHP version on the other hand is a decent beginners guide for the very inexperienced.

If you're looking for an introduction to the language then this book's an acceptable choice. It's short, seems to cover all the basics and is readable in a single sitting. If you're coming from another programming language, looking for in-depth advanced PHP knowledge or a book similar to Crockfords then this isn't the book for you.

Score: 6/10 as a beginners book.

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

PHP: the Good Parts – Short Review

After working my way through JavaScript: The Good Parts I decided to put away all my misconceptions and give PHP a try. While I'm not actually looking to write any projects in the language at the moment I was interested to see how much of the PHP bashing was still based in fact and to learn what an expert in the language could show me. So I bought PHP: The Good Parts, which is a completely different book from the previous title in the series.

While Javascript the good parts was a little dry (and a lot opinionated) it did provide an excellent overview of the language and showed why it shouldn't be overlooked or treated as a toy. The PHP version on the other hand is a decent beginners guide for the very inexperienced.

If you're looking for an introduction to the language then this book's an acceptable choice. It's short, seems to cover all the basics and is readable in a single sitting. If you're coming from another programming language, looking for in-depth advanced PHP knowledge or a book similar to Crockfords then this isn't the book for you.

Score: 6/10 as a beginners book.

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Zabbix GUIs and Automation

In the T-DOSE Zabbix talk, which I'm happy to say was both well presented and showed some interesting features, I got called out for a quote I made on Twitter (which just goes to show - you never know where what you said is going to show up and haunt you) about the relevance, and I'd say overemphasis, of the GUI to the zabbix monitoring system - and other monitoring systems in general. Rather than argue with the speaker (for the record, I hate it when the audience does that) I thought I'd note my objects with it here instead.

My monitoring system of choice is Nagios. It's starting to get a little long in the tooth (where can I add a new host on the fly?) but it's survived this long because it got a lot of things right. Including its loose coupling and the fact it can read directories of config files. Zabbix, and to a degree Hyperic (which has a command line interface that only a satanist could love), are GUI focused tools (let's ignore auto-detect for now). To add a host you click around. To add another host, you click around. To add a new check you click around. To add a new group you click around. To... you get the idea.

Now that might have been fine a couple of years ago (well, not really) and it's an easy, intuitive way to add new config (well, not in Hyperics case) but it bothers me on two levels, firstly I like the Unix approach that nearly everything is a file, and secondly I no longer have a handful of hosts, I have hundreds to thousands of the things. All spread around multiple production sites for all the usual reasons like load distribution, geographical locality, resilience etc and I generate all the configs for those from my puppet modelling.

Using the right puppet modules my servers know about my clients, my clients can aggregate their services and everything stays in sync. While zabbix allows you to define templates and associate checks using groups and (which Nagios can also do) that's the wrong level for me. My servers have lots of traits I need applied to them, monitoring, trending, logging etc and I want to define that once and have the artifacts actioned where they're needed - not have to work around half an API or click through a GUI. To be honest, the fact that the API seems like such an afterthought bothers me (possibly unreasonably so) as I think it shows a community with different needs to mine.

And now on to using the GUI to actually display information. From the presentation I understand that the Zabbix team are moving in the direction I consider correct - anything you can do from the GUI you'll be able to do from the API. Your monitoring system (and your trending systems) are too important to only access in the way other people think you should. The information in it needs to be presented in different ways to different audiences - and here too I think Nagios (with a little help from MK Livestatus and Nagvis is currently doing an OK job. It's extensible, I have a full query language for retrieving monitoring state and I can convey the information on a screen that highlights my information in the way I like - without making people use the still very unloved Nagios CGIs. Forcing me through a single GUI that allows no comprehensive API access (other than raw SQL) is a losing bet for me.

Hopefully some of that explains my issues with monitoring in general and zabbix (as it is at the moment) in particular. You may not agree with it - but with the tool chain I have these days I think the nicer interfaces, without full API access is a bug, not a feature.

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Zabbix GUIs and Automation

In the T-DOSE Zabbix talk, which I'm happy to say was both well presented and showed some interesting features, I got called out for a quote I made on Twitter (which just goes to show - you never know where what you said is going to show up and haunt you) about the relevance, and I'd say overemphasis, of the GUI to the zabbix monitoring system - and other monitoring systems in general. Rather than argue with the speaker (for the record, I hate it when the audience does that) I thought I'd note my objects with it here instead.

My monitoring system of choice is Nagios. It's starting to get a little long in the tooth (where can I add a new host on the fly?) but it's survived this long because it got a lot of things right. Including its loose coupling and the fact it can read directories of config files. Zabbix, and to a degree Hyperic (which has a command line interface that only a satanist could love), are GUI focused tools (let's ignore auto-detect for now). To add a host you click around. To add another host, you click around. To add a new check you click around. To add a new group you click around. To... you get the idea.

Now that might have been fine a couple of years ago (well, not really) and it's an easy, intuitive way to add new config (well, not in Hyperics case) but it bothers me on two levels, firstly I like the Unix approach that nearly everything is a file, and secondly I no longer have a handful of hosts, I have hundreds to thousands of the things. All spread around multiple production sites for all the usual reasons like load distribution, geographical locality, resilience etc and I generate all the configs for those from my puppet modelling.

Using the right puppet modules my servers know about my clients, my clients can aggregate their services and everything stays in sync. While zabbix allows you to define templates and associate checks using groups and (which Nagios can also do) that's the wrong level for me. My servers have lots of traits I need applied to them, monitoring, trending, logging etc and I want to define that once and have the artifacts actioned where they're needed - not have to work around half an API or click through a GUI. To be honest, the fact that the API seems like such an afterthought bothers me (possibly unreasonably so) as I think it shows a community with different needs to mine.

And now on to using the GUI to actually display information. From the presentation I understand that the Zabbix team are moving in the direction I consider correct - anything you can do from the GUI you'll be able to do from the API. Your monitoring system (and your trending systems) are too important to only access in the way other people think you should. The information in it needs to be presented in different ways to different audiences - and here too I think Nagios (with a little help from MK Livestatus and Nagvis is currently doing an OK job. It's extensible, I have a full query language for retrieving monitoring state and I can convey the information on a screen that highlights my information in the way I like - without making people use the still very unloved Nagios CGIs. Forcing me through a single GUI that allows no comprehensive API access (other than raw SQL) is a losing bet for me.

Hopefully some of that explains my issues with monitoring in general and zabbix (as it is at the moment) in particular. You may not agree with it - but with the tool chain I have these days I think the nicer interfaces, without full API access is a bug, not a feature.

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Adventures in Cronologger

Cronjobs are one of those necessary evils of any decent sized Unix setup, they provide often essential pieces of a sites data flows but are often treated as second class citizens. While I've already mentioned my Cron commandments I'm always looking for improvements in the rest of my cron tool set and, with Vladimir Vuksan's cronologger, I may have found another piece of the puzzle.

The concept is simple, you add a command to the front of your crontabs and it invokes your actual cron command. This wrapper script collects the stdout, stderr and some other details such as exit code and run time. The backend is a couchdb data store and the simple reporting pages are written in PHP, and are easy to work through, crib and base your own reports from. Having all this cron information also helps provide a talking point with development, it's easy to show progress and imbue a sense of actually getting somewhere when the number of cronjobs with errors drops each day, rather than the systems team mentioning that their email boxes are a little emptier since the last release.

While our initial tests seem positive there are a couple of reports and tweaks to the command line data injector that we want for our local usage. The biggest problem with the project may well be that the idea is so obviously correct that we end up re-implementing it in something a little more suitable for our environment. Maybe a Python command line client and Perl Template Toolkit driven reports to replace the PHP. But that's a possibility for later - for now cronologger is a great 80% solver.

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Adventures in Cronologger

Cronjobs are one of those necessary evils of any decent sized Unix setup, they provide often essential pieces of a sites data flows but are often treated as second class citizens. While I've already mentioned my Cron commandments I'm always looking for improvements in the rest of my cron tool set and, with Vladimir Vuksan's cronologger, I may have found another piece of the puzzle.

The concept is simple, you add a command to the front of your crontabs and it invokes your actual cron command. This wrapper script collects the stdout, stderr and some other details such as exit code and run time. The backend is a couchdb data store and the simple reporting pages are written in PHP, and are easy to work through, crib and base your own reports from. Having all this cron information also helps provide a talking point with development, it's easy to show progress and imbue a sense of actually getting somewhere when the number of cronjobs with errors drops each day, rather than the systems team mentioning that their email boxes are a little emptier since the last release.

While our initial tests seem positive there are a couple of reports and tweaks to the command line data injector that we want for our local usage. The biggest problem with the project may well be that the idea is so obviously correct that we end up re-implementing it in something a little more suitable for our environment. Maybe a Python command line client and Perl Template Toolkit driven reports to replace the PHP. But that's a possibility for later - for now cronologger is a great 80% solver.

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

The ThoughtWorks Anthology – Short Review

The ThoughtWorks Anthology is a collection of short articles and essays written by a number of their employees (some of who are now ex-employees) about software development with a heavily agile slant. The topics range from the very high level "Lush Landscape of Languages" and "What is an Iteration manager anyway" to the more technical and technique focused "Refactoring Ant Build Files" and "Object Calisthenics".

While the general quality of the writing is very good, especially my favourite - 'Object Calisthenics', the biggest problem with a book like this is that a lot of the essays authors, and some of their also knowledgeable co-workers, have personal blogs where this quality of information is available on a (near) daily basis, in both greater depth and more a conversational nature.

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Netbeans vs Commandline

The last time we interviewed for Java developers (a couple of jobs ago) it came as quite a surprise at how few of them could function without their IDE of choice. A high percentage of the candidates struggled to compile using javac, had problems navigating the docs and made a large number of very simple syntax errors that they were obviously used to their editor dealing with.

At the time the more unix focused team, most of who were very long term vim and emacs users, had a number of discussions about how this should impact our rating of the candidates. One school of thought was that people should use the tools that make them most productive. The other was that people should understand their tool chain. How can you diagnose issues on a production server if you can't even compile a class on the command line? You can tell which side I was on.

I've recently joined a small Java project and after some awkward fiddling around with ant, junit and half a dozen other jars decided to give Netbeans a chance. I was pleasantly surprised at how quickly and easily I got the same project up and running in the IDE. I don't yet have a clue how it's storing the files on disk, constructs the build or test targets and a dozen other little details but at this stage in my basic use of Java it doesn't seem to matter.

It's strange how quickly seductive all the optional extras can be and how easy it is to lose track of what you don't know while adapting to the features they offer. I'm not sure how much of it is better tooling, benefits of a strongly typed static language or just having a dedicated team behind producing a consistent development environment but it felt very easy to take baby steps with. And I'm hoping the tool continues to show me more power as my needs when using it grow.

While I'm at no risk of giving up vim for my day to day work I think I'll be investing some time in to learning one of the big three Java editors (Eclipse, Netbeans or IntelliJ) for while I'm away in the strange world.

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!

Obese Provisioning – Antipattern

One antipattern I'm seeing with increasing frequency is that of obese (or fat, or bloated) system provisioning. It seems as common in people that are just getting used to having an automated provisioning system and are enthusiastic about its power as it is in longer term users who have added layer on layer of cruft to their host builder.

The basic problem is that of adding too much work and intelligence to the actual provisioning stage. Large postrun sections or after_install command blocks should be a warning sign and point to tasks that may well be better off inside a system like Puppet or Chef. It's a seductive problem because it's an easy way to add additional functionality to a host, especially when it allows you to avoid thinking about applying or modifying a general role; even more so if it's one that's already in use on other hosts. Adding a single line in a kickstart or preseed file is quicker, requires no long term thinking and is immediately available.

Unfortunately by going down this path you end up with a lot of one-off host modifications, nearly common additional behaviour and a difficult to refactor build process. A tight coupling between these two stages can make trivial tasks unwieldy and in some cases force work to be made to remove or modify the change for day to day operation after the build has completed.

A good provisioning system should do the bare minimum required to get a machine built. It should be lean, do as little as possible and prepare the host to run its configuration management system. Everything else should be managed from inside that.

Like this post? - Digg Me! | Add to del.icio.us! | reddit this!