Copying an rsnapshot archive

When using rsync to copy rsnapshot archives you should use the --hard-links option:

       -H, --hard-links
              This tells rsync to look for hard-linked files in the source and
              link together the corresponding files on the destination.  With-
              out  this option, hard-linked files in the source are treated as
              though they were separate files.

Thanks to this post: https://www.cyberciti.biz/faq/linux-unix-apple-osx-bsd-rsync-copy-hard-links/

Masterzen’s Blog 2014-01-11 16:45:00

All started a handful of months ago, when it appeared that we’d need to build some of our native software on Windows. Before that time, all our desktop software at Days of Wonder was mostly cross-platform java code that could be cross-compiled on Linux. Unfortunately, we badly needed a Windows build machine.

In this blog post, I’ll tell you the whole story from my zero knowledge of Windows administration to an almost fully automatized Windows build machine image construction.

Jenkins

But, first let’s digress a bit to explain in which context we operate our builds.

Our CI system is built around Jenkins, with a specific twist. We run the Jenkins master on our own infrastructure and our build slaves on AWS EC2. The reason behind this choice is out of the scope of this article (but you can still ask me, I’ll happily answer).

So, we’re using the Jenkins EC2 plugin, and a revamped by your servitor Jenkins S3 Plugin. We produce somewhat large binary artifacts when building our client software, and the bandwidth between EC2 and our master is not that great (and expensive), so using the aforementioned patch I contributed, we host all our artifacts into S3, fully managed by our out-of-aws Jenkins master.

The problem I faced when starting to explore the intricate world of Windows in relation with Jenkins slave, is that we wanted to keep the Linux model we had: on-demand slave spawned by the master when scheduling a build. Unfortunately the current state of the Jenkins EC2 plugin only supports Linux slave.

Enter WinRM and WinRS

The EC2 plugin for Linux slave works like this:

  1. it starts the slave
  2. using an internal scp implementation it copies ‘slave.jar’ which implements the client Jenkins remoting protocol
  3. using an internal ssh implementation, it executes java -jar slave.jar. The stdin and stdout of the slave.jar process is then connected to the jenkins master through an ssh tunnel.
  4. now, Jenkins does its job (basically sending more jars, classes)
  5. at this stage the slave is considered up

I needed to replicate this behavior. In the Windows world, ssh is inexistent. You can find some native implementation (like FreeSSHd or some other commercial ones), but all that options weren’t easy to implement, or simply non-working.

In the Windows world, remote process execution is achieved through the Windows Remote Management which is called WinRM for short. WinRM is an implementation of the WSMAN specifications. It allows to access the Windows Management Instrumentation to get access to hardware counters (ala SNMP or IPMI for the unix world).

One component of WinRM is WinRS: Windows Remote Shell. This is the part that allows to run remote commands. Recent Windows version (at least since Server 2003) are shipped with WinRM installed (but not started by default).

WinRM is an HTTP/SOAP based protocol. By default, the payload is encrypted if the protocol is used in a Domain Controller environment (in this case, it uses Kerberos), which will not be our case on EC2.

Digging, further, I found two client implementations:

I started integrating Overthere into the ec2-plugin but encountered several incompatibilities, most notably Overthere was using a more recent dependency on some libraries than jenkins itself.

I finally decided to create my own WinRM client implementation and released Windows support for the EC2 plugin. This hasn’t been merged upstream, and should still be considered experimental.

We’re using this version of the plugin for about a couple of month and it works, but to be honest WinRM doesn’t seem to be as stable as ssh would be. There are times the slave is unable to start correctly because WinRM abruptly stops working (especially shortly after the machine boots).

WinRM, the bootstrap

So all is great, we know how to execute commands remotely from Jenkins. But that’s not enough for our sysadmin needs. Especially we need to be able to create a Windows AMI that contains all our software to build our own applications.

Since I’m a long time Puppet user (which you certainly noticed if you read this blog in the past), using Puppet to configure our Windows build slave was the only possiblity. So we need to run Puppet on a Windows base AMI, then create an AMI from there that will be used for our build slaves. And if we can make this process repeatable and automatic that’d be wonderful.

In the Linux world, this task is usually devoted to tools like Packer or Veewee (which BTW supports provisioning Windows machines). Unfortunately Packer which is written in Go doesn’t yet support Windows, and Veewee doesn’t support EC2.

That’s the reason I ported the small implementation I wrote for the Jenkins EC2 plugin to a WinRM Go library. This was the perfect pet project to learn a new language :)

Windows Base AMI

So, starting with all those tools, we’re ready to start our project. But there’s a caveat: WinRM is not enabled by default on Windows. So before automating anything we need to create a Windows base AMI that would have the necessary tools to further allow automating installation of our build tools.

Windows boot on EC2

There’s a service running on the AWS Windows AMI called EC2config that does the following at the first boot:

  1. set a random password for the ‘Administrator’ account
  2. generate and install the host certificate used for Remote Desktop Connection.
  3. execute the specified user data (and cloud-init if installed)

On first and subsequent boot, it also does:

  1. it might set the computer host name to match the private DNS name
  2. it configures the key management server (KMS), check for Windows activation status, and activate Windows as necessary.
  3. format and mount any Amazon EBS volumes and instance store volumes, and map volume names to drive letters.
  4. some other administrative tasks

One thing that is problematic with Windows on EC2 is that the Administrator password is unfortunately defined randomly at the first boot. That means to further do things on the machine (usually using remote desktop to administer it) you need to first know it by asking AWS (with the command-line you can do: aws ec2 get-password-data).

Next, we might also want to set a custom password instead of this dynamic one. We might also want to enable WinRM and install several utilities that will help us later.

To do that we can inject specific AMI user-data at the first boot of the Windows base AMI. Those user-data can contain one or more cmd.exe or Powershell scripts that will get executed at boot.

I created this Windows bootstrap Gist (actually I forked and edited the part I needed) to prepare the slave.

First bootstrap

First, we’ll create a Windows security group allowing incoming WinRM, SMB and RDP:

1
2
3
4
5
6
7
aws ec2 create-security-group --group-name "Windows" --description "Remote access to Windows instances"
# WinRM
aws ec2 authorize-security-group-ingress --group-name "Windows" --protocol tcp --port 5985 --cidr <YOURIP>/32
# Incoming SMB/TCP 
aws ec2 authorize-security-group-ingress --group-name "Windows" --protocol tcp --port 445 --cidr <YOURIP>/32
# RDP
aws ec2 authorize-security-group-ingress --group-name "Windows" --protocol tcp --port 3389 --cidr <YOURIP>/32

Now, let’s start our base image with the following user-data (let’s put it into userdata.txt):

1
2
3
4
<powershell>
Set-ExecutionPolicy Unrestricted
icm $executioncontext.InvokeCommand.NewScriptBlock((New-Object Net.WebClient).DownloadString('https://gist.github.com/masterzen/6714787/raw')) -ArgumentList "VerySecret"
</powershell>

This powershell script will download the Windows bootstrap Gist and execute it, passing the desired administrator password.

Next we launch the instance:

1
aws ec2 run-instances --image-id ami-4524002c --instance-type m1.small --security-groups Windows --key-name <YOURKEY> --user-data "$(cat userdata.txt)"

Unlike what is written in the ec2config documentation, the user-data must not be encoded in Base64.

Note, the first boot can be quite long :)

After that we can connect through WinRM with the “VerySecret” password. To check we’ll use the WinRM Go tool I wrote and talked about above:

1
./winrm -hostname <publicip> -username Administrator -password VerySecret "ipconfig /all"

We should see the output of the ipconfig command.

Note: in the next winrm command, I’ve omitted the various credentials to increase legibility (a future version of the tool will allow to read a config file, meanwhile we can create an alias).

A few caveats:

  • BITS doesn’t work in the user-data powershell, because it requires a user to be logged-in which is not possible during boot, that’s the reason downloading is done through the System.Net.WebClient
  • WinRM enforces some resource limits, you might have to increase the allowed shell resources for running some hungry commands: winrm set winrm/config/winrs @{MaxMemoryPerShellMB="1024"} Unfortunately, this is completely broken in Windows Server 2008 unless you install this Microsoft hotfix The linked bootstrap code doesn’t install this hotfix, because I’m not sure I can redistribute the file, that’s an exercise left to the reader :)
  • the winrm traffic is not encrypted nor protected (if you use my tool). Use at your own risk. It’s possible to setup WinRM over HTTPS, but it’s a bit more involved. Current version of my WinRM tool doesn’t support HTTPS yet (but it’s very easy to add).

Baking our base image

Now that we have our base system with WinRM and Puppet installed by the bootstrap code, we need to create a derived AMI that will become our base image later when we’ll create our different windows machines.

1
aws ec2 create-image --instance-id <ourid> --name 'windows-2008-base'

For a real world example we might have defragmented and blanked the free space of the root volume before creating the image (on Windows you can use sdelete for this task).

Note that we don’t run the Ec2config sysprep prior to creating the image, which means the first boot of any instances created from this image won’t run the whole boot sequence and our Administrator password will not be reset to a random password.

Where does Puppet fit?

Now that we have this base image, we can start deriving it to create other images, but this time using Puppet instead of a powershell script. Puppet has been installed on the base image, by virtue of the powershell bootstrap we used as user-data.

First, let’s get rid of the current instance and run a fresh one coming from the new AMI we just created:

1
aws ec2 run-instances --image-id <newami> --instance-type m1.small --security-groups Windows --key-name <YOURKEY>

Anatomy of running Puppet

We’re going to run Puppet in masterless mode for this project. So we need to upload our set of manifests and modules to the target host.

One way to do this is to connect to the host with SMB over TCP (which our base image supports):

1
2
sudo mkdir -p /mnt/win
sudo mount -t cifs -o user="Administrator%VerySecret",uid="$USER",forceuid "//<instance-ip>/C\$/Users/Administrator/AppData/Local/Temp" /mnt/win

Note how we’re using an Administrative Share (the C$ above). On Windows the Administrator user has access to the local drives through Administrative Shares without having to share them as for normal users.

The user-data script we ran in the base image opens the windows firewall to allow inbound SMB over TCP (port 445).

We can then just zip our manifests/modules, send the file over there, and unzip remotely:

1
2
zip -q -r /mnt/win/puppet-windows.zip manifests/jenkins-steam.pp modules -x .git
./winrm "7z x -y -oC:\\Users\\Administrator\\AppData\\Local\\Temp\\ C:\\Users\\Administrator\\AppData\\Local\\Temp\\puppet-windows.zip | FIND /V \"ing  \""

And finally, let’s run Puppet there:

1
./winrm "\"C:\\Program Files (x86)\\Puppet Labs\\Puppet\\bin\\puppet.bat\" apply --debug --modulepath C:\\Users\\Administrator\\AppData\\Local\\Temp\\modules C:\\Users\\Administrator\\AppData\\Local\\Temp\\manifests\\site.pp"

And voila, shortly we’ll have a running instance configured. Now we can create a new image from it and use it as our Windows build slave in the ec2 plugin configuration.

Puppet on Windows

Puppet on Windows is not like your regular Puppet on Unix. Let’s focus on what works or not when running Puppet on Windows.

Core resources known to work

The obvious ones known to work:

  • File: beside symbolic links that are supported only on Puppet >3.4 and Windows 2008+, there are a few things to take care when using files:

    • NTFS is case-insensitive (but not the file resource namevar)
    • Managing permissions: octal unix permissions are mapped to Windows permissions, but the translation is imperfect. Puppet doesn’t manage Windows ACL (for more information check Managing File Permissions on Windows)
  • User: Puppet can create/delete/modify local users. The Security Identifier (SID) can’t be set. User names are case-insensitive on Windows. To my knowledge you can’t manage domain users.

  • Group: Puppet can create/delete/modify local groups. Puppet can’t manage domain groups.

  • Package: Puppet can install MSI or exe installers present on a local path (you need to specify the source). For a more comprehensive package system, check below the paragraph about Chocolatey.

  • Service: Puppet can start/stop/enable/disable services. You need to specify the short service name, not the human-reading service name.

  • Exec: Puppet can run executable (any .exe, .com or .bat). But unlike on Unix, there is no shell so you might need to wrap the commands with cmd /c. Check the Powershell exec provider module for a more comprehensive Exec system on Windows.

  • Host: works the same as for Unix systems.

  • Cron: there’s no cron system on Windows. Instead you must use the Scheduled_task type.

Do not expect your average unix module to work out-of-the-box

Of course that’s expected, mostly because of the used packages. Most of the Forge module for instance are targeting unix systems. Some Forge modules are Windows only, but they tend to cover specific Windows aspects (like registry, Powershell, etc…), still make sure to check those, as they are invaluable in your module Portfolio.

My Path is not your Path!

You certainly know that Windows paths are not like Unix paths. They use \ where Unix uses /.

The problem is that in most languages (including the Puppet DSL) ‘\’ is considered as an escape character when used in double quoted strings literals, so must be doubled \\.

Puppet single-quoted strings don’t understand all of the escape sequences double-quoted strings know (it only parses \' and \\), so it is safe to use a lone \ as long as it is not the last character of the string.

Why is that?

Let’s take this path C:\Users\Administrator\, when enclosed in a single-quoted string 'C:\Users\Administrator\' you will notice that the last 2 characters are \' which forms an escape sequence and thus for Puppet the string is not terminated correctly by a single-quote. The safe way to write a single-quoted path like above is to double the final slash: 'C:\Users\Administrator\\', which looks a bit strange. My suggestion is to double all \ in all kind of strings for simplicity.

Finally when writing an UNC Path#UNC_in_Windows) in a string literal you need to use four backslashes: \\\\host\\path.

Back to the slash/anti-slash problem there’s a simple rule: if the path is directly interpreted by Puppet, then you can safely use /. If the path if destined to a Windows command (like in an Exec), use a \.

Here’s a list of possible type of paths for Puppet resources:

  • Puppet URL: this is an url, so /
  • template paths: this is a path for the master, so /
  • File path: it is preferred to use / for coherence
  • Exec command: it is preferred to use /, but beware that most Windows executable requires \ paths (especially cmd.exe)
  • Package source: it is preferred to use /
  • Scheduled task command: use \ as this will be used directly by Windows.

Windows facts to help detection of windows

To identify a Windows client in a Puppet manifests you can use the kernel, operatingsystem and osfamily facts that all resolves to windows.

Other facts, like hostname, fqdn, domain or memory*, processorcount, architecture, hardwaremodel and so on are working like their Unix counterpart.

Networking facts also works, but with the Windows Interface name (ie Local_Area_Connection), so for instance the local ip address of a server will be in ipaddress_local_area_connection. The ipaddress fact also works, but on my Windows EC2 server it is returning a link-local IPv6 address instead of the IPv4 Local Area Connection address (but that might because it’s running on EC2).

Do yourself a favor and use Chocolatey

We’ve seen that Puppet Package type has a Windows provider that knows how to install MSI and/or exe installers when provided with a local source. Unfortunately this model is very far from what Apt or Yum is able to do on Linux servers, allowing access to multiple repositories of software and on-demand download and installation (on the same subject, we’re still missing something like that for OSX).

Hopefully in the Windows world, there’s Chocolatey. Chocolatey is a package manager (based on NuGet) and a public repository of software (there’s no easy way to have a private repository yet). If you read the bootstrap code I used earlier, you’ve seen that it installs Chocolatey.

Chocolatey is quite straightforward to install (beware that it doesn’t work for Windows Server Core, because it is missing the shell Zip extension, which is the reason the bootstrap code installs Chocolatey manually).

Once installed, the chocolatey command allows to install/remove software that might come in several flavors: either command-line packages or install packages. The first one only allows access through the command line, whereas the second does a full installation of the software.

So for instance to install Git on a Windows machine, it’s as simple as:

1
chocolatey install git.install

To make things much more enjoyable for the Puppet users, there’s a Chocolatey Package Provider Module on the Forge allowing to do the following

1
2
3
4
5
package {
  "cmake":
    ensure => installed,
    provider => "chocolatey"
}

Unfortunately at this stage it’s not possible to host easily your own chocolatey repository. But it is possible to host your own chocolatey packages, and use the source metaparameter. In the following example we assume that I packaged cmake version 2.8.12 (which I did by the way), and hosted this package on my own webserver:

1
2
3
4
5
6
7
8
9
10
11
12
# download_file uses powershell to emulate wget
# check here: http://forge.puppetlabs.com/opentable/download_file
download_file { "cmake":
  url                   => "http://chocolatey.domain.com/packages/cmake.2.8.12.nupkg",
  destination_directory => "C:\\Users\\Administrator\\AppData\\Local\\Temp\\",
}
->
package {
  "cmake":
    ensure => install,
    source => "C:\\Users\\Administrator\\AppData\\Local\\Temp\\"
}

You can also decide that chocolatey will be the default provider by adding this to your site.pp:

1
2
3
Package {
  provider => "chocolatey"
}

Finally read how to create chocolatey packages if you wish to create your own chocolatey packages.

Line endings and character encodings

There’s one final things that the Windows Puppet user must take care about. It’s line endings and character encodings. If you use Puppet File resources to install files on a Windows node, you must be aware that file content is transferred verbatim from the master (either by using content or source).

That means if the file uses the Unix LF line-endings the file content on your Windows machine will use the same. If you need to have a Windows line ending, make sure your file on the master (or the content in the manifest) is using Windows \r\n line ending.

That also means that your text files might not use a windows character set. It’s less problematic nowadays than it could have been in the past because of the ubiquitous UTF-8 encoding. But be aware that the default character set on western Windows systems is CP-1252 and not UTF-8 or ISO-8859-15. It’s possible that cmd.exe scripts not encoded in CP-1252 might not work as intended if they use characters out of the ASCII range.

Conclusion

I hope this article will help you tackle the hard task of provisioning Windows VM and running Puppet on Windows. It is the result of several hours of hard work to find the tools and learn Windows knowledge.

During this journey, I started learning a new language (Go), remembered how I dislike Windows (and its administration), contributed to several open-source projects, discovered a whole lot on Puppet on Windows, and finally learnt a lot on WinRM/WinRS.

Stay tuned on this channel for more article (when I have the time) about Puppet, programming and/or system administration :)

Chef & FreeBSD : use pkgng

Baptiste Daroussin did an incredible job on FreeBSD with the new packages system, named PkgNG. It brings modern workflow, options and shiny features that were needed for a long time. Say goodbye to painfully long upgrades.

However, Chef is not yet able to use this packaging system as it does not have a PkgNG provider, or not had. This is a hacky way to do so but here is a way to use PkgNG with chef, making it the default provider for your packages.

Read more : poudriere & pkgng

Restoring Deleted Files in Linux from the ext3 Journal

Deleting Computer Files
Someone just `rm -rf *`-ed from `/` on a production server.

Fortunately, you have backups. Unfortunately, the server included a database with important business data that was written just before the disaster. That most recent data is not included in the last database backup.

You panic: “This is a Linux server, there’s no Trash, no Recycle Bin, no ‘undelete’…” Is there any chance of recovering all of the data?

Breathe.

Think.

“This is a Linux server… there should be plenty of ways to access those bits and bytes more directly…”

You start to think about trying to `grep` through the block device, but you don’t know exactly what you’re looking for — and much of it is likely to be binary data.

Can ext3 Retrieve the Deleted Files?

Suddenly you remember that ext3′s big advantage over ext2 is its journaled filesystem. It would be unorthodox, but…

If the files you need were accessed recently enough, there’s a chance the block pointers to the files might still exist in the filesystem journal.

While you shut down the wounded VM and attach the virtual disk to another VM to begin investigating, you also search the Internet with the hope that someone besides DenverCoder9 has been in these same circumstances

Lo and behold! Carlo Wood created just the tool you had imagined! ext3grep can reconstruct (many) deleted files based on entries from the filesystem journal. Read HOWTO recover deleted files on an ext3 file system for all of the gory details of how to wield this powerful tool.

ext3 and Deleted Files

Like many UNIX filesystems, ext3 represents files with a data structure called an inode. The inode contains metadata such as what user owns a file and the last time it was modified. It also contains pointers to the “blocks” where the contents of the file actually reside. When a file is deleted from disk, the blocks containing the file contents are not modified immediately; only the inode is changed. (The blocks are simply freed up to be overwritten as space is needed in the future.) On ext2 filesystems, the inode is marked as deleted, but the pointers are left intact. Ext3 actually zeroes-out the inode pointers, making it impossible to retrieve the file contents from a deleted inode.

Inode Pointer Structure

Inode Pointer Structure

The brutish method of `grep`-ing through the disk directly works for small files where you know some unique contents (accidentally deleted configuration files, for example) because there’s a good chance that the blocks have not yet been overwritten. However, trying to restore large or non-plaintext files via this method (e.g. MySQL binary logs) is a recipe for sorrow.

But if the block pointers are zeroed-out in the inode, how can we reconstruct the blocks into a complete file?

As you might have guessed from the title of this post, the answer is “from the journal”. By default on most modern Linux systems, ext3 is configured to log all metadata changes (like file creations and deletions) to the journal. Carlo Wood’s ext3grep utility facilitates reading these journal entries and using them to reconstruct files. Whether this will contain the block pointers we need depends on how big the journal is and how recently the files were last modified, but in our case, the database binlogs are contained there and can be reconstructed in their entirety.

Using the Journal to Restore Files

First we’ll look in the journal for deletion events. If we know the approximate time that tragedy struck our poor server, this step will be much easier. We’ll find a huge glut of deletions and look at the first set, working back until we’re confident that we know the time when the filesystem was in its happier state.

root@datarecovery:~# ext3grep /dev/sdb3 --histogram=dtime --after 1335555802 --before 1335555805
Running ext3grep version 0.10.1
Only show/process deleted entries if they are deleted on or after Fri Apr 27 15:43:22 2012 and before Fri Apr 27 15:43:25 2012.
Number of groups: 156
Minimum / maximum journal block: 1544 / 35884
Loading journal descriptors... sorting... done
The oldest inode block that is still in the journal, appears to be from 1335458082 = Thu Apr 26 12:34:42 2012
Journal transaction 19354469 wraps around, some data blocks might have been lost of this transaction.
Number of descriptors in journal: 30581; min / max sequence numbers: 19354439 / 19359432
Only show/process deleted entries if they are deleted on or after 1335555802 and before 1335555805.
Only showing deleted entries.
Fri Apr 27 15:43:22 2012  1335555802        0
Fri Apr 27 15:43:23 2012  1335555803     1575 ===================================================
Fri Apr 27 15:43:24 2012  1335555804     3029 ====================================================================================================
Fri Apr 27 15:43:25 2012  1335555805
Totals:
1335555802 - 1335555804     4604

Also note that this output tells us the oldest inode block still in the journal. We have a shot at restoring any files accessed after that time, but not necessarily things that have not been accessed after that time.

If we were looking for a particular file, we could attempt to recover it using --recover-file

  --restore-file 'path' [--restore-file 'path' ...]
                         Will restore file 'path'. 'path' is relative to the
                         root of the partition and does not start with a '/' (it
                         must be one of the paths returned by --dump-names).
                         The restored directory, file or symbolic link is
                         created in the current directory as 'RESTORED_FILES/path'.

Be prepared to wait. This process attempts to find inodes matching the path provided and (if there are multiple matches) reason about which one to attempt restoring.

We can also have ext3grep attempt to recover all the files it can by using --restore-all:

  --restore-all          As --restore-file but attempts to restore everything.
                         The use of --after is highly recommended because the
                         attempt to restore very old files will only result in
                         them being hard linked to a more recently deleted file
                         and as such polute the output.

For example:

root@datarecovery:~# ext3grep /dev/sdb3 --restore-all --after 1335555800
Running ext3grep version 0.10.1
Only show/process deleted entries if they are deleted on or after Fri Apr 27 15:43:20 2012.
Number of groups: 156
Minimum / maximum journal block: 1544 / 35884
Loading journal descriptors... sorting... done
The oldest inode block that is still in the journal, appears to be from 1335458082 = Thu Apr 26 12:34:42 2012
Journal transaction 19354469 wraps around, some data blocks might have been lost of this transaction.
Number of descriptors in journal: 30581; min / max sequence numbers: 19354439 / 19359432
Loading sdb3.ext3grep.stage2..............................

If it works, the files will be restored to ./RESTORED_FILES/.

Related Links

Further Reading

Related Tools

The post Restoring Deleted Files in Linux from the ext3 Journal appeared first on Atomic Spin.

Masterzen’s Blog 2011-12-25 17:49:18

The same way I created mysql-snmp a small Net-SNMP subagent that allows exporting performance data from MySQL through SNMP, I’m proud to announce the first release of redis-snmp to monitor Redis servers. It is also inspired by the Cacti MySQL Templates (which also covers Redis).

I originally created this Net-SNMP perl subagent to monitor some Redis performance metrics with OpenNMS.

The where

You’ll find the sources (which allows to produce a debian package) in the redis-snmp github repository

The what

Here are the kind of graphs and metrics you can export from a redis server:

Redis Connections

Redis Commands

Redis Memory

The how

Like mysql-snmp you need to run redis-snmp on a host that has a connectivity with the monitored redis server (the same host makes sense). You also need the following dependencies:

  • Net-SNMP >= 5.4.2.1 (older versions contains a 64 bits varbind issue)
  • perl (tested under perl 5.10 from debian squeeze)

Once running, you should be able to ask your snmpd about redis values:

1
2
3
4
5
6
7
$ snmpbulkwalk -m'REDIS-SERVER-MIB' -v 2c  -c public redis-server.domain.com .1.3.6.1.4.1.20267.400
REDIS-SERVER-MIB::redisConnectedClients.0 = Gauge32: 1
REDIS-SERVER-MIB::redisConnectedSlaves.0 = Gauge32: 0
REDIS-SERVER-MIB::redisUsedMemory.0 = Counter64: 154007648
REDIS-SERVER-MIB::redisChangesSinceLastSave.0 = Gauge32: 542
REDIS-SERVER-MIB::redisTotalConnections.0 = Counter64: 6794739
REDIS-SERVER-MIB::redisCommandsProcessed.0 = Counter64: 37574019

Of course you must adjust the hostname and community. SNMP v2c (or better) is mandatory since we’re reporting 64 bits values.

Note that you can get the OID translation to name only if the REDIS-SNMP-SERVER MIB is installed on the host where you run the above command.

OpeNMS integration

To integrate to OpenNMS, it’s as simple as adding the following group to your datacollection-config.xml file:

1
2
3
4
5
6
7
8
9
<!-- REDIS-SERVER MIB -->
<group name="redis" ifType="ignore">
    <mibObj oid=".1.3.6.1.4.1.20267.400.1.1" instance="0" alias="redisConnectedClnts" type="Gauge32" />
    <mibObj oid=".1.3.6.1.4.1.20267.400.1.2" instance="0" alias="redisConnectedSlavs" type="Gauge32" />
    <mibObj oid=".1.3.6.1.4.1.20267.400.1.3" instance="0" alias="redisUsedMemory" type="Gauge64" />
    <mibObj oid=".1.3.6.1.4.1.20267.400.1.4" instance="0" alias="redisChangsSncLstSv" type="Gauge32" />
    <mibObj oid=".1.3.6.1.4.1.20267.400.1.5" instance="0" alias="redisTotalConnectns" type="Counter64" />
    <mibObj oid=".1.3.6.1.4.1.20267.400.1.6" instance="0" alias="redisCommandsPrcssd" type="Counter64" />
</group>

And the following graph definitions to your snmp-graph.properties file:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
report.redis.redisconnections.name=Redis Connections
report.redis.redisconnections.columns=redisConnectedClnts,redisConnectedSlavs,redisTotalConnectns
report.redis.redisconnections.type=nodeSnmp
report.redis.redisconnections.width=565
report.redis.redisconnections.height=200
report.redis.redisconnections.command=--title "Redis Connections" \
 --width 565 \
 --height 200 \
 DEF:redisConnectedClnts={rrd1}:redisConnectedClnts:AVERAGE \
 DEF:redisConnectedSlavs={rrd2}:redisConnectedSlavs:AVERAGE \
 DEF:redisTotalConnectns={rrd3}:redisTotalConnectns:AVERAGE \
 LINE1:redisConnectedClnts#9B2B1B:"REDIS Connected Clients         " \
 GPRINT:redisConnectedClnts:AVERAGE:"Avg \\: %8.2lf %s" \
 GPRINT:redisConnectedClnts:MIN:"Min \\: %8.2lf %s" \
 GPRINT:redisConnectedClnts:MAX:"Max \\: %8.2lf %s\\n" \
 LINE1:redisConnectedSlavs#4A170F:"REDIS Connected Slaves          " \
 GPRINT:redisConnectedSlavs:AVERAGE:"Avg \\: %8.2lf %s" \
 GPRINT:redisConnectedSlavs:MIN:"Min \\: %8.2lf %s" \
 GPRINT:redisConnectedSlavs:MAX:"Max \\: %8.2lf %s\\n" \
 LINE1:redisTotalConnectns#38524B:"REDIS Total Connections Received" \
 GPRINT:redisTotalConnectns:AVERAGE:"Avg \\: %8.2lf %s" \
 GPRINT:redisTotalConnectns:MIN:"Min \\: %8.2lf %s" \
 GPRINT:redisTotalConnectns:MAX:"Max \\: %8.2lf %s\\n"

report.redis.redismemory.name=Redis Memory
report.redis.redismemory.columns=redisUsedMemory
report.redis.redismemory.type=nodeSnmp
report.redis.redismemory.width=565
report.redis.redismemory.height=200
report.redis.redismemory.command=--title "Redis Memory" \
  --width 565 \
  --height 200 \
  DEF:redisUsedMemory={rrd1}:redisUsedMemory:AVERAGE \
  AREA:redisUsedMemory#3B7AD9:"REDIS Used Memory" \
  GPRINT:redisUsedMemory:AVERAGE:"Avg \\: %8.2lf %s" \
  GPRINT:redisUsedMemory:MIN:"Min \\: %8.2lf %s" \
  GPRINT:redisUsedMemory:MAX:"Max \\: %8.2lf %s\\n"

report.redis.rediscommands.name=Redis Commands
report.redis.rediscommands.columns=redisCommandsPrcssd
report.redis.rediscommands.type=nodeSnmp
report.redis.rediscommands.width=565
report.redis.rediscommands.height=200
report.redis.rediscommands.command=--title "Redis Commands" \
 --width 565 \
 --height 200 \
 DEF:redisCommandsPrcssd={rrd1}:redisCommandsPrcssd:AVERAGE \
 AREA:redisCommandsPrcssd#FF7200:"REDIS Total Commands Processed" \
 GPRINT:redisCommandsPrcssd:AVERAGE:"Avg \\: %8.2lf %s" \
 GPRINT:redisCommandsPrcssd:MIN:"Min \\: %8.2lf %s" \
 GPRINT:redisCommandsPrcssd:MAX:"Max \\: %8.2lf %s\\n"

report.redis.redisunsavedchanges.name=Redis Unsaved Changes
report.redis.redisunsavedchanges.columns=redisChangsSncLstSv
report.redis.redisunsavedchanges.type=nodeSnmp
report.redis.redisunsavedchanges.width=565
report.redis.redisunsavedchanges.height=200
report.redis.redisunsavedchanges.command=--title "Redis Unsaved Changes" \
  --width 565 \
  --height 200 \
  DEF:redisChangsSncLstSv={rrd1}:redisChangsSncLstSv:AVERAGE \
  AREA:redisChangsSncLstSv#A88558:"REDIS Changes Since Last Save" \
  GPRINT:redisChangsSncLstSv:AVERAGE:"Avg \\: %8.2lf %s" \
  GPRINT:redisChangsSncLstSv:MIN:"Min \\: %8.2lf %s" \
  GPRINT:redisChangsSncLstSv:MAX:"Max \\: %8.2lf %s\\n"

Do not forget to register the new graphs in the report list at the top of snmp-graph.properties file.

Restart OpenNMS, and it should start graphing your redis performance metrics. You’ll find those files in the opennms directory of the source distribution.

Enjoy :)

Masterzen’s Blog 2011-12-11 10:34:00

This article is a follow-up of those previous two articles of this series on Puppet Internals:

Today we’ll cover the The Indirector. I believe that at the end of this post, you’ll know exactly what is the indirector and how it works.

The scene

The puppet source code needs to deal with lots of different abstractions to do its job. Among those abstraction you’ll find:

  • Certificates
  • Nodes
  • Facts
  • Catalogs

Each one those abstractions can be found in the Puppet source code under the form of a model class. For instance when Puppet needs to deal with the current node, it in fact deals with an instance of the node model class. This class is called Puppet::Node.

Each model can exist physically under different forms. For instance Facts can come from Facter or a YAML file, or Nodes can come from an ENC, LDAP, site.pp and so on. This is what we call a Terminus.

The Indirector allows the Puppet programmer to deal with model instances without having to manage herself the gory details of where this model instance is coming/going.

For instance, the code is the same for the client call site to find a node when it comes from an ENC or LDAP, because it’s irrelevant to the client code.

Actions

So you might be wondering what the Indirector allows to do with our models. Basically the Indirector implements a basic CRUD (Create, Retrieve, Update, Delete) system. In fact it implements 4 verbs (that maps to the CRUD and REST verb sets):

  • Find: allows to retrieve a specific instance, given through the key
  • Search: allows to retrieve some instances with a search term
  • Destroy: remove a given instance
  • Save: stores a given instance

You’ll see a little bit later how it is wired, but those verbs exist as class and/or instance methods in the models class.

So back to our Puppet::Node example, we can say this:

1
2
3
4
5
6
7
8
9
10
11
12
  # Finding a specific node
  node = Puppet::Node.find('test.daysofwonder.com')

  # here I can use node, being an instance of Puppet::Node
  puts "node: #{node.name}"

  # I can also save the given node (if the terminus allows it of course)
  # Note: save is implemented as an instance method
  node.save

  # we can also destroy a given node (if the terminus implements it):
  Puppet::Node.destroy('unwanted.daysowonder.com')

And this works for all the managed models, I could have done the exact same code with certificate instead of nodes.

Terminii

For the Latin illiterate out-there, terminii is the latin plural for terminus.

So a terminus is a concrete class that knows how to deal with a specific model type. A terminus exists only for a given model. For instance the catalog indirection can use the Compiler or the YAML terminus among half-dozen of available terminus.

The terminus is a class that should inherit somewhere in the class hierarchy from Puppet::Indirector::Terminus. This last sentence might be obscure but if your terminus for a given model directly inherits from Puppet::Indirector::Terminus, it is considered as an abstract terminus and won’t work.

1
2
3
4
5
6
7
8
9
10
11
12
13
  def find(request)
    # request.key contains the instance to find
  end

  def destroy(request)
  end

  def search(request)
  end

  def save(request)
    # request.instance contains the model instance to save
  end

The request parameter used above is an instance of Puppet::Indirector::Request. This request object contains a handful property that might be of interest when implementing a terminus. The first one is the key method which returns the name of the instance we want to manipulate. The other is instance which is available only when saving is a concrete instance of the model.

Implementing a terminus

To implement a new terminus of a given model, you need to add a ruby file of the terminus name in the puppet/indirector/<indirection>/<terminus>.rb.

For instance if we want to implement a new source of puppet nodes like storing node classes in DNS TXT resource records, we’d create a puppet/node/dns.rb file whose find method would ask for TXT RR using request.key.

Puppet already defines some common behavior like yaml based files, rest based, code based or executable based. A new terminus can inherit from one of those abstract terminus to inherit from its behavior.

I contributed (but hasn’t been merged yet) and OCSP system for Puppet. This one defines a new indirection: ocsp. This indirection contains two terminus:

The real concrete one that inherits from Puppet::Indirector::Code, it in fact delegates the OCSP request verification to the OCSP layer:

1
2
3
4
5
6
7
8
9
10
11
require 'puppet/indirector/ocsp'
require 'puppet/indirector/code'
require 'puppet/ssl/ocsp/responder'

class Puppet::Indirector::Ocsp::Ca < Puppet::Indirector::Code
  desc "OCSP request revocation verification through the local CA."

  def save(request)
    Puppet::SSL::Ocsp::Responder.respond(request.instance)
  end
end

It also has a REST terminus. This allows for a given implementation to talk to a remote puppet process (usually a puppetmaster) using the indirector without modifying client or server code:

1
2
3
4
5
6
7
8
9
require 'puppet/indirector/ocsp'
require 'puppet/indirector/rest'

class Puppet::Indirector::Ocsp::Rest < Puppet::Indirector::REST
  desc "Remote OCSP certificate REST remote revocation status."

  use_server_setting(:ca_server)
  use_port_setting(:ca_port)
end

As you can see we can do a REST client without implementing any network stuff!

Indirection creation

To tell Puppet that a given model class can be indirected it’s just a matter or adding a little bit of Ruby metaprogramming.

To keep my OCSP system example, the OCSP request model class is declared like this:

1
2
3
4
5
6
7
8
9
10
class Puppet::SSL::Ocsp::Request < Puppet::SSL::Base
  ...

  extend Puppet::Indirector
  # this will tell puppet that we have a new indirection
  # and our default terminus will be found in puppet/indirector/ocsp/ca.rb
  indirects :ocsp, :terminus_class => :ca

  ...
end

Basically we’re saying the our model Puppet::SSL::Ocsp::Request declares an indirection ocsp, whose default terminus class is ca. That means, if we straightly try to call Puppet::SSL::Ocsp::Request.find, the puppet/indirection/ocsp/ca.rb file will be used.

Terminus selection

There’s something I didn’t talk about. You might ask yourself how Puppet knows which terminus it should use when we call one of the indirector verb. As seen above, if nothing is done to configure it, it will default to the terminus given on the indirects call.

But it is configurable. The Puppet::Indirector module defines the terminus_class= method. This methods when called can be used to change the active terminus.

For instance in the puppet agent, the catalog indirection has a REST terminus, but in the master the same indirection uses the compiler:

1
2
3
4
5
  # puppet agent equivalent code
  Puppet::Resource::Catalog.terminus_class = :rest

  # puppet master equivalent code
  Puppet::Resource::Catalog.terminus_class = :compiler

In fact the code is a little bit more complicated than this for the catalog but in the end it’s equivalent.

There’s also the possibility for a puppet application to specify a routing table between indirection and terminus to simplify the wiring.

More than one type of terminii

There’s something I left aside earlier. There are in fact two types of terminii per indirection:

  • regular terminus as we saw earlier
  • cache terminus

For every model class we can define the regular indirection terminus and an optional cache terminus.

Then when finding for an instance the cache terminus will first be asked for. If not found in the cache (or asked to not get from the cache) the regular terminus will be used. Afterward the instance will be saved in the cache terminus.

This cache is exploited in lots of place in the Puppet code base.

Among those, the catalog cache terminus is set to :yaml on the agent. The effect is that when the agent retrieves the catalog from the master through the :rest regular terminus, it is locally saved by the yaml terminus. This way if the next agent run fails when retrieving the catalog through REST, it will used the previous one locally cached during the previous run.

Most of the certificate stuff is handled along the line of the catalog, with local caching with a file terminus.

REST Terminus in details

There is a direct translation between the REST verbs and the indirection verbs. Thus the :rest terminus:

  1. transforms the indirection and key to an URI: /<environment>/<indirection>/<key>
  2. does an HTTP GET|PUT|DELETE|POST depending on the indirection verb

On the server side, the Puppet network layer does the reverse, calling the right indirection methods based on the URI and the REST verb.

There’s also the possibility to sends parameters to the indirection and with REST, those are transformed into URL request parameters.

The indirection name used in the URI is pluralized by adding a trailing ’s’ to the indirection name when doing a search, to be more REST. For example:

  • GET /production/certificate/test.daysofwonder.com is find
  • GET /production/certificates/unused is a search

When indirecting a model class, Puppet mixes-in the Puppet::Network::FormatHandler module. This module allows to render and convert an instance from and to a serialized format. The most used one in Puppet is called pson, which in fact is json in disguised name.

During a REST transaction, the instance can be serialized and deserialized using this format. Each model can define its preferred serialization format (for instance catalog use pson, but certificates prefer raw encoding).

On the HTTP level, we correctly add the various encoding headers reflecting the serialization used.

You will find a comprehensive list of all REST endpoint in puppet here

Puppet 2.7 indirection

The syntax I used in my samples are derived from the 2.6 puppet source. In Puppet 2.7, the dev team introduced (and are now contemplating removing) an indirection property in the model class which implements the indirector verbs (instead of being implemented directly in the model class).

This translates to:

1
2
3
4
5
  # 2.6 way, and possibly 2.8 onward
  Puppet::Node.find(...)

  # 2.7 way
  Puppet::Node.indirection.find(...)

Gory details anyone?

OK, so how it works?

Let’s focus on Puppet::Node.find call:

  1. Ruby loads the Puppet::Node class
  2. When mixing in Puppet::Indirector we created a bunch of find/destroy… methods in the current model class
  3. Ruby execute the indirects call from the Puppet::Indirector module
    1. This one creates a Puppet::Indirector::Indirection stored locally in the indirection class instance variable
    2. This also registers the given indirection in a global indirection list
    3. This also register the given default terminus class. The terminus are loaded with a Puppet::Util::Autoloader through a set of Puppet::Util::InstanceLoader
  4. When this terminus class is loaded, since it somewhat inherits from Puppet::Indirector::Terminus, the Puppet::Indirector:Terminus#inherited ruby callback is executed. This one after doing a bunch of safety checks register the terminus class as a valid terminus for the loaded indirection.
  5. We’re now ready to really call Puppet::Node.find. find is one of the method that we got when we mixed-in Puppet::Indirector
    1. find first create a Puppet::Indirector::Request, with the given key.
    2. It then checks the terminus cache if one has been defined. If the cache terminus finds an instance, this one is returned
    3. Otherwise find delegates to the registered terminus, by calling terminus.find(request)
    4. If there’s a result, this one is cached in the cache terminus
    5. and the result is returned

Pretty simple, isn’t it? And that’s about the same mechanism for the three other verbs.

It is to be noted that the terminus are loaded with the puppet autoloader. That means it should be possible to add more indirection and/or terminus as long as paths are respected and they are in the RUBYLIB. I don’t think though that those paths are pluginsync’ed.

Conclusion

I know that the indirector can be intimidating at first, but even without completely understanding the internals, it is quite easy to add a new terminus for a given indirection.

On the same subject, I highly recommends this presentation about Extending Puppet by Richard Crowley. This presentation also covers the indirector.

This article will certainly close the Puppet Extension Points series. The last remaining extension type (Faces) have already been covered thoroughly on the Puppetlabs Docs site.

The next article will I think cover the full picture of a full puppet agent/master run.

sunsolve.espix.org : A new tool for Solaris sysadmins…

When the merge of Oracle and SUN became reality, we lost one of the greatest documentation portal for Solaris: Sun Solve.

While I'm a Solaris sysadmin myself, I needed a tool to manage my daily patching, to ease search with bugs, patches and dependancies. I needed also something that could track what I was applying to each system. After some thinking, I came up with a solution: We Sun Solve !

Indeed, I've decided not to keep my work for myself, but to share it amongst every solaris sysadmin who want to use it. Check it out! Give me feedback, ideas and any thing that you're thinking in front of such portal..

You can also come and discuss with us on IRC #sunsolve @ irc.freenode.org

Generic use of chef providers in mcollective

eth0This post follows my previous one, dealing with the reuse of chef providers of chef in mcollective. In the comments Adam Jacob had an interesting word and when I wrote my second agent, to manage package I saw it would be a piece of cake to write a really generic agent, due to the nature of chef resource (and the way to invoke them)

So, this is a generic chef resource mcollective agent, with the associated example client code. It anyway deserves an little explanation; it is not mean to work with a command line invocation. Why ? Because I push quite “complex” data as the resourceactions parameter. The only way I found to make this work from command line is to use eval on the argument, which is no way acceptable. Anyway I hope some people will find this useful.

Reusing chef providers in mcollective

eth0It has been quite calm for a couple of months here. I have switched job, it explains why I had less time to post some things.I now work at fotolia, and I switched from puppet to chef (no troll intended, I still think puppet is a great tool, please read this).

However, a tool I still have is the awesome mcollective. Unfortunately, the most used agents (package, service) relay on puppet providers to do their actions. Fortunately, open source is here, so I wrote a (basic) service agent that uses chef providers to start/stop or restart an agent. It still needs some polish for the status part (ho the ugly hardcoded path) but I was quite excited to share this. Freshly pushed on github !

Thanks to Jordan Sissel for minstrel, an awesome debug tool, the opscode team for the help on the provider and R.I. Pienaar for mcollective (and the support).