Initial Thoughts on Migrating from Amazon EC2 to Rackspace Cloud

Updated: 2011-08-12

Block Devices are Tied to the Instance Type

On EC2, each instance type has a predefined CPU and memory size, but thanks to “Elastic Block Storage” (which is managed independently from the actual instance), you can make your block devices as large or as small as you want. You can also attach additional block devices as needed. This gives you a lot of flexibility to provision the appropriate resources for your specific application and to grow things as you need to. RackSpace Cloud has no EBS equivalent, so the size of your disk seems to be static and tied to the instance type. This means if you start to run out of space, you apparently have no choice but to upgrade to the next instance size, regardless of whether you actually need the additional CPU/memory. Based on a conversation I had with support, I’m guessing this has to do with the fact that all block devices are created locally on the physical VM host, rather than on a SAN. So I can definitely see how this architecture would make it difficult (or even impossible) for RackSpace to implement any of the features made possible by Amazon’s EBS.

See The ability to choose amount of RAM and HD space separately on the RackSpace Cloud feedback forum.

Password Logins by Default

On EC2, one of the first things you do is set up an SSH keypair for your account. This saves you from having to set a root password for new instances. You just select the appropriate keypair when creating the new instance and log in with your SSH key. As far as I know, there is no such feature in the RackSpace Cloud. After you request a new instance, you have to wait for a randomized root password to be emailed to you. Let me repeat that in case you missed it. Your root password is emailed to you in plain text over the Internet. Hmm…

See Do not send root password by email on the RackSpace Cloud feedback forum.

Unable to Stop Instances

Yup, it’s just like being in the old days of EC2 before root EBS volumes. Once an instance is started, you can reboot or terminate it, but you can’t actually stop it to save money. At my previous company, part of our continuous deployment process was to automatically spin up a staging environment to test new code before actually deploying it into production. We also had a dedicated testing environment which we would spin up on demand for testing various things. Traditionally, it was very expensive to run duplicate (or triplicate) environments for testing, but EC2 makes this sort of thing trivially inexpensive, since the instances don’t actually have to be running most of the time. I don’t think something like this would be feasible in the RackSpace Cloud, because constantly terminating and rebuilding every instance in every environment would make things a lot slower and more difficult to manage in general. I realize the process could be sped up a bit by creating a bunch of VM images, but I don’t even want to get started on why I hate that idea. Configuration management has made images obsolete as far as I’m concerned.

See Need option to suspend servers to save money on the RackSpace Cloud feedback forum.

No Concept of Security Groups

I guess I just got used to the peace and security of EC2 security groups, because I took it for granted that RackSpace Cloud would have something similar. So boy was I surprised when I discovered that my first new instances were essentially sitting wide open on the Internet! Now if you’re using a configuration management system, it’s not a huge deal to set up a local firewall on all your instances. But it can definitely be scary, because the lack of real console access in the cloud means there’s a very real possibility that you could accidentally lock yourself out of an instance while testing new firewalls rules.

See Create EC2-like security groups, so you don’t have to configure iptables for each instance on the RackSpace Cloud feedback forum.

DNS Annoyances

One of the nice things about the way DNS is configured on EC2 is that when you resolve a public hostname from an instance, you’ll actually get the internal IP address. This means you can use your public hostnames everywhere, and everything will continue to work just fine. Since DNS doesn’t work this way in RackSpace, things just get a bit more complicated, but again, this is mostly just an annoyance to me right now.

Unable to change filesystem

The default filesystem on Ubuntu is EXT3. Want to convert to EXT4 in order to (for example) run MongoDB according to 10gen’s official recommendations? Oops, too bad.

Cloud Load balancers do not Support SSL Termination

In EC2, it’s possible to upload your SSL certificates to an Elastic Load Balancer (ELB) and have your SSL connections terminate right there (i.e. to accept and decrypt SSL traffic on the ELB and forward it in plain text to back end).

             |         (HTTPS)
       +-----+-----+
       | Amazon ELB|
       +-----+-----+
             |         (HTTP)
      +------+------+
      |             |
+-----+-----+ +-----+-----+
|   app01   | |   app02   |
+-----------+ +-----------+

It’s nice to be able to offload some work to the ELB, but it’s (almost) necessary if you have something like HAProxy or Varnish in front of your application servers (HAProxy and Varnish will not be able to read your SSL encrypted traffic, and therefore, will not be able to make decisions based on the requested URL, headers, etc.). This means you’ll have to stick something like stunnel between the RackSpace load balancer and HAProxy/Varnish/Whatever to handle the SSL decryption.

See Support SSL termination on Cloud Load Balancers on the RackSpace Cloud feedback forum.

Cloud Load balancers do not Support X-Forwarded-For, X-Forwarded-Port or X-Forwarded-Proto Headers

These are pretty important (especially X-Forwarded-For) if you want to know anything about the clients connecting to your servers. Not having them means all your HTTP requests will appear to come from your load balancer, which is essentially useless. RackSpace support told me X-Forwarded-For would be available in Q3 of this year, and that X-Cluster-Client-Ip can be used in the meantime (though it appears that X-Cluster-Client-Ip still isn’t sent with HTTPS requests!), but there are apparently no plans to support X-Forwarded-Port or X-Forwarded-Proto.

See add the x-forwarded-for header to traffic from your cloud load balancer. on the RackSpace Cloud feedback forum.

HTTPS Health Checks on Cloud Load Balancers Occur in Plain Text

How on Earth did this get past QA? Basically what this means is if you set up an HTTPS load balancer (e.g. listening on port 443 and forwarding to 443 on the backend), and you set up an HTTPS health check from the load balancer (i.e. to check the HTTPS version of your site at https://host.example.com/health), you’ll discover that the load balancer essentially makes requests for http://host.example.com:443/health, which will obviously never work, and will result in the load balancer removing all of your instances from rotation. The only workaround is to use the CONNECT health check method, which can only ensure that a port is listening.

I reported this bug to RackSpace support, and a fix is forthcoming, but I haven’t seen an ETA yet.

Conclusion

Based on what I’ve seen so far, I don’t think RackSpace’s Cloud offering even comes close to Amazon’s right now in terms of features and flexibility. EC2 feels to me like something that was designed from the ground up to be essentially “programmable infrastructure,” whereas RackSpace cloud feels essentially like a thin wrapper around a Xen or VMware cluster. Though I fully admit that I’ve only been using it for a couple weeks at this point, so I could be totally missing things, in which case, I would love to get some feedback on some of the issues I’ve raised above.

One thing I think RackSpace does have over Amazon is the ability to mix virtual instances with physical servers. I could definitely see the value in, for example, running some application servers in the cloud for flexibility and running your database on physical hardware for performance (I think the problems with EBS’s IO are pretty well known at this point).

High-Availability Load Balancing with HAProxy and Amazon Elastic Load Balancers on EC2

For a while now, Amazon has offered a simple load balancing solution called Elastic Load Balancing (ELB). For simple sites, this avoids the need to run dedicated load balancer instances. Unfortunately, Amazon’s ELB solution is fairly limited feature-wise, so you may be forced to run your own load balancer instance anyway. Below are some of the limitations I ran into at SocialMedia.com:

  • No ACL feature. This means it’s impossible to make forwarding decisions to back end servers based on URLs/headers/etc.
  • Once the ELB is created, it’s impossible to change the port configuration. If you ever need to do this, you’ll need to create a brand new ELB and update the appropriate DNS records to point to the new ELB. Annoying. (Update: It appears that there may be a way to change the port configuration after all…)
  • More dynamic/automated environments need to be careful when managing instances that are behind an ELB. I found that it’s not a good idea to simply stop your load-balanced instances without de-registering them from the ELB first. It’s also important to ensure that your instances are in a “running” state before registering them with an ELB. You can read why in a thread I started on the AWS Developer Forum.

My biggest issue was the first one. For one of my latest projects at SocialMedia.com, I needed the ability to accept all connections on the same external port and redirect them to different internal ports depending on the request. For example, a request for http://api.example.com/services/foo/v1/something needed to get forwarded to port 123, whereas a request for http://api.example.com/services/bar/v1/something needed to get forwarded to port 456. You can do this easily with HAProxy ACLs.

HAProxy ACLs

First I define a “frontend” section. This frontend listens on external port 80 and contain two ACLs; one for each of the requests in my example above. Line 4 defines an ACL called services_foo_v1 which will match any request with a path that starts with /services/foo/v1/. Line 5 forwards all requests that match the services_foo_v1 to the “backend” with the same name. The “default” line simply forwards unknown requests to the “website” backend:

1
2
3
4
5
6
7
8
9
10
frontend http
  bind *:80
 
  acl services_foo_v1 path_reg ^/services/foo/v1/
  use_backend services_foo_v1 if services_foo_v1
 
  acl services_bar_v1 path_reg ^/services/bar/v1/
  use_backend services_bar_v1 if services_bar_v1
 
  default_backend website

Next I define my “backend” sections. Each backend contains a list of servers/ports that are able to handle certain requests. Lines 2 and 7 are a bit of magic that you may or may not need for your environment. They ensure that (for example) the original request for /services/foo/v1/something on the external interface gets rewritten to /something before being passed on to the backend servers.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
backend services_foo_v1
  reqrep ^([^\ ]*)\ /services/foo/v1/(.*) \1\ /\2
  server app01 app01.example.com:123
  server app02 app02.example.com:123
  server app03 app03.example.com:123
 
backend services_bar_v1
  reqrep ^([^\ ]*)\ /services/bar/v1/(.*) \1\ /\2
  server app01 app01.example.com:456
  server app02 app02.example.com:456
  server app03 app03.example.com:456
 
backend website
  server webserver www.example.com:80

HAProxy High Availability on EC2

In a normal (non-EC2) environment, high-availability is achieved by running two HAProxy instances with a shared IP address and a heartbeat protocol between the instances. The idea is that if one HAProxy instance goes down, the other will simply take over the shared IP address. Unfortunately, it’s just not possible to share private IP addresses like this in EC2. So what other options are there?

The wrong solution is to use round-robin DNS records to distribute traffic between the two load balancer instances:

api.example.com.	300	IN	A	1.2.3.4
api.example.com.	300	IN	A	5.6.7.8

This will sort-of work while both instances are running, but if one goes down, half of your traffic will be sent to a dead load balancer. Remember kids, round-robin DNS records are not a high availability solution. ;-)

Other people have suggested using an Amazon elastic IP in conjunction with the load balancer instances. The idea is to detect the failure of one of the instances (via your existing monitoring system, etc.) and automatically reassign the elastic IP to the other instance. Although this solution sounds simple enough, uptime is important enough to my company that I don’t really trust myself to make this a totally automated and 100% foolproof process. It’s the kind of thing I just don’t want to have to worry about.

Fortunately, there’s another much simpler solution. Just stick an ELB in front of your HAProxy instances:

             |
       +-----+-----+
       | Amazon ELB|
       +-----+-----+
             |
      +------+------+-------------+---------+---------+
      |             |             |         |         |
+-----+-----+ +-----+-----+   +---+---+ +---+---+ +---+---+
| haproxy01 | | haproxy02 |   | app01 | | app02 | | app03 |
+-----------+ +-----------+   +-------+ +-------+ +-------+

Elastic load balancers already have built in redundancy (a single ELB instance is actually backed by a pool of several load balancers which automatically grows and shrinks according to current load), so we don’t have to worry much about that. Then we can stick each HAProxy instance in its own EC2 availability zone to guard against internal EC2 network issues. Now assuming that all HAProxy instances are configured identically (synchronized via Chef, of course ;-), either instance can go down, and it wouldn’t matter, because the ELB will simply route traffic to the remaining live instances. Another nice thing about this solution is that both HAProxy instances will be handling requests at the same time (as opposed to having a backup that is only used during emergencies). This means you get a bit of additional capacity in addition to your redundancy. Though obviously, you’ll want to keep an eye on the total load across all HAProxy instances to ensure that you always have enough spare capacity to survive a failure.

Building CentOS 5 images for EC2

I had a need to create some CentOS 5 hosts on Amazons EC2 platform, and while there’s nothing stopping you from reusing a pre-built AMI, it’s always handy to know how these things are built from scratch.

I had a few basic requirements:

  • I’ll be creating various sizes of EC2 instances, so both i386 and x86_64 AMI’s are required.
  • Preferably boot the native CentOS kernel rather than use the generic EC2 kernel as I know the CentOS-provided Xen kernel JFW.

You’ll need the following:

  • An existing Intel/AMD Linux host, this should be running CentOS, Fedora, RHEL, or anything as long as it ships a usable yum(8). It should also be an x86_64 host if you’re planning on building for both architectures and have around 6GB of free disk space.
  • Amazon AWS account with working Access Key credentials for S3 and a valid X.509 certificate & private key pair for EC2.
  • The EC2 AMI tools and the EC2 API tools installed and available in your $PATH.
  • A flask of weak lemon drink.

Lets assume you’re working under /scratch, you’ll need to first create a directory to hold your root filesystem, and also a couple of directories within that ahead of installing anything:

# mkdir -p /scratch/ami/{dev,etc,proc,sys}

The /dev directory needs a handful of devices creating:

# MAKEDEV -d /scratch/ami/dev -x console
# MAKEDEV -d /scratch/ami/dev -x null
# MAKEDEV -d /scratch/ami/dev -x zero

A minimal /etc/fstab needs to be created:

1
2
3
4
5
/dev/sda1               /                       ext3    defaults        1 1
tmpfs                   /dev/shm                tmpfs   defaults        0 0
devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
sysfs                   /sys                    sysfs   defaults        0 0
proc                    /proc                   proc    defaults        0 0

There are other partitions that will be available to your instance when it is up and running but they vary between instance types, this is the bare minimum that is required and should work for all instance types. If you want to add the additional partitions here, refer to the Instance Storage Documentation. I will instead use Puppet to set up any additional partitions after the instance is booted.

/proc and /sys should also be mounted inside your AMI root:

# mount -t proc proc /scratch/ami/proc
# mount -t sysfs sysfs /scratch/ami/sys

Create a custom /scratch/yum.cfg which will look fairly similar to the one your host system uses:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
[main]
cachedir=/var/cache/yum
keepcache=0
debuglevel=2
logfile=/var/log/yum.log
distroverpkg=redhat-release
tolerant=1
exactarch=1
obsoletes=1
gpgcheck=0
plugins=1
reposdir=/dev/null
 
# Note: yum-RHN-plugin doesn't honor this.
metadata_expire=1h
 
# Default.
# installonly_limit = 3
 
[centos-5]
name=CentOS 5 - Base
baseurl=http://msync.centos.org/centos-5/5/os/$basearch/
enabled=1
 
[centos-5-updates]
name=CentOS 5 - Updates
baseurl=http://msync.centos.org/centos-5/5/updates/$basearch/
enabled=1
 
[centos-5-epel]
name=Extra Packages for Enterprise Linux 5 - $basearch
baseurl=http://download.fedora.redhat.com/pub/epel/5/$basearch/
enabled=1

Notably disable the gpgcheck directive and make sure no additional repositories are picked up by setting the reposdir to somewhere where no .repo files are located otherwise you’ll scoop up any repositories configured on your host system. By making use of the $basearch variable in the URLs, this configuration should work for both i386 and x86_64.

If you have local mirrors of the package repositories, alter the file to point at them and be a good netizen. You will need to make sure that your base repository has the correct package groups information available. Feel free to also add any additional repositories.

You’re now ready to install the bulk of the Operating System. If the host architecture and target architecture are the same, you can just do:

# yum -c /scratch/yum.conf --installroot /scratch/ami -y groupinstall base core

If however you’re creating an i386 AMI on an x86_64 host, you need to use the setarch(8) command to prefix the above command like so:

# setarch i386 yum -c /scratch/yum.conf --installroot /scratch/ami -y groupinstall base core

This mostly fools yum and any child commands into thinking the host is i386 and without it, you’ll just get another x86_64 image. Sadly you can’t do the reverse to build an x86_64 AMI on an i386 host.

This should give you a fairly minimal yet usable base however it won’t have the right kernel installed, so do the following to remedy this:

# yum -c /scratch/yum.cfg --installroot /scratch/ami -y install kernel-xen
# yum -c /scratch/yum.cfg --installroot /scratch/ami -y remove kernel

(Remember to use setarch(8) again if necessary)

You can also use variations of the above commands to add or remove additional packages as you see fit.

All that’s required now is to perform a bit of manual tweaking here and there. Firstly you need to set up the networking which on EC2 is simple, one interface using DHCP. Create /etc/sysconfig/network-scripts/ifcfg-eth0:

1
2
3
4
5
6
7
DEVICE=eth0
BOOTPROTO=dhcp
ONBOOT=yes
TYPE=Ethernet
USERCTL=yes
PEERDNS=yes
IPV6INIT=no

And also /etc/sysconfig/network:

1
NETWORKING=yes

The networking still won’t work without the correct kernel module(s) being loaded so create /etc/modprobe.conf with the following:

1
2
alias eth0 xennet
alias scsi_hostadapter xenblk

The second module means the instance can see the various block devices as well as the first module fixing the networking. The ramdisk for the kernel now needs to be updated so it knows to pull in these two modules and load them at boot, for this you need to know the version of the kernel installed. You can do this a number of ways, but the easiest is to just look at the /boot directory:

# ls -1 /scratch/ami/boot
config-2.6.18-164.15.1.el5xen
grub
initrd-2.6.18-164.15.1.el5xen.img
message
symvers-2.6.18-164.15.1.el5xen.gz
System.map-2.6.18-164.15.1.el5xen
vmlinuz-2.6.18-164.15.1.el5xen
xen.gz-2.6.18-164.15.1.el5
xen-syms-2.6.18-164.15.1.el5

In this case the version is “2.6.18-164.15.1.el5xen”. Using this, we need to run mkinitrd(8) but we also need to use chroot(1) to run the command as installed in your new filesystem, using the filesystem as its / otherwise it will attempt to overwrite bits of your host system. So something like the following:

# chroot /scratch/ami mkinitrd -f /boot/initrd-2.6.18-164.15.1.el5xen.img 2.6.18-164.15.1.el5xen

No /etc/hosts file is created so it’s probably a good idea to create one of those:

1
127.0.0.1	localhost.localdomain localhost

SELinux will be enabled by default and although your instance will boot, you won’t be able to log in so the easiest thing is to just disable it entirely by editing /etc/selinux/config so it looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#	enforcing - SELinux security policy is enforced.
#	permissive - SELinux prints warnings instead of enforcing.
#	disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of these two values:
#	targeted - Only targeted network daemons are protected.
#	strict - Full SELinux protection.
#	mls - Multi Level Security protection.
SELINUXTYPE=targeted 
# SETLOCALDEFS= Check local definition changes
SETLOCALDEFS=0

You could also disable it with the correct kernel parameter at boot time. There may be a way to allow SELinux to work, it may just be that the filesystem needs relabelling which you can force on the first boot by creating an empty /scratch/ami/.autorelabel file. I’ll leave that as an exercise for the reader or myself when I’m bored enough.

Now we need do deal with how to boot the native CentOS kernel. Amazon don’t allow you to upload your own kernels or ramdisks to boot with your instances so how do you do it? Apart from their own kernels, they now provide a PV-GRUB kernel image that when it boots, it behaves just like the regular GRUB bootloader and reads your instance filesystem for a grub.conf and then uses that to select the kernel and loads it from your instance filesystem along with the accompanying ramdisk.

We don’t need to install any boot blocks but we will need to create a simple /boot/grub/grub.conf using the same kernel version we used when recreating the ramdisk:

1
2
3
4
5
6
default=0
timeout=5
title CentOS (2.6.18-164.15.1.el5xen)
	root (hd0)
	kernel /boot/vmlinuz-2.6.18-164.15.1.el5xen ro root=/dev/sda1
	initrd /boot/initrd-2.6.18-164.15.1.el5xen.img

If we install any updated kernels, they should automatically manage this file for us and insert their own entries, we just need to do this once.

To match what you normally get on a regular CentOS host, a couple of symlinks should also be created:

# ln -s grub.conf /scratch/ami/boot/grub/menu.lst
# ln -s ../boot/grub/grub.conf /scratch/ami/etc/grub.conf

When you create an EC2 instance you have to specify an existing SSH keypair created within EC2 which you should be able to use to log into the instance. This is accomplished by the usual practice of having the public part of the key being copied into /root/.ssh/authorized_keys however I initially thought that was magic that Amazon did for you, but they don’t, you need to do it yourself.

When the instance is booted, the public part of the key (as well as various other bits of metadata) is available at the URL http://169.254.169.254/latest/meta-data/public-keys/0/openssh-key, so the easiest thing to do is add the following to /etc/rc.d/rc.local:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#!/bin/sh
#
# This script will be executed *after* all the other init scripts.
# You can put your own initialization stuff in here if you don't
# want to do the full Sys V style init stuff.
 
touch /var/lock/subsys/local
 
if [ ! -d /root/.ssh ] ; then
        mkdir -p /root/.ssh
        chmod 700 /root/.ssh
fi
 
/usr/bin/curl -f http://169.254.169.254/latest/meta-data/public-keys/0/openssh-key > /root/.ssh/authorized_keys
 
chmod 600 /root/.ssh/authorized_keys

You can be more elaborate if you want, but this is enough to allow you to log in with the SSH key. One thing I found was that because the firstboot service started at boot, that sat for a while asking you if wanted to do any firstboot-y things which would delay your rc.local hack from running until it timed out, so Amazon would say your instance was running but you couldn’t SSH in for a minute or two. Easiest thing is to disable firstboot:

# chroot /scratch/ami chkconfig firstboot off

You can also use this to disable more services, there’s a few enabled by default that are arguably useless in an EC2 instance but they won’t break anything if you leave them enabled. You can also enable other services if you installed any additional packages.

Finally, if you need to play about with the image you can just do the following:

# chroot /scratch/ami

(Remember to use setarch(8) again if necessary)

This gives you a shell inside your new filesystem if you need to tweak anything else, one thing I found necessary was to set your /etc/passwd file up correctly and optionally set a root password, which you could use instead of the SSH key.

# pwconv
# passwd
Changing password for user root.
New UNIX password: ********
Retype new UNIX password: ********
passwd: all authentication tokens updated successfully.

If you want to use any RPM-based commands while you’re inside this chroot and you created an i386 image on an x86_64 host, you may get the following error:

# rpm -qa
rpmdb: Program version 4.3 doesn't match environment version
error: db4 error(-30974) from dbenv->open: DB_VERSION_MISMATCH: Database environment version mismatch
error: cannot open Packages index using db3 -  (-30974)
error: cannot open Packages database in /var/lib/rpm

This is because some of the files are architecture-dependent and despite using setarch(8) they still get written as x86_64 format. It’s a simple fix:

# rm -f /scratch/ami/var/lib/rpm/__db*

If you query the RPM database inside the chroot and then want to install or remove more packages with yum outside the chroot, you will need to do the above again.

Before you start to package up your new image there’s one small bit of clean up, for some reason yum creates some transaction files that you’ll notice are under /scratch/ami/scratch/ami/var/lib/yum, I couldn’t work out how to stop it making those so you just need to blow that directory away:

# rm -f /scratch/ami/scratch

You should also unmount the /proc and /sys filesystems you mounted before installing packages:

# umount /scratch/ami/proc
# umount /scratch/ami/sys

Right, you’re now ready to package up your new image.

First thing is to bundle the filesystem which will create one big image file, then chop it up into small ~ 10MB pieces and then create an XML manifest file that ties it all together. You will need your AWS user ID for this part which you can find in your AWS account:

# ec2-bundle-vol -c <certificate_file> -k <private_keyfile> -v /scratch/ami -p centos5-x86_64 -u <user_id> -d /scratch -r x86_64 --no-inherit

This can take a few minutes to run. Remember to also set the architecture appropriately.

The next step is to then upload the manifest and image pieces either to an existing S3 bucket that you own, or a new bucket that will be created:

# ec2-upload-bundle -m /scratch/centos5-x86_64.manifest.xml -b <bucket> -a <access_key> -s <secret_key>

This part has the longest wait depending on how fast your internet connection is, you’ll be uploading around 330MB per image. If you seriously made a flask of weak lemon drink, I’d drink most of it now.

Once that finishes the final step is to register the uploaded files as an AMI ready to create instances from it. Before we do that though, we need to find the correct AKI to boot it with. There should be four AKI’s available in each location, two for each architecture which differ in how they try and find the grub.conf on the image. One treats the image as one big filesystem with no partitioning, the other assumes the image is partitioned and assumes grub.conf is on the first partition.

List all of the available AKI’s with the following:

# ec2-describe-images -C <certificate_file> -K <private_keyfile> -o amazon | grep pv-grub
IMAGE	aki-407d9529	ec2-public-images/pv-grub-hd0-V1.01-i386.gz.manifest.xml	amazon	available	public		i386	kernel				instance-store
IMAGE	aki-427d952b	ec2-public-images/pv-grub-hd0-V1.01-x86_64.gz.manifest.xml	amazon	available	public		x86_64	kernel				instance-store
IMAGE	aki-4c7d9525	ec2-public-images/pv-grub-hd00-V1.01-i386.gz.manifest.xml	amazon	available	public		i386	kernel				instance-store
IMAGE	aki-4e7d9527	ec2-public-images/pv-grub-hd00-V1.01-x86_64.gz.manifest.xml	amazon	available	public		x86_64	kernel				instance-store

Assuming this is still an x86_64 image, the AKI we want is aki-427d952b. More documentation about these is available here.

Now we can register the AMI like so:

# ec2-register -C <certificate_file> -K <private_keyfile> -n centos5-x86_64 -d "CentOS 5 x86_64" --kernel aki-427d952b <bucket>/centos5-x86_64.manifest.xml
IMAGE   ami-deadbeef

The output is our AMI id which we’ll use for creating instances. If you haven’t already created an SSH keypair, do that now:

# ec2-add-keypair -C <certificate_file> -K <private_keyfile> <key_id>

This returns the private key portion of the SSH keypair which you need to save and keep safe, there’s no way of retrieving it if you lose it.

Finally create an instance using the AMI id we got from ec2-register along with the id of a valid SSH keypair:

# ec2-run-instances -C <certificate_file> -K <private_keyfile> -t m1.large -k <key_id> ami-deadbeef

This will return information about your new instance which will initially be in the pending state. Periodically run the following:

# ec2-describe-instances -C <certificate_file> -K <private_keyfile>

Once your instance is in the running state, you should be able to see the hostname and IP that has been allocated and you can now SSH in using the private key you saved from ec2-add-keypair.

When you’ve finished with the instance, you can terminate it as usual with ec2-terminate-instances.

If you no longer need the AMI image or wish to change it in some way, you need to first deregister it with:

# ec2-deregister -C <certificate_file> -K <private_keyfile> ami-deadbeef

Then remove the bundle from the S3 bucket:

# ec2-delete-bundle -b <bucket> -a <access_key> -s <secret_key> -m /scratch/centos5-x86_64.manifest.xml

Then in the case of making changes, repeat the steps from ec2-bundle-vol onwards.

Provision to cloud in 5 minutes using fog

Most recently I have been working on disaster recovery project where we are assembling documentation, processes and code to be able to fire up our whole environment in the cloud in case of a major disaster. At Velocity Conference I met Wesley Beary who is the main developer for fog, a Ruby cloud computing library. What appealed to me about fog is that it has varying support for different clouds so that we are not stuck using a provider due to our non-portable code. Now off to couple quick example to get you going.

To install fog you will need to install Ruby Gems. If you have them type

  sudo gem install fog

The install may fail if you don't have the libxslt and libxml2 dev libraries. On my Ubuntu laptop I resolved it by doing

  sudo apt-get install libxslt1-dev libxml2-dev

On Centos/RHEL 5 I had to do

   yum install libxslt-devel libxml2-devel

Create a file called config.rb which contains your credentials e.g.

#!/usr/bin/ruby

@aws_access_key_id = "XXXXXXXXXXXXXXXXXX"
@aws_secret_access_key = "AXXZZZZZZZZZZZZZZZZZZ"
@aws_region = "us-east-1"

Let's start with the basics. Let's get our currently running instances and what images are available

#!/usr/bin/ruby

require 'rubygems'
require 'fog'

# Import EC2 credentials e.g. @aws_access_key_id and @aws_access_key_id
require './config.rb'

# Set up a connection
connection = Fog::AWS::EC2.new(
    :aws_access_key_id => @aws_access_key_id,
    :aws_secret_access_key => @aws_secret_access_key )

# Get a list of all the running servers/instances
instance_list = connection.servers.all

num_instances = instance_list.length
puts "We have " + num_instances.to_s()  + " servers"

# Print out a table of instances with choice columns
instance_list.table([:id, :flavor_id, :ip_address, :private_ip_address, :image_id ])

###################################################################
# Get a list of our images
###################################################################
my_images_raw = connection.describe_images('Owner' => 'self')
my_images = my_images_raw.body["imagesSet"]

puts "\n###################################################################################"
puts "Following images are available for deployment"
puts "\nImage ID\tArch\t\tImage Location"

#  List image ID, architecture and location
for key in 0...my_images.length
  print my_images[key]["imageId"], "\t" , my_images[key]["architecture"] , "\t\t" , my_images[key]["imageLocation"],  "\n";
end

Let's spin up a m1.large instance

#!/usr/bin/ruby
require 'rubygems'
require 'fog'
# Import EC2 credentials e.g. @aws_access_key_id and @aws_access_key_id
require './config.rb'

# Set up a connection
connection = Fog::AWS::EC2.new(
 :aws_access_key_id => @aws_access_key_id,
 :aws_secret_access_key => @aws_secret_access_key )

server = connection.servers.create(:image_id => 'ami-1234567',
 :flavor_id =>  'm1.large')

# wait for it to be ready to do stuff
server.wait_for { print "."; ready? }

puts "Public IP Address: #{server.ip_address}"
puts "Private IP Address: #{server.private_ip_address}"

This may take a while so please be patient.  You could obviously spin up a number of these instances without waiting for any of them to be available then use connection.servers.all to get a list of running instances.

Now let's destroy a running instance

#!/usr/bin/ruby
require 'rubygems'
require 'fog'
# Import EC2 credentials e.g. @aws_access_key_id and @aws_access_key_id
require './config.rb'

# Set up a connection
connection = Fog::AWS::EC2.new(
    :aws_access_key_id => @aws_access_key_id,
    :aws_secret_access_key => @aws_secret_access_key )

instance_id = "1-123456"

server = connection.servers.get(instance_id)

puts "Flavor: #{server.flavor_id}"
puts "Public IP Address: #{server.ip_address}"
puts "Private IP Address: #{server.private_ip_address}"

server.destroy

There is tons more out there although this gets me going :-) . Now off to playing with R.I. Pienaar's ec2-boot-init.

Thanks to Wesley Beary for answering questions about fog and Ian Meyer for pointing out Chef Fog code.

#!/usr/bin/ruby

require 'rubygems'
require 'fog'
require 'pp'

# Import EC2 credentials e.g. @aws_access_key_id and @aws_access_key_id
require './config.rb'

# Set up a connection
connection = Fog::AWS::EC2.new(
:aws_access_key_id => @aws_access_key_id,
:aws_secret_access_key => @aws_secret_access_key )

# Get a list of all the running servers/instances
instance_list = connection.servers.all

num_instances = instance_list.length
puts "We have " + num_instances.to_s()  + " servers"

# Print out a table of instances with choice columns
instance_list.table([:id, :flavor_id, :ip_address, :private_ip_address, :image_id ])

###################################################################
# Get a list of our images
###################################################################
my_images_raw = connection.describe_images('Owner' => 'self')

my_images = my_images_raw.body["imagesSet"]

puts "\n###################################################################################"
puts "Following images are available for deployment"
puts "\nImage ID\tArch\t\tImage Location"

for key in 0...my_images.length
print my_images[key]["imageId"], "\t" , my_images[key]["architecture"] , "\t\t" , my_images[key]["imageLocation"],  "\n";
end

###################################################################
# Get a list of all instance flavors
###################################################################
flavors = connection.flavors()

print "\n\n============\nFlavors\n============\n"
#flavors.table([:bits, :cores, :disk, :ram, :name])
flavors.table

dynect4r: A Ruby Library and Command Line Client for the Dynect REST API (Version 2)

Well, I should have listened to everyone who warned me about UltraDNS’s obscene prices. But I figured it’s only DNS, so how much more could they be compared to their competition? $50 per month? Maybe $100? Boy was I surprised to find out that UltraDNS’s prices are literally 10-25 times more than everyone else’s! Hilarious…

I’ve actually been a DynDNS customer since the late nineties or so (I have free custom DNS service for life for making a donation to them back when they were a much smaller company), so I had looked at Dyn.com‘s products before. I just must have gotten confused with all their different websites and DNS products, because I somehow got the impression that the DynDNS API wasn’t powerful enough to do what I wanted to do. I was absolutely wrong. After having written command line clients for both APIs (see ultradns4r, and now dynectr4), I think I speak from authority when I say the Dynect API is every bit as powerful as UltraDNS’s. And at 1/10th – 1/25th the cost of UltraDNS, going with Dynect is a no-brainer. But I’ve digressed long enough.

I wrote dynect4r for the same reason I wrote ultradns4r; I wanted to be able to manage all my DNS records via the command line. And now that I’ve learned how to package Ruby projects as gems, you can simply…

gem install dynect4r

and then do things like…

dynect4r-client -n test.example.org 1.1.1.1

Since the key feature of this project is the command line client, the actual library behind it is a pretty simple wrapper around rest-client. If you’re looking for something a bit more powerful to use in your own Ruby projects, you may be interested in dynect_rest by Adam Jacob from Opscode. We actually discovered each other’s projects last night in #chef, and realized that it would probably be a good idea to pool our efforts eventually.

ultradns4r: A Ruby Library and Command Line Client for the Neustar UltraDNS SOAP API

I just spent the last week or so learning more about SOAP than I ever wanted to know. ;-) Fortunately, the result of that hard work resulted in something that might benefit the EC2 community.

I am pleased to announce ultradns4r; a Ruby library and command line client for the Neustar UltraDNS SOAP API. I created this tool to alleviate the pain in dealing with EC2′s dynamic hostnames and IP addresses. Since it allows editing of arbitrary DNS records via the command line, it can be used to make EC2 instances update their own DNS records.

Lessons Learned

  • If you need to do SOAP in Ruby, just use Savon. Trust me on this.
  • WSSE authentication is a complete pain in the ass. I was unable to find any SOAP library (Ruby, Python, or Perl) that could authenticate with the UltraDNS API servers out of the box (Savon included). I ended up having to build the entire WSSE header manually in order to generate the exact XML needed.
  • Apparently, element order sometimes matters with SOAP! This is something I never expected, considering that (last I knew) the XML spec does not even allow you to enforce element or attribute order. This was also the cause of a lot of my WSSE problems. I found that for example, if the Password element came after the Nonce and Created elements (which is the case if you use Savon’s built-in WSSE authentication), then authentication would fail. Could this have been the result of having a buggy XML parser on the server side? In any case, this is one of those annoying issues to be aware of.