BoxGrinder Build 0.9.4 released

The release of BoxGrinder Build 0.9.4 is finally upon us. This release primarily improves the performance and reliability of AWS related functionality within BoxGrinder Build, in addition to a range of miscellaneous bug-fixes and documentation improvements.

What's new in 0.9.4?

Notable changes are summarised below, although the majority of improvements for this release are not directly visible to the end user.

You can update immediately from the Fedora updates-testing repository via:

yum update -y 'rubygem-boxgrinder*' --enablerepo='updates-testing'

S3 and EBS plugin enhancements and bug-fixes

Previously BoxGrinder used a couple of different community AWS libraries to enable interactions with Amazon's API. However, we have now moved over to the recently released official Ruby AWS library; aws-sdk, which provides a much more comprehensive feature-set and clear object-oriented model than was available previously. Furthermore, it is regularly maintained, so newer functionality should be exposed for use far more swiftly than was possible previously.

In the course of rewriting the plugins to best utilise the new library, a variety of bugs were eliminated [1], which should hopefully result in a far more consistent experience with the plugins [2].

Building on EC2 Micro instances

A reduction has been made to the amount of memory allocated to libguestfs in order to avoid running out of memory on EC2 Micro instances [BGBUILD-246]. A welcome side-effect of the change is a decrease in build times on larger instances.

EBS AMI overwriting behaviour

The manner in which overwrite functionality works for EBS based AMIs has been changed slightly. As outlined in this developer mailing list post at length, it should provide a far more useful and stable way of overwriting an existing AMI.

boxgrinder-build my.appl -p ec2 -d ami --delivery-config overwrite:true

Overwrite existing EBS AMI of the same name, version and release, specified in my.appl. If any existing instances are running, the program will terminate with a warning before any irreversible actions occur. The base snapshot will be deleted.

boxgrinder-build my.appl -p ec2 -d ami --delivery-config overwrite:true,terminate_instances:true

As above, but any running instances will automatically be terminated on your behalf. Beware that this will delete any attached EBS volumes. If you wish to preserve any particular EBS volume, simply detach it.

boxgrinder-build my.appl -p ec2 -d ami --delivery-config overwrite:true,preserve_snapshots:true

As first example, but the base snapshot will not be deleted. It will be orphaned, so if you wish to remove it later you will need to do so manually.

Other points of interest

  • It is now possible for applications that modify the RPM database in the post section of your appliance to run without issue (such as YUM).
  • In addition to any functional changes, we've made some improvements to the documentation (plugin, appliance definition)
  • Any plugin developers should note that BoxGrinder has updated to RSpec 2 for testing [BGBUILD-273].
  • Many improvements under the covers to the AWS-related plugins, if you are a plugin developer many new and existing functions are available in AWSHelper, EC2Helper and S3Helper.

Comprehensive Change-log

Bug

  • [BGBUILD-171] Log entries order is wrong
  • [BGBUILD-238] stop annoying AWS gem messages
  • [BGBUILD-249] Warning from S3 AMI plugin that BG is attempting to create a bucket that already exists
  • [BGBUILD-263] NoMethodError: undefined method `item' for nil:NilClass while creating EBS appliance
  • [BGBUILD-265] Resolve concurrency issues in S3 plugin for overwriting
  • [BGBUILD-269] RPM database is recreated after post section execution preventing installing RPM in post section
  • [BGBUILD-275] default_repos setting is not included in schema and is not documented
  • [BGBUILD-290] Small documentation issues on boxgrinder.org

Enhancement

  • [BGBUILD-246] Detect when insufficient system memory is available for standard libguestfs, and reduce allocation.

Task

  • [BGBUILD-271] Make docs clearer about creating appliances for multiple EC2 regions
  • [BGBUILD-272] Move from aws and amazon-ec2 to official aws-sdk gem
  • [BGBUILD-273] Move to RSpec2
  • [BGBUILD-278] Package aws-sdk gem into Fedora

[1] AWS-related bugs resolved: [BGBUILD-238], [BGBUILD-249], [BGBUILD-263], [BGBUILD-265], [BGBUILD-271], [BGBUILD-272].

[2] There appears to be a regression in aws-sdk >1.0.2 that causes S3 overwrite to fail. Soon-to-arrive 0.9.5 will remedy this with a work-around.

BoxGrinder Build around the World

Whilst procrastinating perusing the Internet, we happened to encounter some fantastic non-English BoxGrinder Build tutorials that are more than worthy of sharing.

Brazilian Portuguese: Appliances na hora com BoxGrinder

Appliances na hora com BoxGrinder is a beginner's BoxGrinder Build tutorial by Amador Pahim, showing how to define and build a custom httpd appliance from scratch.

Japanese: BoxGrinderで遊ぶ (Playing with BoxGrinder)

Takayoshi Kimura's blog-post on playing with BoxGrinder Build; a basic tutorial on BoxGrinder Build, demonstrating how easily you can link your images into libVirt.

Contribute BoxGrinder tutorials in your local language

If you are able and willing to write BoxGrinder Build tutorials in your local languages, then we're interested in hearing from you. These contributions could be tutorials or articles written on your blog or web-site, or by volunteering to translate existing content.

We are happy to share links your content. Assistance from the community to widen the accessibility of BoxGrinder Build is something we're keen on doing as we grow, so if you can translate then let us know.

How we test BoxGrinder

In this post I would like to highight our efforts to make BoxGrinder stable. It'll be about testing, of course.

Unit tests

Our first line of defense are unit tests. Because BoxGrinder is written in Ruby we use the lovely RSpec library to write our specs. RSpec provides a nice looking and flexible DSL for writing unit tests.

Consider following code:

it "should add ec2-user account" do
  guestfs = mock("guestfs")

  guestfs.should_receive(:sh).with("useradd ec2-user")
  guestfs.should_receive(:sh).with("echo -e 'ec2-user\tALL=(ALL)\tNOPASSWD: ALL' >> /etc/sudoers")

  @plugin.add_ec2_user(guestfs)
end

The code is self-explaining and that's what counts!

Currently we have over 270 tests (>83% C0 code coverage) for boxgrinder-build and over 70 tests (>89% C0 code coverage) for boxgrinder-core gems. And we will have more. I promise you this.

TeamCity

As we have a Continuous Integration server - we build our unit tests for every commit and developers are automatically notified in case of failures, triggering an immediate developer response. Most of our failed unit tests are issues with forgetting to commit something, or similar. So, not too bad.

Integration tests

This is pretty new for us. How to do integration testing for an appliance builder? Well, we can build appliances! But how do we make them easy to write (and maintain)? This is another story. Thankfully BoxGrinder Build has a great feature - you can use BoxGrinder as a library in your Ruby scripts. You don't need to use the command line to run the builds. We described the feature in detail earlier.

We prepared some test appliance definitions.

At this time we have JEOS appliances for Fedora 15 and CentOS 5. Additionally we have a full modular appliance definitions where we have split the appliance definition sections into files and include them next. This gives us the chance to test all section plus inclusion and override functionality.

Integration tests execution process

We cannot just execute the tests. There a process. And yes - we use the Cloud (AWS in our case). Let's take a look at all stages.

1. Build RPMs

Each day we create RPM's for our gems - these are our nightly builds which are also accessible using YUM. We will use them later.

2. Start new EC2 build instance

After the RPMs are created we start a new instance on EC2. This is a feature of our continuous integration system - TeamCity. If you haven't looked at TeamCity so far - I strongly recommend it. It is very powerful, looks great and it's free!

But back to our instance. After the instance is launched, and the agent installed on the instance connects to our CI server, an additional build machine in the Cloud becomes available for use.

3. Prepare instance

Before the build can be triggered we need to prepare the instance. We need to install BoxGrinder Build using, of course, the nightly builds created for us just a few minutes ago. Additionally we create a BoxGrinder configuration file with the required data.

Now we have launched and configured instance, let's start the tests!

4. Execute actual tests

This is the most important step - the agent installed on the instance pulls the latest integration tests and executes them. They look really simple, for example building Fedora JEOS may look like this:

it "should build Fedora JEOS" do
  @appliance = Appliance.new("./../appliances/jeos-fedora.appl", @config, :log => @log).create
end

Isn't it neat? We have also set some callbacks to make sure the deliverables were created:

after(:each) do
  @appliance.plugin_chain.last[:plugin].deliverables.each_value do |file|
    File.exists?(file).should == true
  end
end

5. Save the artifacts

Every artifact created by our integration tests is saved on CloudFront. This makes is easy to test the image manually later if we need to.

What could make integration tests even more powerful?

Quick answer: ligbuestfs.

Don't know libguestfs?

If you're not familiar with libguestfs - it's a tool for offline image launching. Sounds weird? Maybe, but just for the first time.

It uses qemu to start a custom, minimalistic OS (supermin appliance). This makes it very fast to boot, on my machine it's less than 5 seconds. You can mount disk images in various formats: vmdk, raw, qcow, even ISOs... Mounted disks are available to you - you can decide whether mount them in read-only fashion or make them fully writable.

Libguestfs exposes a lot of API calls that make it possible to check or modify your appliance. In BoxGrinder we use only a fraction of them, but even this small set makes us happy libguestfs users!

Libguestfs comes with a handy tool called guestfish which is a command line interface to libguestfs. This makes is super easy to debug your appliances and we use it a lot when developing BoxGrinder.

$ guestfish -a centos-basic-sda.raw 

Welcome to guestfish, the libguestfs filesystem interactive shell for
editing virtual machine filesystems.

Type: 'help' for help on commands
      'man' to read the manual
      'quit' to quit the shell

><fs> launch 
><fs> mount /dev/vda
/dev/vda   /dev/vda1  /dev/vda2  
><fs> mount /dev/vda1 /
><fs> mount /dev/vda2 /b
/bin   /boot  
><fs> mount /dev/vda2 /boot/
><fs> ls /
bin
boot
dev
etc
home
lib
lib64
lost+found
media
mnt
opt
proc
root
sbin
selinux
srv
sys
tmp
usr
var
><fs> cat /etc/hosts
127.0.0.1       localhost.localdomain localhost
::1     localhost6.localdomain6 localhost6

><fs> 

And, most importantantly for us, libguestfs is rock solid. Go, try it - it's already in Fedora and RHEL!

Integration tests and libguestfs

But how we can use libguestfs in our integration tests? So far we only tested for deliverable existence, which can only prove that something was created. But there are still some open questions. Is the artifact really readable? Does it contain data we wanted? Here libguestfs comes to the rescue.

After building we launch libguestfs, add the disk images and make sure everything is in place. Take a look at this example test:

it "should build modular appliance based on Fedora and convert it to VirtualBox" do
  @config.merge!(:platform => :virtualbox)
  @appliance = Appliance.new("./../appliances/modular.appl", @config, :log => @log).create

  GuestFSHelper.new([@appliance.plugin_chain[1][:plugin].deliverables[:disk]], @appliance.appliance_config, @config, :log => @log ).customize do |guestfs, guestfs_helper|
    guestfs.exists('/fedora-boxgrinder-test').should == 1
    guestfs.exists('/common-test-base-boxgrinder-test').should == 1
    guestfs.exists('/hardware-cpus-boxgrinder-test').should == 1
    guestfs.exists('/repos-boxgrinder-noarch-ephemeral-boxgrinder-test').should == 1
  end
end

The above code is a real test. It'll create the appliance, convert it to virtualbox format and then make sure that the files that should be created exist. Not bad for 10 lines of code, huh?

Future directions

Testing offline appliances makes a lot of sense - we want to make sure the image is valid. But does the upload process work as expected? Does the image boot correctly on the destination platform, especially on EC2? This is the next step we want to investigate.

We'll introduce a new step in addition to the 5 described above: uploading, launching the appliance and testing on a real instance. It could look similar to this (not working code):

it "should build Fedora JEOS" do
  @config.merge!(:platform => :ec2, :delivery => :ebs)
  @appliance = Appliance.new("./../appliances/mysql-fedora.appl", @config, :log => @log).create

  @ec2 = AWS::EC2::Base.new(:access_key_id => ACCESS_KEY_ID, :secret_access_key => SECRET_ACCESS_KEY)
  instance = @ec2.run_instances(:image_id =>  @appliance.plugin_chain.last[:plugin].deliverables[:ami]).instancesSet.item.first

  # Not implemented: wait for correct state instance.instanceState pending => started

  Net::SSH.start(instance.dnsName, 'ec2-user', :key_data => 'ASDASDASD' ) do |ssh|
    ssh.exec!('/etc/init.d/mysqld status').should match /is running/
    ssh.exec!('ps ax | grep [m]ysql | wc -l').should_not == '0'
  end

  @ec2.terminate_instance(:instance_id => instance.instanceId)
end

This almost-working ruby code shows how easily we can extend our integration tests to run the tests on the right platform.

The idea of the above code is to create an appliance, convert it to EC2 format, upload to EC2 and launch. After the instance becomes available we'll connect to it using SSH, and check if the mysql daemon is really running. The instance is terminated afterwards.

We'll work on making this as clean as possible, abstracting the interactions with EC2 even more.

If you have any comments or ideas. Feel free to leave a note.

BoxGrinder Build 0.9.3 released

Today marks the release of BoxGrinder Build 0.9.3, packed with important bug fixes and improvements, there are also a few interesting new features to explore.

BoxGrinder Meta 1.6

The BoxGrinder Meta appliances provide an automatically-updated and ideally prepared environment to try the latest release of BoxGrinder Build

An improved release is now available in a variety of virtualized and raw formats, including BoxGrinder Build EBS AMIs for every AWS region. Previously we only published in US East, but in response to demand you can now launch instances in both x86 and i386/i686 versions in all regions.

The primary improvement in this release, aside from wider global distribution, is the enabling of ntpd to ensure that the system clock remains accurate, avoiding problems when interacting with remote systems that have sensitivity to time-skew.

What's new in 0.9.3?

Scientific Linux support

Through the kind contribution of community member Neil Wilson (NeilW), we now have support for Scientific Linux builds. Usage is quintessentially straightforward;

name: my-sl-appliance
os:
  name: sl
  version: 6

Then run the build: boxgrinder-build my-sl-appliance.appl

Neil's contributed plugin extends the base classes present in BoxGrinder for building RHEL-derived operating systems. For those of you with ambitions to creating additional plugins, this demonstrates how painless it is to extend existing functionality to provide new features.

Reliably create and upload EBS backed AMIs in all Amazon AWS regions

Typically a bug-fix doesn't get much prominence in a release announcement, however with the resolution of a ream of interrelated bugs (amongst which included [BGBUILD-254], [BGBUILD-231], [BGBUILD-251], [BGBUILD-193], [BGBUILD-250]), you can now reliably build and deliver EBS backed AMIs to any Amazon region. Previously it was only possible to deliver to us-east-1 without issues, and even then builds would occasionally fail at random due to concurrency issues, this release fixes those bugs and hence the build process is now far more reliable and reproducible.

boxgrinder-build my-appliance.appl -p ec2 -d ebs

You must run the build on an instance in the same region that you want the EBS AMI registered in (this is an AWS limitation). For instance, if you wanted to deliver an appliance to Tokyo (ap-northeast-1) then you would need to run your build on an instance running in ap-northeast-1.

Overwrite support for S3, AMIs and EBS AMIs

A common development idiom is to iteratively change an appliance, upload and test it, then tear it down afterwards. However, it is impossible to upload a new iteration with the same name as the previous one, as this is a conflict that AWS rejects. An obvious work-around is to increment the version and release numbers in your appliance, which will ensure that the generated appliance has a different name to the previous one. Unfortunately this has the disadvantage that it quickly consumes large volumes of storage (S3, EBS) on Amazon which must be paid for. Hence to remedy this we have provided an overwrite attribute that can be set for EBS and S3 delivery options, which will cause the existing version to be torn down, de-registered and deleted, and the new revision with identical version and release numbers to be uploaded in its place.

In this example, we attempt to upload an AMI appliance that is already registered, first without overwrite enabled, then subsequently with it:

# First upload
boxgrinder-build my-appliance.appl -p ec2 -d ami
# Second upload fails, because there is already an image with the same name, version and release.
boxgrinder-build my-appliance.appl -p ec2 -d ami
# Works!
boxgrinder-build my-appliance.appl -p ec2 -d ami --delivery-config=overwrite:true

Overwrite is supported for AMIs, EBS AMIs, and standard S3 packaged delivery. EBS overwriting is slightly more complex than the other plugins, as there are several components that work together to make the instance, the most important of which are; a snapshot, an EBS root volume and the registered image. By setting overwrite, the plugin will locate and delete the EBS volume, its associated snapshot, and then de-register the image. If you wish to preserve the snapshot, you must set preserve_snapshots:true.

If BoxGrinder appliance snapshotting is enabled, then only the very last snapshot will be overwritten.

Plugin configuration is now validated before the build process begins

Previously if bad configuration values were provided to a plugin, the build process would fail only when the plugin was reached in the build pipeline. BoxGrinder now allows plugins to validate before executing the build pipeline, thus allowing much earlier detection and failure. Any plugin developers will need to modify their software to utilise this new feature, a simple example of which is the Local delivery plugin code, see the after_init and validate methods.

Other points of interest

  • BoxGrinder welcomes Japan! Support has been added for ap-northeast-1 region for all AWS-related plugins.
  • We have dropped support for Fedora 13, as it has reached EOL status.
  • The PAE configuration option is now in the operating system configuration options, rather than the appliance definition, for instance: ... --os-config=pae:false.
  • ~/.ssh/authorized_keys no longer fills with duplicate key entries, as a result of our custom /etc/rc.local script.
  • The volume of debugging messages emanating from libguestfs has been significantly decreased in --trace and --debug.

Comprehensive Change-log

Feature Request

Bug

  • [BGBUILD-191] Build fails for EC2/EBS appliance when creating filesystem on new disk for CentOS/RHEL 5
  • [BGBUILD-193] Amazon EBS delivery plugin: 'Not found device for suffix g' error while building image
  • [BGBUILD-220] group names have spaces (to the user), this breaks the schema rules for packages
  • [BGBUILD-223] BoxGrinder hangs because qemu.wrapper does not detect x86_64 properly on CentOS 5.6
  • [BGBUILD-229] boxgrinder meta fedora 15 not updating itself at boot
  • [BGBUILD-231] Cannot register Fedora 15 EC2 AMI with S3 delivery plugin in eu-west-1 availability zone
  • [BGBUILD-232] boxgrinder doesn't validate config early enough
  • [BGBUILD-233] boxgrinder fails to report a missing config file
  • [BGBUILD-237] Tilde characters break creation of yum.conf
  • [BGBUILD-250] EBS plugin incorrectly determines that non-US regions are not EC2 instances
  • [BGBUILD-251] Add ap-northeast-1 (tokyo) region for EBS plugin
  • [BGBUILD-252] rc.local script fills ~/.ssh/authorized_keys with a duplicate key every boot
  • [BGBUILD-253] RequestTimeTooSkewed error when attempting upload to S3 from an EC2 instance
  • [BGBUILD-254] all EBS volumes are delivered to us-east-1 despite setting other regions and buckets
  • [BGBUILD-260] Wrong EC2 discovery causing libguestfs errors on non US regions

Task

  • [BGBUILD-225] Move PAE configuration parameter to operating system configuration

Enhancement

  • [BGBUILD-261] Decrease amount of debug log when downloading or uploading file using guestfs

BoxGrinder weekly planning meetings

In the addition to the latest mailing list announcement we are going to start weekly planning meetings. I would like to invite everyone interested in BoxGrinder development to join us. You'll see what we're currently working on and what next week's plans are. Of course we greatly appreciate any input from you at these meetings, don't be shy!

Our meetings will be held on every Monday, 2pm UTC (4pm CEST, 10am EST) in #boxgrinder IRC channel on irc.freenode.net.

Talk to you soon!