BoxGrinder 0.10.3 Released

We're happy to announce the long-awaited BoxGrinder Build 0.10.3 release. Yet again it is primarily bug fixes, but we have resolved some tricky upstream issues to eliminate some of the most annoying persistent issues.

Large numbers of partitions

It is now possible to successfully create appliances with large numbers of partitions, whereas this generally failed when more than 4 or 5 partitions were specified previously. Furthermore, there should be fewer issues with the build process being interrupted by loop device symlinks left hanging around.

Cross-arch building of RHEL and derivatives

On newer versions of Fedora cross-arch builds no longer completed successfully (i.e. setarch i686 boxgrinder-build ...); this has now been fixed upstream, and should work without issue.

Additionally, a separate issue on Fedora 17 has been resolved where our qemu wrapper did not find a valid binary.

Found a bug?

If you have any issues, please open a ticket in our bug-tracker.

Release notes

Bug

  • [BGBUILD-339] - existing rpm package with the name containing '+' considered as an invalid name
  • [BGBUILD-358] - Unable to cross-arch build appliance for CentOS 5
  • [BGBUILD-359] - Unable to create more that 2 partitions in Centos EBS-EC2.
  • [BGBUILD-367] - Build fails with default_repos disabled on centos 5
  • [BGBUILD-368] - /usr/bin/qemu: No such file or directory on Fedora 17 32-bit

BoxGrinder 0.10.2 Released

The bug-fix release BoxGrinder Build 0.10.2 is available. It contains a couple of important fixes which have been breaking people's S3 AMI appliances on Amazon EC2. You can refer to our previous blog post about it, but put tersely, you should be able to successfully run appliances on any EC2 instance size without issue now!

We've managed to sneak in an extra feature however...

Fedora 17 support

Huzzah, Fedora 17 support is here and working well. You can build an appliance in the usual manner, simply write your appliance definition specifying fedora 17 as your OS name and version:

name: my-super-cool-fedora-17-appliance
os:
  name: fedora
  version: 17

If you have any issues with the new OS, please open a ticket in our bug-tracker.

Release notes

Bug

  • [BGBUILD-361] - CentOS 5 ec2 platform plugin issues, yum runs before having a proper /etc/resolv.conf and fails to resolve mirrors (Error: Cannot find a valid baseurl for repo: base)
  • [BGBUILD-353] - Images not bootable on new type of AWS instances
  • [BGBUILD-347] - Add support for Fedora 17

Upcoming EBS and S3 AMI changes

In light of some discussions we've been having internally, and with various community members, it has been proposed that as of our next BoxGrinder release, we shall no longer build images with ephemeral devices[1] pre-attached (for EBS), or pre-mounted in any fashion[2] by default (for EBS or S3)[3].

Instead we will emit a message at build-time about device attachment and mounting, advising users to install a script such as cloud-init to auto-mount ephemeral devices, as well as a pointer to a boxgrinder.org resource page explaining how best to bake in an appropriate cloud configuration file (more on that soon).

Those that want device mappings baked into their image will continue to be able to do so by defining their desired mappings via configuration. Note that block storage mappings (including ephemeral, or otherwise), can be configured or reconfigured at runtime instead of, or in addition to, build time.

Providing block device mappings

For EBS AMIs no ephemeral devices are attached by default, whereas with S3 AMIs there is a default layout (although you can modify it as you see fit).

At run-time

All of the standard methods of providing (or modifying) a block device mapping at run-time still apply. For example:

ec2-run-instances ami-12345678 -b /dev/xvdb=ephemeral0

At build time

You can speculatively map devices that may not be present in every instance size at build time.

# -d ami or -d ebs
boxgrinder-build my.appl -p ec2 -d ebs --delivery-config \ 
  block_device_mapping:'/dev/xvdb=ephemeral0&/dev/xvdc=ephemeral1'

Why the changes?

The reasons for this change are multitudinous, but the foremost are:

  • Ephemeral device mappings vary according to instance size: We cannot make any easy assumptions about which ephemeral disks will be present, as it varies depending on which instance size is selected, and is subject to arbitrary change by Amazon. For instance, recent problems with BoxGrinder S3 AMIs on m1.small instances were caused by BoxGrinder assuming a particular ephemeral device would be present; an assumption that fell over when AWS introduced a new device mapping layout[4].

  • We do not want to maintain a script that maps and mounts which devices are provided by which instance sizes, as it will duplicate existing well-established efforts (e.g. cloud-init), in addition to being difficult to maintain, inferior in functionality, and surprising to the user.

  • Confusion: Ephemeral device mappings do not show on Amazon API queries, or their web UI (!!). Users do not realise a device is attached until they attempt to attach their own device at the same location (e.g. /dev/xvdb is an ephemeral device mapping, but it does not show in the web UI; if the user attempts to attach a device at xvdb it will fail).

  • Inconsistency: BoxGrinder can build a few different OSes, and some of the best cloud initialisation projects are not necessarily available on all of the OSes default repositories. We would rather not force disparate solutions onto our users, as consistency is one of our primary goals.

Therefore we concluded that rather than a prescriptive or inconsistent solution, we shall provide extremely simple default behaviour, and enable the user to choose an approach that is optimal for them.

Thoughts?

If you have any objections, comments or suggestions, leave them in the comments, or send them to via any of our community channels.


[1] Ephemeral disks are transient/instance storage devices that are available 'free' with most instance sizes. The overall capacity included, and the number of devices the space is subdivided into varies by instance size. For example, a m1.large instance has 2x420GiB instance storage.

[2] There is an important distinction between attaching and mounting devices on AWS. Attaching is akin to physically plugging a disk into a machine. Mounting is the usual process of making a device available to the machine's file system.

[3] For the sake of clarity, it is worth noting that S3 Backed AMIs always have a pre-defined set of ephemeral device mappings provided by EC2, but with EBS by default there are none.

[4] In this particular case we were expecting /dev/xvdb to exist, but for m1.small the ephemeral device we wanted was mapped to /dev/xvda2.

BoxGrinder 0.10.1 Released

The long-awaited BoxGrinder Build 0.10.1 bugfix release is now available; with a variety of irritation eliminating alterations behaviour should be more consistent, and no longer prone to permissions errors.

Permission denied, log shifting errors

If you have seen any errors akin to:

FATAL -- : Logger::ShiftingError: Shifting failed. Permission denied - log/boxgrinder.log.2 or log/boxgrinder.log.3

The problem was caused when BoxGrinder switched to a local user from root, but the log file could still be owned by root. The issue was only apparent on certain systems, and even then often only occasionally.

Ruby 1.9

We've made some changes to ensure BoxGrinder runs correctly under Ruby 1.9, with a particular eye towards the forthcoming release of Fedora 17. You can see our earlier musings on the changes required.

Scientific Linux EBS AMIs

OS constraints on the EBS plugin have been removed, so you can now create a Scientific Linux EBS AMIs. As the limitation is generally eliminated, any community OS plugin is also be able to use the EBS plugin.

Bash tab completions

Basic bash tab completions have been sneaked into this release. Give it a try:

    [root@localhost ~]# boxgrinder-build my.appl --
    --backtrace        --delivery-config  --os-config        --plugins
    --debug            --force            --platform         --trace
    --delivery         --help             --platform-config  --version

Other

Snapshots with the S3 plugin are working correctly again, and some simple testing issues were fixed.

Release notes

Bug

  • [BGBUILD-337] - In SL if default repos are disabled, /etc/yum.repos.d folder is not created
  • [BGBUILD-338] - Weed out non-deterministic tests
  • [BGBUILD-344] - Builds on some platforms impossible due to log (and/or other) files still being owned by root after boxgrinder switches to user
  • [BGBUILD-351] - s3 plugin attempts to create bucket with whole pathname during snapshot

Enhancement

  • [BGBUILD-349] - Use RbConfig instead of obsolete and deprecated Config deprecation warning with Ruby 1.9.3

Task

Sub-task

  • [BGBUILD-348] - Simplecov coverage testing for Ruby >=1.9

Preparing for Fedora 17

Looking to the (beefy) future

Those of you who keep an eye on happenings in Fedora-land will undoubtedly be aware that Fedora 17 is due out in the near future. From BoxGrinder's perspective one of the more important changes is the move to Ruby 1.9.x from 1.8.

There are some syntactic changes and a few subtle semantic differences between versions, so it is important for us to ensure equivalent runtime behaviour in both flavours. Fedora 15 and 16 will both remain on 1.8.7, so we must straddle the fence.

Mercifully, the changes required were fairly minor; but for those of a more inquisitive persuasion, let's look at a few examples of alterations that were required.

Coverage Testing

We needed to provide code coverage analysis with both Rubies, under 1.9 via simplecov and RCov when using 1.8. We only want the relevant tool to be loaded and run for the appropriate version of Ruby.

Our tests are run using Rake, and as Rake tasks run in a new process, a bit of ingenuity is required to ensure the code you write is affecting the correct process.

In this instance the simplest solution was to create a couple of helper files that are run in the new process before the specs begin, ensuring everything required is kicked into action.


Below is a snippet from our Rakefile. Under 1.9, we set an environment variable to indicate to spec_helper that simplecov should be run.

When running with 1.8 a bit of path juggling ensures rcov_helper is run before RCov starts. As RCov is initialised before RSpec, we must ensure that some basic dependencies are met.

Rakefile:

RSpec::Core::RakeTask.new('spec:coverage') do |t|
  t.ruby_opts = "-I ../boxgrinder-core/lib"
  t.pattern = "spec/**/*-spec.rb"
  t.rspec_opts = ['-r spec_helper', '-r boxgrinder-core', 
          '-r rubygems', <snip>]
  t.verbose = true

  if RUBY_VERSION =~ /^1.8/
    t.rcov = true
    t.rcov_opts = ["-Ispec:lib spec/rcov_helper.rb", <snip>]
  else
    ENV['COVERAGE'] = 'true'
  end
end

spec_helper.rb:

if ENV['COVERAGE']
  require 'simplecov'

  FILTER_DIRS = ['spec']

  SimpleCov.start do
    FILTER_DIRS.each{ |f| add_filter f }
  end
end

This might seem fairly circuitous, but if we attempted to start code coverage in the Rakefile itself, we'd simply be analysing Rake, not our code!

Sycked up

module BoxGrinder
  # Avoid Psych::SyntaxError (<unknown>): couldn't parse YAML in 1.9
  if RUBY_VERSION.split('.')[1] == '9'
    require 'yaml'
    YAML::ENGINE.yamler = 'syck'
  end
end

Psych, the default cRuby 1.9.3 YAML parser, causes problems with our YAML parsing and validation (through Kwalify), but fortunately the only change required was to set the parser back to Syck.

Syntactical slips

-    when :ec2:
+    when :ec2
       disk_format = :ami
       container_format = :ami
-    when :vmware:
+    when :vmware
       disk_format = :vmdk

This is a single example of a few slightly unusual case (switch) syntaxes which had crept into the codebase, and due to 1.9's new hash syntax, something like :ec2: appears to be a mangled hash key.

String it out

-    repoquery_output.each do |line|
+    repoquery_output.each_line do |line|

String#each is no longer an alias to #each_line, which splits a string into an array with newline as the separator. The change is probably rather sensible, given that the behaviour is surprising at first encounter.

Regex

-    vmdk_image.scan(/^createType="(.*)"\s?$/).to_s.should == "vmfs"
+    vmdk_image.match(/^createType="(.*)"\s?$/)[1].should == "vmfs"
[1] pry(main)> 'createType="BG"'.scan(/^createType="(.*)"\s?$/).to_s
=> "example" # Ruby 1.8.7

[1] pry(main)> 'createType="BG"'.scan(/^createType="(.*)"\s?$/).to_s
=> "[[\"example\"]]" # Ruby 1.9.3

Numerous examples of slightly dodgy regex matching that relied upon to_s in our tests were eliminated from the codebase.

Other bits

Amongst other changes that bit us was our reliance upon quirky behaviour in Ruby 1.8 bindings (albeit our usage is slightly dubious anyway), and a couple of situations where implicit arrays were assumed.

RSpec 2, when dependencies strike

A few of our newer RSpec tests only work properly with rspec-expectations >= 2.7.0, which is available only on Fedora 17 and above.

- it "should add the path to the path_set" do
+ <snip> :if => RSpec::Expectations::Version::STRING >= '2.7.0' do
    expect{ simple_update }.to change(subject, :path_set).
      from(empty_set).to(mkset('/the/great/escape'))
  end

By using RSpec 2 filters we avoid running tests known to fail spuriously. This is a useful technique if you temporarily need to straddle multiple versions, and refactoring isn't desirable.

Onwards

Thankfully the process was rather easy, with all but a couple of issues being caught by our tests. It would seem that all of the cases were circumstances where we should have been using better approaches anyway, so the outcome was certainly positive.