Migrating from Chef™ to itamae

The Debian CI platform is comprised of 30+ (virtual) machines. Maintaining this many machines, and being able to add new ones with some degree of reliability requires one to use some sort of configuration management.

Until about a week ago, we were using Chef for our configuration management. I was, for several years, the main maintainer of Chef in Debian, so using it was natural to me, as I had used it before for personal and work projects. But last year I decided to request the removal of Chef from Debian, so that it won't be shipped with Debian 11 (bullseye).

After evaluating a few options, I believed that the path of least resistance was to migrate to itamae. itamae was inspired by chef, and uses a DSL that is very similar to the Chef one. Even though the itamae team claim it's not compatible with Chef, the changes that I needed to do were relatively limited. The necessary code changes might look like a lot, but a large part of them could be automated or done in bulk, like doing simple search and replace operations, and moving entire directories around.

In the rest of this post, I will describe the migration process, starting with the infrastructure changes, the types of changes I needed to make to the configuration management code, and my conclusions about the process.

Infrastructure changes

The first step was to add support for itamae to chake, a configuration management wrapper tool that I wrote. chake was originally written as a serverless remote executor for Chef, so this involved a bit of a redesign. I thought it was worth it to do, because at that point chake had gained several interesting managements features that we no directly tied to Chef. This work was done a bit slowly over the course of the several months, starting almost exactly one year ago, and was completed 3 months ago. I wasn't in a hurry and knew I had time before Debian 11 is released and I had to upgrade the platform.

After this was done, I started the work of migrating the then Chef cookbooks to itamae, and the next sections present the main types of changes that were necessary.

During the entire process, I sent a few patches out:

Code changes

These are the main types of changes that were necessary in the configuration code to accomplish the migration to itamae.

Replace cookbook_file with remote_file.

The resource known as cookbook_file in Chef is called remote_file in itamae. Fixing this is just a matter of search and replace, e.g.:

-cookbook_file '/etc/apt/apt.conf.d/00updates' do
+remote_file '/etc/apt/apt.conf.d/00updates' do

Changed file locations

The file structure assumed by itamae is a lot simpler than the one in Chef. The needed changes were:

  • static files and templates moved from cookbooks/${cookbook}/{files,templates}/default to cookbooks/${cookbook}/{files,templates}
  • recipes moved from cookbooks/${cookbook}/recipes/*.rb to cookbooks/${cookbook}/*.rb
  • host-specific files and templates are not supported directly, but can be implemented just by using an explicit source statement, like this:

    remote_file "/etc/foo.conf" do
      source "files/host-#{node['fqdn']}/foo.conf"
    end

Explicit file ownership and mode

Chef is usually design to run as root on the nodes, and files created are owned by root and have move 0644 by default. With itamae, files are by default owned by the user that was used to SSH into the machine. Because of this, I had to review all file creation resources and add owner, group and mode explicitly:

-cookbook_file '/etc/apt/apt.conf.d/00updates' do
-  source 'apt.conf'
+remote_file '/etc/apt/apt.conf.d/00updates' do
+  source 'files/apt.conf'
+  owner   'root'
+  group   'root'
+  mode    "0644"
 end

In the end, I guess being explicit make the configuration code more understandable, so I take that as a win.

Different execution context

One of the major differences between Chef itamae comes down the execution context of the recipes. In both Chef and itamae, the configuration is written in DSL embedded in Ruby. This means that the recipes are just Ruby code, and difference here has to do with where that code is executed. With Chef, the recipes are always execute on the machine you are configuring, while with itamae the recipe is executed on the workstation where you run itamae, and that gets translated to commands that need to be executed on the machine being configured.

For example, if you need to configure a service based on how much RAM the machine has, with Chef you could do something like this:

total_ram = File.readlines("/proc/meminfo").find do |l|
  l.split.first == "MemTotal:"
end.split[1]

file "/etc/service.conf" do
  # use 20% of the total RAM
  content "cache_size = #{ram / 5}KB"
end

With itamae, all that Ruby code will run on the client, so total_ram will contain the wrong number. In the Debian CI case, I worked around that by explicitly declaring the amount of RAM in the static host configuration, and the above construct ended up as something like this:

file "/etc/service.conf" do
  # use 20% of the total RAM
  content "cache_size = #{node['total_ram'] / 5}KB"
end

Lessons learned

This migration is now complete, and there are a few points that I take away from it:

  • The migration is definitely viable, and I'm glad I picked itamae after all.
  • Of course, itamae is way simpler than Chef, and has less features. On the other hand, this means that it a simple package to maintain, with less dependencies and keeping it up to date is a lot easier.
  • itamae is considerably slower than Chef. On my local tests, a noop execution (e.g. re-applying the configuration a second time) against local VMs with itamae takes 3x the time it takes with Chef.

All in all, the system is working just fine, and I consider this to have been a successful migration. I'm happy it worked out.