gotcha: using ccache in Debian package builds

Before I upload packages to Debian, I always do a full build from source under sbuild. This ensures that the package can build from source on a clean environment, implying that the set of build dependencies is complete.

But when iterating on a non-trivial package locally, I will usually build the package directly on my Debian testing system, and I want to take advantage of ccache to cache native (C/C++) code compilation to speed things up. In Debian, the easiest way to enable ccache is to add /usr/lib/ccache to your $PATH. I do this by doing something similar to the following in my ~/.bashrc:

export PATH=/usr/lib/ccache:$PATH

I noticed, however, that my Debian package builds were not using the cache. When building the same small package manually using make, the cache was used, but not when the build was wrapped with dpkg-buildpackage.

I tracked it down to the fact that in compatibility level 13+, debhelper will set $HOME to a temporary directory. For what's it worth, I think that's a good thing: you don't want package builds reaching for your home directory as that makes it harder to make builds reproducible, among other things.

This behavior, however, breaks ccache. The default cache directory is $HOME/.ccache, but that only gets resolved when ccache is actually used. So we end up starting with an empty cache on each build, get a 100% cache miss rate, and still pay for the overhead of populating the cache.

The fix is to explicitly set $CCACHE_DIR upfront, so that by the time $HOME gets overriden, it doesn't matter anymore for ccache. I did this in my ~/.bashrc:

export CCACHE_DIR=$HOME/.ccache

This way, $HOME will be expanded right there when the shell starts, and by the time ccache is called, it will use the persistent cache in my home directory even though $HOME will, at that point, refer to a temporary directory.

Debian CI: 10 years later

It was 2013, and I was on a break from work between Christmas and New Year of 2013. I had been working at Linaro for well over a year, on the LAVA project. I was living and breathing automated testing infrastructure, mostly for testing low-level components such as kernels and bootloaders, on real hardware.

At this point I was also a Debian contributor for quite some years, and had become an official project members two years prior. Most of my involvement was in the Ruby team, where we were already consistently running upstream test suites during package builds.

During that break, I put these two contexts together, and came to the conclusion that Debian needed a dedicated service that would test the contents of the Debian archive. I was aware of the existance of autopkgtest, and started working on a very simple service that would later become Debian CI.

In January 2014, debci was initially announced on that month's Misc Developer News, and later uploaded to Debian. It's been continuously developed for the last 10 years, evolved from a single shell script running tests in a loop into a distributed system with 47 geographically-distributed machines as of writing this piece, became part of the official Debian release process gating migrations to testing, had 5 Summer of Code and Outrechy interns working on it, and processed beyond 40 million test runs.

In there years, Debian CI has received contributions from a lot of people, but I would like to give special credits to the following:

  • Ian Jackson - created autopkgtest.
  • Martin Pitt - was the maintainer of autopkgtest when Debian CI launched and helped a lot for some time.
  • Paul Gevers - decided that he wanted Debian CI test runs to control testing migration. While at it, became a member of the Debian Release Team and the other half of the permanent Debian CI team together with me.
  • Lucas Kanashiro - Google Summer of Code intern, 2014.
  • Brandon Fairchild - Google Summer of Code intern, 2014.
  • Candy Tsai - Outreachy intern, 2019.
  • Pavit Kaur - Google Summer of Code intern, 2021
  • Abiola Ajadi - Outreachy intern, December 2021-2022.

Triaging Debian build failure logs with collab-qa-tools

The Ruby team is working now on transitioning to ruby 3.0. Even though most packages will work just fine, there is substantial amount of packages that require some work to adapt. We have been doing test rebuilds for a while during transitions, but usually triaged the problems manually.

This time I decided to try collab-qa-tools, a set of scripts Lucas Nussbaum uses when he does archive-wide rebuilds. I'm really glad that I did, because those tols save a lot of time when processing a large number of build failures. In this post, I will go through how to triage a set of build logs using collab-qa-tools.

I have made some improvements to the code. Given my last merge request is very new and was not merged yet, a few of the things I mention here may apply only to my own ruby3.0 branch.

collab-qa-tools also contains a few tools do perform the builds in the cloud, but since we already had the builds done, I will not be mentioning that part and will write exclusively about the triaging tools.

Installing collab-qa-tools

The first step is to clone the git repository. Make sure you have the dependencies from debian/control installed (a few Ruby libraries).

One of the patches I sent, and was already accepted, is the ability to run it without the need to install:

source /path/to/collab-qa-tools/activate.sh

This will add the tools to your $PATH.

Preparation

The first think you need to do is getting all your build logs in a directory. The tools assume .log file extension, and they can be named ${PACKAGE}_*.log or just ${PACKAGE}.log.

Creating a TODO file

cqa-scanlogs | grep -v OK > todo

todo will contain one line for each log with a summary of the failure, if it's able to find one. collab-qa-tools has a large set of regular expressions for finding errors in the build logs

It's a good idea to split the TODO file in multiple ones. This can easily be done with split(1), and can be used to delimit triaging sessions, and/or to split the triaging between multiple people. For example this will create todo into todo00, todo01, ..., each containing 30 lines:

split --lines=30 --numeric-suffixes todo todo

Triaging

You can now do the triaging. Let's say we split the TODO files, and will start with todo01.

The first step is calling cqa-fetchbugs (it does what it says on the tin):

cqa-fetchbugs --TODO=todo01

Then, cqa-annotate will guide you through the logs and allow you to report bugs:

cqa-annotate --TODO=todo01

I wrote myself a process.sh wrapper script for cqa-fetchbugs and cqa-annotate that looks like this:

#!/bin/sh

set -eu

for todo in $@; do
  # force downloading bugs
  awk '{print(".bugs." $1)}' "${todo}" | xargs rm -f
  cqa-fetchbugs --TODO="${todo}"

  cqa-annotate \
    --template=template.txt.jinja2 \
    --TODO="${todo}"
done

The --template option is a recent contribution of mine. This is a template for the bug reports you will be sending. It uses Liquid templates, which is very similar to Jinja2 for Python. You will notice that I am even pretending it is Jinja2 to trick vim into doing syntax highlighting for me. The template I'm using looks like this:

From: {{ fullname }} <{{ email }}>
To: submit@bugs.debian.org
Subject: {{ package }}: FTBFS with ruby3.0: {{ summary }}

Source: {{ package }}
Version: {{ version | split:'+rebuild' | first }}
Severity: serious
Justification: FTBFS
Tags: bookworm sid ftbfs
User: debian-ruby@lists.debian.org
Usertags: ruby3.0

Hi,

We are about to enable building against ruby3.0 on unstable. During a test
rebuild, {{ package }} was found to fail to build in that situation.

To reproduce this locally, you need to install ruby-all-dev from experimental
on an unstable system or build chroot.

Relevant part (hopefully):
{% for line in extract %}> {{ line }}
{% endfor %}

The full build log is available at
https://people.debian.org/~kanashiro/ruby3.0/round2/builds/3/{{ package }}/{{ filename | replace:".log",".build.txt" }}

The cqa-annotate loop

cqa-annotate will parse each log file, display an extract of what it found as possibly being the relevant part, and wait for your input:

######## ruby-cocaine_0.5.8-1.1+rebuild1633376733_amd64.log ########
--------- Error:
     Failure/Error: undef_method :exitstatus

     FrozenError:
       can't modify frozen object: pid 2351759 exit 0
     # ./spec/support/unsetting_exitstatus.rb:4:in `undef_method'
     # ./spec/support/unsetting_exitstatus.rb:4:in `singleton class'
     # ./spec/support/unsetting_exitstatus.rb:3:in `assuming_no_processes_have_been_run'
     # ./spec/cocaine/errors_spec.rb:55:in `block (2 levels) in <top (required)>'

Deprecation Warnings:

Using `should` from rspec-expectations' old `:should` syntax without explicitly enabling the syntax is deprecated. Use the new `:expect` syntax or explicitly enable `:should` with `config.expect_with(:rspec) { |c| c.syntax = :should }` instead. Called from /<<PKGBUILDDIR>>/spec/cocaine/command_line/runners/backticks_runner_spec.rb:19:in `block (2 levels) in <top (required)>'.


If you need more of the backtrace for any of these deprecations to
identify where to make the necessary changes, you can configure
`config.raise_errors_for_deprecations!`, and it will turn the
deprecation warnings into errors, giving you the full backtrace.

1 deprecation warning total

Finished in 6.87 seconds (files took 2.68 seconds to load)
67 examples, 1 failure

Failed examples:

rspec ./spec/cocaine/errors_spec.rb:54 # When an error happens does not blow up if running the command errored before execution

/usr/bin/ruby3.0 -I/usr/share/rubygems-integration/all/gems/rspec-support-3.9.3/lib:/usr/share/rubygems-integration/all/gems/rspec-core-3.9.2/lib /usr/share/rubygems-integration/all/gems/rspec-core-3.9.2/exe/rspec --pattern ./spec/\*\*/\*_spec.rb --format documentation failed
ERROR: Test "ruby3.0" failed:
----------------
ERROR: Test "ruby3.0" failed:      Failure/Error: undef_method :exitstatus
----------------
package: ruby-cocaine
lines: 30
------------------------------------------------------------------------
s: skip
i: ignore this package permanently
r: report new bug
f: view full log
------------------------------------------------------------------------
Action [s|i|r|f]:

You can then choose one of the options:

  • s - skip this package and do nothing. You can run cqa-annotate again later and come back to it.
  • i - ignore this package completely. New runs of cqa-annotate won't ask about it again.

    This is useful if the package only fails in your rebuilds due to another package, and would just work when that other package gets fixes. In the Ruby transition this happens when A depends on B, while B builds a C extension and failed to build against the new Ruby. So once B is fixed, A should just work (in principle). But even if A would even have problems of its own, we can't really know before B is fixed so we can retry A.

  • r - report a bug. cqa-annotate will expand the template with the data from the current log, and feed it to mutt. This is currently a limitation: you have to use mutt to report bugs.

    After you report the bug, cqa-annotate will ask if it should edit the TODO file. In my opinion it's best to not do this, and annotate the package with a bug number when you have one (see below).

  • f - view the full log. This is useful when the extract displayed doesn't have enough info, or you want to inspect something that happened earlier (or later) during the build.

When there are existing bugs in the package, cqa-annotate will list them among the options. If you choose a bug number, the TODO file will be annotated with that bug number and new runs of cqa-annotate will not ask about that package anymore. For example after I reported a bug for ruby-cocaine for the issue listed above, I aborted with a ctrl-c, and when I run my process.sh script again I then get this prompt:

----------------
ERROR: Test "ruby3.0" failed:      Failure/Error: undef_method :exitstatus
----------------
package: ruby-cocaine
lines: 30
------------------------------------------------------------------------
s: skip
i: ignore this package permanently
1: 996206 serious ruby-cocaine: FTBFS with ruby3.0: ERROR: Test "ruby3.0" failed:      Failure/Error: undef_method :exitstatus ||
r: report new bug
f: view full log
------------------------------------------------------------------------
Action [s|i|1|r|f]:

Chosing 1 will annotate the TODO file with the bug number, and I'm done with this package. Only a few other hundreds to go.

Getting help with autopkgtest for your package

If you have been involved in Debian packaging at all in the last few years, you are probably aware that autopkgtest is now an important piece of the Debian release process. Back in 2018, the automated testing migration process started considering autopkgtest test results as part of its decision making.

Since them, this process has received several improvements. For example, during the bullseye freeze, non-key packages with a non-trivial autopkgtest test suite could migrate automatically to testing without their maintainers needing to open unblock requests, provided there was no regression in theirs autopkgtest (or those from their reverse dependencies).

Since 2014 when ci.debian.net was first introduced, we have seen an amazing increase in the number of packages in Debian that can be automatically tested. We went from around 100 to 15,000 today. This means not only happier maintainers because their packages get to testing faster, but also improved quality assurance for Debian as a whole.

Chart showing the number of packages tested by ci.debian.net. Starts from close to 0 in 2014, up to 15,000 in 2021. The growth tendency seems to slow down in the last year

However, the growth rate seems to be decreasing. Maybe the low hanging fruit have all been picked, or maybe we just need to help more people jump in the automated testing bandwagon.

With that said, we would like to encourage and help more maintainers to add autopkgtest to their packages. To that effect, I just created the autopkgtest-help repository on salsa, where we will take help requests from maintainers working on autopkgtest for their packages.

If you want help, please go ahead and create an issue in there. To quote the repository README:

Valid requests:

  • "I want to add autopkgtest to package X. X is a tool that [...] and it works by [...]. How should I approach testing it?"

    It's OK if you have no idea where to start. But at least try to describe your package, what it does and how it works so we can try to help you.

  • "I started writing autopkgtest for X, here is my current work in progress [link]. But I encountered problem Y. How to I move forward?"

    If you already have an autopkgtest but is having trouble making it work as you think it should, you can also ask here.

Invalid requests:

  • "Please write autopkgtest for my package X for me".

    As with anything else in free software, please show appreciation for other people's time, and do your own research first. If you pose your question with enough details (see above) and make it interesting, it may be that whoever answers will write at least a basic structure for you, but as the maintainer you are still the expert in the package and what tests are relevant.

If you ask your question soon, you might get your answer recorded in video: we are going to have a DebConf21 talk next month, where we I and Paul Gevers (elbrus) will answer a few autopkgtest questions in video for posterity.

Now, if you have experience enabling autopkgtest for you own packages, please consider watching that repository there to help us help our fellow maintainers.

Debian Continuous Integration now using Salsa logins

I have just updated the Debian Continuous Integration platform with debci 3.1.

This update brings a few database performance improvements, courtesy of adding indexes to very important columns that were missing them. And boy, querying a table with 13 million rows without the proper indexes is bad! :-)

Now, the most user visible change in this update is the change from Debian SSO to Salsa Logins, which is part of Pavit Kaur's GSoC work. She has been working with me and Paul Gevers for a few weeks, and this was the first official task in the internship.

For users, this means that you now can only log in via Salsa. If you have an existing session where you logged in with an SSO certificate, it will still be valid. When you log in with Salsa, your username will be changed to match the one in Salsa. This means that if your account on salsa gets renamed, it will automatically be renamed on Debian CI when you log in the next time. Unfortunately we don't have a logout feature yet, but in the meantime you can use the developer toolbar to delete any existing cookies you might have for ci.debian.net.

Migrating to Salsa logins was in my TODO list for a while. I had the impression that it could do it pretty quick to do by using pre-existing libraries that provide gitlab authentication integration for Rack (Ruby's de facto standard web application interface, like uwsgi for Python). But in reality, the devil was in the details.

We went through several rounds of reviews to get it right. During the entire process, Pavit demonstrated an excelent capacity for responding to feedback, and overall I'm very happy with her performance in the internship so far.

While we were discussing the Salsa logins, we noted a limitation in the existing database structure, where we stored usernames directly as the test requestor field, and decided it was better to normalize that relationship with a proper foreign key to the users table, which she also worked on.

This update also include the very first (and non-user visible) step of her next task, which is adding support for having private tests. Those will be useful for implementing testing for embargoed security updates, and other use cases. This was broken up into 7 or 8 seperate steps, so there is still some work to do there. I'm looking forward to the continuation of this work.


For older posts, see the blog archive.