Rebuilding Debian on Arm
Wookey
What
- Rebuild all of debian on arm64
Nothing clever, but a good example of Debian ecosystem usage in companies
Why?
- Toolchain testing
- Debian is good approximation to 'everything'
- On arm64 (March 2015):
20000 source packages,(75 GB),
42000 binary packages,(130 GB)
about 10^9 lines of source code
Requirements
- Internal (errata are sensitive)
- Quickly!
Who?
- Edmund Grimley-Evans - Building
- Thomas Preud'homme - Toolchains
- Wookey - Advice
Hardware
- 14 HP ProLiant m400 (Moonshot) servers
running Ubuntu 14.04.2 LTS
- SSH root access
How?
Code re-use?
- Lucas' cloud-scripts
- for AWS (Amazon) - not private
- written in ruby
- rebuildd
- debile
- wanna-build
- Polling: slow?
- Email setup faff
- pybit?
- collab-qa-tools log analysis?
How
DIY
- 40-line shell script
- (+ machine setup)
- About 3 days to get 1st build started
Design
- Standard tools:
(sbuild/schroot, apt-cacher, reprepro)
- Avoid skew with debian snapshot
- Put testtools
- Build longest jobs first
- Package (modified) toolchain
- Keep binaries for scanning
- Rebuild against rebuild to be sure
Toolchain Prep
- Get (fixed) upstream tarball
- Apply latest debian diff
- Bump version to ensure used
- dpkg-buildpackage
Put in 'overlay' repo
Package list
wget http://snapshot.debian.org/archive/debian/20150313T152933Z/dists/jessie/main/binary-arm64/Packages.xz
xzcat Packages.xz |
perl -e 'foreach (split(/\n\n/, join("", <>))) { @x = split(/\n/, $_);
if (@t = grep(/^Version:/, @x)) { $v = $t[0]; } else { die; }
$v =~ s/^\S+: //;
if (@t = grep(/^Source:/, @x)) { $p = $t[0]; }
elsif (@t = grep(/^Package:/, @x)) { $p = $t[0]; } else { die; }
if ($p =~ / \((\S+)\)$/) { $v = $1; }
$p =~ s/^\S+: //; $p =~ s/ .*//;
if (@t = grep(/^Filename:/, @x)) { $f = $t[0]; } else { die; }
$f =~ s/^\S+: //;
$f =~ s!^pool/main/!! || die;
$f =~ s!/.*!! || die;
print "$f ${p}_$v\n"; }' | sort -u > packages
Better, something like:
grep-dctrl -s Source -S --pattern="" -n Packages | sort | uniq
Package list
Use buildd layout
4 4g8_1.0-3
6 6tunnel_0.11rc2-8
7 7kaa_2.14.4-1
9 9base_1:6-6
9 9menu_1.8-6
9 9mount_1.3-10
9 9wm_1.2-9
Sort by build-time.
How it works
- One machine as Master
- Run 'building' script (x2) for each build node
- Each 'building' runs
ssh $buildnode "sbuild -d $dist $pkg_$ver"
- Builds packages in list order
- Try builds up to 3 times
- Locking by file tree on master
- Binaries and logs left on build machines
Running a build
create: ~/ctrl/something/packages
then run:
for x in 01 02 03 04 05 06 07 08 09 10 11 12 13 14 ; do
h=moonshot-$x
bin/building `hostname` $h something &
bin/building `hostname` $h something &
done
shows attempts:
~/ctrl/something/[123]/p/package
shows successes:
~/ctrl/somthing/built/p/package
The script
master=$1
server=$2
ctrl="ctrl/$3"
out="out/$3"
dist=$4
build() {
mkdir -p "$ctrl/server-$server/$1/$2"
ssh -n $server \
"mkdir -p \"$out/$1\" && cd \"$out/$1\" && sbuild -d $dist $2 \
--post-build-commands='ssh -n $master mkdir -p \"$ctrl\"/built/$1/$2'"
}
cat "$ctrl/packages" | while read d p ; do
mkdir -p "$ctrl"/1/$d
if mkdir "$ctrl"/1/$d/$p 2> /dev/null ; then
build $d $p
fi
done
Builds
- Jessie Chroot
- sbuild (without -A)
- DEB_BUILD_OPTIONS=parallel=4
Master Setup
- Set up overlay repo
- Install apt-cacher-ng, apache2, gnupg, reprepro
- Generate reprepro archive key
- Configure reprepro 'overlay'
- Expose over http
Chroot Setup
- sbuild-createchroot --make-sbuild-tarball
- inside chroot:
- Set Acquire::http::Proxy to use apt-cacher
- Set Acquire::Check-Valid-Until false
- Add overlay repo
- Add archive signing key
Build Node Setup
- install ssh, sbuild
- copy sshkeys
- add buildd user
- copy chroot tarball
- copy sbuild keys
Results
arm64 (9889 sources)
- 4 missing build-deps
- 15 failed consistently
- 7 build 'sometimes'
- 9863 successful (99.7%)
arch all (9995 packages)
- 175 missing build-deps
- 67 failed consistently
- 9744 skipped
- 9 successful
Results
- Takes ~48hrs on 14 nodes
- 5% failures if arch all built
- Using tmpfs did not increase speed
- Using apt-cacher did
Useful QA
apg FTBFS with two users called 'root' (#783695)
audit FTBFS (tests) if user with uid 890 exists (#783698)
atlas - FTBFS sometimes
Bugs/Issues
- sbuild doesn't create ~/.gnupg
- sbuild-update --keygen really slow
- sbuild -j4 != DEB_BUILD_OPTIONS=parallel=4
Overview
Rebuild code is easy
Machine setup is most of the work
(Important) people very pleased with quick results
Enhancements
- Use LVM?
- Use tmpfs?
- Timeouts and resource constraints
- Test 3 builds per node
Now what?
Put this code somewhere?
Generalise/code machine setup?
What does everyone else do?