Setup a simple two domU system. One with network backend, running
xendriverdomain service, and one with frontend, trying to ping the
backend.
Contrary to other similar tests, use disk image instead of initrd, to
allow bigger rootfs without adding more RAM (for both dom0 and domU).
But keep using pxelinux as a bootloader as it's easier to setup than
installing grub on the disk. Theoretically, it could be started via direct
kernel boot in QEMU, but pxelinux is slightly closer to real-world
deployment.
Use fakeroot to preserve file owners/permissions. This is especially
important for suid binaries like /bin/mount - without fakeroot, they
will end up as suid into non-root user.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
---
Changes in v3:
- add fakeroot
- run ldconfig at the disk image creation time, to avoid running it at
dom0/domU boot time (which is much slower)
Changes in v2:
- use heredoc
- limit ping loop iterations
- use full "backend" / "frontend" in disk image names
- print domU consoles directly to /dev/console, to avoid systemd-added
messages prefix
- terminate test on failure, don't wait for timeout
---
automation/build/debian/13-x86_64.dockerfile | 2 +-
automation/gitlab-ci/test.yaml | 8 +-
automation/scripts/qemu-driverdomains-x86_64.sh | 138 +++++++++++++++++-
3 files changed, 148 insertions(+)
create mode 100755 automation/scripts/qemu-driverdomains-x86_64.sh
diff --git a/automation/build/debian/13-x86_64.dockerfile b/automation/build/debian/13-x86_64.dockerfile
index 2c6c9d4a5098..6382bafbd5bd 100644
--- a/automation/build/debian/13-x86_64.dockerfile
+++ b/automation/build/debian/13-x86_64.dockerfile
@@ -55,7 +55,9 @@ RUN <<EOF
# for test phase, qemu-* jobs
busybox-static
+ e2fsprogs
expect
+ fakeroot
ovmf
qemu-system-x86
diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml
index 7b36f1e126ca..abc5339a74ab 100644
--- a/automation/gitlab-ci/test.yaml
+++ b/automation/gitlab-ci/test.yaml
@@ -656,6 +656,14 @@ qemu-alpine-x86_64-gcc:
- *x86-64-test-needs
- alpine-3.22-gcc
+qemu-alpine-driverdomains-x86_64-gcc:
+ extends: .qemu-x86-64
+ script:
+ - ./automation/scripts/qemu-driverdomains-x86_64.sh 2>&1 | tee ${LOGFILE}
+ needs:
+ - *x86-64-test-needs
+ - alpine-3.22-gcc
+
qemu-smoke-x86-64-gcc:
extends: .qemu-smoke-x86-64
script:
diff --git a/automation/scripts/qemu-driverdomains-x86_64.sh b/automation/scripts/qemu-driverdomains-x86_64.sh
new file mode 100755
index 000000000000..c0241da54168
--- /dev/null
+++ b/automation/scripts/qemu-driverdomains-x86_64.sh
@@ -0,0 +1,138 @@
+#!/bin/bash
+
+set -ex -o pipefail
+
+dom0_rootfs_extra_comp=()
+dom0_rootfs_extra_uncomp=()
+
+cd binaries
+
+# DomU rootfs
+
+mkdir -p rootfs
+cd rootfs
+mkdir -p etc/local.d
+passed="ping test passed"
+failed="TEST FAILED"
+cat > etc/local.d/xen.start << EOF
+#!/bin/bash
+
+set -x
+
+if grep -q test=backend /proc/cmdline; then
+ brctl addbr xenbr0
+ ip link set xenbr0 up
+ ip addr add 192.168.0.1/24 dev xenbr0
+ bash /etc/init.d/xendriverdomain start
+ # log backend-related logs to the console
+ tail -F /var/log/xen/xldevd.log /var/log/xen/xen-hotplug.log >>/dev/console 2>/dev/null &
+else
+ ip link set eth0 up
+ ip addr add 192.168.0.2/24 dev eth0
+ timeout=6 # 6*10s
+ until ping -c 10 192.168.0.1; do
+ sleep 1
+ if [ \$timeout -le 0 ]; then
+ echo "${failed}"
+ exit 1
+ fi
+ ((timeout--))
+ done
+ echo "${passed}"
+fi
+EOF
+chmod +x etc/local.d/xen.start
+fakeroot sh -c "
+ zcat ../rootfs.cpio.gz | cpio -imd
+ zcat ../xen-tools.cpio.gz | cpio -imd
+ ldconfig -r .
+ touch etc/.updated
+ mkfs.ext4 -d . ../domU-rootfs.img 1024M
+"
+cd ..
+rm -rf rootfs
+
+# Dom0 rootfs
+mkdir -p rootfs
+cd rootfs
+fakeroot -s ../fakeroot-save sh -c "
+ zcat ../rootfs.cpio.gz | cpio -imd
+ zcat ../xen-tools.cpio.gz | cpio -imd
+ ldconfig -r .
+ touch etc/.updated
+"
+mkdir -p root etc/local.d
+cat > root/backend.cfg << EOF
+name="backend"
+memory=512
+vcpus=1
+kernel="/root/bzImage"
+extra="console=hvc0 root=/dev/xvda net.ifnames=0 test=backend"
+disk=[ '/root/domU-rootfs-backend.img,raw,xvda,rw' ]
+EOF
+cat > root/frontend.cfg << EOF
+name="frontend"
+memory=512
+vcpus=1
+kernel="/root/bzImage"
+extra="console=hvc0 root=/dev/xvda net.ifnames=0 test=frontend"
+disk=[ '/root/domU-rootfs-frontend.img,raw,xvda,rw' ]
+vif=[ 'bridge=xenbr0,backend=backend' ]
+EOF
+
+cat > etc/local.d/xen.start << EOF
+#!/bin/bash
+
+set -x
+
+bash /etc/init.d/xencommons start
+
+xl list
+
+tail -F /var/log/xen/console/guest-backend.log 2>/dev/null | sed -e "s/^/(backend) /" >>/dev/console &
+tail -F /var/log/xen/console/guest-frontend.log 2>/dev/null | sed -e "s/^/(frontend) /" >>/dev/console &
+xl -vvv create /root/backend.cfg
+xl -vvv create /root/frontend.cfg
+EOF
+chmod +x etc/local.d/xen.start
+
+cp ../domU-rootfs.img ./root/domU-rootfs-backend.img
+cp ../domU-rootfs.img ./root/domU-rootfs-frontend.img
+cp ../bzImage ./root/
+mkdir -p etc/default
+echo 'XENCONSOLED_TRACE=all' >> etc/default/xencommons
+mkdir -p var/log/xen/console
+fakeroot -i ../fakeroot-save mkfs.ext4 -d . ../dom0-rootfs.img 2048M
+cd ..
+rm -rf rootfs
+
+cd ..
+
+cat >> binaries/pxelinux.0 << EOF
+#!ipxe
+
+kernel xen console=com1 console_timestamps=boot
+module bzImage console=hvc0 root=/dev/sda net.ifnames=0
+boot
+EOF
+
+# Run the test
+rm -f smoke.serial
+export TEST_CMD="qemu-system-x86_64 \
+ -cpu qemu64,+svm \
+ -m 2G -smp 2 \
+ -monitor none -serial stdio \
+ -nographic \
+ -device virtio-net-pci,netdev=n0 \
+ -netdev user,id=n0,tftp=binaries,bootfile=/pxelinux.0 \
+ -drive file=binaries/dom0-rootfs.img,format=raw"
+
+export TEST_LOG="smoke.serial"
+export BOOT_MSG="Latest ChangeSet: "
+export LOG_MSG="Domain-0"
+# exit early on test failure too, check if it was success below
+export PASSED="$passed|$failed"
+
+./automation/scripts/console.exp | sed 's/\r\+$//'
+
+grep "$passed" smoke.serial
--
git-series 0.9.1
On Sat, 6 Dec 2025, Marek Marczykowski-Górecki wrote:
> Setup a simple two domU system. One with network backend, running
> xendriverdomain service, and one with frontend, trying to ping the
> backend.
>
> Contrary to other similar tests, use disk image instead of initrd, to
> allow bigger rootfs without adding more RAM (for both dom0 and domU).
> But keep using pxelinux as a bootloader as it's easier to setup than
> installing grub on the disk. Theoretically, it could be started via direct
> kernel boot in QEMU, but pxelinux is slightly closer to real-world
> deployment.
>
> Use fakeroot to preserve file owners/permissions. This is especially
> important for suid binaries like /bin/mount - without fakeroot, they
> will end up as suid into non-root user.
>
> Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> ---
> Changes in v3:
> - add fakeroot
> - run ldconfig at the disk image creation time, to avoid running it at
> dom0/domU boot time (which is much slower)
> Changes in v2:
> - use heredoc
> - limit ping loop iterations
> - use full "backend" / "frontend" in disk image names
> - print domU consoles directly to /dev/console, to avoid systemd-added
> messages prefix
> - terminate test on failure, don't wait for timeout
> ---
> automation/build/debian/13-x86_64.dockerfile | 2 +-
> automation/gitlab-ci/test.yaml | 8 +-
> automation/scripts/qemu-driverdomains-x86_64.sh | 138 +++++++++++++++++-
> 3 files changed, 148 insertions(+)
> create mode 100755 automation/scripts/qemu-driverdomains-x86_64.sh
>
> diff --git a/automation/build/debian/13-x86_64.dockerfile b/automation/build/debian/13-x86_64.dockerfile
> index 2c6c9d4a5098..6382bafbd5bd 100644
> --- a/automation/build/debian/13-x86_64.dockerfile
> +++ b/automation/build/debian/13-x86_64.dockerfile
> @@ -55,7 +55,9 @@ RUN <<EOF
>
> # for test phase, qemu-* jobs
> busybox-static
> + e2fsprogs
> expect
> + fakeroot
> ovmf
> qemu-system-x86
>
> diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml
> index 7b36f1e126ca..abc5339a74ab 100644
> --- a/automation/gitlab-ci/test.yaml
> +++ b/automation/gitlab-ci/test.yaml
> @@ -656,6 +656,14 @@ qemu-alpine-x86_64-gcc:
> - *x86-64-test-needs
> - alpine-3.22-gcc
>
> +qemu-alpine-driverdomains-x86_64-gcc:
> + extends: .qemu-x86-64
> + script:
> + - ./automation/scripts/qemu-driverdomains-x86_64.sh 2>&1 | tee ${LOGFILE}
> + needs:
> + - *x86-64-test-needs
> + - alpine-3.22-gcc
> +
> qemu-smoke-x86-64-gcc:
> extends: .qemu-smoke-x86-64
> script:
> diff --git a/automation/scripts/qemu-driverdomains-x86_64.sh b/automation/scripts/qemu-driverdomains-x86_64.sh
> new file mode 100755
> index 000000000000..c0241da54168
> --- /dev/null
> +++ b/automation/scripts/qemu-driverdomains-x86_64.sh
> @@ -0,0 +1,138 @@
> +#!/bin/bash
> +
> +set -ex -o pipefail
> +
> +dom0_rootfs_extra_comp=()
> +dom0_rootfs_extra_uncomp=()
> +
> +cd binaries
> +
> +# DomU rootfs
> +
> +mkdir -p rootfs
> +cd rootfs
> +mkdir -p etc/local.d
> +passed="ping test passed"
> +failed="TEST FAILED"
> +cat > etc/local.d/xen.start << EOF
> +#!/bin/bash
> +
> +set -x
> +
> +if grep -q test=backend /proc/cmdline; then
> + brctl addbr xenbr0
> + ip link set xenbr0 up
> + ip addr add 192.168.0.1/24 dev xenbr0
> + bash /etc/init.d/xendriverdomain start
> + # log backend-related logs to the console
> + tail -F /var/log/xen/xldevd.log /var/log/xen/xen-hotplug.log >>/dev/console 2>/dev/null &
> +else
> + ip link set eth0 up
> + ip addr add 192.168.0.2/24 dev eth0
> + timeout=6 # 6*10s
> + until ping -c 10 192.168.0.1; do
> + sleep 1
> + if [ \$timeout -le 0 ]; then
> + echo "${failed}"
> + exit 1
> + fi
> + ((timeout--))
> + done
> + echo "${passed}"
> +fi
> +EOF
> +chmod +x etc/local.d/xen.start
> +fakeroot sh -c "
> + zcat ../rootfs.cpio.gz | cpio -imd
> + zcat ../xen-tools.cpio.gz | cpio -imd
> + ldconfig -r .
> + touch etc/.updated
> + mkfs.ext4 -d . ../domU-rootfs.img 1024M
Do we really need 1GB? I would rather use a smaller size if possible.
I would rather use as little resources as possible on the build server
as we might run a few of these jobs in parallel one day soon.
Moreover this script will be run inside a container which means this
data is probably in RAM.
The underlying rootfs is 25M on both ARM and x86. This should be at most
50M.
> +"
> +cd ..
> +rm -rf rootfs
> +
> +# Dom0 rootfs
> +mkdir -p rootfs
> +cd rootfs
> +fakeroot -s ../fakeroot-save sh -c "
> + zcat ../rootfs.cpio.gz | cpio -imd
> + zcat ../xen-tools.cpio.gz | cpio -imd
> + ldconfig -r .
> + touch etc/.updated
> +"
> +mkdir -p root etc/local.d
> +cat > root/backend.cfg << EOF
> +name="backend"
> +memory=512
> +vcpus=1
> +kernel="/root/bzImage"
> +extra="console=hvc0 root=/dev/xvda net.ifnames=0 test=backend"
> +disk=[ '/root/domU-rootfs-backend.img,raw,xvda,rw' ]
> +EOF
> +cat > root/frontend.cfg << EOF
> +name="frontend"
> +memory=512
> +vcpus=1
> +kernel="/root/bzImage"
> +extra="console=hvc0 root=/dev/xvda net.ifnames=0 test=frontend"
> +disk=[ '/root/domU-rootfs-frontend.img,raw,xvda,rw' ]
> +vif=[ 'bridge=xenbr0,backend=backend' ]
> +EOF
> +
> +cat > etc/local.d/xen.start << EOF
> +#!/bin/bash
> +
> +set -x
> +
> +bash /etc/init.d/xencommons start
> +
> +xl list
> +
> +tail -F /var/log/xen/console/guest-backend.log 2>/dev/null | sed -e "s/^/(backend) /" >>/dev/console &
> +tail -F /var/log/xen/console/guest-frontend.log 2>/dev/null | sed -e "s/^/(frontend) /" >>/dev/console &
> +xl -vvv create /root/backend.cfg
> +xl -vvv create /root/frontend.cfg
> +EOF
> +chmod +x etc/local.d/xen.start
> +
> +cp ../domU-rootfs.img ./root/domU-rootfs-backend.img
> +cp ../domU-rootfs.img ./root/domU-rootfs-frontend.img
> +cp ../bzImage ./root/
> +mkdir -p etc/default
> +echo 'XENCONSOLED_TRACE=all' >> etc/default/xencommons
> +mkdir -p var/log/xen/console
> +fakeroot -i ../fakeroot-save mkfs.ext4 -d . ../dom0-rootfs.img 2048M
Same here. Also 2GB might not be sufficient to contain 2 copies of
domU-rootfs.img, given that domU-rootfs.img is 1GB.
If we bring down domU-rootfs.img to 50M, then this could be 150M.
> +cd ..
> +rm -rf rootfs
> +
> +cd ..
> +
> +cat >> binaries/pxelinux.0 << EOF
> +#!ipxe
> +
> +kernel xen console=com1 console_timestamps=boot
> +module bzImage console=hvc0 root=/dev/sda net.ifnames=0
> +boot
> +EOF
> +
> +# Run the test
> +rm -f smoke.serial
> +export TEST_CMD="qemu-system-x86_64 \
> + -cpu qemu64,+svm \
> + -m 2G -smp 2 \
> + -monitor none -serial stdio \
> + -nographic \
> + -device virtio-net-pci,netdev=n0 \
> + -netdev user,id=n0,tftp=binaries,bootfile=/pxelinux.0 \
> + -drive file=binaries/dom0-rootfs.img,format=raw"
> +
> +export TEST_LOG="smoke.serial"
> +export BOOT_MSG="Latest ChangeSet: "
> +export LOG_MSG="Domain-0"
> +# exit early on test failure too, check if it was success below
> +export PASSED="$passed|$failed"
> +
> +./automation/scripts/console.exp | sed 's/\r\+$//'
> +
> +grep "$passed" smoke.serial
> --
> git-series 0.9.1
>
On Tue, Dec 09, 2025 at 04:02:06PM -0800, Stefano Stabellini wrote:
> On Sat, 6 Dec 2025, Marek Marczykowski-Górecki wrote:
> > Setup a simple two domU system. One with network backend, running
> > xendriverdomain service, and one with frontend, trying to ping the
> > backend.
> >
> > Contrary to other similar tests, use disk image instead of initrd, to
> > allow bigger rootfs without adding more RAM (for both dom0 and domU).
> > But keep using pxelinux as a bootloader as it's easier to setup than
> > installing grub on the disk. Theoretically, it could be started via direct
> > kernel boot in QEMU, but pxelinux is slightly closer to real-world
> > deployment.
> >
> > Use fakeroot to preserve file owners/permissions. This is especially
> > important for suid binaries like /bin/mount - without fakeroot, they
> > will end up as suid into non-root user.
> >
> > Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> > ---
> > Changes in v3:
> > - add fakeroot
> > - run ldconfig at the disk image creation time, to avoid running it at
> > dom0/domU boot time (which is much slower)
> > Changes in v2:
> > - use heredoc
> > - limit ping loop iterations
> > - use full "backend" / "frontend" in disk image names
> > - print domU consoles directly to /dev/console, to avoid systemd-added
> > messages prefix
> > - terminate test on failure, don't wait for timeout
> > ---
> > automation/build/debian/13-x86_64.dockerfile | 2 +-
> > automation/gitlab-ci/test.yaml | 8 +-
> > automation/scripts/qemu-driverdomains-x86_64.sh | 138 +++++++++++++++++-
> > 3 files changed, 148 insertions(+)
> > create mode 100755 automation/scripts/qemu-driverdomains-x86_64.sh
> >
> > diff --git a/automation/build/debian/13-x86_64.dockerfile b/automation/build/debian/13-x86_64.dockerfile
> > index 2c6c9d4a5098..6382bafbd5bd 100644
> > --- a/automation/build/debian/13-x86_64.dockerfile
> > +++ b/automation/build/debian/13-x86_64.dockerfile
> > @@ -55,7 +55,9 @@ RUN <<EOF
> >
> > # for test phase, qemu-* jobs
> > busybox-static
> > + e2fsprogs
> > expect
> > + fakeroot
> > ovmf
> > qemu-system-x86
> >
> > diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml
> > index 7b36f1e126ca..abc5339a74ab 100644
> > --- a/automation/gitlab-ci/test.yaml
> > +++ b/automation/gitlab-ci/test.yaml
> > @@ -656,6 +656,14 @@ qemu-alpine-x86_64-gcc:
> > - *x86-64-test-needs
> > - alpine-3.22-gcc
> >
> > +qemu-alpine-driverdomains-x86_64-gcc:
> > + extends: .qemu-x86-64
> > + script:
> > + - ./automation/scripts/qemu-driverdomains-x86_64.sh 2>&1 | tee ${LOGFILE}
> > + needs:
> > + - *x86-64-test-needs
> > + - alpine-3.22-gcc
> > +
> > qemu-smoke-x86-64-gcc:
> > extends: .qemu-smoke-x86-64
> > script:
> > diff --git a/automation/scripts/qemu-driverdomains-x86_64.sh b/automation/scripts/qemu-driverdomains-x86_64.sh
> > new file mode 100755
> > index 000000000000..c0241da54168
> > --- /dev/null
> > +++ b/automation/scripts/qemu-driverdomains-x86_64.sh
> > @@ -0,0 +1,138 @@
> > +#!/bin/bash
> > +
> > +set -ex -o pipefail
> > +
> > +dom0_rootfs_extra_comp=()
> > +dom0_rootfs_extra_uncomp=()
> > +
> > +cd binaries
> > +
> > +# DomU rootfs
> > +
> > +mkdir -p rootfs
> > +cd rootfs
> > +mkdir -p etc/local.d
> > +passed="ping test passed"
> > +failed="TEST FAILED"
> > +cat > etc/local.d/xen.start << EOF
> > +#!/bin/bash
> > +
> > +set -x
> > +
> > +if grep -q test=backend /proc/cmdline; then
> > + brctl addbr xenbr0
> > + ip link set xenbr0 up
> > + ip addr add 192.168.0.1/24 dev xenbr0
> > + bash /etc/init.d/xendriverdomain start
> > + # log backend-related logs to the console
> > + tail -F /var/log/xen/xldevd.log /var/log/xen/xen-hotplug.log >>/dev/console 2>/dev/null &
> > +else
> > + ip link set eth0 up
> > + ip addr add 192.168.0.2/24 dev eth0
> > + timeout=6 # 6*10s
> > + until ping -c 10 192.168.0.1; do
> > + sleep 1
> > + if [ \$timeout -le 0 ]; then
> > + echo "${failed}"
> > + exit 1
> > + fi
> > + ((timeout--))
> > + done
> > + echo "${passed}"
> > +fi
> > +EOF
> > +chmod +x etc/local.d/xen.start
> > +fakeroot sh -c "
> > + zcat ../rootfs.cpio.gz | cpio -imd
> > + zcat ../xen-tools.cpio.gz | cpio -imd
> > + ldconfig -r .
> > + touch etc/.updated
> > + mkfs.ext4 -d . ../domU-rootfs.img 1024M
>
> Do we really need 1GB? I would rather use a smaller size if possible.
> I would rather use as little resources as possible on the build server
> as we might run a few of these jobs in parallel one day soon.
This will be a sparse file, so it won't use really all the space. But
this size is the upper bound of what can be put inside.
That said, it's worth checking if sparse files do work properly on all
runners in /build. AFAIR some older docker versions had issues with that
(was it aufs not supporting sparse files?).
> Moreover this script will be run inside a container which means this
> data is probably in RAM.
Are runners configured to use tmpfs for /build? I don't think it's the
default.
> The underlying rootfs is 25M on both ARM and x86. This should be at most
> 50M.
Rootfs itself is small, but for driver domains it needs to include
toolstack too, and xen-tools.cpio is over 600MB (for debug build).
I might be able to pick just the parts needed for the driver domain (xl
with its deps, maybe some startup scripts, probably few more files), but
it's rather fragile.
> > +"
> > +cd ..
> > +rm -rf rootfs
> > +
> > +# Dom0 rootfs
> > +mkdir -p rootfs
> > +cd rootfs
> > +fakeroot -s ../fakeroot-save sh -c "
> > + zcat ../rootfs.cpio.gz | cpio -imd
> > + zcat ../xen-tools.cpio.gz | cpio -imd
> > + ldconfig -r .
> > + touch etc/.updated
> > +"
> > +mkdir -p root etc/local.d
> > +cat > root/backend.cfg << EOF
> > +name="backend"
> > +memory=512
> > +vcpus=1
> > +kernel="/root/bzImage"
> > +extra="console=hvc0 root=/dev/xvda net.ifnames=0 test=backend"
> > +disk=[ '/root/domU-rootfs-backend.img,raw,xvda,rw' ]
> > +EOF
> > +cat > root/frontend.cfg << EOF
> > +name="frontend"
> > +memory=512
> > +vcpus=1
> > +kernel="/root/bzImage"
> > +extra="console=hvc0 root=/dev/xvda net.ifnames=0 test=frontend"
> > +disk=[ '/root/domU-rootfs-frontend.img,raw,xvda,rw' ]
> > +vif=[ 'bridge=xenbr0,backend=backend' ]
> > +EOF
> > +
> > +cat > etc/local.d/xen.start << EOF
> > +#!/bin/bash
> > +
> > +set -x
> > +
> > +bash /etc/init.d/xencommons start
> > +
> > +xl list
> > +
> > +tail -F /var/log/xen/console/guest-backend.log 2>/dev/null | sed -e "s/^/(backend) /" >>/dev/console &
> > +tail -F /var/log/xen/console/guest-frontend.log 2>/dev/null | sed -e "s/^/(frontend) /" >>/dev/console &
> > +xl -vvv create /root/backend.cfg
> > +xl -vvv create /root/frontend.cfg
> > +EOF
> > +chmod +x etc/local.d/xen.start
> > +
> > +cp ../domU-rootfs.img ./root/domU-rootfs-backend.img
> > +cp ../domU-rootfs.img ./root/domU-rootfs-frontend.img
> > +cp ../bzImage ./root/
> > +mkdir -p etc/default
> > +echo 'XENCONSOLED_TRACE=all' >> etc/default/xencommons
> > +mkdir -p var/log/xen/console
> > +fakeroot -i ../fakeroot-save mkfs.ext4 -d . ../dom0-rootfs.img 2048M
>
> Same here. Also 2GB might not be sufficient to contain 2 copies of
> domU-rootfs.img, given that domU-rootfs.img is 1GB.
See the note about sparse files.
> If we bring down domU-rootfs.img to 50M, then this could be 150M.
>
>
> > +cd ..
> > +rm -rf rootfs
> > +
> > +cd ..
> > +
> > +cat >> binaries/pxelinux.0 << EOF
> > +#!ipxe
> > +
> > +kernel xen console=com1 console_timestamps=boot
> > +module bzImage console=hvc0 root=/dev/sda net.ifnames=0
> > +boot
> > +EOF
> > +
> > +# Run the test
> > +rm -f smoke.serial
> > +export TEST_CMD="qemu-system-x86_64 \
> > + -cpu qemu64,+svm \
> > + -m 2G -smp 2 \
> > + -monitor none -serial stdio \
> > + -nographic \
> > + -device virtio-net-pci,netdev=n0 \
> > + -netdev user,id=n0,tftp=binaries,bootfile=/pxelinux.0 \
> > + -drive file=binaries/dom0-rootfs.img,format=raw"
> > +
> > +export TEST_LOG="smoke.serial"
> > +export BOOT_MSG="Latest ChangeSet: "
> > +export LOG_MSG="Domain-0"
> > +# exit early on test failure too, check if it was success below
> > +export PASSED="$passed|$failed"
> > +
> > +./automation/scripts/console.exp | sed 's/\r\+$//'
> > +
> > +grep "$passed" smoke.serial
> > --
> > git-series 0.9.1
> >
--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
On Wed, 10 Dec 2025, Marek Marczykowski-Górecki wrote:
> On Tue, Dec 09, 2025 at 04:02:06PM -0800, Stefano Stabellini wrote:
> > On Sat, 6 Dec 2025, Marek Marczykowski-Górecki wrote:
> > > Setup a simple two domU system. One with network backend, running
> > > xendriverdomain service, and one with frontend, trying to ping the
> > > backend.
> > >
> > > Contrary to other similar tests, use disk image instead of initrd, to
> > > allow bigger rootfs without adding more RAM (for both dom0 and domU).
> > > But keep using pxelinux as a bootloader as it's easier to setup than
> > > installing grub on the disk. Theoretically, it could be started via direct
> > > kernel boot in QEMU, but pxelinux is slightly closer to real-world
> > > deployment.
> > >
> > > Use fakeroot to preserve file owners/permissions. This is especially
> > > important for suid binaries like /bin/mount - without fakeroot, they
> > > will end up as suid into non-root user.
> > >
> > > Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> > > ---
> > > Changes in v3:
> > > - add fakeroot
> > > - run ldconfig at the disk image creation time, to avoid running it at
> > > dom0/domU boot time (which is much slower)
> > > Changes in v2:
> > > - use heredoc
> > > - limit ping loop iterations
> > > - use full "backend" / "frontend" in disk image names
> > > - print domU consoles directly to /dev/console, to avoid systemd-added
> > > messages prefix
> > > - terminate test on failure, don't wait for timeout
> > > ---
> > > automation/build/debian/13-x86_64.dockerfile | 2 +-
> > > automation/gitlab-ci/test.yaml | 8 +-
> > > automation/scripts/qemu-driverdomains-x86_64.sh | 138 +++++++++++++++++-
> > > 3 files changed, 148 insertions(+)
> > > create mode 100755 automation/scripts/qemu-driverdomains-x86_64.sh
> > >
> > > diff --git a/automation/build/debian/13-x86_64.dockerfile b/automation/build/debian/13-x86_64.dockerfile
> > > index 2c6c9d4a5098..6382bafbd5bd 100644
> > > --- a/automation/build/debian/13-x86_64.dockerfile
> > > +++ b/automation/build/debian/13-x86_64.dockerfile
> > > @@ -55,7 +55,9 @@ RUN <<EOF
> > >
> > > # for test phase, qemu-* jobs
> > > busybox-static
> > > + e2fsprogs
> > > expect
> > > + fakeroot
> > > ovmf
> > > qemu-system-x86
> > >
> > > diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml
> > > index 7b36f1e126ca..abc5339a74ab 100644
> > > --- a/automation/gitlab-ci/test.yaml
> > > +++ b/automation/gitlab-ci/test.yaml
> > > @@ -656,6 +656,14 @@ qemu-alpine-x86_64-gcc:
> > > - *x86-64-test-needs
> > > - alpine-3.22-gcc
> > >
> > > +qemu-alpine-driverdomains-x86_64-gcc:
> > > + extends: .qemu-x86-64
> > > + script:
> > > + - ./automation/scripts/qemu-driverdomains-x86_64.sh 2>&1 | tee ${LOGFILE}
> > > + needs:
> > > + - *x86-64-test-needs
> > > + - alpine-3.22-gcc
> > > +
> > > qemu-smoke-x86-64-gcc:
> > > extends: .qemu-smoke-x86-64
> > > script:
> > > diff --git a/automation/scripts/qemu-driverdomains-x86_64.sh b/automation/scripts/qemu-driverdomains-x86_64.sh
> > > new file mode 100755
> > > index 000000000000..c0241da54168
> > > --- /dev/null
> > > +++ b/automation/scripts/qemu-driverdomains-x86_64.sh
> > > @@ -0,0 +1,138 @@
> > > +#!/bin/bash
> > > +
> > > +set -ex -o pipefail
> > > +
> > > +dom0_rootfs_extra_comp=()
> > > +dom0_rootfs_extra_uncomp=()
> > > +
> > > +cd binaries
> > > +
> > > +# DomU rootfs
> > > +
> > > +mkdir -p rootfs
> > > +cd rootfs
> > > +mkdir -p etc/local.d
> > > +passed="ping test passed"
> > > +failed="TEST FAILED"
> > > +cat > etc/local.d/xen.start << EOF
> > > +#!/bin/bash
> > > +
> > > +set -x
> > > +
> > > +if grep -q test=backend /proc/cmdline; then
> > > + brctl addbr xenbr0
> > > + ip link set xenbr0 up
> > > + ip addr add 192.168.0.1/24 dev xenbr0
> > > + bash /etc/init.d/xendriverdomain start
> > > + # log backend-related logs to the console
> > > + tail -F /var/log/xen/xldevd.log /var/log/xen/xen-hotplug.log >>/dev/console 2>/dev/null &
> > > +else
> > > + ip link set eth0 up
> > > + ip addr add 192.168.0.2/24 dev eth0
> > > + timeout=6 # 6*10s
> > > + until ping -c 10 192.168.0.1; do
> > > + sleep 1
> > > + if [ \$timeout -le 0 ]; then
> > > + echo "${failed}"
> > > + exit 1
> > > + fi
> > > + ((timeout--))
> > > + done
> > > + echo "${passed}"
> > > +fi
> > > +EOF
> > > +chmod +x etc/local.d/xen.start
> > > +fakeroot sh -c "
> > > + zcat ../rootfs.cpio.gz | cpio -imd
> > > + zcat ../xen-tools.cpio.gz | cpio -imd
> > > + ldconfig -r .
> > > + touch etc/.updated
> > > + mkfs.ext4 -d . ../domU-rootfs.img 1024M
> >
> > Do we really need 1GB? I would rather use a smaller size if possible.
> > I would rather use as little resources as possible on the build server
> > as we might run a few of these jobs in parallel one day soon.
>
> This will be a sparse file, so it won't use really all the space. But
> this size is the upper bound of what can be put inside.
> That said, it's worth checking if sparse files do work properly on all
> runners in /build. AFAIR some older docker versions had issues with that
> (was it aufs not supporting sparse files?).
I ran the same command on my local baremetal Ubuntu dev environment
(arm64) and it created a new file of the size passed on the command
line (1GB in this case). It looks like they are not sparse on my end. If
the result depends on versions and configurations, I would rather err on
the side of caution and use the smallest possible number that works.
> > Moreover this script will be run inside a container which means this
> > data is probably in RAM.
>
> Are runners configured to use tmpfs for /build? I don't think it's the
> default.
I don't know for sure, they are just using the default. My goal was to
make our solution more reliable as defaults and configurations might
change.
> > The underlying rootfs is 25M on both ARM and x86. This should be at most
> > 50M.
>
> Rootfs itself is small, but for driver domains it needs to include
> toolstack too, and xen-tools.cpio is over 600MB (for debug build).
> I might be able to pick just the parts needed for the driver domain (xl
> with its deps, maybe some startup scripts, probably few more files), but
> it's rather fragile.
My first thought is to avoid creating a 1GB file in all cases when it
might only be needed for certain individual tests. Now, I realize that
this script might end up only used in driver domains tests but if not, I
would say to use the smallest number depending on the tests, especially
as there seems to be use a huge difference, e.g. 25MB versus 600MB.
My second thought is that 600MB for just the Xen tools is way too large.
I have alpine linux rootfs'es with just the Xen tools installed that are
below 50MB total. I am confused on how we get to 600MB. It might be due
to QEMU and its dependencies but still going from 25MB to 600MB is
incredible!
> > > +"
> > > +cd ..
> > > +rm -rf rootfs
> > > +
> > > +# Dom0 rootfs
> > > +mkdir -p rootfs
> > > +cd rootfs
> > > +fakeroot -s ../fakeroot-save sh -c "
> > > + zcat ../rootfs.cpio.gz | cpio -imd
> > > + zcat ../xen-tools.cpio.gz | cpio -imd
> > > + ldconfig -r .
> > > + touch etc/.updated
> > > +"
> > > +mkdir -p root etc/local.d
> > > +cat > root/backend.cfg << EOF
> > > +name="backend"
> > > +memory=512
> > > +vcpus=1
> > > +kernel="/root/bzImage"
> > > +extra="console=hvc0 root=/dev/xvda net.ifnames=0 test=backend"
> > > +disk=[ '/root/domU-rootfs-backend.img,raw,xvda,rw' ]
> > > +EOF
> > > +cat > root/frontend.cfg << EOF
> > > +name="frontend"
> > > +memory=512
> > > +vcpus=1
> > > +kernel="/root/bzImage"
> > > +extra="console=hvc0 root=/dev/xvda net.ifnames=0 test=frontend"
> > > +disk=[ '/root/domU-rootfs-frontend.img,raw,xvda,rw' ]
> > > +vif=[ 'bridge=xenbr0,backend=backend' ]
> > > +EOF
> > > +
> > > +cat > etc/local.d/xen.start << EOF
> > > +#!/bin/bash
> > > +
> > > +set -x
> > > +
> > > +bash /etc/init.d/xencommons start
> > > +
> > > +xl list
> > > +
> > > +tail -F /var/log/xen/console/guest-backend.log 2>/dev/null | sed -e "s/^/(backend) /" >>/dev/console &
> > > +tail -F /var/log/xen/console/guest-frontend.log 2>/dev/null | sed -e "s/^/(frontend) /" >>/dev/console &
> > > +xl -vvv create /root/backend.cfg
> > > +xl -vvv create /root/frontend.cfg
> > > +EOF
> > > +chmod +x etc/local.d/xen.start
> > > +
> > > +cp ../domU-rootfs.img ./root/domU-rootfs-backend.img
> > > +cp ../domU-rootfs.img ./root/domU-rootfs-frontend.img
> > > +cp ../bzImage ./root/
> > > +mkdir -p etc/default
> > > +echo 'XENCONSOLED_TRACE=all' >> etc/default/xencommons
> > > +mkdir -p var/log/xen/console
> > > +fakeroot -i ../fakeroot-save mkfs.ext4 -d . ../dom0-rootfs.img 2048M
> >
> > Same here. Also 2GB might not be sufficient to contain 2 copies of
> > domU-rootfs.img, given that domU-rootfs.img is 1GB.
>
> See the note about sparse files.
I double checked and they don't appear to be sparse on my system.
> > If we bring down domU-rootfs.img to 50M, then this could be 150M.
> >
> >
> > > +cd ..
> > > +rm -rf rootfs
> > > +
> > > +cd ..
> > > +
> > > +cat >> binaries/pxelinux.0 << EOF
> > > +#!ipxe
> > > +
> > > +kernel xen console=com1 console_timestamps=boot
> > > +module bzImage console=hvc0 root=/dev/sda net.ifnames=0
> > > +boot
> > > +EOF
> > > +
> > > +# Run the test
> > > +rm -f smoke.serial
> > > +export TEST_CMD="qemu-system-x86_64 \
> > > + -cpu qemu64,+svm \
> > > + -m 2G -smp 2 \
> > > + -monitor none -serial stdio \
> > > + -nographic \
> > > + -device virtio-net-pci,netdev=n0 \
> > > + -netdev user,id=n0,tftp=binaries,bootfile=/pxelinux.0 \
> > > + -drive file=binaries/dom0-rootfs.img,format=raw"
> > > +
> > > +export TEST_LOG="smoke.serial"
> > > +export BOOT_MSG="Latest ChangeSet: "
> > > +export LOG_MSG="Domain-0"
> > > +# exit early on test failure too, check if it was success below
> > > +export PASSED="$passed|$failed"
> > > +
> > > +./automation/scripts/console.exp | sed 's/\r\+$//'
> > > +
> > > +grep "$passed" smoke.serial
> > > --
> > > git-series 0.9.1
On Wed, Dec 10, 2025 at 12:28:19PM -0800, Stefano Stabellini wrote:
> On Wed, 10 Dec 2025, Marek Marczykowski-Górecki wrote:
> > On Tue, Dec 09, 2025 at 04:02:06PM -0800, Stefano Stabellini wrote:
> > > On Sat, 6 Dec 2025, Marek Marczykowski-Górecki wrote:
> > > > Setup a simple two domU system. One with network backend, running
> > > > xendriverdomain service, and one with frontend, trying to ping the
> > > > backend.
> > > >
> > > > Contrary to other similar tests, use disk image instead of initrd, to
> > > > allow bigger rootfs without adding more RAM (for both dom0 and domU).
> > > > But keep using pxelinux as a bootloader as it's easier to setup than
> > > > installing grub on the disk. Theoretically, it could be started via direct
> > > > kernel boot in QEMU, but pxelinux is slightly closer to real-world
> > > > deployment.
> > > >
> > > > Use fakeroot to preserve file owners/permissions. This is especially
> > > > important for suid binaries like /bin/mount - without fakeroot, they
> > > > will end up as suid into non-root user.
> > > >
> > > > Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> > > > ---
> > > > Changes in v3:
> > > > - add fakeroot
> > > > - run ldconfig at the disk image creation time, to avoid running it at
> > > > dom0/domU boot time (which is much slower)
> > > > Changes in v2:
> > > > - use heredoc
> > > > - limit ping loop iterations
> > > > - use full "backend" / "frontend" in disk image names
> > > > - print domU consoles directly to /dev/console, to avoid systemd-added
> > > > messages prefix
> > > > - terminate test on failure, don't wait for timeout
> > > > ---
> > > > automation/build/debian/13-x86_64.dockerfile | 2 +-
> > > > automation/gitlab-ci/test.yaml | 8 +-
> > > > automation/scripts/qemu-driverdomains-x86_64.sh | 138 +++++++++++++++++-
> > > > 3 files changed, 148 insertions(+)
> > > > create mode 100755 automation/scripts/qemu-driverdomains-x86_64.sh
> > > >
> > > > diff --git a/automation/build/debian/13-x86_64.dockerfile b/automation/build/debian/13-x86_64.dockerfile
> > > > index 2c6c9d4a5098..6382bafbd5bd 100644
> > > > --- a/automation/build/debian/13-x86_64.dockerfile
> > > > +++ b/automation/build/debian/13-x86_64.dockerfile
> > > > @@ -55,7 +55,9 @@ RUN <<EOF
> > > >
> > > > # for test phase, qemu-* jobs
> > > > busybox-static
> > > > + e2fsprogs
> > > > expect
> > > > + fakeroot
> > > > ovmf
> > > > qemu-system-x86
> > > >
> > > > diff --git a/automation/gitlab-ci/test.yaml b/automation/gitlab-ci/test.yaml
> > > > index 7b36f1e126ca..abc5339a74ab 100644
> > > > --- a/automation/gitlab-ci/test.yaml
> > > > +++ b/automation/gitlab-ci/test.yaml
> > > > @@ -656,6 +656,14 @@ qemu-alpine-x86_64-gcc:
> > > > - *x86-64-test-needs
> > > > - alpine-3.22-gcc
> > > >
> > > > +qemu-alpine-driverdomains-x86_64-gcc:
> > > > + extends: .qemu-x86-64
> > > > + script:
> > > > + - ./automation/scripts/qemu-driverdomains-x86_64.sh 2>&1 | tee ${LOGFILE}
> > > > + needs:
> > > > + - *x86-64-test-needs
> > > > + - alpine-3.22-gcc
> > > > +
> > > > qemu-smoke-x86-64-gcc:
> > > > extends: .qemu-smoke-x86-64
> > > > script:
> > > > diff --git a/automation/scripts/qemu-driverdomains-x86_64.sh b/automation/scripts/qemu-driverdomains-x86_64.sh
> > > > new file mode 100755
> > > > index 000000000000..c0241da54168
> > > > --- /dev/null
> > > > +++ b/automation/scripts/qemu-driverdomains-x86_64.sh
> > > > @@ -0,0 +1,138 @@
> > > > +#!/bin/bash
> > > > +
> > > > +set -ex -o pipefail
> > > > +
> > > > +dom0_rootfs_extra_comp=()
> > > > +dom0_rootfs_extra_uncomp=()
> > > > +
> > > > +cd binaries
> > > > +
> > > > +# DomU rootfs
> > > > +
> > > > +mkdir -p rootfs
> > > > +cd rootfs
> > > > +mkdir -p etc/local.d
> > > > +passed="ping test passed"
> > > > +failed="TEST FAILED"
> > > > +cat > etc/local.d/xen.start << EOF
> > > > +#!/bin/bash
> > > > +
> > > > +set -x
> > > > +
> > > > +if grep -q test=backend /proc/cmdline; then
> > > > + brctl addbr xenbr0
> > > > + ip link set xenbr0 up
> > > > + ip addr add 192.168.0.1/24 dev xenbr0
> > > > + bash /etc/init.d/xendriverdomain start
> > > > + # log backend-related logs to the console
> > > > + tail -F /var/log/xen/xldevd.log /var/log/xen/xen-hotplug.log >>/dev/console 2>/dev/null &
> > > > +else
> > > > + ip link set eth0 up
> > > > + ip addr add 192.168.0.2/24 dev eth0
> > > > + timeout=6 # 6*10s
> > > > + until ping -c 10 192.168.0.1; do
> > > > + sleep 1
> > > > + if [ \$timeout -le 0 ]; then
> > > > + echo "${failed}"
> > > > + exit 1
> > > > + fi
> > > > + ((timeout--))
> > > > + done
> > > > + echo "${passed}"
> > > > +fi
> > > > +EOF
> > > > +chmod +x etc/local.d/xen.start
> > > > +fakeroot sh -c "
> > > > + zcat ../rootfs.cpio.gz | cpio -imd
> > > > + zcat ../xen-tools.cpio.gz | cpio -imd
> > > > + ldconfig -r .
> > > > + touch etc/.updated
> > > > + mkfs.ext4 -d . ../domU-rootfs.img 1024M
> > >
> > > Do we really need 1GB? I would rather use a smaller size if possible.
> > > I would rather use as little resources as possible on the build server
> > > as we might run a few of these jobs in parallel one day soon.
> >
> > This will be a sparse file, so it won't use really all the space. But
> > this size is the upper bound of what can be put inside.
> > That said, it's worth checking if sparse files do work properly on all
> > runners in /build. AFAIR some older docker versions had issues with that
> > (was it aufs not supporting sparse files?).
>
> I ran the same command on my local baremetal Ubuntu dev environment
> (arm64) and it created a new file of the size passed on the command
> line (1GB in this case). It looks like they are not sparse on my end. If
> the result depends on versions and configurations, I would rather err on
> the side of caution and use the smallest possible number that works.
Hm, interesting. What filesystem is that on?
On my side it's definitely sparse (ext4):
[user@disp8129 Downloads]$ du -sch
12K .
12K total
[user@disp8129 Downloads]$ mkfs.ext4 -d . ../domU-rootfs.img 1024M
mke2fs 1.47.2 (1-Jan-2025)
Creating regular file ../domU-rootfs.img
Creating filesystem with 262144 4k blocks and 65536 inodes
Filesystem UUID: f50a5dfe-4dcf-4f3e-82d0-3dc54a788ab0
Superblock backups stored on blocks:
32768, 98304, 163840, 229376
Allocating group tables: done
Writing inode tables: done
Creating journal (8192 blocks): done
Copying files into the device: done
Writing superblocks and filesystem accounting information: done
[user@disp8129 Downloads]$ ls -lhs ../domU-rootfs.img
33M -rw-r--r--. 1 user user 1.0G Dec 10 21:45 ../domU-rootfs.img
> > > Moreover this script will be run inside a container which means this
> > > data is probably in RAM.
> >
> > Are runners configured to use tmpfs for /build? I don't think it's the
> > default.
>
> I don't know for sure, they are just using the default. My goal was to
> make our solution more reliable as defaults and configurations might
> change.
>
>
> > > The underlying rootfs is 25M on both ARM and x86. This should be at most
> > > 50M.
> >
> > Rootfs itself is small, but for driver domains it needs to include
> > toolstack too, and xen-tools.cpio is over 600MB (for debug build).
> > I might be able to pick just the parts needed for the driver domain (xl
> > with its deps, maybe some startup scripts, probably few more files), but
> > it's rather fragile.
>
> My first thought is to avoid creating a 1GB file in all cases when it
> might only be needed for certain individual tests. Now, I realize that
> this script might end up only used in driver domains tests but if not,
Indeed this script is specifically about driverdomains test.
> I
> would say to use the smallest number depending on the tests, especially
> as there seems to be use a huge difference, e.g. 25MB versus 600MB.
>
> My second thought is that 600MB for just the Xen tools is way too large.
> I have alpine linux rootfs'es with just the Xen tools installed that are
> below 50MB total. I am confused on how we get to 600MB. It might be due
> to QEMU and its dependencies but still going from 25MB to 600MB is
> incredible!
Indeed it's mostly about QEMU (its main binary itself takes 55MB),
including all bundled firmwares etc (various flavors of edk2 alone take
270MB). There is also usr/lib/debug which takes 85MB.
But then, usr/lib/libxen* combined takes almost 50MB.
OTOH, non-debug xen-tools.cpio takes "just" 130MB.
--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
On Wed, 10 Dec 2025, Marek Marczykowski-Górecki wrote: > > > > > + mkfs.ext4 -d . ../domU-rootfs.img 1024M > > > > > > > > Do we really need 1GB? I would rather use a smaller size if possible. > > > > I would rather use as little resources as possible on the build server > > > > as we might run a few of these jobs in parallel one day soon. > > > > > > This will be a sparse file, so it won't use really all the space. But > > > this size is the upper bound of what can be put inside. > > > That said, it's worth checking if sparse files do work properly on all > > > runners in /build. AFAIR some older docker versions had issues with that > > > (was it aufs not supporting sparse files?). > > > > I ran the same command on my local baremetal Ubuntu dev environment > > (arm64) and it created a new file of the size passed on the command > > line (1GB in this case). It looks like they are not sparse on my end. If > > the result depends on versions and configurations, I would rather err on > > the side of caution and use the smallest possible number that works. > > Hm, interesting. What filesystem is that on? > > On my side it's definitely sparse (ext4): > > [user@disp8129 Downloads]$ du -sch > 12K . > 12K total > [user@disp8129 Downloads]$ mkfs.ext4 -d . ../domU-rootfs.img 1024M > mke2fs 1.47.2 (1-Jan-2025) > Creating regular file ../domU-rootfs.img > Creating filesystem with 262144 4k blocks and 65536 inodes > Filesystem UUID: f50a5dfe-4dcf-4f3e-82d0-3dc54a788ab0 > Superblock backups stored on blocks: > 32768, 98304, 163840, 229376 > > Allocating group tables: done > Writing inode tables: done > Creating journal (8192 blocks): done > Copying files into the device: done > Writing superblocks and filesystem accounting information: done > > [user@disp8129 Downloads]$ ls -lhs ../domU-rootfs.img > 33M -rw-r--r--. 1 user user 1.0G Dec 10 21:45 ../domU-rootfs.img I went and check two of the runners, one ARM and one x86, and it looks like they support sparse outside and inside containers. They should have all the same configuration so I think we can assume they support sparse files appropriately. So it looks like it OK. But please could you add an in-code comment to highlight that the file created will be sparse? > > > > Moreover this script will be run inside a container which means this > > > > data is probably in RAM. > > > > > > Are runners configured to use tmpfs for /build? I don't think it's the > > > default. > > > > I don't know for sure, they are just using the default. My goal was to > > make our solution more reliable as defaults and configurations might > > change. > > > > > > > > The underlying rootfs is 25M on both ARM and x86. This should be at most > > > > 50M. > > > > > > Rootfs itself is small, but for driver domains it needs to include > > > toolstack too, and xen-tools.cpio is over 600MB (for debug build). > > > I might be able to pick just the parts needed for the driver domain (xl > > > with its deps, maybe some startup scripts, probably few more files), but > > > it's rather fragile. > > > > My first thought is to avoid creating a 1GB file in all cases when it > > might only be needed for certain individual tests. Now, I realize that > > this script might end up only used in driver domains tests but if not, > > Indeed this script is specifically about driverdomains test. > > > I > > would say to use the smallest number depending on the tests, especially > > as there seems to be use a huge difference, e.g. 25MB versus 600MB. > > > > My second thought is that 600MB for just the Xen tools is way too large. > > I have alpine linux rootfs'es with just the Xen tools installed that are > > below 50MB total. I am confused on how we get to 600MB. It might be due > > to QEMU and its dependencies but still going from 25MB to 600MB is > > incredible! > > Indeed it's mostly about QEMU (its main binary itself takes 55MB), > including all bundled firmwares etc (various flavors of edk2 alone take > 270MB). There is also usr/lib/debug which takes 85MB. > But then, usr/lib/libxen* combined takes almost 50MB. > > OTOH, non-debug xen-tools.cpio takes "just" 130MB. Can we use the non-debug xen-tools.cpio and also can we remove all the bundled firmware? Do we really need EDK2 for instance? I don't think it is worth doing an in-details analysis of what binaries to keep and what to remove, but at least removing the unnecessary in-guest firmware and ideally chosing a non-debug build sounds reasonable?
On Wed, Dec 10, 2025 at 01:58:44PM -0800, Stefano Stabellini wrote: > On Wed, 10 Dec 2025, Marek Marczykowski-Górecki wrote: > > > > > > + mkfs.ext4 -d . ../domU-rootfs.img 1024M > > > > > > > > > > Do we really need 1GB? I would rather use a smaller size if possible. > > > > > I would rather use as little resources as possible on the build server > > > > > as we might run a few of these jobs in parallel one day soon. > > > > > > > > This will be a sparse file, so it won't use really all the space. But > > > > this size is the upper bound of what can be put inside. > > > > That said, it's worth checking if sparse files do work properly on all > > > > runners in /build. AFAIR some older docker versions had issues with that > > > > (was it aufs not supporting sparse files?). > > > > > > I ran the same command on my local baremetal Ubuntu dev environment > > > (arm64) and it created a new file of the size passed on the command > > > line (1GB in this case). It looks like they are not sparse on my end. If > > > the result depends on versions and configurations, I would rather err on > > > the side of caution and use the smallest possible number that works. > > > > Hm, interesting. What filesystem is that on? > > > > On my side it's definitely sparse (ext4): > > > > [user@disp8129 Downloads]$ du -sch > > 12K . > > 12K total > > [user@disp8129 Downloads]$ mkfs.ext4 -d . ../domU-rootfs.img 1024M > > mke2fs 1.47.2 (1-Jan-2025) > > Creating regular file ../domU-rootfs.img > > Creating filesystem with 262144 4k blocks and 65536 inodes > > Filesystem UUID: f50a5dfe-4dcf-4f3e-82d0-3dc54a788ab0 > > Superblock backups stored on blocks: > > 32768, 98304, 163840, 229376 > > > > Allocating group tables: done > > Writing inode tables: done > > Creating journal (8192 blocks): done > > Copying files into the device: done > > Writing superblocks and filesystem accounting information: done > > > > [user@disp8129 Downloads]$ ls -lhs ../domU-rootfs.img > > 33M -rw-r--r--. 1 user user 1.0G Dec 10 21:45 ../domU-rootfs.img > > I went and check two of the runners, one ARM and one x86, and it looks > like they support sparse outside and inside containers. They should have > all the same configuration so I think we can assume they support sparse > files appropriately. > > So it looks like it OK. But please could you add an in-code comment to > highlight that the file created will be sparse? Sure. > > > > > Moreover this script will be run inside a container which means this > > > > > data is probably in RAM. > > > > > > > > Are runners configured to use tmpfs for /build? I don't think it's the > > > > default. > > > > > > I don't know for sure, they are just using the default. My goal was to > > > make our solution more reliable as defaults and configurations might > > > change. > > > > > > > > > > > The underlying rootfs is 25M on both ARM and x86. This should be at most > > > > > 50M. > > > > > > > > Rootfs itself is small, but for driver domains it needs to include > > > > toolstack too, and xen-tools.cpio is over 600MB (for debug build). > > > > I might be able to pick just the parts needed for the driver domain (xl > > > > with its deps, maybe some startup scripts, probably few more files), but > > > > it's rather fragile. > > > > > > My first thought is to avoid creating a 1GB file in all cases when it > > > might only be needed for certain individual tests. Now, I realize that > > > this script might end up only used in driver domains tests but if not, > > > > Indeed this script is specifically about driverdomains test. > > > > > I > > > would say to use the smallest number depending on the tests, especially > > > as there seems to be use a huge difference, e.g. 25MB versus 600MB. > > > > > > My second thought is that 600MB for just the Xen tools is way too large. > > > I have alpine linux rootfs'es with just the Xen tools installed that are > > > below 50MB total. I am confused on how we get to 600MB. It might be due > > > to QEMU and its dependencies but still going from 25MB to 600MB is > > > incredible! > > > > Indeed it's mostly about QEMU (its main binary itself takes 55MB), > > including all bundled firmwares etc (various flavors of edk2 alone take > > 270MB). There is also usr/lib/debug which takes 85MB. > > But then, usr/lib/libxen* combined takes almost 50MB. > > > > OTOH, non-debug xen-tools.cpio takes "just" 130MB. > > Can we use the non-debug xen-tools.cpio I can use non-debug one. While debug build of hypervisor changes quite a lot in terms of test output details, the purpose of this test is mostly to test toolstack and frontend drivers - and here debug build doesn't change much. > and also can we remove all the > bundled firmware? Do we really need EDK2 for instance? > > I don't think it is worth doing an in-details analysis of what binaries > to keep and what to remove, but at least removing the unnecessary > in-guest firmware and ideally chosing a non-debug build sounds > reasonable? Excluding QEMU _for now_ makes sense. But there might be a day when we'd like to test QEMU backends in a driver domain and/or a domU booted via UEFI (IIUC such configuration has PV frontend in EDK2 - at least for the disk - and it makes sense testing if it works with driver domains). -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab
On Thu, 11 Dec 2025, Marek Marczykowski-Górecki wrote: > On Wed, Dec 10, 2025 at 01:58:44PM -0800, Stefano Stabellini wrote: > > On Wed, 10 Dec 2025, Marek Marczykowski-Górecki wrote: > > > > > > > + mkfs.ext4 -d . ../domU-rootfs.img 1024M > > > > > > > > > > > > Do we really need 1GB? I would rather use a smaller size if possible. > > > > > > I would rather use as little resources as possible on the build server > > > > > > as we might run a few of these jobs in parallel one day soon. > > > > > > > > > > This will be a sparse file, so it won't use really all the space. But > > > > > this size is the upper bound of what can be put inside. > > > > > That said, it's worth checking if sparse files do work properly on all > > > > > runners in /build. AFAIR some older docker versions had issues with that > > > > > (was it aufs not supporting sparse files?). > > > > > > > > I ran the same command on my local baremetal Ubuntu dev environment > > > > (arm64) and it created a new file of the size passed on the command > > > > line (1GB in this case). It looks like they are not sparse on my end. If > > > > the result depends on versions and configurations, I would rather err on > > > > the side of caution and use the smallest possible number that works. > > > > > > Hm, interesting. What filesystem is that on? > > > > > > On my side it's definitely sparse (ext4): > > > > > > [user@disp8129 Downloads]$ du -sch > > > 12K . > > > 12K total > > > [user@disp8129 Downloads]$ mkfs.ext4 -d . ../domU-rootfs.img 1024M > > > mke2fs 1.47.2 (1-Jan-2025) > > > Creating regular file ../domU-rootfs.img > > > Creating filesystem with 262144 4k blocks and 65536 inodes > > > Filesystem UUID: f50a5dfe-4dcf-4f3e-82d0-3dc54a788ab0 > > > Superblock backups stored on blocks: > > > 32768, 98304, 163840, 229376 > > > > > > Allocating group tables: done > > > Writing inode tables: done > > > Creating journal (8192 blocks): done > > > Copying files into the device: done > > > Writing superblocks and filesystem accounting information: done > > > > > > [user@disp8129 Downloads]$ ls -lhs ../domU-rootfs.img > > > 33M -rw-r--r--. 1 user user 1.0G Dec 10 21:45 ../domU-rootfs.img > > > > I went and check two of the runners, one ARM and one x86, and it looks > > like they support sparse outside and inside containers. They should have > > all the same configuration so I think we can assume they support sparse > > files appropriately. > > > > So it looks like it OK. But please could you add an in-code comment to > > highlight that the file created will be sparse? > > Sure. > > > > > > > Moreover this script will be run inside a container which means this > > > > > > data is probably in RAM. > > > > > > > > > > Are runners configured to use tmpfs for /build? I don't think it's the > > > > > default. > > > > > > > > I don't know for sure, they are just using the default. My goal was to > > > > make our solution more reliable as defaults and configurations might > > > > change. > > > > > > > > > > > > > > The underlying rootfs is 25M on both ARM and x86. This should be at most > > > > > > 50M. > > > > > > > > > > Rootfs itself is small, but for driver domains it needs to include > > > > > toolstack too, and xen-tools.cpio is over 600MB (for debug build). > > > > > I might be able to pick just the parts needed for the driver domain (xl > > > > > with its deps, maybe some startup scripts, probably few more files), but > > > > > it's rather fragile. > > > > > > > > My first thought is to avoid creating a 1GB file in all cases when it > > > > might only be needed for certain individual tests. Now, I realize that > > > > this script might end up only used in driver domains tests but if not, > > > > > > Indeed this script is specifically about driverdomains test. > > > > > > > I > > > > would say to use the smallest number depending on the tests, especially > > > > as there seems to be use a huge difference, e.g. 25MB versus 600MB. > > > > > > > > My second thought is that 600MB for just the Xen tools is way too large. > > > > I have alpine linux rootfs'es with just the Xen tools installed that are > > > > below 50MB total. I am confused on how we get to 600MB. It might be due > > > > to QEMU and its dependencies but still going from 25MB to 600MB is > > > > incredible! > > > > > > Indeed it's mostly about QEMU (its main binary itself takes 55MB), > > > including all bundled firmwares etc (various flavors of edk2 alone take > > > 270MB). There is also usr/lib/debug which takes 85MB. > > > But then, usr/lib/libxen* combined takes almost 50MB. > > > > > > OTOH, non-debug xen-tools.cpio takes "just" 130MB. > > > > Can we use the non-debug xen-tools.cpio > > I can use non-debug one. While debug build of hypervisor changes quite a > lot in terms of test output details, the purpose of this test is mostly > to test toolstack and frontend drivers - and here debug build doesn't > change much. > > > and also can we remove all the > > bundled firmware? Do we really need EDK2 for instance? > > > > I don't think it is worth doing an in-details analysis of what binaries > > to keep and what to remove, but at least removing the unnecessary > > in-guest firmware and ideally chosing a non-debug build sounds > > reasonable? > > Excluding QEMU _for now_ makes sense. But there might be a day when we'd > like to test QEMU backends in a driver domain and/or a domU booted via > UEFI (IIUC such configuration has PV frontend in EDK2 - at least for the > disk - and it makes sense testing if it works with driver domains). Ok, in that case, let's go with excluding QEMU and EDK2. While there might be cases in the future where one or both are needed I don't think is a good idea to increase the rootfs size for all tests including the ones where they are not needed.
© 2016 - 2025 Red Hat, Inc.