Use LZMA2 options that match the arch-specific alignment of instructions.
This change reduces compressed kernel size 0-2 % depending on the arch.
On 1-byte-aligned x86 it makes no difference and on 4-byte-aligned archs
it helps the most.
Use the ARM-Thumb filter for ARM-Thumb2 kernels. This reduces compressed
kernel size about 5 %.[1] Previously such kernels were compressed using
the ARM filter which didn't do anything useful with ARM-Thumb2 code.
Add BCJ filter support for ARM64 and RISC-V. On ARM64 the compressed
kernel size is reduced about 5 % and on RISC-V by 7-8 % compared to
unfiltered XZ or plain LZMA. However:
- arch/arm64/boot/Makefile and arch/riscv/boot/Makefile don't include
the build rule (two lines) for XZ support even though they support
six other compressors. It would be trivial to add the rule but boot
loaders would need XZ support too.
- A new enough version of the xz tool is required: 5.4.0 for ARM64 and
5.6.0 for RISC-V. With an old xz version a message is printed to
standard error and the kernel is compressed without the filter.
Update lib/decompress_unxz.c to match the changes to xz_wrap.sh.
Update the CONFIG_KERNEL_XZ help text in init/Kconfig:
- Add the RISC-V and ARM64 filters.
- Clarify that the PowerPC filter is for big endian only.
- Omit IA-64.
Link: https://lore.kernel.org/lkml/1637379771-39449-1-git-send-email-zhongjubin@huawei.com/ [1]
Reviewed-by: Jia Tan <jiat0218@gmail.com>
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
---
init/Kconfig | 5 +-
lib/decompress_unxz.c | 14 ++++-
scripts/xz_wrap.sh | 141 ++++++++++++++++++++++++++++++++++++++++--
3 files changed, 151 insertions(+), 9 deletions(-)
diff --git a/init/Kconfig b/init/Kconfig
index f3ea5dea9c85..785e15aa5395 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -308,8 +308,9 @@ config KERNEL_XZ
BCJ filters which can improve compression ratio of executable
code. The size of the kernel is about 30% smaller with XZ in
comparison to gzip. On architectures for which there is a BCJ
- filter (i386, x86_64, ARM, IA-64, PowerPC, and SPARC), XZ
- will create a few percent smaller kernel than plain LZMA.
+ filter (i386, x86_64, ARM, ARM64, RISC-V, big endian PowerPC,
+ and SPARC), XZ will create a few percent smaller kernel than
+ plain LZMA.
The speed is about the same as with LZMA: The decompression
speed of XZ is better than that of bzip2 but worse than gzip
diff --git a/lib/decompress_unxz.c b/lib/decompress_unxz.c
index 46aa3be13fc5..cae00395d7a6 100644
--- a/lib/decompress_unxz.c
+++ b/lib/decompress_unxz.c
@@ -126,11 +126,21 @@
#ifdef CONFIG_X86
# define XZ_DEC_X86
#endif
-#ifdef CONFIG_PPC
+#if defined(CONFIG_PPC) && defined(CONFIG_CPU_BIG_ENDIAN)
# define XZ_DEC_POWERPC
#endif
#ifdef CONFIG_ARM
-# define XZ_DEC_ARM
+# ifdef CONFIG_THUMB2_KERNEL
+# define XZ_DEC_ARMTHUMB
+# else
+# define XZ_DEC_ARM
+# endif
+#endif
+#ifdef CONFIG_ARM64
+# define XZ_DEC_ARM64
+#endif
+#ifdef CONFIG_RISCV
+# define XZ_DEC_RISCV
#endif
#ifdef CONFIG_SPARC
# define XZ_DEC_SPARC
diff --git a/scripts/xz_wrap.sh b/scripts/xz_wrap.sh
index c8c36441ab70..5bdf0c35cc85 100755
--- a/scripts/xz_wrap.sh
+++ b/scripts/xz_wrap.sh
@@ -6,14 +6,145 @@
#
# Author: Lasse Collin <lasse.collin@tukaani.org>
+# This has specialized settings for the following archs. However,
+# XZ-compressed kernel isn't currently supported on every listed arch.
+#
+# Arch Align Notes
+# arm 2/4 ARM and ARM-Thumb2
+# arm64 4
+# csky 2
+# loongarch 4
+# mips 2/4 MicroMIPS is 2-byte aligned
+# parisc 4
+# powerpc 4 Uses its own wrapper for compressors instead of this.
+# riscv 2/4
+# s390 2
+# sh 2
+# sparc 4
+# x86 1
+
+# A few archs use 2-byte or 4-byte aligned instructions depending on
+# the kernel config. This function is used to check if the relevant
+# config option is set to "y".
+is_enabled()
+{
+ grep -q "^$1=y$" include/config/auto.conf
+}
+
+# Set XZ_VERSION (and LIBLZMA_VERSION). This is needed to disable features
+# that aren't available in old XZ Utils versions.
+eval "$($XZ --robot --version)" || exit
+
+# Assume that no BCJ filter is available.
BCJ=
-LZMA2OPTS=
+# Set the instruction alignment to 1, 2, or 4 bytes.
+#
+# Set the BCJ filter if one is available.
+# It must match the #ifdef usage in lib/decompress_unxz.c.
case $SRCARCH in
- x86) BCJ=--x86 ;;
- powerpc) BCJ=--powerpc ;;
- arm) BCJ=--arm ;;
- sparc) BCJ=--sparc ;;
+ arm)
+ if is_enabled CONFIG_THUMB2_KERNEL; then
+ ALIGN=2
+ BCJ=--armthumb
+ else
+ ALIGN=4
+ BCJ=--arm
+ fi
+ ;;
+
+ arm64)
+ ALIGN=4
+
+ # ARM64 filter was added in XZ Utils 5.4.0.
+ if [ "$XZ_VERSION" -ge 50040002 ]; then
+ BCJ=--arm64
+ else
+ echo "$0: Upgrading to xz >= 5.4.0" \
+ "would enable the ARM64 filter" \
+ "for better compression" >&2
+ fi
+ ;;
+
+ csky)
+ ALIGN=2
+ ;;
+
+ loongarch)
+ ALIGN=4
+ ;;
+
+ mips)
+ if is_enabled CONFIG_CPU_MICROMIPS; then
+ ALIGN=2
+ else
+ ALIGN=4
+ fi
+ ;;
+
+ parisc)
+ ALIGN=4
+ ;;
+
+ powerpc)
+ ALIGN=4
+
+ # The filter is only for big endian instruction encoding.
+ if is_enabled CONFIG_CPU_BIG_ENDIAN; then
+ BCJ=--powerpc
+ fi
+ ;;
+
+ riscv)
+ if is_enabled CONFIG_RISCV_ISA_C; then
+ ALIGN=2
+ else
+ ALIGN=4
+ fi
+
+ # RISC-V filter was added in XZ Utils 5.6.0.
+ if [ "$XZ_VERSION" -ge 50060002 ]; then
+ BCJ=--riscv
+ else
+ echo "$0: Upgrading to xz >= 5.6.0" \
+ "would enable the RISC-V filter" \
+ "for better compression" >&2
+ fi
+ ;;
+
+ s390)
+ ALIGN=2
+ ;;
+
+ sh)
+ ALIGN=2
+ ;;
+
+ sparc)
+ ALIGN=4
+ BCJ=--sparc
+ ;;
+
+ x86)
+ ALIGN=1
+ BCJ=--x86
+ ;;
+
+ *)
+ echo "$0: Arch-specific tuning is missing for '$SRCARCH'" >&2
+
+ # Guess 2-byte-aligned instructions. Guessing too low
+ # should hurt less than guessing too high.
+ ALIGN=2
+ ;;
+esac
+
+# Select the LZMA2 options matching the instruction alignment.
+case $ALIGN in
+ 1) LZMA2OPTS= ;;
+ 2) LZMA2OPTS=lp=1 ;;
+ 4) LZMA2OPTS=lp=2,lc=2 ;;
+ *) echo "$0: ALIGN wrong or missing" >&2; exit 1 ;;
esac
# Use single-threaded mode because it compresses a little better
--
2.44.0
Under the light of the recent xz backdoor, I should note that this patch (patch 11) does: > +# Set XZ_VERSION (and LIBLZMA_VERSION). This is needed to disable features > +# that aren't available in old XZ Utils versions. > +eval "$($XZ --robot --version)" || exit > + in order to do > + arm64) > + ALIGN=4 > + > + # ARM64 filter was added in XZ Utils 5.4.0. > + if [ "$XZ_VERSION" -ge 50040002 ]; then > + BCJ=--arm64 > + else > + echo "$0: Upgrading to xz >= 5.4.0" \ > + "would enable the ARM64 filter" \ > + "for better compression" >&2 > + fi > + ;; and > + # RISC-V filter was added in XZ Utils 5.6.0. > + if [ "$XZ_VERSION" -ge 50060002 ]; then > + BCJ=--riscv > + else > + echo "$0: Upgrading to xz >= 5.6.0" \ > + "would enable the RISC-V filter" \ > + "for better compression" >&2 > + fi > which was noted on Hacker News as a potential gadget of exploitation[1]. Thanks Vegard for bringing it up[2]. A compromised $XZ could modify the build files directly in C, or even produce a file that decompresses into a kernel with added evil instructions, at a quite near level to Reflections on Trusting Trust. Nonetheless, execution of high level shell script would probably be more useful for an attacker that has to surreptitiously include their backdoor, as it would only require a few bytes (e.g. a sed call) when compared to coding that in C. So, in the spirit of keeping a fair amount of paranoia, and since it doesn't do any harm, any such code should be failproofed to ensure it can only import the expected shell variables with the right format[3]: eval "$($XZ --robot --version | grep '^\(XZ\|LIBLZMA\)_VERSION=[0-9]*$')" || exit Regards [1] https://news.ycombinator.com/item?id=39869715 [2] https://www.openwall.com/lists/oss-security/2024/03/30/11 [3] Actually, LIBLZMA_VERSION isn't used, only XZ_VERSION. Being generous and accepting that one as well. :)
On 2024-03-31 angel.lkml@16bits.net wrote: > Under the light of the recent xz backdoor, I should note that this > patch (patch 11) does: > > > +# Set XZ_VERSION (and LIBLZMA_VERSION). This is needed to disable > > features +# that aren't available in old XZ Utils versions. > > +eval "$($XZ --robot --version)" || exit The eval method has been on the xz man page for a very long time but I agree that due to the recent events the above method is not ideal. It can break also if XZ_OPT or XZ_DEFAULTS contains something that they usually shouldn't. For example, XZ_OPT=--help would make the above eval method run the output of $XZ --help. > So, in the spirit of keeping a fair amount of paranoia, and since it > doesn't do any harm, any such code should be failproofed to ensure it > can only import the expected shell variables with the right format[3]: > > eval "$($XZ --robot --version | grep '^\(XZ\|LIBLZMA\)_VERSION=[0-9]*$')" || exit I would rather get rid of eval. I committed the following to the upstream repository: XZ_VERSION=$($XZ --robot --version | sed -n 's/^XZ_VERSION=//p') || exit Thanks! -- Lasse Collin
On 2024-04-03 Lasse Collin wrote:
> On 2024-03-31 angel.lkml@16bits.net wrote:
> > So, in the spirit of keeping a fair amount of paranoia, and since it
> > doesn't do any harm, any such code should be failproofed to ensure
> > it can only import the expected shell variables with the right
> > format[3]:
> >
> > eval "$($XZ --robot --version | grep
> > '^\(XZ\|LIBLZMA\)_VERSION=[0-9]*$')" || exit
>
> I would rather get rid of eval. I committed the following to the
> upstream repository:
>
> XZ_VERSION=$($XZ --robot --version | sed -n 's/^XZ_VERSION=//p') ||
> exit
Both my new version and the suggested eval+grep version have error
detection issues:
- With the eval+grep version, if there are no matches, eval gets an
empty string as an argument in which case eval's exit status is
zero and "exit" won't be run. Exit status from $XZ is ignored.
XZ_VERSION won't be set or it might be inherited from the
environment.
- With $XZ ... | sed ..., the exit status of $XZ is ignored. sed
will exit with 0 and thus "exit" won't be run even if $XZ fails.
Upstream I changed to this:
XZ_VERSION=$($XZ --robot --version) || exit
XZ_VERSION=$(printf '%s\n' "$XZ_VERSION" | sed -n 's/^XZ_VERSION=//p')
If output from $XZ is weird, XZ_VERSION might still become weird too.
But the way the variable is used later should at worst result in
"integer expression expected" error message.
I think the above is a good enough balance for a shell script like
this.
--
Lasse Collin
© 2016 - 2026 Red Hat, Inc.