From nobody Mon Dec 1 22:37:27 2025 Received: from smtp.gentoo.org (woodpecker.gentoo.org [140.211.166.183]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7F2282DFA48 for ; Wed, 26 Nov 2025 17:33:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=140.211.166.183 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764178408; cv=none; b=WDwDSaHVYzL0Ripbql5OYY7VBcCJWqGk9mTPMsPL9T2vfbA86E4oYeb7WkOULtK0sNKMYJhQrskYfJT5yK2rDYggkKJ1Jm/qG8qEYY0SLwW77LencvoZC56PL/eJOq53fHJFeXKa75MqkOcW4uMAbA9zZAT6vxy8ZEivfiMV4Bc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764178408; c=relaxed/simple; bh=XT+1NDUYRWug1b8XK+9+lrQBlyErjDd/gax4B3Opgvc=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=BEBr5Iccw54eC+Qh5Mb9kmFZMac26aiq1cVvJbkXEP+GJVDcNWzy7BW2BLMwPBMcNEj80WTfNgkIYoA52wTZly8SX3OHK/E8HaBAftuuw/tnUNzibwJSSgKah+IXTzqMhvnIbD1HvORcPrWltUSL5RWfwoXnfmddOtTeMrlYGKw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gentoo.org; spf=pass smtp.mailfrom=gentoo.org; arc=none smtp.client-ip=140.211.166.183 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gentoo.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gentoo.org Received: from ignea.aura.home.arpa (d.6.e.0.0.0.0.0.0.0.0.0.0.0.0.0.c.6.e.0.c.6.2.0.0.b.8.0.1.0.0.2.ip6.arpa [IPv6:2001:8b0:26c:e6c::e6d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange x25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: chewi) by smtp.gentoo.org (Postfix) with ESMTPSA id 784EB340FA7; Wed, 26 Nov 2025 17:33:24 +0000 (UTC) From: James Le Cuirot To: x86@kernel.org Cc: linux-kernel@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , Ard Biesheuvel , James Le Cuirot Subject: [PATCH] x86: fix oops caused by old EFI info on kexec boot Date: Wed, 26 Nov 2025 17:32:10 +0000 Message-ID: <20251126173209.374755-2-chewi@gentoo.org> X-Mailer: git-send-email 2.51.2 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" kexec on x86 passes initrd details via the boot_params. If no initrd is supplied, then ramdisk_size is 0. When determining whether to reserve memory for the initrd on the subsequent boot, ramdisk_size being 0 causes the logic to fall back to phys_initrd_start and phys_initrd_size set from the EFI tables in efi.c. This is stale information from the initial boot. The system continues to boot and has even been seen to function under heavy load for days, but allocating very large amounts of memory reliably triggers an oops rather than the OOM killer. BUG: kernel NULL pointer dereference, address: 0000000000000008 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 0 P4D 0 Oops: Oops: 0002 [#1] SMP NOPTI This issue was introduced in f4dc7fffa9873db50ec25624572f8217a6225de8 when the EFI stub initrd loading was unified between architectures. Avoid the issue by checking whether the bootloader is not kexec before falling back to the EFI table values. I strongly suspect this also affects other architectures. A different fix would be required there, and I do have a fix in mind, but I was unable to reproduce the issue under QEMU's aarch64 virt machine. I think this is at least partly because it relies on ACPI while kexec passes the initd details via the device tree. Signed-off-by: James Le Cuirot --- arch/x86/kernel/setup.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 1b2edd07a3e1..8aa65daf121f 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -300,7 +300,8 @@ static u64 __init get_ramdisk_image(void) =20 ramdisk_image |=3D (u64)boot_params.ext_ramdisk_image << 32; =20 - if (ramdisk_image =3D=3D 0) + /* Don't fall back for kexec as phys_initrd_start will be stale */ + if (ramdisk_image =3D=3D 0 && (boot_params.hdr.type_of_loader >> 4) !=3D = 0xD) ramdisk_image =3D phys_initrd_start; =20 return ramdisk_image; @@ -311,7 +312,8 @@ static u64 __init get_ramdisk_size(void) =20 ramdisk_size |=3D (u64)boot_params.ext_ramdisk_size << 32; =20 - if (ramdisk_size =3D=3D 0) + /* Don't fall back for kexec as phys_initrd_start will be stale */ + if (ramdisk_size =3D=3D 0 && (boot_params.hdr.type_of_loader >> 4) !=3D 0= xD) ramdisk_size =3D phys_initrd_size; =20 return ramdisk_size; --=20 2.51.2