From nobody Thu Apr 2 15:41:18 2026 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2EBA43E7177 for ; Tue, 17 Mar 2026 14:15:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773756953; cv=none; b=BLoLtGe+a23qRJWUSQJbgJ+Dq33HCM1QAm6dn4ED9g3zOJG1cuw9gqDiMM0S4Asma1KOlynmyUef/3q38Ms9gPMqKwD6qnGry9o8kk0d/4ozhCZNLajdP5gqa/Tdazn1TBMgO52cuayXW7vitoxjzw0uIw4GL7kkVlhL0hZ+4cA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773756953; c=relaxed/simple; bh=Iz88SWLxpj7z1SorA7PAVpSrur17/8yfRfM/80b4T3I=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=W7QQZ2+1U0xgYpHMmC5FybraPO5+9P3Zbd9Eh3yB1oe00xZm7QIOa9bT8lMZgJ1R2Vv+oBjFh3TXsIu2qriWza7gQTN3JOK2XZSVaI4sNwgM1VLUav9QUb59T8X9RIpEQCyFDueh8CuGMqXfe59KuPcC9/y5PBoCCB3IrGDtBqI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--mclapinski.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=KTeUWEvr; arc=none smtp.client-ip=209.85.128.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--mclapinski.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="KTeUWEvr" Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-4837bfcfe0dso76641225e9.1 for ; Tue, 17 Mar 2026 07:15:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1773756950; x=1774361750; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=20jDT7vbb/pcwg3RWwDlyXHU14Ovqo1gDrXuULw2Qtw=; b=KTeUWEvrGH8xsZ03QGtnC121pVWPDADzP7cMOoV0wtr3wNP5RzaBaX9cnZziNjujN+ t81ye2FO4wU+6fzfrKphLHGgLs3L0UwZm/voTcLpEEOXmR1J3h0YWPJxjtUm6IpUR8Oa MMs1oWDzaIjDH+iP7to16M1tE4rzZ9n9ZLpa3oMYX+u8UR97TOZF+TGxOL+RjY6A6tt1 yvhIE0vvNklEK9XETtpcbfOEIM21TUIvwIRVaGdRIaQHkD2TYRnZlCJvL8klTiiF9YW4 OhoNZtbBAMWRI7P4XXYzQXw6MwQy0hr+vueSEkdlCfVTC7al0xR/T9/i73IVI+kBSqbo DJQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773756950; x=1774361750; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=20jDT7vbb/pcwg3RWwDlyXHU14Ovqo1gDrXuULw2Qtw=; b=XCnSZOQgvMWBAwF/0US8FB1O9/jKM0Q+B0RSvd/kn+IbXXGv5lPJiQ0OLRKSYLHrz+ lzqDZN90+iiWHVOCVXAl/Kvm3QKYoiGSRhlVbPRNJ4t+qxUj5PvzrKqoEeJfKpALZvll DRLr+lHt7UVUY/I+p4TUiZpoXaMreAvYmiEjL+aDjYKcU769e83H06tNsvBjzsvP+k0u VpeF1wNWO3vViy4EsWUqw9i8PRC2VMhpGAn/qRooDs7u2/yz3jBdCj3zWGN54oNZEC50 hs338fExZ5L1rzUPVvR0r/fKCJmLc74THG0B5CZjqjeZVcLZfMO4kDfF9VeCBFrsz0+5 S7fA== X-Gm-Message-State: AOJu0YyX3UvR64+mg5x/GKnzLSUajNIgbyHJndTZMEUFhzuqmLJ//4mC 97JoKqNV93u3RCSdAfeD+D8XGfBaKSPPbEYqgiCcXrGdhk0KDcamfbPS15+oijZd+bfTG78LlpH 0X2hIyYubryMUO9YhbRcDlQ== X-Received: from wmpj20.prod.google.com ([2002:a05:600c:4894:b0:485:3c21:d5f0]) (user=mclapinski job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:8b45:b0:485:30d4:6b9e with SMTP id 5b1f17b1804b1-48556702853mr281166825e9.21.1773756950287; Tue, 17 Mar 2026 07:15:50 -0700 (PDT) Date: Tue, 17 Mar 2026 15:15:34 +0100 In-Reply-To: <20260317141534.815634-1-mclapinski@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260317141534.815634-1-mclapinski@google.com> X-Mailer: git-send-email 2.53.0.851.ga537e3e6e9-goog Message-ID: <20260317141534.815634-4-mclapinski@google.com> Subject: [PATCH v7 3/3] kho: make preserved pages compatible with deferred struct page init From: Michal Clapinski To: Evangelos Petrongonas , Pasha Tatashin , Mike Rapoport , Pratyush Yadav , Alexander Graf , Samiullah Khawaja , kexec@lists.infradead.org, linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Andrew Morton , Michal Clapinski Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Evangelos Petrongonas When CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, struct page initialization is deferred to parallel kthreads that run later in the boot process. During KHO restoration, kho_preserved_memory_reserve() writes metadata for each preserved memory region. However, if the struct page has not been initialized, this write targets uninitialized memory, potentially leading to errors like: BUG: unable to handle page fault for address: ... Fix this by introducing kho_get_preserved_page(), which ensures all struct pages in a preserved region are initialized by calling init_deferred_page() which is a no-op when the struct page is already initialized. Signed-off-by: Evangelos Petrongonas Co-developed-by: Michal Clapinski Signed-off-by: Michal Clapinski Reviewed-by: Pratyush Yadav (Google) Reviewed-by: Pasha Tatashin Reviewed-by: Mike Rapoport (Microsoft) --- I think we can't initialize those struct pages in kho_restore_page. I encountered this stack: page_zone(start_page) __pageblock_pfn_to_page set_zone_contiguous page_alloc_init_late So, at the end of page_alloc_init_late struct pages are expected to be already initialized. set_zone_contiguous() looks at the first and last struct page of each pageblock in each populated zone to figure out if the zone is contiguous. If a kho page lands on a pageblock boundary, this will lead to access of an uninitialized struct page. There is also page_ext_init that invokes pfn_to_nid, which calls page_to_nid for each section-aligned page. There might be other places that do something similar. Therefore, it's a good idea to initialize all struct pages by the end of deferred struct page init. That's why I'm resending Evangelos's patch. I also tried to implement Pratyush's idea, i.e. iterate over zones, then get node from zone. I didn't notice any performance difference even with 8GB of kho. --- kernel/liveupdate/Kconfig | 2 -- kernel/liveupdate/kexec_handover.c | 27 ++++++++++++++++++++++++++- 2 files changed, 26 insertions(+), 3 deletions(-) diff --git a/kernel/liveupdate/Kconfig b/kernel/liveupdate/Kconfig index 1a8513f16ef7..c13af38ba23a 100644 --- a/kernel/liveupdate/Kconfig +++ b/kernel/liveupdate/Kconfig @@ -1,12 +1,10 @@ # SPDX-License-Identifier: GPL-2.0-only =20 menu "Live Update and Kexec HandOver" - depends on !DEFERRED_STRUCT_PAGE_INIT =20 config KEXEC_HANDOVER bool "kexec handover" depends on ARCH_SUPPORTS_KEXEC_HANDOVER && ARCH_SUPPORTS_KEXEC_FILE - depends on !DEFERRED_STRUCT_PAGE_INIT select MEMBLOCK_KHO_SCRATCH select KEXEC_FILE select LIBFDT diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_h= andover.c index e511a50fab9c..b49ebcd0b946 100644 --- a/kernel/liveupdate/kexec_handover.c +++ b/kernel/liveupdate/kexec_handover.c @@ -471,6 +471,31 @@ struct page *kho_restore_pages(phys_addr_t phys, unsig= ned long nr_pages) } EXPORT_SYMBOL_GPL(kho_restore_pages); =20 +/* + * With CONFIG_DEFERRED_STRUCT_PAGE_INIT, struct pages in higher memory re= gions + * may not be initialized yet at the time KHO deserializes preserved memor= y. + * KHO uses the struct page to store metadata and a later initialization w= ould + * overwrite it. + * Ensure all the struct pages in the preservation are + * initialized. kho_preserved_memory_reserve() marks the reservation as no= init + * to make sure they don't get re-initialized later. + */ +static struct page *__init kho_get_preserved_page(phys_addr_t phys, + unsigned int order) +{ + unsigned long pfn =3D PHYS_PFN(phys); + int nid; + + if (!IS_ENABLED(CONFIG_DEFERRED_STRUCT_PAGE_INIT)) + return pfn_to_page(pfn); + + nid =3D early_pfn_to_nid(pfn); + for (unsigned long i =3D 0; i < (1UL << order); i++) + init_deferred_page(pfn + i, nid); + + return pfn_to_page(pfn); +} + static int __init kho_preserved_memory_reserve(phys_addr_t phys, unsigned int order) { @@ -479,7 +504,7 @@ static int __init kho_preserved_memory_reserve(phys_add= r_t phys, u64 sz; =20 sz =3D 1 << (order + PAGE_SHIFT); - page =3D phys_to_page(phys); + page =3D kho_get_preserved_page(phys, order); =20 /* Reserve the memory preserved in KHO in memblock */ memblock_reserve(phys, sz); --=20 2.53.0.851.ga537e3e6e9-goog