From nobody Wed Dec 17 08:52:13 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CCEDCC07E98 for ; Tue, 28 Nov 2023 11:04:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344475AbjK1LEL (ORCPT ); Tue, 28 Nov 2023 06:04:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53158 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344341AbjK1LD4 (ORCPT ); Tue, 28 Nov 2023 06:03:56 -0500 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 383DAD63; Tue, 28 Nov 2023 03:04:02 -0800 (PST) Received: from pps.filterd (m0360083.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3ASAf5gT021443; Tue, 28 Nov 2023 11:03:50 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=ENblU8b9kmr4HHBQAZKVfNtfDlkeJYzFF+MrAK9ErZ0=; b=bc9BINpwUDZFb07MDDt6x1sr3bPXQYaI8k/qtLcWAOoCkYmFfrEeturQsB5032yGWTgF bCDQy4KxIuAR8Dz4y98HS93vfm57s1B9oZZnS30RDlDWt9to/BWkc599stsf8v0NjBvS 8HPPlqtn8C710eKv0WmgroHGi+9wAle52EuEjlo5YYGRhvE+OEvYOs1OL4SHP4pgG2CA vGED7ABH6yqpqePUOaOzS60dliFK+Tb0wsYyiSR7D+tIKKAp4xboPrI8nlf8cMeFewKc mio/pZcSePo42x7CBXks0/EEb73f8RKXAUwkUWjhRh0oFVaQo31SRQukxRLLmo+nLSpW ZQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3unenpgn2d-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 28 Nov 2023 11:03:49 +0000 Received: from m0360083.ppops.net (m0360083.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3ASAffwK023555; Tue, 28 Nov 2023 11:03:49 GMT Received: from ppma13.dal12v.mail.ibm.com (dd.9e.1632.ip4.static.sl-reverse.com [50.22.158.221]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3unenpgn1y-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 28 Nov 2023 11:03:49 +0000 Received: from pps.filterd (ppma13.dal12v.mail.ibm.com [127.0.0.1]) by ppma13.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3ASAWL2v004897; Tue, 28 Nov 2023 11:03:48 GMT Received: from smtprelay06.fra02v.mail.ibm.com ([9.218.2.230]) by ppma13.dal12v.mail.ibm.com (PPS) with ESMTPS id 3ukwfjxtcn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 28 Nov 2023 11:03:48 +0000 Received: from smtpav03.fra02v.mail.ibm.com (smtpav03.fra02v.mail.ibm.com [10.20.54.102]) by smtprelay06.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3ASB3jm043057702 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 28 Nov 2023 11:03:45 GMT Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 20B6220040; Tue, 28 Nov 2023 11:03:45 +0000 (GMT) Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C0F542004B; Tue, 28 Nov 2023 11:03:44 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by smtpav03.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 28 Nov 2023 11:03:44 +0000 (GMT) From: Sumanth Korikkar To: linux-mm , Andrew Morton , David Hildenbrand Cc: Oscar Salvador , Michal Hocko , "Aneesh Kumar K.V" , Anshuman Khandual , Gerald Schaefer , Alexander Gordeev , Heiko Carstens , Vasily Gorbik , linux-s390 , LKML Subject: [PATCH v4 1/5] mm/memory_hotplug: introduce MEM_PREPARE_ONLINE/MEM_FINISH_OFFLINE notifiers Date: Tue, 28 Nov 2023 12:03:38 +0100 Message-Id: <20231128110342.102096-2-sumanthk@linux.ibm.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231128110342.102096-1-sumanthk@linux.ibm.com> References: <20231128110342.102096-1-sumanthk@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: c2N6gVNhRkyKGXsPU66RHFf4DWtiEzS4 X-Proofpoint-GUID: Tvs_SK6wNh9VSr2ZXfIRz6PEr46AxBFk X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.987,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-11-28_10,2023-11-27_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 malwarescore=0 mlxscore=0 lowpriorityscore=0 bulkscore=0 priorityscore=1501 clxscore=1015 suspectscore=0 impostorscore=0 mlxlogscore=423 phishscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311060000 definitions=main-2311280087 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Introduce MEM_PREPARE_ONLINE/MEM_FINISH_OFFLINE memory notifiers to prepare the transition of memory to and from a physically accessible state. This enhancement is crucial for implementing the "memmap on memory" feature for s390 in a subsequent patch. Platforms such as x86 can support physical memory hotplug via ACPI. When there is physical memory hotplug, ACPI event leads to the memory addition with the following callchain: acpi_memory_device_add() -> acpi_memory_enable_device() -> __add_memory() After this, the hotplugged memory is physically accessible, and altmap support prepared, before the "memmap on memory" initialization in memory_block_online() is called. On s390, memory hotplug works in a different way. The available hotplug memory has to be defined upfront in the hypervisor, but it is made physically accessible only when the user sets it online via sysfs, currently in the MEM_GOING_ONLINE notifier. This is too late and "memmap on memory" initialization is performed before calling MEM_GOING_ONLINE notifier. During the memory hotplug addition phase, altmap support is prepared and during the memory onlining phase s390 requires memory to be physically accessible and then subsequently initiate the "memmap on memory" initialization process. The memory provider will handle new MEM_PREPARE_ONLINE / MEM_FINISH_OFFLINE notifications and make the memory accessible. The mhp_flag MHP_OFFLINE_INACCESSIBLE is introduced and is relevant when used along with MHP_MEMMAP_ON_MEMORY, because the altmap cannot be written (e.g., poisoned) when adding memory -- before it is set online. This allows for adding memory with an altmap that is not currently made available by a hypervisor. When onlining that memory, the hypervisor can be instructed to make that memory accessible via the new notifiers and the onlining phase will not require any memory allocations, which is helpful in low-memory situations. All architectures ignore unknown memory notifiers. Therefore, the introduction of these new notifiers does not result in any functional modifications across architectures. Suggested-by: Gerald Schaefer Suggested-by: David Hildenbrand Signed-off-by: Sumanth Korikkar Acked-by: David Hildenbrand --- drivers/base/memory.c | 23 ++++++++++++++++++++++- include/linux/memory.h | 9 +++++++++ include/linux/memory_hotplug.h | 18 +++++++++++++++++- include/linux/memremap.h | 1 + mm/memory_hotplug.c | 13 ++++++++++++- mm/sparse.c | 3 ++- 6 files changed, 63 insertions(+), 4 deletions(-) diff --git a/drivers/base/memory.c b/drivers/base/memory.c index 8a13babd826c..b99bcc70d6e5 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -188,6 +188,7 @@ static int memory_block_online(struct memory_block *mem) unsigned long start_pfn =3D section_nr_to_pfn(mem->start_section_nr); unsigned long nr_pages =3D PAGES_PER_SECTION * sections_per_block; unsigned long nr_vmemmap_pages =3D 0; + struct memory_notify arg; struct zone *zone; int ret; =20 @@ -207,9 +208,19 @@ static int memory_block_online(struct memory_block *me= m) if (mem->altmap) nr_vmemmap_pages =3D mem->altmap->free; =20 + arg.altmap_start_pfn =3D start_pfn; + arg.altmap_nr_pages =3D nr_vmemmap_pages; + arg.start_pfn =3D start_pfn + nr_vmemmap_pages; + arg.nr_pages =3D nr_pages - nr_vmemmap_pages; mem_hotplug_begin(); + ret =3D memory_notify(MEM_PREPARE_ONLINE, &arg); + ret =3D notifier_to_errno(ret); + if (ret) + goto out_notifier; + if (nr_vmemmap_pages) { - ret =3D mhp_init_memmap_on_memory(start_pfn, nr_vmemmap_pages, zone); + ret =3D mhp_init_memmap_on_memory(start_pfn, nr_vmemmap_pages, + zone, mem->altmap->inaccessible); if (ret) goto out; } @@ -231,7 +242,11 @@ static int memory_block_online(struct memory_block *me= m) nr_vmemmap_pages); =20 mem->zone =3D zone; + mem_hotplug_done(); + return ret; out: + memory_notify(MEM_FINISH_OFFLINE, &arg); +out_notifier: mem_hotplug_done(); return ret; } @@ -244,6 +259,7 @@ static int memory_block_offline(struct memory_block *me= m) unsigned long start_pfn =3D section_nr_to_pfn(mem->start_section_nr); unsigned long nr_pages =3D PAGES_PER_SECTION * sections_per_block; unsigned long nr_vmemmap_pages =3D 0; + struct memory_notify arg; int ret; =20 if (!mem->zone) @@ -275,6 +291,11 @@ static int memory_block_offline(struct memory_block *m= em) mhp_deinit_memmap_on_memory(start_pfn, nr_vmemmap_pages); =20 mem->zone =3D NULL; + arg.altmap_start_pfn =3D start_pfn; + arg.altmap_nr_pages =3D nr_vmemmap_pages; + arg.start_pfn =3D start_pfn + nr_vmemmap_pages; + arg.nr_pages =3D nr_pages - nr_vmemmap_pages; + memory_notify(MEM_FINISH_OFFLINE, &arg); out: mem_hotplug_done(); return ret; diff --git a/include/linux/memory.h b/include/linux/memory.h index f53cfdaaaa41..939a16bd5cea 100644 --- a/include/linux/memory.h +++ b/include/linux/memory.h @@ -96,8 +96,17 @@ int set_memory_block_size_order(unsigned int order); #define MEM_GOING_ONLINE (1<<3) #define MEM_CANCEL_ONLINE (1<<4) #define MEM_CANCEL_OFFLINE (1<<5) +#define MEM_PREPARE_ONLINE (1<<6) +#define MEM_FINISH_OFFLINE (1<<7) =20 struct memory_notify { + /* + * The altmap_start_pfn and altmap_nr_pages fields are designated for + * specifying the altmap range and are exclusively intended for use in + * MEM_PREPARE_ONLINE/MEM_FINISH_OFFLINE notifiers. + */ + unsigned long altmap_start_pfn; + unsigned long altmap_nr_pages; unsigned long start_pfn; unsigned long nr_pages; int status_change_nid_normal; diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 7d2076583494..ee00015575aa 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -106,6 +106,22 @@ typedef int __bitwise mhp_t; * implies the node id (nid). */ #define MHP_NID_IS_MGID ((__force mhp_t)BIT(2)) +/* + * The hotplugged memory is completely inaccessible while the memory is + * offline. The memory provider will handle MEM_PREPARE_ONLINE / + * MEM_FINISH_OFFLINE notifications and make the memory accessible. + * + * This flag is only relevant when used along with MHP_MEMMAP_ON_MEMORY, + * because the altmap cannot be written (e.g., poisoned) when adding + * memory -- before it is set online. + * + * This allows for adding memory with an altmap that is not currently + * made available by a hypervisor. When onlining that memory, the + * hypervisor can be instructed to make that memory available, and + * the onlining phase will not require any memory allocations, which is + * helpful in low-memory situations. + */ +#define MHP_OFFLINE_INACCESSIBLE ((__force mhp_t)BIT(3)) =20 /* * Extended parameters for memory hotplug: @@ -154,7 +170,7 @@ extern void adjust_present_page_count(struct page *page, long nr_pages); /* VM interface that may be used by firmware interface */ extern int mhp_init_memmap_on_memory(unsigned long pfn, unsigned long nr_p= ages, - struct zone *zone); + struct zone *zone, bool mhp_off_inaccessible); extern void mhp_deinit_memmap_on_memory(unsigned long pfn, unsigned long n= r_pages); extern int online_pages(unsigned long pfn, unsigned long nr_pages, struct zone *zone, struct memory_group *group); diff --git a/include/linux/memremap.h b/include/linux/memremap.h index 744c830f4b13..9837f3e6fb95 100644 --- a/include/linux/memremap.h +++ b/include/linux/memremap.h @@ -25,6 +25,7 @@ struct vmem_altmap { unsigned long free; unsigned long align; unsigned long alloc; + bool inaccessible; }; =20 /* diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 7a5fc89a8652..fdc29701e5c6 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1084,7 +1084,7 @@ void adjust_present_page_count(struct page *page, str= uct memory_group *group, } =20 int mhp_init_memmap_on_memory(unsigned long pfn, unsigned long nr_pages, - struct zone *zone) + struct zone *zone, bool mhp_off_inaccessible) { unsigned long end_pfn =3D pfn + nr_pages; int ret, i; @@ -1093,6 +1093,15 @@ int mhp_init_memmap_on_memory(unsigned long pfn, uns= igned long nr_pages, if (ret) return ret; =20 + /* + * Memory block is accessible at this stage and hence poison the struct + * pages now. If the memory block is accessible during memory hotplug + * addition phase, then page poisining is already performed in + * sparse_add_section(). + */ + if (mhp_off_inaccessible) + page_init_poison(pfn_to_page(pfn), sizeof(struct page) * nr_pages); + move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_UNMOVABLE); =20 for (i =3D 0; i < nr_pages; i++) @@ -1439,6 +1448,8 @@ int __ref add_memory_resource(int nid, struct resourc= e *res, mhp_t mhp_flags) if (mhp_flags & MHP_MEMMAP_ON_MEMORY) { if (mhp_supports_memmap_on_memory(size)) { mhp_altmap.free =3D memory_block_memmap_on_memory_pages(); + if (mhp_flags & MHP_OFFLINE_INACCESSIBLE) + mhp_altmap.inaccessible =3D true; params.altmap =3D kmalloc(sizeof(struct vmem_altmap), GFP_KERNEL); if (!params.altmap) { ret =3D -ENOMEM; diff --git a/mm/sparse.c b/mm/sparse.c index 77d91e565045..3991c717b769 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -907,7 +907,8 @@ int __meminit sparse_add_section(int nid, unsigned long= start_pfn, * Poison uninitialized struct pages in order to catch invalid flags * combinations. */ - page_init_poison(memmap, sizeof(struct page) * nr_pages); + if (!altmap || !altmap->inaccessible) + page_init_poison(memmap, sizeof(struct page) * nr_pages); =20 ms =3D __nr_to_section(section_nr); set_section_nid(section_nr, nid); --=20 2.41.0 From nobody Wed Dec 17 08:52:13 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6E948C46CA3 for ; Tue, 28 Nov 2023 11:04:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344483AbjK1LEP (ORCPT ); Tue, 28 Nov 2023 06:04:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53174 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344350AbjK1LD4 (ORCPT ); Tue, 28 Nov 2023 06:03:56 -0500 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1A266D6; Tue, 28 Nov 2023 03:04:02 -0800 (PST) Received: from pps.filterd (m0360072.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3ASAwBAX001844; Tue, 28 Nov 2023 11:03:50 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=Dt1YW0vf+McTaKErPl0iHdB6D5PuLO6sey3WZMfMbwU=; b=Sc3zvUEFHGRTrbbi+NXR7ymvmTtLj7hSNONycQ6nE7sAOOviAIecJCsMnReq3C6+xUwD 2pSDMewQx0vKEIJfuZ0C6Vu9MlwbmnrgOwfoySNXV8LimhdIrHwHakDMRfN/zIw2uxUS 9jAE63F8cXP/Na09nUw0Xo1IARvDMhRgsed+RAlCB313CabNtCYCXnpmOiZA3AXO3X6j 2tno1R2HIuFraeG+TR7kPgHVoURDqT8st5AIMBpjjdSbXYqaV+FnsoOEc78fmg4v/1AN xn3VdszIJM7gugKSAkdgXwYEo4rt1wy2bWloFHhTnuV/wDnB+l/UIZyyRd26M/yRpNN+ fA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3unewng5tx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 28 Nov 2023 11:03:50 +0000 Received: from m0360072.ppops.net (m0360072.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3ASB0Iu2009598; Tue, 28 Nov 2023 11:03:50 GMT Received: from ppma21.wdc07v.mail.ibm.com (5b.69.3da9.ip4.static.sl-reverse.com [169.61.105.91]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3unewng5tk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 28 Nov 2023 11:03:49 +0000 Received: from pps.filterd (ppma21.wdc07v.mail.ibm.com [127.0.0.1]) by ppma21.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3ASAV3OL028308; Tue, 28 Nov 2023 11:03:48 GMT Received: from smtprelay06.fra02v.mail.ibm.com ([9.218.2.230]) by ppma21.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3ukv8nf5ra-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 28 Nov 2023 11:03:48 +0000 Received: from smtpav03.fra02v.mail.ibm.com (smtpav03.fra02v.mail.ibm.com [10.20.54.102]) by smtprelay06.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3ASB3jRF42664230 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 28 Nov 2023 11:03:45 GMT Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7C51C20040; Tue, 28 Nov 2023 11:03:45 +0000 (GMT) Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 29DC220043; Tue, 28 Nov 2023 11:03:45 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by smtpav03.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 28 Nov 2023 11:03:45 +0000 (GMT) From: Sumanth Korikkar To: linux-mm , Andrew Morton , David Hildenbrand Cc: Oscar Salvador , Michal Hocko , "Aneesh Kumar K.V" , Anshuman Khandual , Gerald Schaefer , Alexander Gordeev , Heiko Carstens , Vasily Gorbik , linux-s390 , LKML Subject: [PATCH v4 2/5] s390/mm: allocate vmemmap pages from self-contained memory range Date: Tue, 28 Nov 2023 12:03:39 +0100 Message-Id: <20231128110342.102096-3-sumanthk@linux.ibm.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231128110342.102096-1-sumanthk@linux.ibm.com> References: <20231128110342.102096-1-sumanthk@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-GUID: oHUgj7lc5pWSgR15Wn2RF29slpcxcEXS X-Proofpoint-ORIG-GUID: 1t_H8THe2gQoijmu9x2yH148SZzwcgVo X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.987,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-11-28_10,2023-11-27_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=936 priorityscore=1501 impostorscore=0 bulkscore=0 phishscore=0 clxscore=1015 lowpriorityscore=0 adultscore=0 suspectscore=0 spamscore=0 mlxscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311060000 definitions=main-2311280087 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Allocate memory map (struct pages array) from the hotplugged memory range, rather than using system memory. The change addresses the issue where standby memory, when configured to be much larger than online memory, could potentially lead to ipl failure due to memory map allocation from online memory. For example, 16MB of memory map allocation is needed for a memory block size of 1GB and when standby memory is configured much larger than online memory, this could lead to ipl failure. To address this issue, the solution involves introducing "memmap on memory" using the vmem_altmap structure on s390. Architectures that want to implement it should pass the altmap to the vmemmap_populate() function and its associated callchain. This enhancement is discussed in the commit 4b94ffdc4163 ("x86, mm: introduce vmem_altmap to augment vmemmap_populate()"). Provide "memmap on memory" support for s390 by passing the altmap in vmemmap_populate() and its callchain. The allocation path is described as follows: * When altmap is NULL in vmemmap_populate(), memory map allocation occurs using the existing vmemmap_alloc_block_buf(). * When altmap is not NULL in vmemmap_populate(), memory map allocation still uses vmemmap_alloc_block_buf(), but this function internally calls altmap_alloc_block_buf(). For deallocation, the process is outlined as follows: * When altmap is NULL in vmemmap_free(), memory map deallocation happens through free_pages(). * When altmap is not NULL in vmemmap_free(), memory map deallocation occurs via vmem_altmap_free(). While memory map allocation is primarily handled through the self-contained memory map range, there might still be a small amount of system memory allocation required for vmemmap pagetables. To mitigate this impact, this feature will be limited to machines with EDAT1 support. Reviewed-by: Gerald Schaefer Signed-off-by: Sumanth Korikkar --- arch/s390/mm/init.c | 3 --- arch/s390/mm/vmem.c | 62 +++++++++++++++++++++++++-------------------- 2 files changed, 35 insertions(+), 30 deletions(-) diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c index 43e612bc2bcd..8d9a60ccb777 100644 --- a/arch/s390/mm/init.c +++ b/arch/s390/mm/init.c @@ -281,9 +281,6 @@ int arch_add_memory(int nid, u64 start, u64 size, unsigned long size_pages =3D PFN_DOWN(size); int rc; =20 - if (WARN_ON_ONCE(params->altmap)) - return -EINVAL; - if (WARN_ON_ONCE(params->pgprot.pgprot !=3D PAGE_KERNEL.pgprot)) return -EINVAL; =20 diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c index 186a020857cf..eb100479f7be 100644 --- a/arch/s390/mm/vmem.c +++ b/arch/s390/mm/vmem.c @@ -33,8 +33,12 @@ static void __ref *vmem_alloc_pages(unsigned int order) return memblock_alloc(size, size); } =20 -static void vmem_free_pages(unsigned long addr, int order) +static void vmem_free_pages(unsigned long addr, int order, struct vmem_alt= map *altmap) { + if (altmap) { + vmem_altmap_free(altmap, 1 << order); + return; + } /* We don't expect boot memory to be removed ever. */ if (!slab_is_available() || WARN_ON_ONCE(PageReserved(virt_to_page((void *)addr)))) @@ -156,7 +160,8 @@ static bool vmemmap_unuse_sub_pmd(unsigned long start, = unsigned long end) =20 /* __ref: we'll only call vmemmap_alloc_block() via vmemmap_populate() */ static int __ref modify_pte_table(pmd_t *pmd, unsigned long addr, - unsigned long end, bool add, bool direct) + unsigned long end, bool add, bool direct, + struct vmem_altmap *altmap) { unsigned long prot, pages =3D 0; int ret =3D -ENOMEM; @@ -172,11 +177,11 @@ static int __ref modify_pte_table(pmd_t *pmd, unsigne= d long addr, if (pte_none(*pte)) continue; if (!direct) - vmem_free_pages((unsigned long) pfn_to_virt(pte_pfn(*pte)), 0); + vmem_free_pages((unsigned long)pfn_to_virt(pte_pfn(*pte)), get_order(P= AGE_SIZE), altmap); pte_clear(&init_mm, addr, pte); } else if (pte_none(*pte)) { if (!direct) { - void *new_page =3D vmemmap_alloc_block(PAGE_SIZE, NUMA_NO_NODE); + void *new_page =3D vmemmap_alloc_block_buf(PAGE_SIZE, NUMA_NO_NODE, al= tmap); =20 if (!new_page) goto out; @@ -213,7 +218,8 @@ static void try_free_pte_table(pmd_t *pmd, unsigned lon= g start) =20 /* __ref: we'll only call vmemmap_alloc_block() via vmemmap_populate() */ static int __ref modify_pmd_table(pud_t *pud, unsigned long addr, - unsigned long end, bool add, bool direct) + unsigned long end, bool add, bool direct, + struct vmem_altmap *altmap) { unsigned long next, prot, pages =3D 0; int ret =3D -ENOMEM; @@ -234,11 +240,11 @@ static int __ref modify_pmd_table(pud_t *pud, unsigne= d long addr, if (IS_ALIGNED(addr, PMD_SIZE) && IS_ALIGNED(next, PMD_SIZE)) { if (!direct) - vmem_free_pages(pmd_deref(*pmd), get_order(PMD_SIZE)); + vmem_free_pages(pmd_deref(*pmd), get_order(PMD_SIZE), altmap); pmd_clear(pmd); pages++; } else if (!direct && vmemmap_unuse_sub_pmd(addr, next)) { - vmem_free_pages(pmd_deref(*pmd), get_order(PMD_SIZE)); + vmem_free_pages(pmd_deref(*pmd), get_order(PMD_SIZE), altmap); pmd_clear(pmd); } continue; @@ -261,7 +267,7 @@ static int __ref modify_pmd_table(pud_t *pud, unsigned = long addr, * page tables since vmemmap_populate gets * called for each section separately. */ - new_page =3D vmemmap_alloc_block(PMD_SIZE, NUMA_NO_NODE); + new_page =3D vmemmap_alloc_block_buf(PMD_SIZE, NUMA_NO_NODE, altmap); if (new_page) { set_pmd(pmd, __pmd(__pa(new_page) | prot)); if (!IS_ALIGNED(addr, PMD_SIZE) || @@ -280,7 +286,7 @@ static int __ref modify_pmd_table(pud_t *pud, unsigned = long addr, vmemmap_use_sub_pmd(addr, next); continue; } - ret =3D modify_pte_table(pmd, addr, next, add, direct); + ret =3D modify_pte_table(pmd, addr, next, add, direct, altmap); if (ret) goto out; if (!add) @@ -302,12 +308,12 @@ static void try_free_pmd_table(pud_t *pud, unsigned l= ong start) for (i =3D 0; i < PTRS_PER_PMD; i++, pmd++) if (!pmd_none(*pmd)) return; - vmem_free_pages(pud_deref(*pud), CRST_ALLOC_ORDER); + vmem_free_pages(pud_deref(*pud), CRST_ALLOC_ORDER, NULL); pud_clear(pud); } =20 static int modify_pud_table(p4d_t *p4d, unsigned long addr, unsigned long = end, - bool add, bool direct) + bool add, bool direct, struct vmem_altmap *altmap) { unsigned long next, prot, pages =3D 0; int ret =3D -ENOMEM; @@ -347,7 +353,7 @@ static int modify_pud_table(p4d_t *p4d, unsigned long a= ddr, unsigned long end, } else if (pud_large(*pud)) { continue; } - ret =3D modify_pmd_table(pud, addr, next, add, direct); + ret =3D modify_pmd_table(pud, addr, next, add, direct, altmap); if (ret) goto out; if (!add) @@ -370,12 +376,12 @@ static void try_free_pud_table(p4d_t *p4d, unsigned l= ong start) if (!pud_none(*pud)) return; } - vmem_free_pages(p4d_deref(*p4d), CRST_ALLOC_ORDER); + vmem_free_pages(p4d_deref(*p4d), CRST_ALLOC_ORDER, NULL); p4d_clear(p4d); } =20 static int modify_p4d_table(pgd_t *pgd, unsigned long addr, unsigned long = end, - bool add, bool direct) + bool add, bool direct, struct vmem_altmap *altmap) { unsigned long next; int ret =3D -ENOMEM; @@ -394,7 +400,7 @@ static int modify_p4d_table(pgd_t *pgd, unsigned long a= ddr, unsigned long end, goto out; p4d_populate(&init_mm, p4d, pud); } - ret =3D modify_pud_table(p4d, addr, next, add, direct); + ret =3D modify_pud_table(p4d, addr, next, add, direct, altmap); if (ret) goto out; if (!add) @@ -415,12 +421,12 @@ static void try_free_p4d_table(pgd_t *pgd, unsigned l= ong start) if (!p4d_none(*p4d)) return; } - vmem_free_pages(pgd_deref(*pgd), CRST_ALLOC_ORDER); + vmem_free_pages(pgd_deref(*pgd), CRST_ALLOC_ORDER, NULL); pgd_clear(pgd); } =20 static int modify_pagetable(unsigned long start, unsigned long end, bool a= dd, - bool direct) + bool direct, struct vmem_altmap *altmap) { unsigned long addr, next; int ret =3D -ENOMEM; @@ -445,7 +451,7 @@ static int modify_pagetable(unsigned long start, unsign= ed long end, bool add, goto out; pgd_populate(&init_mm, pgd, p4d); } - ret =3D modify_p4d_table(pgd, addr, next, add, direct); + ret =3D modify_p4d_table(pgd, addr, next, add, direct, altmap); if (ret) goto out; if (!add) @@ -458,14 +464,16 @@ static int modify_pagetable(unsigned long start, unsi= gned long end, bool add, return ret; } =20 -static int add_pagetable(unsigned long start, unsigned long end, bool dire= ct) +static int add_pagetable(unsigned long start, unsigned long end, bool dire= ct, + struct vmem_altmap *altmap) { - return modify_pagetable(start, end, true, direct); + return modify_pagetable(start, end, true, direct, altmap); } =20 -static int remove_pagetable(unsigned long start, unsigned long end, bool d= irect) +static int remove_pagetable(unsigned long start, unsigned long end, bool d= irect, + struct vmem_altmap *altmap) { - return modify_pagetable(start, end, false, direct); + return modify_pagetable(start, end, false, direct, altmap); } =20 /* @@ -474,7 +482,7 @@ static int remove_pagetable(unsigned long start, unsign= ed long end, bool direct) static int vmem_add_range(unsigned long start, unsigned long size) { start =3D (unsigned long)__va(start); - return add_pagetable(start, start + size, true); + return add_pagetable(start, start + size, true, NULL); } =20 /* @@ -483,7 +491,7 @@ static int vmem_add_range(unsigned long start, unsigned= long size) static void vmem_remove_range(unsigned long start, unsigned long size) { start =3D (unsigned long)__va(start); - remove_pagetable(start, start + size, true); + remove_pagetable(start, start + size, true, NULL); } =20 /* @@ -496,9 +504,9 @@ int __meminit vmemmap_populate(unsigned long start, uns= igned long end, int node, =20 mutex_lock(&vmem_mutex); /* We don't care about the node, just use NUMA_NO_NODE on allocations */ - ret =3D add_pagetable(start, end, false); + ret =3D add_pagetable(start, end, false, altmap); if (ret) - remove_pagetable(start, end, false); + remove_pagetable(start, end, false, altmap); mutex_unlock(&vmem_mutex); return ret; } @@ -509,7 +517,7 @@ void vmemmap_free(unsigned long start, unsigned long en= d, struct vmem_altmap *altmap) { mutex_lock(&vmem_mutex); - remove_pagetable(start, end, false); + remove_pagetable(start, end, false, altmap); mutex_unlock(&vmem_mutex); } =20 --=20 2.41.0 From nobody Wed Dec 17 08:52:13 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E7DF6C4167B for ; Tue, 28 Nov 2023 11:04:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344351AbjK1LD6 (ORCPT ); Tue, 28 Nov 2023 06:03:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46252 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344334AbjK1LDz (ORCPT ); Tue, 28 Nov 2023 06:03:55 -0500 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4C2A31B4; Tue, 28 Nov 2023 03:04:01 -0800 (PST) Received: from pps.filterd (m0353723.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3ASAvjNN029674; Tue, 28 Nov 2023 11:03:50 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=YospTRq+FJ5wYWTZvshEgL1LbpK874Hy9OrFFGUNsPc=; b=NS4mx8AdeSyhxcSIxiz7TCFTcKZMZM0malpgQGgmK3aZy5lYb+FCQWb7VePQJKn9+5q2 scFyyINH6dOgjj5X5LX5B1tcoDGhZBUOeOwR0WY6RgiC1LB2ZcOrUv51VqzzARCNjJww +IWme8RX2aAPph4AvQTW/2f6L4HuIrZxTubbpsl5r52nuDdO87i6N9AuBHHxE5iDxz3h rXbiVnbVlRIaghYWQMz8rpUJTyobzbMjXT/5Q5mmjtK8RvXtao+Yno6kIp5bzfHH7AtJ Gt7wnBOm5xxOw8jjQlc4Pixp0ld9/xWyVHyK9Q1zHiLP4l6+HK0Qpf7G0vJGdvzEzKns zQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3unewm86fw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 28 Nov 2023 11:03:50 +0000 Received: from m0353723.ppops.net (m0353723.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3ASAxA7J001203; Tue, 28 Nov 2023 11:03:49 GMT Received: from ppma13.dal12v.mail.ibm.com (dd.9e.1632.ip4.static.sl-reverse.com [50.22.158.221]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3unewm86fb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 28 Nov 2023 11:03:49 +0000 Received: from pps.filterd (ppma13.dal12v.mail.ibm.com [127.0.0.1]) by ppma13.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3ASAQrI5004908; Tue, 28 Nov 2023 11:03:48 GMT Received: from smtprelay06.fra02v.mail.ibm.com ([9.218.2.230]) by ppma13.dal12v.mail.ibm.com (PPS) with ESMTPS id 3ukwfjxtcr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 28 Nov 2023 11:03:48 +0000 Received: from smtpav03.fra02v.mail.ibm.com (smtpav03.fra02v.mail.ibm.com [10.20.54.102]) by smtprelay06.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3ASB3jNt43057706 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 28 Nov 2023 11:03:46 GMT Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DD7B220040; Tue, 28 Nov 2023 11:03:45 +0000 (GMT) Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 85E362004B; Tue, 28 Nov 2023 11:03:45 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by smtpav03.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 28 Nov 2023 11:03:45 +0000 (GMT) From: Sumanth Korikkar To: linux-mm , Andrew Morton , David Hildenbrand Cc: Oscar Salvador , Michal Hocko , "Aneesh Kumar K.V" , Anshuman Khandual , Gerald Schaefer , Alexander Gordeev , Heiko Carstens , Vasily Gorbik , linux-s390 , LKML Subject: [PATCH v4 3/5] s390/sclp: remove unhandled memory notifier type Date: Tue, 28 Nov 2023 12:03:40 +0100 Message-Id: <20231128110342.102096-4-sumanthk@linux.ibm.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231128110342.102096-1-sumanthk@linux.ibm.com> References: <20231128110342.102096-1-sumanthk@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: Zq7rP1hGBS9rV2E0rZkhNPhV9jXLv0e8 X-Proofpoint-GUID: XDZLI_YG-4DTKboc2b-WvRet7xqhQcdf X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.987,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-11-28_10,2023-11-27_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 adultscore=0 lowpriorityscore=0 malwarescore=0 mlxscore=0 mlxlogscore=790 spamscore=0 suspectscore=0 phishscore=0 priorityscore=1501 bulkscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311060000 definitions=main-2311280087 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Remove memory notifier types which are unhandled by s390. Unhandled memory notifier types are covered by default case. Suggested-by: Alexander Gordeev Reviewed-by: David Hildenbrand Signed-off-by: Sumanth Korikkar --- drivers/s390/char/sclp_cmd.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/s390/char/sclp_cmd.c b/drivers/s390/char/sclp_cmd.c index 11c428f4c7cf..355e63e44e95 100644 --- a/drivers/s390/char/sclp_cmd.c +++ b/drivers/s390/char/sclp_cmd.c @@ -340,9 +340,6 @@ static int sclp_mem_notifier(struct notifier_block *nb, if (contains_standby_increment(start, start + size)) rc =3D -EPERM; break; - case MEM_ONLINE: - case MEM_CANCEL_OFFLINE: - break; case MEM_GOING_ONLINE: rc =3D sclp_mem_change_state(start, size, 1); break; --=20 2.41.0 From nobody Wed Dec 17 08:52:13 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D9914C4167B for ; Tue, 28 Nov 2023 11:04:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344509AbjK1LET (ORCPT ); Tue, 28 Nov 2023 06:04:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46244 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344388AbjK1LD5 (ORCPT ); Tue, 28 Nov 2023 06:03:57 -0500 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E02A2D62; Tue, 28 Nov 2023 03:04:03 -0800 (PST) Received: from pps.filterd (m0353729.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3ASB254f025913; Tue, 28 Nov 2023 11:03:51 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=shUyYZ2Z3ut3Lwca1JzgJ1p3wYsUGEfvmN4QGWlAmEo=; b=Ii2DDpGlAY78NEZPp39Mb2qf2zQBdPJWr/Cma2NLVgMqYXpQU5bW69GK1dkk858fi84u 5Sml7RPJg2gnM84fcQJ5yK6AGHPIHE2yavobp7cUdHVsUPSHZUvEF1bIOmENQOob7rgM mgCF49OSqcsCV3I8ac4TaCjTh2XUhKwLpONy3C3/3BizeMMi8iOyo1WKehzLggSEBN+F CGSevQShpRkyem7LDMYqRN83u7GpxlVaGIdu9rotKWDixtNHgKaNLgXcJFdtt3XnIMeR GndqL1dSwWV0PzkzM08UIsUP4jL++QXzTbjKGWrZCgCxYVi1eCgfQdMs1c7ofBf5xwpP 7Q== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3uneyp818k-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 28 Nov 2023 11:03:51 +0000 Received: from m0353729.ppops.net (m0353729.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3ASB3oaK031307; Tue, 28 Nov 2023 11:03:50 GMT Received: from ppma23.wdc07v.mail.ibm.com (5d.69.3da9.ip4.static.sl-reverse.com [169.61.105.93]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3uneyp817s-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 28 Nov 2023 11:03:50 +0000 Received: from pps.filterd (ppma23.wdc07v.mail.ibm.com [127.0.0.1]) by ppma23.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3ASAYt6s012206; Tue, 28 Nov 2023 11:03:49 GMT Received: from smtprelay03.fra02v.mail.ibm.com ([9.218.2.224]) by ppma23.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3ukvrkf1me-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 28 Nov 2023 11:03:49 +0000 Received: from smtpav03.fra02v.mail.ibm.com (smtpav03.fra02v.mail.ibm.com [10.20.54.102]) by smtprelay03.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3ASB3kZP11403954 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 28 Nov 2023 11:03:46 GMT Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 40FAA20040; Tue, 28 Nov 2023 11:03:46 +0000 (GMT) Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E708820043; Tue, 28 Nov 2023 11:03:45 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by smtpav03.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 28 Nov 2023 11:03:45 +0000 (GMT) From: Sumanth Korikkar To: linux-mm , Andrew Morton , David Hildenbrand Cc: Oscar Salvador , Michal Hocko , "Aneesh Kumar K.V" , Anshuman Khandual , Gerald Schaefer , Alexander Gordeev , Heiko Carstens , Vasily Gorbik , linux-s390 , LKML Subject: [PATCH v4 4/5] s390/mm: implement MEM_PREPARE_ONLINE/MEM_FINISH_OFFLINE notifiers Date: Tue, 28 Nov 2023 12:03:41 +0100 Message-Id: <20231128110342.102096-5-sumanthk@linux.ibm.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231128110342.102096-1-sumanthk@linux.ibm.com> References: <20231128110342.102096-1-sumanthk@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: 98U7NOVK_bxOH5XumKU7eprbvIZmuzMp X-Proofpoint-GUID: iSk767hBzyOipeNYXEJmkIyF_QDcPhhc X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.987,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-11-28_10,2023-11-27_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 impostorscore=0 lowpriorityscore=0 malwarescore=0 priorityscore=1501 mlxscore=0 spamscore=0 adultscore=0 clxscore=1015 suspectscore=0 phishscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311060000 definitions=main-2311280087 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" MEM_PREPARE_ONLINE memory notifier makes memory block physical accessible via sclp assign command. The notifier ensures self-contained memory maps are accessible and hence enabling the "memmap on memory" on s390. MEM_FINISH_OFFLINE memory notifier shifts the memory block to an inaccessible state via sclp unassign command. Implementation considerations: * When MHP_MEMMAP_ON_MEMORY is disabled, the system retains the old behavior. This means the memory map is allocated from default memory. * If MACHINE_HAS_EDAT1 is unavailable, MHP_MEMMAP_ON_MEMORY is automatically disabled. This ensures that vmemmap pagetables do not consume additional memory from the default memory allocator. * The MEM_GOING_ONLINE notifier has been modified to perform no operation, as MEM_PREPARE_ONLINE already executes the sclp assign command. * The MEM_CANCEL_ONLINE/MEM_OFFLINE notifier now performs no operation, as MEM_FINISH_OFFLINE already executes the sclp unassign command. Reviewed-by: Gerald Schaefer Signed-off-by: Sumanth Korikkar --- drivers/s390/char/sclp_cmd.c | 47 +++++++++++++++++++++++++++++++----- 1 file changed, 41 insertions(+), 6 deletions(-) diff --git a/drivers/s390/char/sclp_cmd.c b/drivers/s390/char/sclp_cmd.c index 355e63e44e95..c551dc6734fe 100644 --- a/drivers/s390/char/sclp_cmd.c +++ b/drivers/s390/char/sclp_cmd.c @@ -18,6 +18,7 @@ #include #include #include +#include #include #include #include @@ -26,6 +27,7 @@ #include #include #include +#include =20 #include "sclp.h" =20 @@ -319,6 +321,7 @@ static bool contains_standby_increment(unsigned long st= art, unsigned long end) static int sclp_mem_notifier(struct notifier_block *nb, unsigned long action, void *data) { + unsigned long altmap_start, altmap_size; unsigned long start, size; struct memory_notify *arg; unsigned char id; @@ -340,13 +343,43 @@ static int sclp_mem_notifier(struct notifier_block *n= b, if (contains_standby_increment(start, start + size)) rc =3D -EPERM; break; - case MEM_GOING_ONLINE: + case MEM_PREPARE_ONLINE: + /* + * Access the altmap_start_pfn and altmap_nr_pages fields + * within the struct memory_notify specifically when dealing + * with only MEM_PREPARE_ONLINE/MEM_PREPARE_OFFLINE notifiers. + */ + altmap_start =3D arg->altmap_start_pfn << PAGE_SHIFT; + altmap_size =3D arg->altmap_nr_pages << PAGE_SHIFT; + /* + * When altmap is in use, take the specified memory range + * online, which includes the altmap. + */ + if (altmap_size) { + start =3D altmap_start; + size +=3D altmap_size; + } rc =3D sclp_mem_change_state(start, size, 1); + if (rc || !altmap_size) + break; + /* + * Set CMMA state to nodat here, since the struct page memory + * at the beginning of the memory block will not go through the + * buddy allocator later. + */ + __arch_set_page_nodat((void *)__va(start), arg->altmap_nr_pages); break; - case MEM_CANCEL_ONLINE: - sclp_mem_change_state(start, size, 0); - break; - case MEM_OFFLINE: + case MEM_FINISH_OFFLINE: + altmap_start =3D arg->altmap_start_pfn << PAGE_SHIFT; + altmap_size =3D arg->altmap_nr_pages << PAGE_SHIFT; + /* + * When altmap is in use, take the specified memory range + * offline, which includes the altmap. + */ + if (altmap_size) { + start =3D altmap_start; + size +=3D altmap_size; + } sclp_mem_change_state(start, size, 0); break; default: @@ -397,7 +430,9 @@ static void __init add_memory_merged(u16 rn) if (!size) goto skip_add; for (addr =3D start; addr < start + size; addr +=3D block_size) - add_memory(0, addr, block_size, MHP_NONE); + add_memory(0, addr, block_size, + MACHINE_HAS_EDAT1 ? + MHP_MEMMAP_ON_MEMORY | MHP_OFFLINE_INACCESSIBLE : MHP_NONE); skip_add: first_rn =3D rn; num =3D 1; --=20 2.41.0 From nobody Wed Dec 17 08:52:13 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1263C4167B for ; Tue, 28 Nov 2023 11:04:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344464AbjK1LEH (ORCPT ); Tue, 28 Nov 2023 06:04:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53186 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344333AbjK1LD4 (ORCPT ); Tue, 28 Nov 2023 06:03:56 -0500 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 17DC9D5B; Tue, 28 Nov 2023 03:04:03 -0800 (PST) Received: from pps.filterd (m0353729.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3ASB25u3025897; Tue, 28 Nov 2023 11:03:51 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=Onulgm7nCUdcpaVdPXWthCgk2585qRsTxwp5zRlqqMg=; b=Kcfb2+Aa7rlmP84dTnj3jhupFlOeHseXbO+vquUWiKN4INv1JManUQr4do24L8Jz70n0 TJAX6IT8lEFwPViTOf8EhTgwHFB/yn1AnX5dG0imga14s6JU1UwvKt7o/lT9z6JPQp58 7qaydGILnT0iQ7QwIuqmfK8XRPgHcQvEZVN172LbdzbgVZL43IvPmbP01lOKTQ7akGQj 1JVMEyQnUbR/Os+tzrcrEl9gRm9KbBS7xvAZPLqwM93Fy9XDkMm0wTewKzqh5O2Upfk8 R64TM8nuWmQJsQOUiYzdHqi5ia4zFm2kN8B3QlsV9ecD1Ln/GYyyjuHgnU07xGSj9gYh Cw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3uneyp818a-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 28 Nov 2023 11:03:50 +0000 Received: from m0353729.ppops.net (m0353729.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3ASB38B7029061; Tue, 28 Nov 2023 11:03:50 GMT Received: from ppma11.dal12v.mail.ibm.com (db.9e.1632.ip4.static.sl-reverse.com [50.22.158.219]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3uneyp817p-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 28 Nov 2023 11:03:50 +0000 Received: from pps.filterd (ppma11.dal12v.mail.ibm.com [127.0.0.1]) by ppma11.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3ASAacF5018258; Tue, 28 Nov 2023 11:03:49 GMT Received: from smtprelay03.fra02v.mail.ibm.com ([9.218.2.224]) by ppma11.dal12v.mail.ibm.com (PPS) with ESMTPS id 3ukwy1pn6x-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 28 Nov 2023 11:03:49 +0000 Received: from smtpav03.fra02v.mail.ibm.com (smtpav03.fra02v.mail.ibm.com [10.20.54.102]) by smtprelay03.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3ASB3kDQ8913522 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 28 Nov 2023 11:03:46 GMT Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A071B20040; Tue, 28 Nov 2023 11:03:46 +0000 (GMT) Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4A14A2004B; Tue, 28 Nov 2023 11:03:46 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by smtpav03.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 28 Nov 2023 11:03:46 +0000 (GMT) From: Sumanth Korikkar To: linux-mm , Andrew Morton , David Hildenbrand Cc: Oscar Salvador , Michal Hocko , "Aneesh Kumar K.V" , Anshuman Khandual , Gerald Schaefer , Alexander Gordeev , Heiko Carstens , Vasily Gorbik , linux-s390 , LKML Subject: [PATCH v4 5/5] s390: enable MHP_MEMMAP_ON_MEMORY Date: Tue, 28 Nov 2023 12:03:42 +0100 Message-Id: <20231128110342.102096-6-sumanthk@linux.ibm.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231128110342.102096-1-sumanthk@linux.ibm.com> References: <20231128110342.102096-1-sumanthk@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: cKTj3XpeWC2jy5IUBIMUhDQUBwt3xMBX X-Proofpoint-GUID: bYXWX24IszSnuTl2yc3haCcd-ORcO-Mi X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.987,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-11-28_10,2023-11-27_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=798 impostorscore=0 lowpriorityscore=0 malwarescore=0 priorityscore=1501 mlxscore=0 spamscore=0 adultscore=0 clxscore=1015 suspectscore=0 phishscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311060000 definitions=main-2311280087 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Enable MHP_MEMMAP_ON_MEMORY to support "memmap on memory". memory_hotplug.memmap_on_memory=3Dtrue kernel parameter should be set in kernel boot option to enable the feature. Reviewed-by: Gerald Schaefer Signed-off-by: Sumanth Korikkar --- arch/s390/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index 3bec98d20283..4b9b0f947ddb 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -113,6 +113,7 @@ config S390 select ARCH_INLINE_WRITE_UNLOCK_BH select ARCH_INLINE_WRITE_UNLOCK_IRQ select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE + select ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE select ARCH_STACKWALK select ARCH_SUPPORTS_ATOMIC_RMW select ARCH_SUPPORTS_DEBUG_PAGEALLOC --=20 2.41.0