From nobody Tue Dec 30 14:49:32 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7E477C4167B for ; Tue, 14 Nov 2023 18:03:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233848AbjKNSDS (ORCPT ); Tue, 14 Nov 2023 13:03:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55746 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233813AbjKNSDG (ORCPT ); Tue, 14 Nov 2023 13:03:06 -0500 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E335D120; Tue, 14 Nov 2023 10:03:02 -0800 (PST) Received: from pps.filterd (m0353722.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3AEHQIa7016024; Tue, 14 Nov 2023 18:02:50 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=t4iPSrUqXTdq6uimmfP0sjAtFW1WrZnbxr2zCedm4IA=; b=Jb0KN8EWN/9grNokkaGBTLcu+nIEmF31V2uKuTLvg90PrGIizEClgswD/WbP4QJ4sAnV nZJ8jnjI0aMMihKKuGeIlQzJjk+mNG8LoEmXSaZ1uAkBoIepbFCGegi53YLNWePWGORA ptPkD/FvKi8Oq9UK6QM+lUlN4+5vEhqU8xtgc4YGhQ7g9oFo0SRyVf8XrqV2y32GOaBh 4ckxz41h/McIq+hZnUVS/4wL843FtcDxk0V6dXLTODT6PNo6zAYllMP+TCm7lNhzhiKf sBmyJvRpLdilcqTWy/8Sx3hdxROP+xUKD9ngyCD+Gq3PE6aGT8PDZQ/XY7LhPKOyhEgQ 1Q== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ucd9qs37e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Nov 2023 18:02:49 +0000 Received: from m0353722.ppops.net (m0353722.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3AEHRcXs019155; Tue, 14 Nov 2023 18:02:49 GMT Received: from ppma11.dal12v.mail.ibm.com (db.9e.1632.ip4.static.sl-reverse.com [50.22.158.219]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ucd9qs36v-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Nov 2023 18:02:49 +0000 Received: from pps.filterd (ppma11.dal12v.mail.ibm.com [127.0.0.1]) by ppma11.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3AEGXrde023319; Tue, 14 Nov 2023 18:02:48 GMT Received: from smtprelay03.fra02v.mail.ibm.com ([9.218.2.224]) by ppma11.dal12v.mail.ibm.com (PPS) with ESMTPS id 3uapn1h49q-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Nov 2023 18:02:48 +0000 Received: from smtpav05.fra02v.mail.ibm.com (smtpav05.fra02v.mail.ibm.com [10.20.54.104]) by smtprelay03.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3AEI2kiC787094 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 14 Nov 2023 18:02:46 GMT Received: from smtpav05.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DED282004D; Tue, 14 Nov 2023 18:02:45 +0000 (GMT) Received: from smtpav05.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8428E2004F; Tue, 14 Nov 2023 18:02:45 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by smtpav05.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 14 Nov 2023 18:02:45 +0000 (GMT) From: Sumanth Korikkar To: linux-mm , Andrew Morton , David Hildenbrand Cc: Oscar Salvador , Michal Hocko , "Aneesh Kumar K.V" , Anshuman Khandual , Gerald Schaefer , Alexander Gordeev , Heiko Carstens , Vasily Gorbik , linux-s390 , LKML Subject: [PATCH 1/8] mm/memory_hotplug: fix memory hotplug locking order Date: Tue, 14 Nov 2023 19:02:31 +0100 Message-Id: <20231114180238.1522782-2-sumanthk@linux.ibm.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231114180238.1522782-1-sumanthk@linux.ibm.com> References: <20231114180238.1522782-1-sumanthk@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-GUID: BIXo_VT22hZfX4XgX7TC1OZv13_pajKf X-Proofpoint-ORIG-GUID: uUKZXUZ9JVYz3dbUEntXT-3bGX2sv7Yl X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.987,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-11-14_18,2023-11-14_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1011 phishscore=0 mlxlogscore=372 suspectscore=0 spamscore=0 lowpriorityscore=0 priorityscore=1501 impostorscore=0 mlxscore=0 bulkscore=0 malwarescore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311060000 definitions=main-2311140136 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From Documentation/core-api/memory-hotplug.rst: When adding/removing/onlining/offlining memory or adding/removing heterogeneous/device memory, we should always hold the mem_hotplug_lock in write mode to serialise memory hotplug (e.g. access to global/zone variables). mhp_(de)init_memmap_on_memory() functions can change zone stats and struct page content, but they are currently called w/o the mem_hotplug_lock. When memory block is being offlined and when kmemleak goes through each populated zone, the following theoretical race conditions could occur: CPU 0: | CPU 1: memory_offline() | -> offline_pages() | -> mem_hotplug_begin() | ... | -> mem_hotplug_done() | | kmemleak_scan() | -> get_online_mems() | ... -> mhp_deinit_memmap_on_memory() | [not protected by mem_hotplug_begin/done()]| Marks memory section as offline, | Retrieves zone_start_pfn poisons vmemmap struct pages and updates | and struct page members. the zone related data | | ... | -> put_online_mems() Fix this by ensuring mem_hotplug_lock is taken before performing mhp_init_memmap_on_memory(). Also ensure that mhp_deinit_memmap_on_memory() holds the lock. online/offline_pages() are currently only called from memory_block_online/offline(), so it is safe to move the locking there. Fixes: a08a2ae34613 ("mm,memory_hotplug: allocate memmap from the added mem= ory range") Reviewed-by: Gerald Schaefer Signed-off-by: Sumanth Korikkar --- drivers/base/memory.c | 12 +++++++++--- mm/memory_hotplug.c | 13 ++++++------- 2 files changed, 15 insertions(+), 10 deletions(-) diff --git a/drivers/base/memory.c b/drivers/base/memory.c index f3b9a4d0fa3b..1e9f6a1749b9 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -204,10 +204,11 @@ static int memory_block_online(struct memory_block *m= em) if (mem->altmap) nr_vmemmap_pages =3D mem->altmap->free; =20 + mem_hotplug_begin(); if (nr_vmemmap_pages) { ret =3D mhp_init_memmap_on_memory(start_pfn, nr_vmemmap_pages, zone); if (ret) - return ret; + goto out; } =20 ret =3D online_pages(start_pfn + nr_vmemmap_pages, @@ -215,7 +216,7 @@ static int memory_block_online(struct memory_block *mem) if (ret) { if (nr_vmemmap_pages) mhp_deinit_memmap_on_memory(start_pfn, nr_vmemmap_pages); - return ret; + goto out; } =20 /* @@ -227,6 +228,8 @@ static int memory_block_online(struct memory_block *mem) nr_vmemmap_pages); =20 mem->zone =3D zone; +out: + mem_hotplug_done(); return ret; } =20 @@ -247,6 +250,7 @@ static int memory_block_offline(struct memory_block *me= m) if (mem->altmap) nr_vmemmap_pages =3D mem->altmap->free; =20 + mem_hotplug_begin(); if (nr_vmemmap_pages) adjust_present_page_count(pfn_to_page(start_pfn), mem->group, -nr_vmemmap_pages); @@ -258,13 +262,15 @@ static int memory_block_offline(struct memory_block *= mem) if (nr_vmemmap_pages) adjust_present_page_count(pfn_to_page(start_pfn), mem->group, nr_vmemmap_pages); - return ret; + goto out; } =20 if (nr_vmemmap_pages) mhp_deinit_memmap_on_memory(start_pfn, nr_vmemmap_pages); =20 mem->zone =3D NULL; +out: + mem_hotplug_done(); return ret; } =20 diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 1b03f4ec6fd2..c8238fc5edcb 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1129,6 +1129,9 @@ void mhp_deinit_memmap_on_memory(unsigned long pfn, u= nsigned long nr_pages) kasan_remove_zero_shadow(__va(PFN_PHYS(pfn)), PFN_PHYS(nr_pages)); } =20 +/* + * Must be called with mem_hotplug_lock in write mode. + */ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, struct zone *zone, struct memory_group *group) { @@ -1149,7 +1152,6 @@ int __ref online_pages(unsigned long pfn, unsigned lo= ng nr_pages, !IS_ALIGNED(pfn + nr_pages, PAGES_PER_SECTION))) return -EINVAL; =20 - mem_hotplug_begin(); =20 /* associate pfn range with the zone */ move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_ISOLATE); @@ -1208,7 +1210,6 @@ int __ref online_pages(unsigned long pfn, unsigned lo= ng nr_pages, writeback_set_ratelimit(); =20 memory_notify(MEM_ONLINE, &arg); - mem_hotplug_done(); return 0; =20 failed_addition: @@ -1217,7 +1218,6 @@ int __ref online_pages(unsigned long pfn, unsigned lo= ng nr_pages, (((unsigned long long) pfn + nr_pages) << PAGE_SHIFT) - 1); memory_notify(MEM_CANCEL_ONLINE, &arg); remove_pfn_range_from_zone(zone, pfn, nr_pages); - mem_hotplug_done(); return ret; } =20 @@ -1863,6 +1863,9 @@ static int count_system_ram_pages_cb(unsigned long st= art_pfn, return 0; } =20 +/* + * Must be called with mem_hotplug_lock in write mode. + */ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages, struct zone *zone, struct memory_group *group) { @@ -1885,8 +1888,6 @@ int __ref offline_pages(unsigned long start_pfn, unsi= gned long nr_pages, !IS_ALIGNED(start_pfn + nr_pages, PAGES_PER_SECTION))) return -EINVAL; =20 - mem_hotplug_begin(); - /* * Don't allow to offline memory blocks that contain holes. * Consequently, memory blocks with holes can never get onlined @@ -2027,7 +2028,6 @@ int __ref offline_pages(unsigned long start_pfn, unsi= gned long nr_pages, =20 memory_notify(MEM_OFFLINE, &arg); remove_pfn_range_from_zone(zone, start_pfn, nr_pages); - mem_hotplug_done(); return 0; =20 failed_removal_isolated: @@ -2042,7 +2042,6 @@ int __ref offline_pages(unsigned long start_pfn, unsi= gned long nr_pages, (unsigned long long) start_pfn << PAGE_SHIFT, ((unsigned long long) end_pfn << PAGE_SHIFT) - 1, reason); - mem_hotplug_done(); return ret; } =20 --=20 2.41.0 From nobody Tue Dec 30 14:49:32 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D50C7C4167D for ; Tue, 14 Nov 2023 18:03:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233850AbjKNSDM (ORCPT ); Tue, 14 Nov 2023 13:03:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55730 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233738AbjKNSDG (ORCPT ); Tue, 14 Nov 2023 13:03:06 -0500 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 408AF121; Tue, 14 Nov 2023 10:03:03 -0800 (PST) Received: from pps.filterd (m0360072.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3AEHaZXB020727; Tue, 14 Nov 2023 18:02:53 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=UXuiVxrhS/ib2MzACnKB4Rr4sLIRlSyid8ooltH0jI4=; b=RAplVQvp9sbVP0ga+9CZJuhytEzKojnJWmvaIbG58Wyy0i9tFegtR5SIG8R7PuKRl+WX XCYV4llb51bYAr4H3bsqYZ5habUIJz7TfHifnJkVBPKLNAoWoh/6fj6hn0kIUz8uxX26 vg/DcsMbgP9nanKYNyWBS5ohLS8EKK0REJLvNIgWcrTDPgSnnOlvQXTzJqlXK2hFPugq 0q7NCVJsQC0XSToXpFcMAEZ0LpRTx0nE9Qyj7yuUXB8GuS25MjwurEaU1Sj6WgMY3TRs 0WQ/AI+JmIa7Kx4nkapIQKlNeq/JpM6qSRq6DEL5TEgM+6haaQHdYDQz4V5xL1eYjvH4 1w== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ucdeg8r9r-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Nov 2023 18:02:52 +0000 Received: from m0360072.ppops.net (m0360072.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3AEHbOQq022529; Tue, 14 Nov 2023 18:02:51 GMT Received: from ppma21.wdc07v.mail.ibm.com (5b.69.3da9.ip4.static.sl-reverse.com [169.61.105.91]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ucdeg8r8r-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Nov 2023 18:02:51 +0000 Received: from pps.filterd (ppma21.wdc07v.mail.ibm.com [127.0.0.1]) by ppma21.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3AEGY3Oa020091; Tue, 14 Nov 2023 18:02:50 GMT Received: from smtprelay04.fra02v.mail.ibm.com ([9.218.2.228]) by ppma21.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3uamxn9kn0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Nov 2023 18:02:49 +0000 Received: from smtpav05.fra02v.mail.ibm.com (smtpav05.fra02v.mail.ibm.com [10.20.54.104]) by smtprelay04.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3AEI2keY40895122 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 14 Nov 2023 18:02:46 GMT Received: from smtpav05.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4F0F22004D; Tue, 14 Nov 2023 18:02:46 +0000 (GMT) Received: from smtpav05.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E9B462005A; Tue, 14 Nov 2023 18:02:45 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by smtpav05.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 14 Nov 2023 18:02:45 +0000 (GMT) From: Sumanth Korikkar To: linux-mm , Andrew Morton , David Hildenbrand Cc: Oscar Salvador , Michal Hocko , "Aneesh Kumar K.V" , Anshuman Khandual , Gerald Schaefer , Alexander Gordeev , Heiko Carstens , Vasily Gorbik , linux-s390 , LKML Subject: [PATCH 2/8] mm/memory_hotplug: fix error handling in add_memory_resource() Date: Tue, 14 Nov 2023 19:02:32 +0100 Message-Id: <20231114180238.1522782-3-sumanthk@linux.ibm.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231114180238.1522782-1-sumanthk@linux.ibm.com> References: <20231114180238.1522782-1-sumanthk@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: jVaEeKBBA0p_qfPGMQOSGvCcx2fmXvce X-Proofpoint-GUID: 27rzWHrwdprSxIiUtgE4QBwK6wgOC6zP X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.987,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-11-14_18,2023-11-14_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 phishscore=0 mlxlogscore=543 malwarescore=0 spamscore=0 adultscore=0 mlxscore=0 impostorscore=0 suspectscore=0 priorityscore=1501 clxscore=1015 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311060000 definitions=main-2311140136 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" In add_memory_resource(), creation of memory block devices occurs after successful call to arch_add_memory(). However, creation of memory block devices could fail. In that case, arch_remove_memory() is called to perform necessary cleanup. Currently with or without altmap support, arch_remove_memory() is always passed with altmap set to NULL during error handling. This leads to freeing of struct pages using free_pages(), eventhough the allocation might have been performed with altmap support via altmap_alloc_block_buf(). Fix the error handling by passing altmap in arch_remove_memory(). This ensures the following: * When altmap is disabled, deallocation of the struct pages array occurs via free_pages(). * When altmap is enabled, deallocation occurs via vmem_altmap_free(). Fixes: db051a0dac13 ("mm/memory_hotplug: create memory block devices after = arch_add_memory()") Reviewed-by: Gerald Schaefer Signed-off-by: Sumanth Korikkar --- mm/memory_hotplug.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index c8238fc5edcb..4f476a970e84 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1458,7 +1458,7 @@ int __ref add_memory_resource(int nid, struct resourc= e *res, mhp_t mhp_flags) /* create memory block devices after memory was added */ ret =3D create_memory_block_devices(start, size, params.altmap, group); if (ret) { - arch_remove_memory(start, size, NULL); + arch_remove_memory(start, size, params.altmap); goto error_free; } =20 --=20 2.41.0 From nobody Tue Dec 30 14:49:32 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 633C7C4167D for ; Tue, 14 Nov 2023 18:03:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233876AbjKNSD3 (ORCPT ); Tue, 14 Nov 2023 13:03:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55776 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233839AbjKNSDJ (ORCPT ); Tue, 14 Nov 2023 13:03:09 -0500 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ADAE511B; Tue, 14 Nov 2023 10:03:06 -0800 (PST) Received: from pps.filterd (m0356517.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3AEHkGrI003687; Tue, 14 Nov 2023 18:02:53 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=f3ODx0nEjQw7AZiZQdgaRfqK/mj0aCKaVn/urVt+O/s=; b=V9o26zdqvtRS6Zy9w1PHmyl/fUFs/aMac4sn3GwMb18Vi8SYVbjGxylBLmeKLmMvuLSk y/Flcf5+qhoeyUNmmcmVbxGrkB7GD5XWMrcK80I4HGfy203BEp+sTMGwZ2ms4KG8gvXU LZwcSPldP8vVAPX86SrsUXReBVJODsgm0fYXlm2JzC0cFlUyvN5Gf34rF0jvBXu5ASgD r/2SMwmxK6ttOWBwkmdMwJRHkfCWraQa2DGID2ciYbH+nu6dfEV0EvB2P3QKvqIdkKJu xZ9b3SJMRXKw0GukJq7tAXjM6Dm27VXWQ9u8kQTJ4R4cDNCH2Vaw6AJDGir1VcpTQtG/ 3g== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ucce7ak4e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Nov 2023 18:02:52 +0000 Received: from m0356517.ppops.net (m0356517.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3AEHkFHc003646; Tue, 14 Nov 2023 18:02:51 GMT Received: from ppma22.wdc07v.mail.ibm.com (5c.69.3da9.ip4.static.sl-reverse.com [169.61.105.92]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ucce7ak1q-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Nov 2023 18:02:51 +0000 Received: from pps.filterd (ppma22.wdc07v.mail.ibm.com [127.0.0.1]) by ppma22.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3AEGXxss019651; Tue, 14 Nov 2023 18:02:50 GMT Received: from smtprelay04.fra02v.mail.ibm.com ([9.218.2.228]) by ppma22.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3uamay9tc7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Nov 2023 18:02:49 +0000 Received: from smtpav05.fra02v.mail.ibm.com (smtpav05.fra02v.mail.ibm.com [10.20.54.104]) by smtprelay04.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3AEI2kso40829366 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 14 Nov 2023 18:02:46 GMT Received: from smtpav05.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B4E7C2004D; Tue, 14 Nov 2023 18:02:46 +0000 (GMT) Received: from smtpav05.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 59E3A20040; Tue, 14 Nov 2023 18:02:46 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by smtpav05.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 14 Nov 2023 18:02:46 +0000 (GMT) From: Sumanth Korikkar To: linux-mm , Andrew Morton , David Hildenbrand Cc: Oscar Salvador , Michal Hocko , "Aneesh Kumar K.V" , Anshuman Khandual , Gerald Schaefer , Alexander Gordeev , Heiko Carstens , Vasily Gorbik , linux-s390 , LKML Subject: [PATCH 3/8] mm: use vmem_altmap code without CONFIG_ZONE_DEVICE Date: Tue, 14 Nov 2023 19:02:33 +0100 Message-Id: <20231114180238.1522782-4-sumanthk@linux.ibm.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231114180238.1522782-1-sumanthk@linux.ibm.com> References: <20231114180238.1522782-1-sumanthk@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-GUID: gh2OxY0_rfZXOYpRCD07_I0KQppgYvSj X-Proofpoint-ORIG-GUID: oHrM96tYm2-NwV52JlE9xvwvtGj3q7oa X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.987,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-11-14_18,2023-11-14_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 lowpriorityscore=0 suspectscore=0 mlxscore=0 mlxlogscore=921 malwarescore=0 bulkscore=0 clxscore=1011 spamscore=0 impostorscore=0 priorityscore=1501 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311060000 definitions=main-2311140137 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" vmem_altmap_free() and vmem_altmap_offset() could be utlized without CONFIG_ZONE_DEVICE enabled. Hence, move it to sparse-vmemmap.c Reviewed-by: Gerald Schaefer Signed-off-by: Sumanth Korikkar Reviewed-by: David Hildenbrand --- include/linux/memremap.h | 12 ------------ include/linux/mm.h | 2 ++ mm/memremap.c | 14 +------------- mm/sparse-vmemmap.c | 13 +++++++++++++ 4 files changed, 16 insertions(+), 25 deletions(-) diff --git a/include/linux/memremap.h b/include/linux/memremap.h index 1314d9c5f05b..744c830f4b13 100644 --- a/include/linux/memremap.h +++ b/include/linux/memremap.h @@ -196,8 +196,6 @@ struct dev_pagemap *get_dev_pagemap(unsigned long pfn, struct dev_pagemap *pgmap); bool pgmap_pfn_valid(struct dev_pagemap *pgmap, unsigned long pfn); =20 -unsigned long vmem_altmap_offset(struct vmem_altmap *altmap); -void vmem_altmap_free(struct vmem_altmap *altmap, unsigned long nr_pfns); unsigned long memremap_compat_align(void); #else static inline void *devm_memremap_pages(struct device *dev, @@ -228,16 +226,6 @@ static inline bool pgmap_pfn_valid(struct dev_pagemap = *pgmap, unsigned long pfn) return false; } =20 -static inline unsigned long vmem_altmap_offset(struct vmem_altmap *altmap) -{ - return 0; -} - -static inline void vmem_altmap_free(struct vmem_altmap *altmap, - unsigned long nr_pfns) -{ -} - /* when memremap_pages() is disabled all archs can remap a single page */ static inline unsigned long memremap_compat_align(void) { diff --git a/include/linux/mm.h b/include/linux/mm.h index bf5d0b1b16f4..5edb0dfd2d01 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3765,6 +3765,8 @@ pud_t *vmemmap_pud_populate(p4d_t *p4d, unsigned long= addr, int node); pmd_t *vmemmap_pmd_populate(pud_t *pud, unsigned long addr, int node); pte_t *vmemmap_pte_populate(pmd_t *pmd, unsigned long addr, int node, struct vmem_altmap *altmap, struct page *reuse); +unsigned long vmem_altmap_offset(struct vmem_altmap *altmap); +void vmem_altmap_free(struct vmem_altmap *altmap, unsigned long nr_pfns); void *vmemmap_alloc_block(unsigned long size, int node); struct vmem_altmap; void *vmemmap_alloc_block_buf(unsigned long size, int node, diff --git a/mm/memremap.c b/mm/memremap.c index bee85560a243..9531faa92a7c 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -7,6 +7,7 @@ #include #include #include +#include #include #include #include @@ -422,19 +423,6 @@ void devm_memunmap_pages(struct device *dev, struct de= v_pagemap *pgmap) } EXPORT_SYMBOL_GPL(devm_memunmap_pages); =20 -unsigned long vmem_altmap_offset(struct vmem_altmap *altmap) -{ - /* number of pfns from base where pfn_to_page() is valid */ - if (altmap) - return altmap->reserve + altmap->free; - return 0; -} - -void vmem_altmap_free(struct vmem_altmap *altmap, unsigned long nr_pfns) -{ - altmap->alloc -=3D nr_pfns; -} - /** * get_dev_pagemap() - take a new live reference on the dev_pagemap for @p= fn * @pfn: page frame number to lookup page_map diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index a2cbe44c48e1..bd1b9a137f93 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -46,6 +46,19 @@ static void * __ref __earlyonly_bootmem_alloc(int node, MEMBLOCK_ALLOC_ACCESSIBLE, node); } =20 +unsigned long vmem_altmap_offset(struct vmem_altmap *altmap) +{ + /* number of pfns from base where pfn_to_page() is valid */ + if (altmap) + return altmap->reserve + altmap->free; + return 0; +} + +void vmem_altmap_free(struct vmem_altmap *altmap, unsigned long nr_pfns) +{ + altmap->alloc -=3D nr_pfns; +} + void * __meminit vmemmap_alloc_block(unsigned long size, int node) { /* If the main allocator is up use that, fallback to bootmem. */ --=20 2.41.0 From nobody Tue Dec 30 14:49:32 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4E97C072A2 for ; Tue, 14 Nov 2023 18:03:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233819AbjKNSDH (ORCPT ); Tue, 14 Nov 2023 13:03:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55706 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229607AbjKNSDF (ORCPT ); Tue, 14 Nov 2023 13:03:05 -0500 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E6CCFFD; Tue, 14 Nov 2023 10:03:01 -0800 (PST) Received: from pps.filterd (m0360072.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3AEHmjJl000591; Tue, 14 Nov 2023 18:02:51 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=gcatVFRylgUt8az7PKF3Ve1gzVhxgYmwZT9J8GPrpLY=; b=XBiMfqZ/l9GZtCoBIFQA/jkxYJHcUDru4iN/e9ysnae0JlSOL5OlyspucGPD2QblpKtw cArsNiC6VXBpy+vk0Q6sEbMrlke2Kc9/5TSca8XsSocErktzv8L1e6b9RHPFh+YDt8xD y81MbsqEGkNr9og2tPCiZJVtuRZXa2TwA5RP9IXx+LTbKCJXuD6nUIt5BIMBWHPxXKNu PrWxESVSOY6wjf3Gbd2/+kcPjzJBvLfDyv/9ldfiZTyZrDXXavLbITU8W+qjE8sTv+Uc 9ppGnEiOHiZAMDGoXzEVWHN1o7nv9q4B+hutuu7VqGm8RQTv78+4fcKwyJG9ahm/UQNr cQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ucdeg8r9c-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Nov 2023 18:02:51 +0000 Received: from m0360072.ppops.net (m0360072.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3AEHan3d021195; Tue, 14 Nov 2023 18:02:50 GMT Received: from ppma12.dal12v.mail.ibm.com (dc.9e.1632.ip4.static.sl-reverse.com [50.22.158.220]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ucdeg8r8v-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Nov 2023 18:02:50 +0000 Received: from pps.filterd (ppma12.dal12v.mail.ibm.com [127.0.0.1]) by ppma12.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3AEGXurS009528; Tue, 14 Nov 2023 18:02:50 GMT Received: from smtprelay02.fra02v.mail.ibm.com ([9.218.2.226]) by ppma12.dal12v.mail.ibm.com (PPS) with ESMTPS id 3uakxst00s-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Nov 2023 18:02:49 +0000 Received: from smtpav05.fra02v.mail.ibm.com (smtpav05.fra02v.mail.ibm.com [10.20.54.104]) by smtprelay02.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3AEI2lCQ28639800 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 14 Nov 2023 18:02:47 GMT Received: from smtpav05.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 249462004D; Tue, 14 Nov 2023 18:02:47 +0000 (GMT) Received: from smtpav05.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id BF7A52004E; Tue, 14 Nov 2023 18:02:46 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by smtpav05.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 14 Nov 2023 18:02:46 +0000 (GMT) From: Sumanth Korikkar To: linux-mm , Andrew Morton , David Hildenbrand Cc: Oscar Salvador , Michal Hocko , "Aneesh Kumar K.V" , Anshuman Khandual , Gerald Schaefer , Alexander Gordeev , Heiko Carstens , Vasily Gorbik , linux-s390 , LKML Subject: [PATCH 4/8] mm/memory_hotplug: introduce MEM_PHYS_ONLINE/OFFLINE memory notifiers Date: Tue, 14 Nov 2023 19:02:34 +0100 Message-Id: <20231114180238.1522782-5-sumanthk@linux.ibm.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231114180238.1522782-1-sumanthk@linux.ibm.com> References: <20231114180238.1522782-1-sumanthk@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: 3-B0AODtqVsG1D4YTR0K2oZv4CBkSvD_ X-Proofpoint-GUID: A7TIJdN-oyc-T-PIMNWKJUo0g9NKc9hB X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.987,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-11-14_18,2023-11-14_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 phishscore=0 mlxlogscore=601 malwarescore=0 spamscore=0 adultscore=0 mlxscore=0 impostorscore=0 suspectscore=0 priorityscore=1501 clxscore=1015 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311060000 definitions=main-2311140136 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Add new memory notifiers to mimic the dynamic ACPI event triggered logic for memory hotplug on platforms that do not generate such events. This will be used to implement "memmap on memory" feature for s390 in a later patch. Platforms such as x86 can support physical memory hotplug via ACPI. When there is physical memory hotplug, ACPI event leads to the memory addition with the following callchain: acpi_memory_device_add() -> acpi_memory_enable_device() -> __add_memory() After this, the hotplugged memory is physically accessible, and altmap support prepared, before the "memmap on memory" initialization in memory_block_online() is called. On s390, memory hotplug works in a different way. The available hotplug memory has to be defined upfront in the hypervisor, but it is made physically accessible only when the user sets it online via sysfs, currently in the MEM_GOING_ONLINE notifier. This requires calling add_memory() during early memory detection, in order to get the sysfs representation, but we cannot use "memmap on memory" altmap support at this stage, w/o having it physically accessible. Since no ACPI or similar events are generated, there is no way to set up altmap support, or even make the memory physically accessible at all, before the "memmap on memory" initialization in memory_block_online(). The new MEM_PHYS_ONLINE notifier allows to work around this, by providing a hook to make the memory physically accessible, and also call __add_pages() with altmap support, early in memory_block_online(). Similarly, the MEM_PHYS_OFFLINE notifier allows to make the memory inaccessible and call __remove_pages(), at the end of memory_block_offline(). Calling __add/remove_pages() requires mem_hotplug_lock, so move mem_hotplug_begin/done() to include the new notifiers. All architectures ignore unknown memory notifiers, so this patch should not introduce any functional changes. Reviewed-by: Gerald Schaefer Signed-off-by: Sumanth Korikkar --- drivers/base/memory.c | 18 +++++++++++++++++- include/linux/memory.h | 2 ++ 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/drivers/base/memory.c b/drivers/base/memory.c index 1e9f6a1749b9..604940f62246 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -185,6 +185,7 @@ static int memory_block_online(struct memory_block *mem) unsigned long start_pfn =3D section_nr_to_pfn(mem->start_section_nr); unsigned long nr_pages =3D PAGES_PER_SECTION * sections_per_block; unsigned long nr_vmemmap_pages =3D 0; + struct memory_notify arg; struct zone *zone; int ret; =20 @@ -194,6 +195,14 @@ static int memory_block_online(struct memory_block *me= m) zone =3D zone_for_pfn_range(mem->online_type, mem->nid, mem->group, start_pfn, nr_pages); =20 + arg.start_pfn =3D start_pfn; + arg.nr_pages =3D nr_pages; + mem_hotplug_begin(); + ret =3D memory_notify(MEM_PHYS_ONLINE, &arg); + ret =3D notifier_to_errno(ret); + if (ret) + goto out_notifier; + /* * Although vmemmap pages have a different lifecycle than the pages * they describe (they remain until the memory is unplugged), doing @@ -204,7 +213,6 @@ static int memory_block_online(struct memory_block *mem) if (mem->altmap) nr_vmemmap_pages =3D mem->altmap->free; =20 - mem_hotplug_begin(); if (nr_vmemmap_pages) { ret =3D mhp_init_memmap_on_memory(start_pfn, nr_vmemmap_pages, zone); if (ret) @@ -228,7 +236,11 @@ static int memory_block_online(struct memory_block *me= m) nr_vmemmap_pages); =20 mem->zone =3D zone; + mem_hotplug_done(); + return ret; out: + memory_notify(MEM_PHYS_OFFLINE, &arg); +out_notifier: mem_hotplug_done(); return ret; } @@ -238,6 +250,7 @@ static int memory_block_offline(struct memory_block *me= m) unsigned long start_pfn =3D section_nr_to_pfn(mem->start_section_nr); unsigned long nr_pages =3D PAGES_PER_SECTION * sections_per_block; unsigned long nr_vmemmap_pages =3D 0; + struct memory_notify arg; int ret; =20 if (!mem->zone) @@ -269,6 +282,9 @@ static int memory_block_offline(struct memory_block *me= m) mhp_deinit_memmap_on_memory(start_pfn, nr_vmemmap_pages); =20 mem->zone =3D NULL; + arg.start_pfn =3D start_pfn; + arg.nr_pages =3D nr_pages; + memory_notify(MEM_PHYS_OFFLINE, &arg); out: mem_hotplug_done(); return ret; diff --git a/include/linux/memory.h b/include/linux/memory.h index f53cfdaaaa41..5d8b962b8fa1 100644 --- a/include/linux/memory.h +++ b/include/linux/memory.h @@ -96,6 +96,8 @@ int set_memory_block_size_order(unsigned int order); #define MEM_GOING_ONLINE (1<<3) #define MEM_CANCEL_ONLINE (1<<4) #define MEM_CANCEL_OFFLINE (1<<5) +#define MEM_PHYS_ONLINE (1<<6) +#define MEM_PHYS_OFFLINE (1<<7) =20 struct memory_notify { unsigned long start_pfn; --=20 2.41.0 From nobody Tue Dec 30 14:49:32 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64990C07CA9 for ; Tue, 14 Nov 2023 18:03:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233840AbjKNSDK (ORCPT ); Tue, 14 Nov 2023 13:03:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55718 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230200AbjKNSDG (ORCPT ); Tue, 14 Nov 2023 13:03:06 -0500 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F3B3711B; Tue, 14 Nov 2023 10:03:01 -0800 (PST) Received: from pps.filterd (m0353724.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3AEHvemb001950; Tue, 14 Nov 2023 18:02:52 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=Dt1YW0vf+McTaKErPl0iHdB6D5PuLO6sey3WZMfMbwU=; b=ON9N5yjTtGgXjMsWnyoAN3SUQSppD3iC7NaA8Ve9APrcu4iCsdWR2zj01wobL485qaME 7+eixWX6LlxoJVbRZzc06sNy1ZyE1VqohwgPP7VmHSMLSolg05SeLn+pCT9wD+ZwvKaD 3oX1+J9k7pCabzt8fuzKJSrxi4zo4bwJc51DIlTT9KrKqd5wg8KoewTpTpkbgK+ZmLi9 HYPhX+wAU+HWg4ZKQs5gHfHwI6BQwgYf3w/HIRGTNbeeteJM0dxHU+nQ01aOeCq1ScyN EZhDGF1wdu4kNuAdbW5JfZxC9+HCYFweCvtru14sUvMfeAri9oFytxIVbFoipXHjBsyh Ug== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ucdrc064m-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Nov 2023 18:02:52 +0000 Received: from m0353724.ppops.net (m0353724.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3AEHw2Hf005011; Tue, 14 Nov 2023 18:02:51 GMT Received: from ppma22.wdc07v.mail.ibm.com (5c.69.3da9.ip4.static.sl-reverse.com [169.61.105.92]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ucdrc0640-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Nov 2023 18:02:51 +0000 Received: from pps.filterd (ppma22.wdc07v.mail.ibm.com [127.0.0.1]) by ppma22.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3AEGXx32019663; Tue, 14 Nov 2023 18:02:50 GMT Received: from smtprelay02.fra02v.mail.ibm.com ([9.218.2.226]) by ppma22.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3uamay9tcb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Nov 2023 18:02:50 +0000 Received: from smtpav05.fra02v.mail.ibm.com (smtpav05.fra02v.mail.ibm.com [10.20.54.104]) by smtprelay02.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3AEI2lil50594116 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 14 Nov 2023 18:02:47 GMT Received: from smtpav05.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 8DD2C20040; Tue, 14 Nov 2023 18:02:47 +0000 (GMT) Received: from smtpav05.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2FAB12004F; Tue, 14 Nov 2023 18:02:47 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by smtpav05.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 14 Nov 2023 18:02:47 +0000 (GMT) From: Sumanth Korikkar To: linux-mm , Andrew Morton , David Hildenbrand Cc: Oscar Salvador , Michal Hocko , "Aneesh Kumar K.V" , Anshuman Khandual , Gerald Schaefer , Alexander Gordeev , Heiko Carstens , Vasily Gorbik , linux-s390 , LKML Subject: [PATCH 5/8] s390/mm: allocate vmemmap pages from self-contained memory range Date: Tue, 14 Nov 2023 19:02:35 +0100 Message-Id: <20231114180238.1522782-6-sumanthk@linux.ibm.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231114180238.1522782-1-sumanthk@linux.ibm.com> References: <20231114180238.1522782-1-sumanthk@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: F_3EpuppbSlK_QynpT4c09-sd9G7Nhpc X-Proofpoint-GUID: 5YgP5f0Xdr57zL6_iFLF2dNnKZfKngQP X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.987,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-11-14_18,2023-11-14_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 adultscore=0 spamscore=0 lowpriorityscore=0 priorityscore=1501 mlxlogscore=940 mlxscore=0 suspectscore=0 phishscore=0 bulkscore=0 impostorscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311060000 definitions=main-2311140136 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Allocate memory map (struct pages array) from the hotplugged memory range, rather than using system memory. The change addresses the issue where standby memory, when configured to be much larger than online memory, could potentially lead to ipl failure due to memory map allocation from online memory. For example, 16MB of memory map allocation is needed for a memory block size of 1GB and when standby memory is configured much larger than online memory, this could lead to ipl failure. To address this issue, the solution involves introducing "memmap on memory" using the vmem_altmap structure on s390. Architectures that want to implement it should pass the altmap to the vmemmap_populate() function and its associated callchain. This enhancement is discussed in the commit 4b94ffdc4163 ("x86, mm: introduce vmem_altmap to augment vmemmap_populate()"). Provide "memmap on memory" support for s390 by passing the altmap in vmemmap_populate() and its callchain. The allocation path is described as follows: * When altmap is NULL in vmemmap_populate(), memory map allocation occurs using the existing vmemmap_alloc_block_buf(). * When altmap is not NULL in vmemmap_populate(), memory map allocation still uses vmemmap_alloc_block_buf(), but this function internally calls altmap_alloc_block_buf(). For deallocation, the process is outlined as follows: * When altmap is NULL in vmemmap_free(), memory map deallocation happens through free_pages(). * When altmap is not NULL in vmemmap_free(), memory map deallocation occurs via vmem_altmap_free(). While memory map allocation is primarily handled through the self-contained memory map range, there might still be a small amount of system memory allocation required for vmemmap pagetables. To mitigate this impact, this feature will be limited to machines with EDAT1 support. Reviewed-by: Gerald Schaefer Signed-off-by: Sumanth Korikkar --- arch/s390/mm/init.c | 3 --- arch/s390/mm/vmem.c | 62 +++++++++++++++++++++++++-------------------- 2 files changed, 35 insertions(+), 30 deletions(-) diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c index 43e612bc2bcd..8d9a60ccb777 100644 --- a/arch/s390/mm/init.c +++ b/arch/s390/mm/init.c @@ -281,9 +281,6 @@ int arch_add_memory(int nid, u64 start, u64 size, unsigned long size_pages =3D PFN_DOWN(size); int rc; =20 - if (WARN_ON_ONCE(params->altmap)) - return -EINVAL; - if (WARN_ON_ONCE(params->pgprot.pgprot !=3D PAGE_KERNEL.pgprot)) return -EINVAL; =20 diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c index 186a020857cf..eb100479f7be 100644 --- a/arch/s390/mm/vmem.c +++ b/arch/s390/mm/vmem.c @@ -33,8 +33,12 @@ static void __ref *vmem_alloc_pages(unsigned int order) return memblock_alloc(size, size); } =20 -static void vmem_free_pages(unsigned long addr, int order) +static void vmem_free_pages(unsigned long addr, int order, struct vmem_alt= map *altmap) { + if (altmap) { + vmem_altmap_free(altmap, 1 << order); + return; + } /* We don't expect boot memory to be removed ever. */ if (!slab_is_available() || WARN_ON_ONCE(PageReserved(virt_to_page((void *)addr)))) @@ -156,7 +160,8 @@ static bool vmemmap_unuse_sub_pmd(unsigned long start, = unsigned long end) =20 /* __ref: we'll only call vmemmap_alloc_block() via vmemmap_populate() */ static int __ref modify_pte_table(pmd_t *pmd, unsigned long addr, - unsigned long end, bool add, bool direct) + unsigned long end, bool add, bool direct, + struct vmem_altmap *altmap) { unsigned long prot, pages =3D 0; int ret =3D -ENOMEM; @@ -172,11 +177,11 @@ static int __ref modify_pte_table(pmd_t *pmd, unsigne= d long addr, if (pte_none(*pte)) continue; if (!direct) - vmem_free_pages((unsigned long) pfn_to_virt(pte_pfn(*pte)), 0); + vmem_free_pages((unsigned long)pfn_to_virt(pte_pfn(*pte)), get_order(P= AGE_SIZE), altmap); pte_clear(&init_mm, addr, pte); } else if (pte_none(*pte)) { if (!direct) { - void *new_page =3D vmemmap_alloc_block(PAGE_SIZE, NUMA_NO_NODE); + void *new_page =3D vmemmap_alloc_block_buf(PAGE_SIZE, NUMA_NO_NODE, al= tmap); =20 if (!new_page) goto out; @@ -213,7 +218,8 @@ static void try_free_pte_table(pmd_t *pmd, unsigned lon= g start) =20 /* __ref: we'll only call vmemmap_alloc_block() via vmemmap_populate() */ static int __ref modify_pmd_table(pud_t *pud, unsigned long addr, - unsigned long end, bool add, bool direct) + unsigned long end, bool add, bool direct, + struct vmem_altmap *altmap) { unsigned long next, prot, pages =3D 0; int ret =3D -ENOMEM; @@ -234,11 +240,11 @@ static int __ref modify_pmd_table(pud_t *pud, unsigne= d long addr, if (IS_ALIGNED(addr, PMD_SIZE) && IS_ALIGNED(next, PMD_SIZE)) { if (!direct) - vmem_free_pages(pmd_deref(*pmd), get_order(PMD_SIZE)); + vmem_free_pages(pmd_deref(*pmd), get_order(PMD_SIZE), altmap); pmd_clear(pmd); pages++; } else if (!direct && vmemmap_unuse_sub_pmd(addr, next)) { - vmem_free_pages(pmd_deref(*pmd), get_order(PMD_SIZE)); + vmem_free_pages(pmd_deref(*pmd), get_order(PMD_SIZE), altmap); pmd_clear(pmd); } continue; @@ -261,7 +267,7 @@ static int __ref modify_pmd_table(pud_t *pud, unsigned = long addr, * page tables since vmemmap_populate gets * called for each section separately. */ - new_page =3D vmemmap_alloc_block(PMD_SIZE, NUMA_NO_NODE); + new_page =3D vmemmap_alloc_block_buf(PMD_SIZE, NUMA_NO_NODE, altmap); if (new_page) { set_pmd(pmd, __pmd(__pa(new_page) | prot)); if (!IS_ALIGNED(addr, PMD_SIZE) || @@ -280,7 +286,7 @@ static int __ref modify_pmd_table(pud_t *pud, unsigned = long addr, vmemmap_use_sub_pmd(addr, next); continue; } - ret =3D modify_pte_table(pmd, addr, next, add, direct); + ret =3D modify_pte_table(pmd, addr, next, add, direct, altmap); if (ret) goto out; if (!add) @@ -302,12 +308,12 @@ static void try_free_pmd_table(pud_t *pud, unsigned l= ong start) for (i =3D 0; i < PTRS_PER_PMD; i++, pmd++) if (!pmd_none(*pmd)) return; - vmem_free_pages(pud_deref(*pud), CRST_ALLOC_ORDER); + vmem_free_pages(pud_deref(*pud), CRST_ALLOC_ORDER, NULL); pud_clear(pud); } =20 static int modify_pud_table(p4d_t *p4d, unsigned long addr, unsigned long = end, - bool add, bool direct) + bool add, bool direct, struct vmem_altmap *altmap) { unsigned long next, prot, pages =3D 0; int ret =3D -ENOMEM; @@ -347,7 +353,7 @@ static int modify_pud_table(p4d_t *p4d, unsigned long a= ddr, unsigned long end, } else if (pud_large(*pud)) { continue; } - ret =3D modify_pmd_table(pud, addr, next, add, direct); + ret =3D modify_pmd_table(pud, addr, next, add, direct, altmap); if (ret) goto out; if (!add) @@ -370,12 +376,12 @@ static void try_free_pud_table(p4d_t *p4d, unsigned l= ong start) if (!pud_none(*pud)) return; } - vmem_free_pages(p4d_deref(*p4d), CRST_ALLOC_ORDER); + vmem_free_pages(p4d_deref(*p4d), CRST_ALLOC_ORDER, NULL); p4d_clear(p4d); } =20 static int modify_p4d_table(pgd_t *pgd, unsigned long addr, unsigned long = end, - bool add, bool direct) + bool add, bool direct, struct vmem_altmap *altmap) { unsigned long next; int ret =3D -ENOMEM; @@ -394,7 +400,7 @@ static int modify_p4d_table(pgd_t *pgd, unsigned long a= ddr, unsigned long end, goto out; p4d_populate(&init_mm, p4d, pud); } - ret =3D modify_pud_table(p4d, addr, next, add, direct); + ret =3D modify_pud_table(p4d, addr, next, add, direct, altmap); if (ret) goto out; if (!add) @@ -415,12 +421,12 @@ static void try_free_p4d_table(pgd_t *pgd, unsigned l= ong start) if (!p4d_none(*p4d)) return; } - vmem_free_pages(pgd_deref(*pgd), CRST_ALLOC_ORDER); + vmem_free_pages(pgd_deref(*pgd), CRST_ALLOC_ORDER, NULL); pgd_clear(pgd); } =20 static int modify_pagetable(unsigned long start, unsigned long end, bool a= dd, - bool direct) + bool direct, struct vmem_altmap *altmap) { unsigned long addr, next; int ret =3D -ENOMEM; @@ -445,7 +451,7 @@ static int modify_pagetable(unsigned long start, unsign= ed long end, bool add, goto out; pgd_populate(&init_mm, pgd, p4d); } - ret =3D modify_p4d_table(pgd, addr, next, add, direct); + ret =3D modify_p4d_table(pgd, addr, next, add, direct, altmap); if (ret) goto out; if (!add) @@ -458,14 +464,16 @@ static int modify_pagetable(unsigned long start, unsi= gned long end, bool add, return ret; } =20 -static int add_pagetable(unsigned long start, unsigned long end, bool dire= ct) +static int add_pagetable(unsigned long start, unsigned long end, bool dire= ct, + struct vmem_altmap *altmap) { - return modify_pagetable(start, end, true, direct); + return modify_pagetable(start, end, true, direct, altmap); } =20 -static int remove_pagetable(unsigned long start, unsigned long end, bool d= irect) +static int remove_pagetable(unsigned long start, unsigned long end, bool d= irect, + struct vmem_altmap *altmap) { - return modify_pagetable(start, end, false, direct); + return modify_pagetable(start, end, false, direct, altmap); } =20 /* @@ -474,7 +482,7 @@ static int remove_pagetable(unsigned long start, unsign= ed long end, bool direct) static int vmem_add_range(unsigned long start, unsigned long size) { start =3D (unsigned long)__va(start); - return add_pagetable(start, start + size, true); + return add_pagetable(start, start + size, true, NULL); } =20 /* @@ -483,7 +491,7 @@ static int vmem_add_range(unsigned long start, unsigned= long size) static void vmem_remove_range(unsigned long start, unsigned long size) { start =3D (unsigned long)__va(start); - remove_pagetable(start, start + size, true); + remove_pagetable(start, start + size, true, NULL); } =20 /* @@ -496,9 +504,9 @@ int __meminit vmemmap_populate(unsigned long start, uns= igned long end, int node, =20 mutex_lock(&vmem_mutex); /* We don't care about the node, just use NUMA_NO_NODE on allocations */ - ret =3D add_pagetable(start, end, false); + ret =3D add_pagetable(start, end, false, altmap); if (ret) - remove_pagetable(start, end, false); + remove_pagetable(start, end, false, altmap); mutex_unlock(&vmem_mutex); return ret; } @@ -509,7 +517,7 @@ void vmemmap_free(unsigned long start, unsigned long en= d, struct vmem_altmap *altmap) { mutex_lock(&vmem_mutex); - remove_pagetable(start, end, false); + remove_pagetable(start, end, false, altmap); mutex_unlock(&vmem_mutex); } =20 --=20 2.41.0 From nobody Tue Dec 30 14:49:32 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1F78C4332F for ; Tue, 14 Nov 2023 18:03:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233853AbjKNSDW (ORCPT ); Tue, 14 Nov 2023 13:03:22 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55754 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233816AbjKNSDH (ORCPT ); Tue, 14 Nov 2023 13:03:07 -0500 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4C4A8129; Tue, 14 Nov 2023 10:03:03 -0800 (PST) Received: from pps.filterd (m0353724.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3AEHvemd001950; Tue, 14 Nov 2023 18:02:53 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=1Yp6f1FjTz6MWL+ah3/xd3SGdtmDVxo3POCoC3LZlzI=; b=s1U8xw9gk6cYT1zQBLN+zUsOhYxd3GRpUYfGAzciUOV+8QS9CM2gWdicQ35duVjRnxyq ZOFZ0UvJtiuJfGtqwVJ9ejFnOCSFhvH/lNLg3Lrh1Uk7SL/vHfMpfCTCJcyqKx0a9v54 KWgmTxNkn1WVLxDqj2J/sHZSoF59MuaRHh/cCn733sNS690LJ8QufFieSM25yuxMULLz jRxhcT7zW0VPJSL7QSnCUZpsVNeTbNCwlZNlhmVPY/v1nSSbtUZjRmyEv3fP345/egE0 qKrk9hrpL27J6KzAiAaYHXWcuvkNZxEi+iGLQqGIvxW1zoJkacKxLYAH8k344+xcrUO1 kA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ucdrc065h-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Nov 2023 18:02:53 +0000 Received: from m0353724.ppops.net (m0353724.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3AEHwaVn006514; Tue, 14 Nov 2023 18:02:52 GMT Received: from ppma13.dal12v.mail.ibm.com (dd.9e.1632.ip4.static.sl-reverse.com [50.22.158.221]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ucdrc0641-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Nov 2023 18:02:51 +0000 Received: from pps.filterd (ppma13.dal12v.mail.ibm.com [127.0.0.1]) by ppma13.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3AEGXxri022358; Tue, 14 Nov 2023 18:02:50 GMT Received: from smtprelay05.fra02v.mail.ibm.com ([9.218.2.225]) by ppma13.dal12v.mail.ibm.com (PPS) with ESMTPS id 3uap5k177v-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Nov 2023 18:02:50 +0000 Received: from smtpav05.fra02v.mail.ibm.com (smtpav05.fra02v.mail.ibm.com [10.20.54.104]) by smtprelay05.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3AEI2mAn17826512 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 14 Nov 2023 18:02:48 GMT Received: from smtpav05.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id F1EE120063; Tue, 14 Nov 2023 18:02:47 +0000 (GMT) Received: from smtpav05.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 98A4D2004E; Tue, 14 Nov 2023 18:02:47 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by smtpav05.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 14 Nov 2023 18:02:47 +0000 (GMT) From: Sumanth Korikkar To: linux-mm , Andrew Morton , David Hildenbrand Cc: Oscar Salvador , Michal Hocko , "Aneesh Kumar K.V" , Anshuman Khandual , Gerald Schaefer , Alexander Gordeev , Heiko Carstens , Vasily Gorbik , linux-s390 , LKML Subject: [PATCH 6/8] s390/mm: implement MEM_PHYS_ONLINE MEM_PHYS_OFFLINE memory notifiers Date: Tue, 14 Nov 2023 19:02:36 +0100 Message-Id: <20231114180238.1522782-7-sumanthk@linux.ibm.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231114180238.1522782-1-sumanthk@linux.ibm.com> References: <20231114180238.1522782-1-sumanthk@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: X8TQEr1PfC9_ZSwf_PkeMMAQwyQAaP5M X-Proofpoint-GUID: 6bW3L6tOpS5WQ1u6Zs9aaPnAoC0Iqwlf X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.987,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-11-14_18,2023-11-14_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 adultscore=0 spamscore=0 lowpriorityscore=0 priorityscore=1501 mlxlogscore=999 mlxscore=0 suspectscore=0 phishscore=0 bulkscore=0 impostorscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311060000 definitions=main-2311140136 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Implement MEM_PHYS_ONLINE and MEM_PHYS_OFFLINE memory notifiers on s390 Implementation of MEM_PHYS_ONLINE Memory Notifier: * Transition the memory block to an accessible/online state using the sclp assign command. * Execute __add_pages() for the memory block, enabling a self-contained memory map range. For boot-time memory, vmemmap mapping is carried out through sparse_init(). Implementation of MEM_PHYS_OFFLINE Memory Notifier: * Execute __remove_pages() exclusively for the memory block (applicable where a self-contained memory map was possible before). * Shift the memory block to an inaccessible/offline state using the sclp unassign command. Additional Implementation Considerations: * When MHP_MEMMAP_ON_MEMORY is disabled, the system retains the old behavior. This means the memory map is allocated from default memory, and struct vmemmap pages are populated during the standby memory detection phase. * With MHP_MEMMAP_ON_MEMORY enabled (allowing self-contained memory map), the memory map is allocated using the self-contained memory map range. Struct vmemmap pages are populated during the memory hotplug phase. * If MACHINE_HAS_EDAT1 is unavailable, MHP_MEMMAP_ON_MEMORY is automatically disabled. This ensures that vmemmap pagetables do not consume additional memory from the default memory allocator. * The MEM_GOING_ONLINE notifier has been modified to perform no operation, as MEM_PHYS_ONLINE already executes the sclp assign command. * The MEM_CANCEL_ONLINE notifier now performs no operation, as MEM_PHYS_OFFLINE already executes the sclp unassign command. * The call to __add_pages() in arch_add_memory() with altmap support is skipped. This operation is deferred and will be performed later in the MEM_PHYS_ONLINE notifier. Reviewed-by: Gerald Schaefer Signed-off-by: Sumanth Korikkar --- arch/s390/mm/init.c | 16 +++++++++++++++- drivers/s390/char/sclp_cmd.c | 33 ++++++++++++++++++++++++++++++--- 2 files changed, 45 insertions(+), 4 deletions(-) diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c index 8d9a60ccb777..db505ed590b2 100644 --- a/arch/s390/mm/init.c +++ b/arch/s390/mm/init.c @@ -288,6 +288,12 @@ int arch_add_memory(int nid, u64 start, u64 size, rc =3D vmem_add_mapping(start, size); if (rc) return rc; + /* + * If MHP_MEMMAP_ON_MEMORY is enabled, perform __add_pages() during memory + * onlining phase + */ + if (params->altmap) + return 0; =20 rc =3D __add_pages(nid, start_pfn, size_pages, params); if (rc) @@ -300,7 +306,15 @@ void arch_remove_memory(u64 start, u64 size, struct vm= em_altmap *altmap) unsigned long start_pfn =3D start >> PAGE_SHIFT; unsigned long nr_pages =3D size >> PAGE_SHIFT; =20 - __remove_pages(start_pfn, nr_pages, altmap); + /* + * On s390, currently arch_remove_memory() will be called during error + * handling of add_memory_resource(). When MHP_MEMMAP_ON_MEMORY is + * enabled, __add_pages() is performed later during the memory onlining + * phase. Hence, __remove_pages() should not be called here in that + * case, but only later during memory offline phase + */ + if (!altmap) + __remove_pages(start_pfn, nr_pages, NULL); vmem_remove_mapping(start, size); } #endif /* CONFIG_MEMORY_HOTPLUG */ diff --git a/drivers/s390/char/sclp_cmd.c b/drivers/s390/char/sclp_cmd.c index 11c428f4c7cf..12f3d4af7e4e 100644 --- a/drivers/s390/char/sclp_cmd.c +++ b/drivers/s390/char/sclp_cmd.c @@ -18,6 +18,7 @@ #include #include #include +#include #include #include #include @@ -26,6 +27,7 @@ #include #include #include +#include =20 #include "sclp.h" =20 @@ -319,6 +321,8 @@ static bool contains_standby_increment(unsigned long st= art, unsigned long end) static int sclp_mem_notifier(struct notifier_block *nb, unsigned long action, void *data) { + struct mhp_params params =3D { .pgprot =3D pgprot_mhp(PAGE_KERNEL) }; + struct memory_block *memory_block; unsigned long start, size; struct memory_notify *arg; unsigned char id; @@ -330,6 +334,11 @@ static int sclp_mem_notifier(struct notifier_block *nb, mutex_lock(&sclp_mem_mutex); for_each_clear_bit(id, sclp_storage_ids, sclp_max_storage_id + 1) sclp_attach_storage(id); + memory_block =3D find_memory_block(pfn_to_section_nr(arg->start_pfn)); + if (!memory_block) { + rc =3D -EINVAL; + goto out; + } switch (action) { case MEM_GOING_OFFLINE: /* @@ -344,17 +353,34 @@ static int sclp_mem_notifier(struct notifier_block *n= b, case MEM_CANCEL_OFFLINE: break; case MEM_GOING_ONLINE: + break; + case MEM_PHYS_ONLINE: rc =3D sclp_mem_change_state(start, size, 1); + if (rc || !memory_block->altmap) + goto out; + params.altmap =3D memory_block->altmap; + rc =3D __add_pages(0, arg->start_pfn, arg->nr_pages, ¶ms); + if (rc) + sclp_mem_change_state(start, size, 0); + /* + * Set CMMA state to nodat here, since the struct page memory + * at the beginning of the memory block will not go through the + * buddy allocator later. + */ + __arch_set_page_nodat((void *)start, memory_block->altmap->free); break; case MEM_CANCEL_ONLINE: - sclp_mem_change_state(start, size, 0); - break; case MEM_OFFLINE: + break; + case MEM_PHYS_OFFLINE: + if (memory_block->altmap) + __remove_pages(arg->start_pfn, arg->nr_pages, memory_block->altmap); sclp_mem_change_state(start, size, 0); break; default: break; } +out: mutex_unlock(&sclp_mem_mutex); return rc ? NOTIFY_BAD : NOTIFY_OK; } @@ -400,7 +426,8 @@ static void __init add_memory_merged(u16 rn) if (!size) goto skip_add; for (addr =3D start; addr < start + size; addr +=3D block_size) - add_memory(0, addr, block_size, MHP_NONE); + add_memory(0, addr, block_size, + MACHINE_HAS_EDAT1 ? MHP_MEMMAP_ON_MEMORY : MHP_NONE); skip_add: first_rn =3D rn; num =3D 1; --=20 2.41.0 From nobody Tue Dec 30 14:49:32 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C663C4167B for ; Tue, 14 Nov 2023 18:03:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233886AbjKNSDY (ORCPT ); Tue, 14 Nov 2023 13:03:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55770 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233828AbjKNSDJ (ORCPT ); Tue, 14 Nov 2023 13:03:09 -0500 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D779B11F; Tue, 14 Nov 2023 10:03:06 -0800 (PST) Received: from pps.filterd (m0356517.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3AEHkGQG003707; Tue, 14 Nov 2023 18:02:54 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=Om6IJuStr3N58ZwDdeKdV4d8eL5rcFp+Ss4RABMj1ec=; b=lHR196ArFATBZTu4+GGGuI3lD4Lzfwt633x5BGwARX1CgC5lEQ/yTdGK4EVZpugjW2It K1JbP0EV8iqjbaQ/1gnBpyJjYgPpIjMDllSGdf3NGgD/BNQkoEWNgIhptqJ9k0H8C3Qv uH20mPv9d7vo2TtMh/TIeCFhse/Lv0wjAM7Wkssv2tzbcrdL/ccZASxkjMjyu1LCnnQD SbQ6Tdkia/D9DgQhg7tfxnwlrc+o70vFsoYemEFVr6aBMrbTFEgcNMvC8zyBixFH//Jk 9QItNZ1m8VFxYTnnrh9+y86OEzjPpPYsTTLldg4qb6wElvjtRzJhLnkHqzbQbI3AyVFL lA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ucce7ak51-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Nov 2023 18:02:53 +0000 Received: from m0356517.ppops.net (m0356517.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3AEHnX7o017866; Tue, 14 Nov 2023 18:02:52 GMT Received: from ppma11.dal12v.mail.ibm.com (db.9e.1632.ip4.static.sl-reverse.com [50.22.158.219]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ucce7ak3p-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Nov 2023 18:02:52 +0000 Received: from pps.filterd (ppma11.dal12v.mail.ibm.com [127.0.0.1]) by ppma11.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3AEGXsOU023335; Tue, 14 Nov 2023 18:02:51 GMT Received: from smtprelay05.fra02v.mail.ibm.com ([9.218.2.225]) by ppma11.dal12v.mail.ibm.com (PPS) with ESMTPS id 3uapn1h4a2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Nov 2023 18:02:51 +0000 Received: from smtpav05.fra02v.mail.ibm.com (smtpav05.fra02v.mail.ibm.com [10.20.54.104]) by smtprelay05.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3AEI2mrh61866476 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 14 Nov 2023 18:02:48 GMT Received: from smtpav05.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6B2732004B; Tue, 14 Nov 2023 18:02:48 +0000 (GMT) Received: from smtpav05.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 089A020065; Tue, 14 Nov 2023 18:02:48 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by smtpav05.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 14 Nov 2023 18:02:47 +0000 (GMT) From: Sumanth Korikkar To: linux-mm , Andrew Morton , David Hildenbrand Cc: Oscar Salvador , Michal Hocko , "Aneesh Kumar K.V" , Anshuman Khandual , Gerald Schaefer , Alexander Gordeev , Heiko Carstens , Vasily Gorbik , linux-s390 , LKML Subject: [PATCH 7/8] s390/sclp: remove unhandled memory notifier type Date: Tue, 14 Nov 2023 19:02:37 +0100 Message-Id: <20231114180238.1522782-8-sumanthk@linux.ibm.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231114180238.1522782-1-sumanthk@linux.ibm.com> References: <20231114180238.1522782-1-sumanthk@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-GUID: OpvzgQwUY56uUnXzNSxpR9d6bgqemBfq X-Proofpoint-ORIG-GUID: d8nHFaYob11HUE1xU71UucmHtG_iRw1p X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.987,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-11-14_18,2023-11-14_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 phishscore=0 lowpriorityscore=0 suspectscore=0 mlxscore=0 mlxlogscore=758 malwarescore=0 bulkscore=0 clxscore=1011 spamscore=0 impostorscore=0 priorityscore=1501 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311060000 definitions=main-2311140137 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Remove memory notifier types which are unhandled by s390. Unhandled memory notifier types are covered by default case. Suggested-by: Alexander Gordeev Reviewed-by: Gerald Schaefer Signed-off-by: Sumanth Korikkar --- drivers/s390/char/sclp_cmd.c | 8 -------- 1 file changed, 8 deletions(-) diff --git a/drivers/s390/char/sclp_cmd.c b/drivers/s390/char/sclp_cmd.c index 12f3d4af7e4e..428f8a583e8f 100644 --- a/drivers/s390/char/sclp_cmd.c +++ b/drivers/s390/char/sclp_cmd.c @@ -349,11 +349,6 @@ static int sclp_mem_notifier(struct notifier_block *nb, if (contains_standby_increment(start, start + size)) rc =3D -EPERM; break; - case MEM_ONLINE: - case MEM_CANCEL_OFFLINE: - break; - case MEM_GOING_ONLINE: - break; case MEM_PHYS_ONLINE: rc =3D sclp_mem_change_state(start, size, 1); if (rc || !memory_block->altmap) @@ -369,9 +364,6 @@ static int sclp_mem_notifier(struct notifier_block *nb, */ __arch_set_page_nodat((void *)start, memory_block->altmap->free); break; - case MEM_CANCEL_ONLINE: - case MEM_OFFLINE: - break; case MEM_PHYS_OFFLINE: if (memory_block->altmap) __remove_pages(arg->start_pfn, arg->nr_pages, memory_block->altmap); --=20 2.41.0 From nobody Tue Dec 30 14:49:32 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57322C4332F for ; Tue, 14 Nov 2023 18:03:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233858AbjKNSDc (ORCPT ); Tue, 14 Nov 2023 13:03:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48484 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233844AbjKNSDL (ORCPT ); Tue, 14 Nov 2023 13:03:11 -0500 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E7B78FD; Tue, 14 Nov 2023 10:03:08 -0800 (PST) Received: from pps.filterd (m0353726.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3AEHvqwU016121; Tue, 14 Nov 2023 18:02:57 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=Onulgm7nCUdcpaVdPXWthCgk2585qRsTxwp5zRlqqMg=; b=c8NFTrz8/tIDihfukCnfsCbmC1FcFIURTtGvVR8NySoJQykyyj327XH7u4GDZlcufFpJ U6+/rI4Rd8KRv/i8RTCxYSmOaXZGP3MgiQuS6NJl3nrRNwc3IhRUYjtENBEW+GMUMYV8 AI1oHMF/59a2rFdxlaq9xgIj+ew7TuMhVRkKmVKxq9JVKbiFN8Yn/vLbisreRB4YO8Zd 5/CMEe4fdIOi63pfd9MnB84rEMb0+d28dPSRILOJ3Gi0+KOs/ggvrqel5TADmlq6T3dO 8Z4hkM/ThQTrA0ELWkVvW2payT3AJuR4mU7QLMINfec2Gbosgio799sInTTSNOD3H3zv 6Q== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ucdre05x2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Nov 2023 18:02:56 +0000 Received: from m0353726.ppops.net (m0353726.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3AEHvpRK015925; Tue, 14 Nov 2023 18:02:55 GMT Received: from ppma12.dal12v.mail.ibm.com (dc.9e.1632.ip4.static.sl-reverse.com [50.22.158.220]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ucdre05sc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Nov 2023 18:02:55 +0000 Received: from pps.filterd (ppma12.dal12v.mail.ibm.com [127.0.0.1]) by ppma12.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3AEGXuhS009522; Tue, 14 Nov 2023 18:02:51 GMT Received: from smtprelay05.fra02v.mail.ibm.com ([9.218.2.225]) by ppma12.dal12v.mail.ibm.com (PPS) with ESMTPS id 3uakxst011-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Nov 2023 18:02:51 +0000 Received: from smtpav05.fra02v.mail.ibm.com (smtpav05.fra02v.mail.ibm.com [10.20.54.104]) by smtprelay05.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3AEI2mFX13697646 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 14 Nov 2023 18:02:49 GMT Received: from smtpav05.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DA7D42004D; Tue, 14 Nov 2023 18:02:48 +0000 (GMT) Received: from smtpav05.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 764BD20040; Tue, 14 Nov 2023 18:02:48 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by smtpav05.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 14 Nov 2023 18:02:48 +0000 (GMT) From: Sumanth Korikkar To: linux-mm , Andrew Morton , David Hildenbrand Cc: Oscar Salvador , Michal Hocko , "Aneesh Kumar K.V" , Anshuman Khandual , Gerald Schaefer , Alexander Gordeev , Heiko Carstens , Vasily Gorbik , linux-s390 , LKML Subject: [PATCH 8/8] s390: enable MHP_MEMMAP_ON_MEMORY Date: Tue, 14 Nov 2023 19:02:38 +0100 Message-Id: <20231114180238.1522782-9-sumanthk@linux.ibm.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231114180238.1522782-1-sumanthk@linux.ibm.com> References: <20231114180238.1522782-1-sumanthk@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: 1iuqyY7fBYgzh-7gjOGrUOu7wzJrGNhK X-Proofpoint-GUID: QpGydmbNiA0MCr2CgYCO6rQTGCRt5cGJ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.987,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-11-14_18,2023-11-14_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 lowpriorityscore=0 malwarescore=0 phishscore=0 suspectscore=0 priorityscore=1501 adultscore=0 impostorscore=0 spamscore=0 mlxlogscore=749 clxscore=1015 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311060000 definitions=main-2311140137 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Enable MHP_MEMMAP_ON_MEMORY to support "memmap on memory". memory_hotplug.memmap_on_memory=3Dtrue kernel parameter should be set in kernel boot option to enable the feature. Reviewed-by: Gerald Schaefer Signed-off-by: Sumanth Korikkar --- arch/s390/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index 3bec98d20283..4b9b0f947ddb 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -113,6 +113,7 @@ config S390 select ARCH_INLINE_WRITE_UNLOCK_BH select ARCH_INLINE_WRITE_UNLOCK_IRQ select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE + select ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE select ARCH_STACKWALK select ARCH_SUPPORTS_ATOMIC_RMW select ARCH_SUPPORTS_DEBUG_PAGEALLOC --=20 2.41.0