From nobody Sun Feb 8 22:18:18 2026 Received: from mail-qk1-f175.google.com (mail-qk1-f175.google.com [209.85.222.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 30C44296BA7 for ; Mon, 5 Jan 2026 20:36:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767645410; cv=none; b=twuRaXQK1Pdr4FYUJ3uMh+aiBgQSB7yp2/RUFhh4ZQk8JiqR5TGMfnD8FKQdJ0bNQsLk9sh5msZ41Gx6Y0PIS3tkhiTZjMYlwGV7RFAxQOJrCG4/F653yaE68qw/ZtygFs9la1ZZGzIZXS9hj4kKpep1GsrYjc0nkGGZ61F2EiU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767645410; c=relaxed/simple; bh=f+ELqnRRXIQ2b4DGNbyukcMtoBF1HL3fMH/BaZfdydQ=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=jRurdvMbm6SDp3wrnmWDqcfpMPFtk+HdvtWjdzToiUacyzB9WZzAqdDx7YSr/MwZKLoF9JoXlkqhDtTx4yO2uGeyYyKiJu8E0fIMouDZpgPmr4jXCzJcscREcPbYIA1ZcPa+bReNjDR9YsPOiSSqGlPbXhFGVLrl6Metq0pVfEE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gourry.net; spf=pass smtp.mailfrom=gourry.net; dkim=pass (2048-bit key) header.d=gourry.net header.i=@gourry.net header.b=kHghnqUU; arc=none smtp.client-ip=209.85.222.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gourry.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gourry.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gourry.net header.i=@gourry.net header.b="kHghnqUU" Received: by mail-qk1-f175.google.com with SMTP id af79cd13be357-8c24f867b75so29007285a.2 for ; Mon, 05 Jan 2026 12:36:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gourry.net; s=google; t=1767645407; x=1768250207; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=KZqDH83vRzD+061XVx0iqYmNRWn5jGpRz20URi6BfTw=; b=kHghnqUUue9Oz3Ux86sptYMiQG7+ZyyHSOlpA/wbDuL6lbjxpcgEBthw1hSbyJ6jak Y0V4A3ihfzukNGE621fBGhiuR4KwTXLo0xZ0035IpFRUbYISw1CWkmwqQmb0L746podu yHt53mQNP/s/zqOZSj98NphJuHmMWM296d1j3dtT9W8gm+xfEhfPi0IDGVWpfPb+pRvz ZqmJuw8PgLT4ztUngp7dsFo9TttEtlsQkIKwKAUafvkfMHBpT1orBisyfm7uFkfj+DSe N/OX5N2xu6kPuPk3zauEgs0LfMqacFEr/AZlpP2tKEqi4unUT30ExQNC4R35FP2qKhQp zm9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767645407; x=1768250207; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=KZqDH83vRzD+061XVx0iqYmNRWn5jGpRz20URi6BfTw=; b=RjdIE+dpxGC/A7b/59EBEMcR264iLRkVjKO0a3SFnyeXhczrw26RrnW++tbd4rzkAg nZ8q7BVeyZWqEBZ1UdH308Q8DSdqjrZ9QTSuc7rSKzdulcnpGpl766Ee9RZ3tg0cJCgp KjqfZUYI5audehpsyY8TE+776+YCJrLiTIUQM0Ty1E0gIOd2vFJ8tYIjRp3R8nKW0/WD R0GGUi9PBcrH3ry2CKcNoS8imSLbIRCW/kMWrJLBzypX0cZfz5VC9JBQkuPlzoLbjfdT 82sZk5fZ2vFlYRfjfIORb+44lX/WEh3QsOaOtxXcHe/B9An/y1l2DwM9+udndlIyQ/yS 4Efw== X-Gm-Message-State: AOJu0YwNPDdxY2AbIX5rimXVVKo23qxuCO3bE/+lCjjLyjx+hhW7CZXm ok5R7sVU8kHpdnwJQAcMYnkvh+HQyo4z1AFxka6H2v6DE0GZ3mH4knNx+riCJ6r4akA= X-Gm-Gg: AY/fxX7dAQ52ltWYXfq+HJrlY6veOTy15we3Z5KuolcXnSce5RKErIsNA9Vjxa8+yU9 Be+Y4TqYP/dt/xygN2GMeGfmz0OJDYDio/EwR8SwVJWcJKlZ4h7IFCphO87iC0GOdjgCsqBCzO+ eFuVA7x72rYg5r9aH3UH13uIkrnWIhbrbRKWeM9hPvsMRSXjhOjc040SxTgdM+r4UGVLAbL4FhH E8IFg5n0X15F2AHPwvalMcgw//MuP8cve2rXs4BVuQTSStLNeiC8OwGDA+q8alWLCRYJN4qEnq4 re5oSR8Fx7KPN0xws3KLGWPTuy5pMkjmzoBgsweHyaqpSwpXEwzuAEqcQQeKgNgugX0ibRIQrmg SsdnqqJ4vdU1g/za9jctRgY6qkg12B2RxpObE+XfeB29UPAGEoZexzZNgSmkIJ4uRKOAEzSoSgw gsQp77Usnx8S2+9PnTIJTlEZAIKNkJjN+itk6ySfyMoTUxOx1A9jtP7MaaYjLLoXDCcH03EaPJ6 OnpSDY8thn82g== X-Google-Smtp-Source: AGHT+IHFQ2zKeDm6cfqKEElGn5vKL5ZyEPq7WfLjHcrIvWqOxVf6ThFNHOhhq0ReOlp5JXk4+y9X4w== X-Received: by 2002:a05:620a:4449:b0:8c0:9618:8c75 with SMTP id af79cd13be357-8c37eba72cfmr122811185a.71.1767645406901; Mon, 05 Jan 2026 12:36:46 -0800 (PST) Received: from gourry-fedora-PF4VCD3F.lan (pool-96-255-20-138.washdc.ftas.verizon.net. [96.255.20.138]) by smtp.gmail.com with ESMTPSA id af79cd13be357-8c37f5312d3sm25685585a.43.2026.01.05.12.36.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Jan 2026 12:36:46 -0800 (PST) From: Gregory Price To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, david@redhat.com, osalvador@suse.de, gregkh@linuxfoundation.org, rafael@kernel.org, dakr@kernel.org, akpm@linux-foundation.org, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, mhocko@suse.com, hare@suse.de Subject: [RFC PATCH] memory,memory_hotplug: allow restricting memory blocks to zone movable Date: Mon, 5 Jan 2026 15:36:11 -0500 Message-ID: <20260105203611.4079743-1-gourry@gourry.net> X-Mailer: git-send-email 2.52.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" It was reported (LPC 2025) that userland services which monitor memory blocks can cause hot-unplug to fail permanently. This can occur when drivers attempt to hot-remove memory in two phases (offline, remove), while a userland service detects the memory offline and re-onlines the memory into a zone which may prevent removal. This patch allows a driver to specify that a given memory block is intended as ZONE_MOVABLE memory only (i.e. the system should try to protect its hot-unpluggability). This is done via an MHP flag and a new "movable_only" bool in `struct memory_block`. Attempts to online a memory block with movable_only=3Dtrue with any value other than MMOP_ONLINE_MOVABLE will fail with -EINVAL. It is hard to catch all possible ways to implement offline/remove process, so a race condition here can clearly still occur if the userland service onlines the memory back into ZONE_MOVABLE, but it at least will not prevent the removal of a block at a later time. Suggested-by: Hannes Reinecke Signed-off-by: Gregory Price --- drivers/base/memory.c | 15 +++++++++++---- include/linux/memory.h | 4 +++- include/linux/memory_hotplug.h | 13 +++++++++++++ mm/memory_hotplug.c | 12 +++++++++--- 4 files changed, 36 insertions(+), 8 deletions(-) diff --git a/drivers/base/memory.c b/drivers/base/memory.c index 6d84a02cfa5d..59512e4b8d62 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -374,6 +374,8 @@ static int memory_block_change_state(struct memory_bloc= k *mem, =20 if (to_state =3D=3D MEM_OFFLINE) mem->state =3D MEM_GOING_OFFLINE; + else if (mem->movable_only && to_state !=3D MMOP_ONLINE_MOVABLE) + return -EINVAL; =20 ret =3D memory_block_action(mem, to_state); mem->state =3D ret ? from_state_req : to_state; @@ -811,7 +813,8 @@ void memory_block_add_nid_early(struct memory_block *me= m, int nid) =20 static int add_memory_block(unsigned long block_id, int nid, unsigned long= state, struct vmem_altmap *altmap, - struct memory_group *group) + struct memory_group *group, + bool movable_only) { struct memory_block *mem; int ret =3D 0; @@ -829,6 +832,7 @@ static int add_memory_block(unsigned long block_id, int= nid, unsigned long state mem->state =3D state; mem->nid =3D nid; mem->altmap =3D altmap; + mem->movable_only =3D movable_only; INIT_LIST_HEAD(&mem->group_next); =20 #ifndef CONFIG_NUMA @@ -880,7 +884,8 @@ static void remove_memory_block(struct memory_block *me= mory) */ int create_memory_block_devices(unsigned long start, unsigned long size, int nid, struct vmem_altmap *altmap, - struct memory_group *group) + struct memory_group *group, + bool movable_only) { const unsigned long start_block_id =3D pfn_to_block_id(PFN_DOWN(start)); unsigned long end_block_id =3D pfn_to_block_id(PFN_DOWN(start + size)); @@ -893,7 +898,8 @@ int create_memory_block_devices(unsigned long start, un= signed long size, return -EINVAL; =20 for (block_id =3D start_block_id; block_id !=3D end_block_id; block_id++)= { - ret =3D add_memory_block(block_id, nid, MEM_OFFLINE, altmap, group); + ret =3D add_memory_block(block_id, nid, MEM_OFFLINE, altmap, group, + movable_only); if (ret) break; } @@ -998,7 +1004,8 @@ void __init memory_dev_init(void) continue; =20 block_id =3D memory_block_id(nr); - ret =3D add_memory_block(block_id, NUMA_NO_NODE, MEM_ONLINE, NULL, NULL); + ret =3D add_memory_block(block_id, NUMA_NO_NODE, MEM_ONLINE, NULL, NULL, + false); if (ret) { panic("%s() failed to add memory block: %d\n", __func__, ret); diff --git a/include/linux/memory.h b/include/linux/memory.h index 43d378038ce2..bab24f796d3d 100644 --- a/include/linux/memory.h +++ b/include/linux/memory.h @@ -80,6 +80,7 @@ struct memory_block { struct vmem_altmap *altmap; struct memory_group *group; /* group (if any) for this block */ struct list_head group_next; /* next block inside memory group */ + bool movable_only; /* If set, only ZONE_MOVABLE is valid */ #if defined(CONFIG_MEMORY_FAILURE) && defined(CONFIG_MEMORY_HOTPLUG) atomic_long_t nr_hwpoison; #endif @@ -160,7 +161,8 @@ extern int register_memory_notifier(struct notifier_blo= ck *nb); extern void unregister_memory_notifier(struct notifier_block *nb); int create_memory_block_devices(unsigned long start, unsigned long size, int nid, struct vmem_altmap *altmap, - struct memory_group *group); + struct memory_group *group, + bool movable_only); void remove_memory_block_devices(unsigned long start, unsigned long size); extern void memory_dev_init(void); extern int memory_notify(unsigned long val, void *v); diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 23f038a16231..ca51ef2ad0cf 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -75,6 +75,19 @@ typedef int __bitwise mhp_t; */ #define MHP_OFFLINE_INACCESSIBLE ((__force mhp_t)BIT(3)) =20 +/* + * Restrict hotplugged memory blocks to ZONE_MOVABLE only. + * + * During offlining of hotplugged memory which was originally onlined + * as ZONE_MOVABLE, userland services may detect blocks going offline + * and automatically re-online them into ZONE_NORMAL or lower. When + * this happens it may become permanently incapable of being removed. + * + * Allow driver-managed memory sources to restrict memory blocks to + * ZONE_MOVABLE only, so that the truly degenerate case can be mitigated. + */ +#define MHP_MOVABLE_ONLY ((__force mhp_t)BIT(4)) + /* * Extended parameters for memory hotplug: * altmap: alternative allocator for memmap array (optional) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 81ba5b019926..1a184bfd87f6 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1346,7 +1346,9 @@ static int check_hotplug_memory_range(u64 start, u64 = size) =20 static int online_memory_block(struct memory_block *mem, void *arg) { - mem->online_type =3D mhp_get_default_online_type(); + mem->online_type =3D mem->movable_only ? + MMOP_ONLINE_MOVABLE : + mhp_get_default_online_type(); return device_online(&mem->dev); } =20 @@ -1449,6 +1451,7 @@ static int create_altmaps_and_memory_blocks(int nid, = struct memory_group *group, unsigned long memblock_size =3D memory_block_size_bytes(); u64 cur_start; int ret; + bool movable_only =3D mhp_flags & MHP_MOVABLE_ONLY; =20 for (cur_start =3D start; cur_start < start + size; cur_start +=3D memblock_size) { @@ -1478,7 +1481,8 @@ static int create_altmaps_and_memory_blocks(int nid, = struct memory_group *group, =20 /* create memory block devices after memory was added */ ret =3D create_memory_block_devices(cur_start, memblock_size, nid, - params.altmap, group); + params.altmap, group, + movable_only); if (ret) { arch_remove_memory(cur_start, memblock_size, NULL); kfree(params.altmap); @@ -1506,6 +1510,7 @@ int add_memory_resource(int nid, struct resource *res= , mhp_t mhp_flags) struct memory_group *group =3D NULL; u64 start, size; bool new_node =3D false; + bool movable_only =3D mhp_flags & MHP_MOVABLE_ONLY; int ret; =20 start =3D res->start; @@ -1564,7 +1569,8 @@ int add_memory_resource(int nid, struct resource *res= , mhp_t mhp_flags) goto error; =20 /* create memory block devices after memory was added */ - ret =3D create_memory_block_devices(start, size, nid, NULL, group); + ret =3D create_memory_block_devices(start, size, nid, NULL, group, + movable_only); if (ret) { arch_remove_memory(start, size, params.altmap); goto error; --=20 2.52.0