From nobody Sat Apr 4 00:24:01 2026 Received: from mail-vs1-f44.google.com (mail-vs1-f44.google.com [209.85.217.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9E20B38C2A1 for ; Sat, 21 Mar 2026 15:04:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.217.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774105453; cv=none; b=E+YYVczjgSVxmk23dzXTTYlVjzUWorWSaqeZVHNeLB+7Ec2MAc8klmeY/ehRcJrbp9MXOuLa4sX3HVIzDaTo1d7Aq9nYPuoe7k+9EKILrnO4hCcLSWA6zIGCMWfqB7g8L+gL01dMRDmPXXHaNONTB0mR1wFrINAfGwtM2YPnIk0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774105453; c=relaxed/simple; bh=TQuW+OZpUZldWOSNfRy7f3RL4fU3SXCpUPS/DDnyruc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=aHNN0itdG0oNgBMcpF7A235fNy25ABhY5F/coPqeg5glec5CCvKq4mn+9Y2F3zaa4Y/DQUnKgYwSuzWbneHv8AqC2CysuFzYlLrG88OlN+XXZoin1pE8891QNibmY7HDGqTc1SrGQDfSYGzj80A0phfZJ+8y14zvr/FkCmTPZGE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gourry.net; spf=pass smtp.mailfrom=gourry.net; dkim=pass (2048-bit key) header.d=gourry.net header.i=@gourry.net header.b=WIVNG0h2; arc=none smtp.client-ip=209.85.217.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gourry.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gourry.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gourry.net header.i=@gourry.net header.b="WIVNG0h2" Received: by mail-vs1-f44.google.com with SMTP id ada2fe7eead31-602849cfe17so800445137.2 for ; Sat, 21 Mar 2026 08:04:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gourry.net; s=google; t=1774105450; x=1774710250; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7xO5G1TTTEFUzwGvqajKTysfZ/iiFkVbZDpQBQ6HidU=; b=WIVNG0h2ZeQzUPvRh2774vu4yZn07LydlNN93iXKvO232ZdcYkPvXYmTUJiPMnSHX9 4JyCogb+fxUkSbU+dHPjh+d6bOvRt74STPFAEGg4Ez7QWHYwcJitKs/yf4oN4nzuJ4mF 46Xe/gVjdQ0/OnBX9Hgfqwr8cTPAmn+yFcYyxogRnX/yTz6Y1OsVj5WyEN6ykUY9tlBn KOdO81IMC5qcW0PyAZDCinUdjN2FnNLQgllOHI5e4l3dDD2CARWQg4IkXhHR7jHZneaP 8O/WS8dT+Q65DRR69Y9cpkYQzyskkWRtoRdORW5P6+8iS9uf77HtimFZ/fhTr9Nl/K6w irHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774105450; x=1774710250; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=7xO5G1TTTEFUzwGvqajKTysfZ/iiFkVbZDpQBQ6HidU=; b=mYt+vmLlMIZ8gG9B4if6GWTGdF+A6Z3tHcihHCxMNkZR65OXxET6Col/3HceNMXC0X TGpqyKsxiYv2o0K1UYSUhf5x79hYUfe8MVOB8MP5jgWvcJHjgyc8ovtWbXYjO1NUBeXY kinNHjqW9tVQxDnLOB5y9zM5lpXUjKgVkpG8RKFeqOGOSwxYTmmq8TaXwZsjsba5uc9C WmFBfe/Xb7eKRFssKeV0686aD3LimAPQUe9jD7f67g4G9qNuKrY3EX3112k4DbS4tXzV MXyNEbqT1rRfDpG+MdJBXf5+IOIbj5NQpFBwstfJdvGwviUauX0iBdjXZz5uFaPb0+o5 Yiqw== X-Forwarded-Encrypted: i=1; AJvYcCV/Z+myD4SE97A0wZUmUyod92AMrPUYAFxfPtBbVagN2sx0PMZ3p5ZN4YvD1y2SAhR7xMnnHdCf8F7rxco=@vger.kernel.org X-Gm-Message-State: AOJu0Yx8HpLqxsikb1wPDL/fDtN5cPGDWsozqHHaKD8/QwlOg6UB/Kr4 FmwJrV+2pDWRSXvM9HQvcnYrxMY7S54umJwY1+rSf49nT10E76MVcYm9DIW7CvLUgOQ= X-Gm-Gg: ATEYQzzLiVtupDY/OXc6rB0hxHVwJUo861/oTwArwUvkbDFFHk5zZND1UJCPBDeUoan jJ9Y/Fg94MuePXYhi7Z6hkxozkf4HrFXfR/+lHwr16yQQJfgyTJXSfmUwjsHjk7is64Ng/fqSwr 9TBNPjo0s8SpbZSESSohaVUpCAeKcxt32TzWocTLC6C3MTXkeglNEoboIPS6CQem9q9Q4vIhRyR TZ2WZLpNhtBrv7gPxlD2XHWW8/xPGFu7urdoemK3b48Z0GUEaQ/zeUqtMjocs/Ax3USMNtRg+yx 9g7aA8isD9JwA1lWLt2E6uqMyXPPN7Pog+JVjcp0APqyMn5hc3XZTpEM5qK8MUCRAwLwO+H6SlJ bELU5RhoOhuhrtXKNzmuUOHqJ9gJyFGYOPB0LpZKPAwNMf2ULYkEwsdERVktP/ecDfsvQlUjgx3 gBk+N/nje8Chs5lfsaM5plM69v8gvh2XECP8BvhEc1zJCorYAq4LxPMbjPoI0yL1bZB6rWFDv0z m8cINtSM1+Ze10= X-Received: by 2002:a05:6102:dd0:b0:601:f386:9ed2 with SMTP id ada2fe7eead31-602aea8d861mr2785184137.7.1774105450228; Sat, 21 Mar 2026 08:04:10 -0700 (PDT) Received: from gourry-fedora-PF4VCD3F.lan (pool-96-255-20-138.washdc.ftas.verizon.net. [96.255.20.138]) by smtp.gmail.com with ESMTPSA id af79cd13be357-8cfc90ba89fsm391979885a.40.2026.03.21.08.04.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 21 Mar 2026 08:04:09 -0700 (PDT) From: Gregory Price To: linux-mm@kvack.org, vishal.l.verma@intel.com, dave.jiang@intel.com, akpm@linux-foundation.org, david@kernel.org, osalvador@suse.de Cc: dan.j.williams@intel.com, ljs@kernel.org, Liam.Howlett@oracle.com, vbabka@kernel.org, rppt@kernel.org, surenb@google.com, mhocko@suse.com, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, kernel-team@meta.com Subject: [PATCH 1/8] mm/memory-tiers: consolidate memory type dedup into mt_get_memory_type() Date: Sat, 21 Mar 2026 11:03:57 -0400 Message-ID: <20260321150404.3288786-2-gourry@gourry.net> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260321150404.3288786-1-gourry@gourry.net> References: <20260321150404.3288786-1-gourry@gourry.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Replace per-driver memory type list infrastructure with a single mt_get_memory_type(adist) that deduplicates against the global default_memory_types list under memory_tier_lock. The per-driver lists (mutex + list_head + find/put wrappers) provided dedup within a single driver, but not across drivers or with the core. Since the number of distinct adist values is bounded and types on default_memory_types are never freed anyway, the per-driver cleanup on module unload was not useful. Add MEMTIER_DEFAULT_LOWTIER_ADISTANCE to replace the default DAX adistance, since it was really used as a standin for all kmem hotplugged memory. This at least makes the default tier relationship clearer to other drivers and they can see where to put their memory in relation to the default lower tier. Core changes: - Add mt_get_memory_type() as the single exported entry point - Drop most other interfaces - clear_node_memory_type() is now the appropriate put function. - export MEMTIER_DEFAULT_LOWTIER_ADISTANCE dax/kmem changes: - Remove MEMTIER_DEFAULT_DAX_ADISTANCE, use MEMTIER_DEFAULT_LOWTIER_ADISTA= NCE - Remove per-driver kmem_memory_type_lock/kmem_memory_types/wrappers - Store mtype per-device in dax_kmem_data - Pass data->mtype to clear_node_memory_type() instead of NULL Signed-off-by: Gregory Price --- drivers/dax/kmem.c | 32 +++++--------------------------- include/linux/memory-tiers.h | 34 ++++++++++------------------------ mm/memory-tiers.c | 29 +++++++++++++---------------- 3 files changed, 28 insertions(+), 67 deletions(-) diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c index 2cc8749bc871..eb693a581961 100644 --- a/drivers/dax/kmem.c +++ b/drivers/dax/kmem.c @@ -16,13 +16,6 @@ #include "dax-private.h" #include "bus.h" =20 -/* - * Default abstract distance assigned to the NUMA node onlined - * by DAX/kmem if the low level platform driver didn't initialize - * one for this NUMA node. - */ -#define MEMTIER_DEFAULT_DAX_ADISTANCE (MEMTIER_ADISTANCE_DRAM * 5) - /* Memory resource name used for add_memory_driver_managed(). */ static const char *kmem_name; /* Set if any memory will remain added when the driver will be unloaded. */ @@ -47,24 +40,10 @@ static int dax_kmem_range(struct dev_dax *dev_dax, int = i, struct range *r) struct dax_kmem_data { const char *res_name; int mgid; + struct memory_dev_type *mtype; struct resource *res[]; }; =20 -static DEFINE_MUTEX(kmem_memory_type_lock); -static LIST_HEAD(kmem_memory_types); - -static struct memory_dev_type *kmem_find_alloc_memory_type(int adist) -{ - guard(mutex)(&kmem_memory_type_lock); - return mt_find_alloc_memory_type(adist, &kmem_memory_types); -} - -static void kmem_put_memory_types(void) -{ - guard(mutex)(&kmem_memory_type_lock); - mt_put_memory_types(&kmem_memory_types); -} - static int dev_dax_kmem_probe(struct dev_dax *dev_dax) { struct device *dev =3D &dev_dax->dev; @@ -74,7 +53,7 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax) int i, rc, mapped =3D 0; mhp_t mhp_flags; int numa_node; - int adist =3D MEMTIER_DEFAULT_DAX_ADISTANCE; + int adist =3D MEMTIER_DEFAULT_LOWTIER_ADISTANCE; =20 /* * Ensure good NUMA information for the persistent memory. @@ -90,7 +69,7 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax) } =20 mt_calc_adistance(numa_node, &adist); - mtype =3D kmem_find_alloc_memory_type(adist); + mtype =3D mt_get_memory_type(adist); if (IS_ERR(mtype)) return PTR_ERR(mtype); =20 @@ -189,6 +168,7 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax) } mapped++; } + data->mtype =3D mtype; =20 dev_set_drvdata(dev, data); =20 @@ -253,7 +233,7 @@ static void dev_dax_kmem_remove(struct dev_dax *dev_dax) * for that. This implies this reference will be around * till next reboot. */ - clear_node_memory_type(node, NULL); + clear_node_memory_type(node, data->mtype); } } #else @@ -292,7 +272,6 @@ static int __init dax_kmem_init(void) return rc; =20 error_dax_driver: - kmem_put_memory_types(); kfree_const(kmem_name); return rc; } @@ -302,7 +281,6 @@ static void __exit dax_kmem_exit(void) dax_driver_unregister(&device_dax_kmem_driver); if (!any_hotremove_failed) kfree_const(kmem_name); - kmem_put_memory_types(); } =20 MODULE_AUTHOR("Intel Corporation"); diff --git a/include/linux/memory-tiers.h b/include/linux/memory-tiers.h index 96987d9d95a8..70fbd3ad577f 100644 --- a/include/linux/memory-tiers.h +++ b/include/linux/memory-tiers.h @@ -20,11 +20,17 @@ */ #define MEMTIER_ADISTANCE_DRAM ((4L * MEMTIER_CHUNK_SIZE) + (MEMTIER_CHUNK= _SIZE >> 1)) =20 +/* + * Default abstract distance assigned to non-DRAM memory if the platform + * driver didn't initialize one for this NUMA node. + */ +#define MEMTIER_DEFAULT_LOWTIER_ADISTANCE (MEMTIER_ADISTANCE_DRAM * 5) + struct memory_tier; struct memory_dev_type { /* list of memory types that are part of same tier as this type */ struct list_head tier_sibling; - /* list of memory types that are managed by one driver */ + /* memory types on global list */ struct list_head list; /* abstract distance for this specific memory type */ int adistance; @@ -39,8 +45,6 @@ struct access_coordinate; extern bool numa_demotion_enabled; extern struct memory_dev_type *default_dram_type; extern nodemask_t default_dram_nodes; -struct memory_dev_type *alloc_memory_type(int adistance); -void put_memory_type(struct memory_dev_type *memtype); void init_node_memory_type(int node, struct memory_dev_type *default_type); void clear_node_memory_type(int node, struct memory_dev_type *memtype); int register_mt_adistance_algorithm(struct notifier_block *nb); @@ -49,9 +53,7 @@ int mt_calc_adistance(int node, int *adist); int mt_set_default_dram_perf(int nid, struct access_coordinate *perf, const char *source); int mt_perf_to_adistance(struct access_coordinate *perf, int *adist); -struct memory_dev_type *mt_find_alloc_memory_type(int adist, - struct list_head *memory_types); -void mt_put_memory_types(struct list_head *memory_types); +struct memory_dev_type *mt_get_memory_type(int adist); #ifdef CONFIG_MIGRATION int next_demotion_node(int node, const nodemask_t *allowed_mask); void node_get_allowed_targets(pg_data_t *pgdat, nodemask_t *targets); @@ -78,18 +80,6 @@ static inline bool node_is_toptier(int node) #define numa_demotion_enabled false #define default_dram_type NULL #define default_dram_nodes NODE_MASK_NONE -/* - * CONFIG_NUMA implementation returns non NULL error. - */ -static inline struct memory_dev_type *alloc_memory_type(int adistance) -{ - return NULL; -} - -static inline void put_memory_type(struct memory_dev_type *memtype) -{ - -} =20 static inline void init_node_memory_type(int node, struct memory_dev_type = *default_type) { @@ -142,14 +132,10 @@ static inline int mt_perf_to_adistance(struct access_= coordinate *perf, int *adis return -EIO; } =20 -static inline struct memory_dev_type *mt_find_alloc_memory_type(int adist, - struct list_head *memory_types) +static inline struct memory_dev_type *mt_get_memory_type(int adist) { return NULL; } - -static inline void mt_put_memory_types(struct list_head *memory_types) -{ -} #endif /* CONFIG_NUMA */ + #endif /* _LINUX_MEMORY_TIERS_H */ diff --git a/mm/memory-tiers.c b/mm/memory-tiers.c index 986f809376eb..c8f032a75249 100644 --- a/mm/memory-tiers.c +++ b/mm/memory-tiers.c @@ -38,14 +38,17 @@ struct node_memory_type_map { static DEFINE_MUTEX(memory_tier_lock); static LIST_HEAD(memory_tiers); /* - * The list is used to store all memory types that are not created - * by a device driver. + * The list is used to store all memory types, both auto-initialized + * and driver-requested. Drivers obtain types via mt_get_memory_type(). */ static LIST_HEAD(default_memory_types); static struct node_memory_type_map node_memory_types[MAX_NUMNODES]; struct memory_dev_type *default_dram_type; nodemask_t default_dram_nodes __initdata =3D NODE_MASK_NONE; =20 +static struct memory_dev_type *mt_find_alloc_memory_type(int adist, + struct list_head *memory_types); + static const struct bus_type memory_tier_subsys =3D { .name =3D "memory_tiering", .dev_name =3D "memory_tier", @@ -621,7 +624,7 @@ static void release_memtype(struct kref *kref) kfree(memtype); } =20 -struct memory_dev_type *alloc_memory_type(int adistance) +static struct memory_dev_type *alloc_memory_type(int adistance) { struct memory_dev_type *memtype; =20 @@ -635,13 +638,11 @@ struct memory_dev_type *alloc_memory_type(int adistan= ce) kref_init(&memtype->kref); return memtype; } -EXPORT_SYMBOL_GPL(alloc_memory_type); =20 -void put_memory_type(struct memory_dev_type *memtype) +static void put_memory_type(struct memory_dev_type *memtype) { kref_put(&memtype->kref, release_memtype); } -EXPORT_SYMBOL_GPL(put_memory_type); =20 void init_node_memory_type(int node, struct memory_dev_type *memtype) { @@ -670,7 +671,8 @@ void clear_node_memory_type(int node, struct memory_dev= _type *memtype) } EXPORT_SYMBOL_GPL(clear_node_memory_type); =20 -struct memory_dev_type *mt_find_alloc_memory_type(int adist, struct list_h= ead *memory_types) +static struct memory_dev_type *mt_find_alloc_memory_type(int adist, + struct list_head *memory_types) { struct memory_dev_type *mtype; =20 @@ -686,18 +688,13 @@ struct memory_dev_type *mt_find_alloc_memory_type(int= adist, struct list_head *m =20 return mtype; } -EXPORT_SYMBOL_GPL(mt_find_alloc_memory_type); =20 -void mt_put_memory_types(struct list_head *memory_types) +struct memory_dev_type *mt_get_memory_type(int adist) { - struct memory_dev_type *mtype, *mtn; - - list_for_each_entry_safe(mtype, mtn, memory_types, list) { - list_del(&mtype->list); - put_memory_type(mtype); - } + guard(mutex)(&memory_tier_lock); + return mt_find_alloc_memory_type(adist, &default_memory_types); } -EXPORT_SYMBOL_GPL(mt_put_memory_types); +EXPORT_SYMBOL_GPL(mt_get_memory_type); =20 /* * This is invoked via `late_initcall()` to initialize memory tiers for --=20 2.53.0 From nobody Sat Apr 4 00:24:01 2026 Received: from mail-qk1-f169.google.com (mail-qk1-f169.google.com [209.85.222.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 86B0A391E68 for ; Sat, 21 Mar 2026 15:04:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774105454; cv=none; b=tVO9Q0dXwOWBBO8TJjZRtOLrnkfAA/+2kWAiS4tdjdb67r4NFBYT3CVix1Y9AV+oH7t1SpQG/IAmBlWp7HkEvPKfI+1zHy+eMflyPU4ClAntlhLLMYuxg/B44Ws9uVaLiEe0GYST/P9YNCJ+qh6Ia/NtlE91dbub+uOzjHstfyw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774105454; c=relaxed/simple; bh=4aQ6c6cuuLgbidEXo4JUWgHdhHA0vPcOx/O2oIS0jG8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=KpoOHuzKlANoTwJSvWblUR/1tS23frnsNO9bHlDpIb9+p0BgkE3Ta1rnPtx9EQZbckOUUk/q+jPFYq4ouuz8swl0rsQIIPfcfS7eAGb/lPkSv887uBAVbNlyvgpOavVdHNWln2oO1GHpdGPReUm1jFVE9Jx1dt5pXHx+1t64cN4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gourry.net; spf=pass smtp.mailfrom=gourry.net; dkim=pass (2048-bit key) header.d=gourry.net header.i=@gourry.net header.b=R5ga+0W7; arc=none smtp.client-ip=209.85.222.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gourry.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gourry.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gourry.net header.i=@gourry.net header.b="R5ga+0W7" Received: by mail-qk1-f169.google.com with SMTP id af79cd13be357-8c70b5594f4so333670485a.1 for ; Sat, 21 Mar 2026 08:04:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gourry.net; s=google; t=1774105452; x=1774710252; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=EZlQlpoI2kyeTHm+ataFfQwGqNyciJ5dRDvCF4GFSVw=; b=R5ga+0W7o8L6lu7qQbjqT2SdycQ73uUvbF2FiFvjEK56tXWOWqR+yKjEszzehSyE6o 6hnJZx0RSCqr71IQeYlSnHOuZOcQgK4Rq9oEMbh1QcQfhKzdYVv9MzHNDgnLi7ClFY29 UMQft4AUZAaXEgE02UJwxjWGCgfxRgKexyuAwTjIFa1Qv4IjOazmy0ZPtaz7MscBCPYc 6G1ZOMUiH9Df5wjwb21yKNPX/GbncOiDcg8dMCfiKscYozUlf9sOhKMYiFbe/9uRc77J /bs4tlnFNcpQL1bNsNryKve3Dmj39MDX3POD1OL2iwcgvMskOfQ46PGdjJLa9T2cjp4r EfZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774105452; x=1774710252; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=EZlQlpoI2kyeTHm+ataFfQwGqNyciJ5dRDvCF4GFSVw=; b=VR0rt2Gh/2MpteDjyf9oD8+lH+f7SXQr3bWVkKNXhmWrQfR7EjYz9lS/p8Nqv1NnhR vlCWgL4Fts7pr4oHl7CX2+ka3GQACzA8AogmT1HlPQTdhrDu4ar9aRD6wj0SKh0xwrYE 50dedhxvQFjRPOr53x1U0DJQUWADc2rf3wRjvi7KWjB0zkdnUvSkfNzNNCZS30C9AFUp WtNRCd3ztk4f6CeL/OjxYJLD460652FtiOCqk/EEmQmeu6Npv6DNBiuRr07rXANJZtDW vZ7Hse4WkEHNbdGYJ4U+iofQHngHH5nB0Vbseltb7osfykkCEV8a8Qx5CimXM1nXUYtc 6tpQ== X-Forwarded-Encrypted: i=1; AJvYcCUKiLSjtbYuHT74oMPN301rp0izu+mhtWjGyg159tE0zPt9Gvtu5uq5An3MsGJc4BcJZcXsh4ahsxTM2ZM=@vger.kernel.org X-Gm-Message-State: AOJu0YyIAmmNq2RuELTiqH0g5/aBIMylh1DSvGLLVZCwZzD85/J+wraC 3u/5nL/CJjKq3J7WSbPf6GwzzN/3RrgWcevV0th0WFrPhAq8+c2QTcIuDXaRktsp45o= X-Gm-Gg: ATEYQzw+KiZltADkiC/UPE/xgiKbh2LNxVvDg58umt7ajj9RqPoOLQYRqKNOyiSAm5T Km+Y6DufwMTe8ZajfssgKPgx7kYBMd1SvnpgFDlCLGhXJd298IChfMe/BHQyoCtzuAjPGs0eQia Se5tY3eW3QbmHtJanvjspnkgSvKxMI32LI2OVWgnggc3aIwVUk84hG8EAp+168TsTKCOEOlMile EHNpTC+/rVhLGYL77hC38MBII7v02bA2NqFTpD9viEwckIWkOFx41GHrkZ0jA956j6tNufV2CL3 o6REGmNnEuCm3H8rSj3jY+nownbO8oo4ZxQkk8SFlCTEcEutFiQMn3UwnNGfimiTVmGDR1jctAe AArDCvYUP4zWBkdD1uf5myn29SVxe7YdQPZg9MkrsbjS3cnkR1KiHIZf2qi+lwbUfQq2mPz1s7d Ny7hIEwX3MwFXBu4v+YDNVYse6+h2zSFKt6xlBvND2vp7XcAV+eR0GvereqNAEp3B8u8KeR2uqZ JLhnaoVn/pHXUZ96uJtkPVzAA== X-Received: by 2002:a05:620a:454b:b0:8cd:8ce4:c0ad with SMTP id af79cd13be357-8cfb9e4b1b4mr1481729385a.22.1774105452426; Sat, 21 Mar 2026 08:04:12 -0700 (PDT) Received: from gourry-fedora-PF4VCD3F.lan (pool-96-255-20-138.washdc.ftas.verizon.net. [96.255.20.138]) by smtp.gmail.com with ESMTPSA id af79cd13be357-8cfc90ba89fsm391979885a.40.2026.03.21.08.04.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 21 Mar 2026 08:04:11 -0700 (PDT) From: Gregory Price To: linux-mm@kvack.org, vishal.l.verma@intel.com, dave.jiang@intel.com, akpm@linux-foundation.org, david@kernel.org, osalvador@suse.de Cc: dan.j.williams@intel.com, ljs@kernel.org, Liam.Howlett@oracle.com, vbabka@kernel.org, rppt@kernel.org, surenb@google.com, mhocko@suse.com, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, kernel-team@meta.com Subject: [PATCH 2/8] mm/memory: add memory_block_align_range() helper Date: Sat, 21 Mar 2026 11:03:58 -0400 Message-ID: <20260321150404.3288786-3-gourry@gourry.net> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260321150404.3288786-1-gourry@gourry.net> References: <20260321150404.3288786-1-gourry@gourry.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Memory hotplug operations require ranges aligned to memory block boundaries. This is a generic operation for hotplug. Add memory_block_align_range() as a common helper in that aligns the start address up and end address down to memory block boundaries. Update dax/kmem to use this helper. Signed-off-by: Gregory Price --- drivers/dax/kmem.c | 4 +--- include/linux/memory.h | 22 ++++++++++++++++++++++ 2 files changed, 23 insertions(+), 3 deletions(-) diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c index eb693a581961..798f389df992 100644 --- a/drivers/dax/kmem.c +++ b/drivers/dax/kmem.c @@ -26,9 +26,7 @@ static int dax_kmem_range(struct dev_dax *dev_dax, int i,= struct range *r) struct dev_dax_range *dax_range =3D &dev_dax->ranges[i]; struct range *range =3D &dax_range->range; =20 - /* memory-block align the hotplug range */ - r->start =3D ALIGN(range->start, memory_block_size_bytes()); - r->end =3D ALIGN_DOWN(range->end + 1, memory_block_size_bytes()) - 1; + *r =3D memory_block_align_range(range); if (r->start >=3D r->end) { r->start =3D range->start; r->end =3D range->end; diff --git a/include/linux/memory.h b/include/linux/memory.h index 5bb5599c6b2b..17cdf6ba3823 100644 --- a/include/linux/memory.h +++ b/include/linux/memory.h @@ -20,6 +20,7 @@ #include #include #include +#include =20 #define MIN_MEMORY_BLOCK_SIZE (1UL << SECTION_SIZE_BITS) =20 @@ -100,6 +101,27 @@ int arch_get_memory_phys_device(unsigned long start_pf= n); unsigned long memory_block_size_bytes(void); int set_memory_block_size_order(unsigned int order); =20 +/** + * memory_block_align_range - align a physical address range to memory blo= cks + * @range: the input range to align + * + * Aligns the start address up and the end address down to memory block + * boundaries. This is required for memory hotplug operations which must + * operate on memory-block aligned ranges. + * + * Returns the aligned range. Callers should check that the returned + * range is valid (aligned.start < aligned.end) before using it. + */ +static inline struct range memory_block_align_range(const struct range *ra= nge) +{ + struct range aligned; + + aligned.start =3D ALIGN(range->start, memory_block_size_bytes()); + aligned.end =3D ALIGN_DOWN(range->end + 1, memory_block_size_bytes()) - 1; + + return aligned; +} + struct memory_notify { unsigned long start_pfn; unsigned long nr_pages; --=20 2.53.0 From nobody Sat Apr 4 00:24:01 2026 Received: from mail-vs1-f48.google.com (mail-vs1-f48.google.com [209.85.217.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1D4BA391E69 for ; Sat, 21 Mar 2026 15:04:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.217.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774105456; cv=none; b=UHfpfA/JeovbwEMgW3qQfkqz9Jx2aRdUWS3bEhgNpcYzfwEYwbmH5KxIOTt8QwfvQv1BEGzZ8YGlGt12QBYXuRAg7a5mbeF5SS369DK9CoMVGHNMYvgXwGINlKs9v+/6xk6n4nxnLyw9LQp943Nw6GyOrUyaH/s2cR2lPJDSe+I= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774105456; c=relaxed/simple; bh=yADcON58BEc3yFXzCjE0Ob9wsd7UaBtY/rhvKXGDZqQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=mAIf/wdZdSL5gcMcJZbi61zdO4RQBSoGJKfj/6jq4CpZNl78DZmv5E+sbnCoQQebnM6iGfvy64eWFwKjsU2Nh+29ghDVkURnUftv2PlA1T0ZNYWLcFwWe/IUO4G0Y9mHt3N69TuukBaikBYP8eavrp6aLwDaUoirNyy5+cKat14= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gourry.net; spf=pass smtp.mailfrom=gourry.net; dkim=pass (2048-bit key) header.d=gourry.net header.i=@gourry.net header.b=cWBEXRUE; arc=none smtp.client-ip=209.85.217.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gourry.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gourry.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gourry.net header.i=@gourry.net header.b="cWBEXRUE" Received: by mail-vs1-f48.google.com with SMTP id ada2fe7eead31-5fff52ab292so384765137.1 for ; Sat, 21 Mar 2026 08:04:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gourry.net; s=google; t=1774105454; x=1774710254; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=caGxMrakhd3CQmCyLFUIWJ+VIS649rYQrZPqpBwweHk=; b=cWBEXRUEXRo+Xq5gb1VIWQ3//d70gdeIrg3jLQqO/l41PjMS7vgdN31dRr8hOvw66J nIsF1FZg0bW6ImwTM5qtm6H+GO/2ahzi4u/UuW+IhB/PYin+hxhQrXVvg6gOiV/bofk7 5Mafn7GKH9p1VAh+aH16MafImru0M4P/QoJZmwgSXY33KH1fupYUh+DELJoUbSNGEkS9 Qcn23kITwRerHm2Rg6RcYtliMb9Ng1TOfTI/pNs4I8R1Xs0pbbH25bVUG3I2m0NNVaG3 hdLY98wcfBfdrpl9x9PzSgat1Vugchj6m7yVlZ/1993TigO9CVnVrNyTUTeKj9GxnO84 ngmA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774105454; x=1774710254; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=caGxMrakhd3CQmCyLFUIWJ+VIS649rYQrZPqpBwweHk=; b=ajuxox/QtI5OAHEbuf1sxr5VYCWyjs5KVbrsJltcc8B/rVrTcCerftTAXVDL5CDUzr 3d6OLulvzH2l6ki1P05hB4YujuYPx8rkZWIqfrl0xuPgE4INYxM3BlyKF9lGCViqqP6v ILQwvIQIsqAvOiNu0/VFVueHE08m9jyH9m2jY0DOXTrLPG+f9Bf/wafDz9B5+JG21pDO JoDvqouxPE1VOeCvcwUav6JsNR98BAPtPIChopafhBy9FGdsvB9wbArow3VOyUm+vcjF fS2TOCY+UO3dZnr9QP8vsis4id6gF2SnYihQoQHunhBYMWxpXQz5Be+pLWYPurxqu1v0 J/6w== X-Forwarded-Encrypted: i=1; AJvYcCUWeEyisA8LkgyXMJ4907+4dHZL7OOMBNeB6bXS8C89g3e3mxbPjp1+i4BSISQRveBo2yVTEahQ7hSGHSQ=@vger.kernel.org X-Gm-Message-State: AOJu0YyVK8v9+xTVLhVYtvwcRMNmQ/Exd3XhYTRcT3Lrdw5h2ARV2O6p IMoChWa9cr6tzd8z7+U6nSWU3ff+9OyIlV+JIrKJIeT9+3iif8dMqOjnsjOhOUc+b+g= X-Gm-Gg: ATEYQzy7uWEcoAE5IFzIACJg6vsPt/qNLPBSpY6OwGE4l4SVnwJLpK5iOm2It479l6s 3jnvZ76IVI1faa+7MYGO9pe5QKeo2Q4r2kGGAR26/k9BY7p7wpyWh85yAlAxr6P0tlyRQrsU2wm gGenE344Ff89hPVlOvIKLti+HYozjcBCPTuY46UyiAWYofG/RqjCcoO/AQhO8sYSZH+mjDgnSRy Y31OcNN60OPxms8+uIcqD2JZrwu2SElURzplhlWfY4jmIq63NNSW/5zs0tkL0SxMxwYATJ8xffF FEOoXCg6kp7j1+brzzyNIVkSmWALJ1W4oA4qKL9x7sAPNKbYXg+gNV6snY+9DuYeT+mcuB3nwdJ 96LmwItvHyiBtml4DYE9UwikxNZEXH66LNSsj058zvwnkMLSgnPadL7ERSeKmvqZsZUyZuZSrtR 3RjY1h6FgFOb9GqBm12w/RhU5NC1dLjI3iUkL5frPqGY3RtXZ+mzaD50x5DiT8EM74diKA2XUkg 0WwBsy9SOAoQrPW0IirRd6tXQ== X-Received: by 2002:a05:6102:8089:b0:5ff:e39d:9f9b with SMTP id ada2fe7eead31-602aeb3a44amr3175883137.16.1774105454003; Sat, 21 Mar 2026 08:04:14 -0700 (PDT) Received: from gourry-fedora-PF4VCD3F.lan (pool-96-255-20-138.washdc.ftas.verizon.net. [96.255.20.138]) by smtp.gmail.com with ESMTPSA id af79cd13be357-8cfc90ba89fsm391979885a.40.2026.03.21.08.04.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 21 Mar 2026 08:04:13 -0700 (PDT) From: Gregory Price To: linux-mm@kvack.org, vishal.l.verma@intel.com, dave.jiang@intel.com, akpm@linux-foundation.org, david@kernel.org, osalvador@suse.de Cc: dan.j.williams@intel.com, ljs@kernel.org, Liam.Howlett@oracle.com, vbabka@kernel.org, rppt@kernel.org, surenb@google.com, mhocko@suse.com, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, kernel-team@meta.com Subject: [PATCH 3/8] mm/memory_hotplug: pass online_type to online_memory_block() via arg Date: Sat, 21 Mar 2026 11:03:59 -0400 Message-ID: <20260321150404.3288786-4-gourry@gourry.net> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260321150404.3288786-1-gourry@gourry.net> References: <20260321150404.3288786-1-gourry@gourry.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Modify online_memory_block() to accept the online type through its arg parameter rather than calling mhp_get_default_online_type() internally. This prepares for allowing callers to specify explicit online types. Update the caller in add_memory_resource() to pass the default online type via a local variable. No functional change. Cc: Oscar Salvador Cc: Andrew Morton Acked-by: David Hildenbrand (Red Hat) Signed-off-by: Gregory Price --- mm/memory_hotplug.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 86d3faf50453..282bf3d89613 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1338,7 +1338,9 @@ static int check_hotplug_memory_range(u64 start, u64 = size) =20 static int online_memory_block(struct memory_block *mem, void *arg) { - mem->online_type =3D mhp_get_default_online_type(); + enum mmop *online_type =3D arg; + + mem->online_type =3D *online_type; return device_online(&mem->dev); } =20 @@ -1492,6 +1494,7 @@ static int create_altmaps_and_memory_blocks(int nid, = struct memory_group *group, int add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags) { struct mhp_params params =3D { .pgprot =3D pgprot_mhp(PAGE_KERNEL) }; + enum mmop online_type =3D mhp_get_default_online_type(); enum memblock_flags memblock_flags =3D MEMBLOCK_NONE; struct memory_group *group =3D NULL; u64 start, size; @@ -1580,7 +1583,8 @@ int add_memory_resource(int nid, struct resource *res= , mhp_t mhp_flags) =20 /* online pages if requested */ if (mhp_get_default_online_type() !=3D MMOP_OFFLINE) - walk_memory_blocks(start, size, NULL, online_memory_block); + walk_memory_blocks(start, size, &online_type, + online_memory_block); =20 return ret; error: --=20 2.53.0 From nobody Sat Apr 4 00:24:01 2026 Received: from mail-vs1-f50.google.com (mail-vs1-f50.google.com [209.85.217.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 144FD391E5E for ; Sat, 21 Mar 2026 15:04:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.217.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774105458; cv=none; b=E0+wetfIsbIS9a2t20Io5j/6fweTasUBKmbQY4BHnqjaNcyUTiVDIYY7fWbj4WVrybMBw9Sv9Ehc1OR2oVPhBt/YC38ur2AhL0p3YRaBlMCUtXAHg0z9tAVnPwz/nIL5KNT2F1SQ2EBV8z8eioV7Wu7BQ8i8PtLBOc4wdIf9mgc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774105458; c=relaxed/simple; bh=cwLq4fYvCtZxySyHqWgZl6tJPwolVZsJKYMQbZ2gyMA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JLv5s94x/NtD6m94+m6Bdgi8CloWhHEajR4xVJ04GtP/9qzDz669pjvrnqEly/6QW2nbv89RRC03aSPz8i65g//itRnoCEiGCxslIkc1EINg6f4DmhvtpJzyJiqlR0W2EDfHObvOZUj0DmcREgyu8SD2vX0VMKm70Y3y4nlB120= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gourry.net; spf=pass smtp.mailfrom=gourry.net; dkim=pass (2048-bit key) header.d=gourry.net header.i=@gourry.net header.b=k1w30U+/; arc=none smtp.client-ip=209.85.217.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gourry.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gourry.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gourry.net header.i=@gourry.net header.b="k1w30U+/" Received: by mail-vs1-f50.google.com with SMTP id ada2fe7eead31-5ffe6887e29so1892203137.1 for ; Sat, 21 Mar 2026 08:04:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gourry.net; s=google; t=1774105456; x=1774710256; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=8aQsutcLH43Y9NxwFlNMe1GVmyWLzH9mSMGyjSQ6t5E=; b=k1w30U+/9SOiGAzhJFbIo+EURteHnRFqZgB6yRO7ST4LnAcfMpcG340ALNvH7lAUjr b1Lx7llOdek3Vj5zpQzCRFvAfACeLtNNEQlS1qExj/BQfvolO4ctmg6UEm6bHqBdx++E jx4Pw00kS2xh1GfvlCgl/q8i8lEz92xzKNhdng3rpZ7Ob8Jal6WTq8pIX0xVcHGK3tAB rAFhVBB23DQWmhp27Rl668ELAIK04pA0VbaPBKF8boSByQxm3mVuRVUIr5DX/qekW2sZ Nyg+LhVGSFhwzTTV1/CHReYUf7QZdSMIzSDILSTj+FcM/zRN/7Whxavrgv81sLUfVwwg Mgjw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774105456; x=1774710256; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=8aQsutcLH43Y9NxwFlNMe1GVmyWLzH9mSMGyjSQ6t5E=; b=C1eRNH2OGsYx2X5rtaSMj0argkXxzSLZfjaIAgOib1MEneAO/WsIb5BqerY8Z3FT8S gFbdOl2Q5h9nrQUDoNJ3msPyq3cEYDPNVtYeQ5xDivWZwmVRGkFK/oSTbzLGfjj1BTSk mjDc53pEPNB0mpmrahvhhGviJ0+tBWljCcV1bMCGlYcaxqvkS/BCE/Uus+zqI8k9BJCM 9ieHJHBUS11+FsSOudbsN7mc8aOzZLRcTKgQM9ahGN0zXA8xfcOchz6wcV+t7gK62Fna iCSUcLCBoQ7K4JvgYBggP3U9Krdsx/wbRkfvdAz0Ma5KAtlBPsbvumN1JIcfxIe1e0pG tLXA== X-Forwarded-Encrypted: i=1; AJvYcCWEFUlWu+dV/3rqzx07IP72YLIjA9vSby6FVeKP0LGPX6kz8yx5D1maqaJZ93gDIc3yKNoyjJ98y2trGJs=@vger.kernel.org X-Gm-Message-State: AOJu0YzAIYtMz7YD0HKH1SEj6TxJJeruDe4o4PwxpgbGCDXUTh+sVShb QpC1Si6DAzk22NpxLwocsk0eTlj2KeonRSn3xCIi03ekncEj1NYAHO2NiPxjTiiql6E= X-Gm-Gg: ATEYQzxnpY7LHalbY/qIZ6nEidkVcQ7syCc9amgZyVLeQrq+NywkzTxRh+EXOE8rEp6 xvqxNRDPgFktUShPNXdCX9tWNjizUUiw7EpCat0QlTJ9M5mraitb6zAzspBbzlsJnVdtlDBB4V3 ss7xUi0FPSXSiTM08ajAsmhd7eyg6yNA1e9rkzSB00/eIZsmaK24gq4SzhgbJajBDsBGx1rLyMo cQoj4+NqWVTb58XU+u2+8ixH4XuJedfJkvH+05DO7B4zvevmTdAPi/Q3+KZRsEfswBO1z7QT/BN 3ugwGJWJQYxT1vVnFZEVskCYj11yDwBPLyXgvaIGyQwHSae+oM899yPJEBR4zyF70VjvXIdB/yz vyUCutwNYw+2P+17LiVZp+ZrjC9Ll8r/8TyIQo7acgK1xJFOjeRU8w1eA35f44nQQNVavvErqTT qNl3wgHUmK8nz5pDPlH3+xjwsjS9ggadGyzEPWgloQxg6nV5ZhjwBYR3WGtUDem9Z4zRy6Cb1dH VUDunVazXzBVmgCmQmLA0ZVRw== X-Received: by 2002:a05:6102:5a94:b0:5ff:fbe4:89c with SMTP id ada2fe7eead31-602aed31766mr3273670137.26.1774105456040; Sat, 21 Mar 2026 08:04:16 -0700 (PDT) Received: from gourry-fedora-PF4VCD3F.lan (pool-96-255-20-138.washdc.ftas.verizon.net. [96.255.20.138]) by smtp.gmail.com with ESMTPSA id af79cd13be357-8cfc90ba89fsm391979885a.40.2026.03.21.08.04.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 21 Mar 2026 08:04:15 -0700 (PDT) From: Gregory Price To: linux-mm@kvack.org, vishal.l.verma@intel.com, dave.jiang@intel.com, akpm@linux-foundation.org, david@kernel.org, osalvador@suse.de Cc: dan.j.williams@intel.com, ljs@kernel.org, Liam.Howlett@oracle.com, vbabka@kernel.org, rppt@kernel.org, surenb@google.com, mhocko@suse.com, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, kernel-team@meta.com Subject: [PATCH 4/8] mm/memory_hotplug: export mhp_get_default_online_type Date: Sat, 21 Mar 2026 11:04:00 -0400 Message-ID: <20260321150404.3288786-5-gourry@gourry.net> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260321150404.3288786-1-gourry@gourry.net> References: <20260321150404.3288786-1-gourry@gourry.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Drivers which may pass hotplug policy down to DAX need MMOP_ symbols and the mhp_get_default_online_type function for hotplug use cases. Some drivers (cxl) co-mingle their hotplug and devdax use-cases into the same driver code, and chose the dax_kmem path as the default driver path - making it difficult to require hotplug as a predicate to building the overall driver (it may break other non-hotplug use-cases). Export mhp_get_default_online_type function to allow these drivers to build when hotplug is disabled and still use the DAX use case. In the built-out case we simply return MMOP_OFFLINE as it's non-destructive. The internal function can never return -1 either, so we choose this to allow for defining the function with 'enum mmop'. Signed-off-by: Gregory Price --- include/linux/memory_hotplug.h | 29 +++++++++++++++++++++++++++++ mm/memory_hotplug.c | 1 + 2 files changed, 30 insertions(+) diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index e77ef3d7ff73..a8bcb36f93b8 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -6,6 +6,7 @@ #include #include #include +#include =20 struct page; struct zone; @@ -28,6 +29,27 @@ enum mmop { MMOP_ONLINE_MOVABLE, }; =20 +/** + * mmop_to_str - convert memory online type to string + * @online_type: the MMOP_* value to convert + * + * Returns a string representation of the memory online type, + * suitable for sysfs output (includes trailing newline). + */ +static inline const char *mmop_to_str(enum mmop online_type) +{ + switch (online_type) { + case MMOP_ONLINE: + return "online\n"; + case MMOP_ONLINE_KERNEL: + return "online_kernel\n"; + case MMOP_ONLINE_MOVABLE: + return "online_movable\n"; + default: + return "offline\n"; + } +} + #ifdef CONFIG_MEMORY_HOTPLUG struct page *pfn_to_online_page(unsigned long pfn); =20 @@ -221,6 +243,11 @@ static inline bool mhp_supports_memmap_on_memory(void) static inline void pgdat_kswapd_lock(pg_data_t *pgdat) {} static inline void pgdat_kswapd_unlock(pg_data_t *pgdat) {} static inline void pgdat_kswapd_lock_init(pg_data_t *pgdat) {} + +static inline int mhp_online_type_from_str(const char *str) +{ + return -EOPNOTSUPP; +} #endif /* ! CONFIG_MEMORY_HOTPLUG */ =20 /* @@ -316,6 +343,8 @@ extern struct zone *zone_for_pfn_range(enum mmop online= _type, extern int arch_create_linear_mapping(int nid, u64 start, u64 size, struct mhp_params *params); void arch_remove_linear_mapping(u64 start, u64 size); +#else +static inline enum mmop mhp_get_default_online_type(void) { return MMOP_OF= FLINE; } #endif /* CONFIG_MEMORY_HOTPLUG */ =20 #endif /* __LINUX_MEMORY_HOTPLUG_H */ diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 282bf3d89613..af9a6cb5a2f9 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -240,6 +240,7 @@ enum mmop mhp_get_default_online_type(void) =20 return mhp_default_online_type; } +EXPORT_SYMBOL_GPL(mhp_get_default_online_type); =20 void mhp_set_default_online_type(enum mmop online_type) { --=20 2.53.0 From nobody Sat Apr 4 00:24:01 2026 Received: from mail-qk1-f175.google.com (mail-qk1-f175.google.com [209.85.222.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B717C391E63 for ; Sat, 21 Mar 2026 15:04:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774105465; cv=none; b=GrQELkKiTpFFENHwZc/fTxqL5fzpXlik9R2c/HNnOXEFl3iESi364ENwojbEkjRh6LYAERDEsnKbLD3rzJuXEujqttQm+xhYKJ/MXuhk39py5AtXKtKWJREzxZpirqpbrxeyLAiG4JGyw1sk+WOIxLN8z/RKASu8aGOp6zU4BNc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774105465; c=relaxed/simple; bh=rF+DX+JfgUHRP6PxAjzMWPOUKYEy9HGwrNkxn2i6BO4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=KmtKL3mI5gdxuhoqUKquGGFyITgLorf5QzcGrAo9YRfHWDS4bIpDEZE2vLldUvZszUSggb4BRS60t9k2oIhI8JmYpzNzmC8rOiq9SlZt+Wk0qoF8PeczwbOi3QrsE/71wE9XcxEa8NIZE7uL99zKwcAa9ixsTuZTtVY6NLAwyFA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gourry.net; spf=pass smtp.mailfrom=gourry.net; dkim=pass (2048-bit key) header.d=gourry.net header.i=@gourry.net header.b=qWNQs8ox; arc=none smtp.client-ip=209.85.222.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gourry.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gourry.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gourry.net header.i=@gourry.net header.b="qWNQs8ox" Received: by mail-qk1-f175.google.com with SMTP id af79cd13be357-8cfbbdbaf3cso280479785a.0 for ; Sat, 21 Mar 2026 08:04:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gourry.net; s=google; t=1774105458; x=1774710258; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=9vvnYDBDYSd91JS5VlgcZ1pjC7MUK2Lw1jakydB3PHI=; b=qWNQs8oxsOzKWa5EarGQo2j3wjBilO/R0/OFgHYAIpgz/2ZAgARYiLfJx1587OCLm9 XSOf8SiejkbndGbn8ArkHqtFSYE85O4/sZ9ikG88Mt30e9g/jJ3gAHYZn1nwpfcMD4h8 7FNyqyR4vT8va25+nq92GvCnn99WuMB61DRcWpx8CX4rQj9IE4OidBrOg3kapWX61xpW GOyswXcdtR6EjaE2V1fzmdDzNQ291LGkRFo5MmG7wHcW2dVBR2Twsd/XCFpF6H1dbCCO 1owcX7Oq5aRzWJ9N6Y67x5Rocu1ATQoefJzqUaQjpd9h8IFvnwkfyH3REK5ez0tpIIDU u+gw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774105458; x=1774710258; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=9vvnYDBDYSd91JS5VlgcZ1pjC7MUK2Lw1jakydB3PHI=; b=hziN0l+ndeQSz9LrwbuGAl2SHYZBBQ5l9wjXMUKTAtGexzXNfTEx53xahLaM/Uqm1u g/5aCrg26x+4WH4NDM3B2X5uLXttWuUHErdc4z15kUtLH4eihNept85sxWDiBu9h9Bdu 1juK5fXw18OJ/iThf3ftEyU2m+Ar8fVm01Q0CQQB0NHU6iaRWIvfInOx0T5l04Ou1xcq 43bDC8V/90tWmcjegKw5A+zizJL+epSHRLopTc+T4zwSkfjKey+QRlcYWzz4oJ0lJQCi 4fwg9r7QNYsMxTTmhpaDnd40vNVnIs4F7RPsurw1bydBoJn7KP3QJWj4ieXT0MfyDSJw O3eQ== X-Forwarded-Encrypted: i=1; AJvYcCWa+VKGHJ+FNIxjxIxmb4cdefuDFNGo0Q6mUP3ZldsXyIZQQT89nr3Wv54jgyWDbD/E4J+agh2NyC7c8Mk=@vger.kernel.org X-Gm-Message-State: AOJu0YzgpNP7DFVIJ5+4YdFaI6wRmxq7o1l4dHUV5ibPjqpP6vHi09PT +PzMzOdfDucXOaktZNpj2lft0oS9KxfZaudYvcEoUCmxdLcs4Ig7ZhdXgMGsK3DU49w= X-Gm-Gg: ATEYQzzyhDj1T5fveWO7rzN1l6ZT9M4oAjfDdfpSmFRC36N0HdIeVyTY9ieWUFyzjRC F1K7xp8Ghk6N02R3UeS+HT4TpsBGcGxQ74LsVEJSQ9vWkWOsPeZ2IkHsMAmSz6I87/w5SmJWnkN 6dbCE5tY2CMHEjN9dT0zXarin4zi7d6GfAJRfuaaoDamSSZC3Ug02wbAUpKDiq4FNfNxTyP1VFO vWfOeEr7OELX3nZhjaqm+UJYLmmaSxYAJ8Ccixwr6vTLfW9u7IlP901zKnd30pXQOJanvhUYrVA 2rvoH/7fR4MRj5FLlixUODVvKl1qm7w38iQHlmJIh1WNTQHr39SS2Rh5gH6mvxTnYuWqvQxcTPe o9/hHObtcdpYQiPnolvGddiIHINNCfceRXOZgxWE6Pxz47banNVcoumd0uaK9Vux4y50A1VTIzP 5tHuYZfi3GTDbTp/kjd8MR7j88U+PjiIBJ0dgLgceelOfW208TGO/bjZ/N7wtcOMhBL0zdkze7M LrXluZtXI2tciATy5kL+HbBaQ== X-Received: by 2002:a05:620a:410c:b0:8a6:92d1:2dae with SMTP id af79cd13be357-8cfc795a1famr877339485a.5.1774105457588; Sat, 21 Mar 2026 08:04:17 -0700 (PDT) Received: from gourry-fedora-PF4VCD3F.lan (pool-96-255-20-138.washdc.ftas.verizon.net. [96.255.20.138]) by smtp.gmail.com with ESMTPSA id af79cd13be357-8cfc90ba89fsm391979885a.40.2026.03.21.08.04.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 21 Mar 2026 08:04:17 -0700 (PDT) From: Gregory Price To: linux-mm@kvack.org, vishal.l.verma@intel.com, dave.jiang@intel.com, akpm@linux-foundation.org, david@kernel.org, osalvador@suse.de Cc: dan.j.williams@intel.com, ljs@kernel.org, Liam.Howlett@oracle.com, vbabka@kernel.org, rppt@kernel.org, surenb@google.com, mhocko@suse.com, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, kernel-team@meta.com Subject: [PATCH 5/8] mm/memory_hotplug: add __add_memory_driver_managed() with online_type arg Date: Sat, 21 Mar 2026 11:04:01 -0400 Message-ID: <20260321150404.3288786-6-gourry@gourry.net> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260321150404.3288786-1-gourry@gourry.net> References: <20260321150404.3288786-1-gourry@gourry.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Existing callers of add_memory_driver_managed cannot select the preferred online type (ZONE_NORMAL vs ZONE_MOVABLE), requiring it to hot-add memory as offline blocks, and then follow up by onlining each memory block individually. Most drivers prefer the system default, but the CXL driver wants to plumb a preferred policy through the dax kmem driver. Refactor APIs to add a new interface which allows the dax kmem and cxl_core modules to select a preferred policy. Only expose this interface to those modules to avoid confusion among existing API users and to limit usage in out-of-tree modules. Refactor add_memory_driver_managed, extract __add_memory_driver_managed - Add proper kernel-doc for add_memory_driver_managed while refactoring - New helper accepts an explicit online_type. - New help validates online_type is between OFFLINE and ONLINE_MOVABLE Refactor: add_memory_resource, extract __add_memory_resource - new helper accepts an explicit online_type Original APIs now explicitly pass the system-default to new helpers. No functional change for existing users. Cc: David Hildenbrand Cc: Oscar Salvador Cc: Andrew Morton Signed-off-by: Gregory Price --- include/linux/memory_hotplug.h | 3 ++ mm/memory_hotplug.c | 60 +++++++++++++++++++++++++++++----- 2 files changed, 55 insertions(+), 8 deletions(-) diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index a8bcb36f93b8..1f19f08552ea 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -320,6 +320,9 @@ extern int __add_memory(int nid, u64 start, u64 size, m= hp_t mhp_flags); extern int add_memory(int nid, u64 start, u64 size, mhp_t mhp_flags); extern int add_memory_resource(int nid, struct resource *resource, mhp_t mhp_flags); +int __add_memory_driver_managed(int nid, u64 start, u64 size, + const char *resource_name, mhp_t mhp_flags, + enum mmop online_type); extern int add_memory_driver_managed(int nid, u64 start, u64 size, const char *resource_name, mhp_t mhp_flags); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index af9a6cb5a2f9..9081aad5078f 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1492,10 +1492,10 @@ static int create_altmaps_and_memory_blocks(int nid= , struct memory_group *group, * * we are OK calling __meminit stuff here - we have CONFIG_MEMORY_HOTPLUG */ -int add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags) +static int __add_memory_resource(int nid, struct resource *res, mhp_t mhp_= flags, + enum mmop online_type) { struct mhp_params params =3D { .pgprot =3D pgprot_mhp(PAGE_KERNEL) }; - enum mmop online_type =3D mhp_get_default_online_type(); enum memblock_flags memblock_flags =3D MEMBLOCK_NONE; struct memory_group *group =3D NULL; u64 start, size; @@ -1583,7 +1583,7 @@ int add_memory_resource(int nid, struct resource *res= , mhp_t mhp_flags) merge_system_ram_resource(res); =20 /* online pages if requested */ - if (mhp_get_default_online_type() !=3D MMOP_OFFLINE) + if (online_type !=3D MMOP_OFFLINE) walk_memory_blocks(start, size, &online_type, online_memory_block); =20 @@ -1601,7 +1601,13 @@ int add_memory_resource(int nid, struct resource *re= s, mhp_t mhp_flags) return ret; } =20 -/* requires device_hotplug_lock, see add_memory_resource() */ +int add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags) +{ + return __add_memory_resource(nid, res, mhp_flags, + mhp_get_default_online_type()); +} + +/* requires device_hotplug_lock, see __add_memory_resource() */ int __add_memory(int nid, u64 start, u64 size, mhp_t mhp_flags) { struct resource *res; @@ -1629,7 +1635,15 @@ int add_memory(int nid, u64 start, u64 size, mhp_t m= hp_flags) } EXPORT_SYMBOL_GPL(add_memory); =20 -/* +/** + * __add_memory_driver_managed - add driver-managed memory with explicit o= nline_type + * @nid: NUMA node ID where the memory will be added + * @start: Start physical address of the memory range + * @size: Size of the memory range in bytes + * @resource_name: Resource name in format "System RAM ($DRIVER)" + * @mhp_flags: Memory hotplug flags + * @online_type: Auto-Online behavior (offline, online, kernel, movable) + * * Add special, driver-managed memory to the system as system RAM. Such * memory is not exposed via the raw firmware-provided memmap as system * RAM, instead, it is detected and added by a driver - during cold boot, @@ -1649,9 +1663,12 @@ EXPORT_SYMBOL_GPL(add_memory); * * The resource_name (visible via /proc/iomem) has to have the format * "System RAM ($DRIVER)". + * + * Return: 0 on success, negative error code on failure. */ -int add_memory_driver_managed(int nid, u64 start, u64 size, - const char *resource_name, mhp_t mhp_flags) +int __add_memory_driver_managed(int nid, u64 start, u64 size, + const char *resource_name, mhp_t mhp_flags, + enum mmop online_type) { struct resource *res; int rc; @@ -1661,6 +1678,9 @@ int add_memory_driver_managed(int nid, u64 start, u64= size, resource_name[strlen(resource_name) - 1] !=3D ')') return -EINVAL; =20 + if (online_type < MMOP_OFFLINE || online_type > MMOP_ONLINE_MOVABLE) + return -EINVAL; + lock_device_hotplug(); =20 res =3D register_memory_resource(start, size, resource_name); @@ -1669,7 +1689,7 @@ int add_memory_driver_managed(int nid, u64 start, u64= size, goto out_unlock; } =20 - rc =3D add_memory_resource(nid, res, mhp_flags); + rc =3D __add_memory_resource(nid, res, mhp_flags, online_type); if (rc < 0) release_memory_resource(res); =20 @@ -1677,6 +1697,30 @@ int add_memory_driver_managed(int nid, u64 start, u6= 4 size, unlock_device_hotplug(); return rc; } +EXPORT_SYMBOL_FOR_MODULES(__add_memory_driver_managed, "kmem,cxl_core"); + +/** + * add_memory_driver_managed - add driver-managed memory + * @nid: NUMA node ID where the memory will be added + * @start: Start physical address of the memory range + * @size: Size of the memory range in bytes + * @resource_name: Resource name in format "System RAM ($DRIVER)" + * @mhp_flags: Memory hotplug flags + * + * Add driver-managed memory with the system default online type set by + * build config or kernel boot parameter. + * + * See __add_memory_driver_managed for more details. + * + * Return: 0 on success, negative error code on failure. + */ +int add_memory_driver_managed(int nid, u64 start, u64 size, + const char *resource_name, mhp_t mhp_flags) +{ + return __add_memory_driver_managed(nid, start, size, resource_name, + mhp_flags, + mhp_get_default_online_type()); +} EXPORT_SYMBOL_GPL(add_memory_driver_managed); =20 /* --=20 2.53.0 From nobody Sat Apr 4 00:24:01 2026 Received: from mail-qk1-f171.google.com (mail-qk1-f171.google.com [209.85.222.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 427F5392824 for ; Sat, 21 Mar 2026 15:04:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774105464; cv=none; b=CmaiY4rFW4BIUrw44rqmo8h5DOiODCB3KAUhYpR26yIkwsUByHc0EbBNK9YU9AHtNTDe8fWdvJF0pOrrmTZ0WUgm3DyDQgwkzbRoJuwCC4BAF8AsOqkgMqpmTqgpcYretwVpVaVGHHUUmExHh/A5Xb0g74NnNhJDAQid9mbmMJc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774105464; c=relaxed/simple; bh=Wowz7DV1aUnJYA8F9t35yrzNxk7X9sC3o+wv4gNWZWA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=G3HXWQH9eobNbEIR/iJNGpqZBgLxCo9cS1wLhXnfF7OpAq+Fc0yEyfw3gomxo1LDyUYc7sEeg8JV+ZP76rRFu28+/BC5cjEamDpmqzAlLp7Nd2pPNvM+tjXFAQfqWP9GMKTGLex8tqvvL4j06QXn8cR01HKErb4Dfg7zhCGF6hs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gourry.net; spf=pass smtp.mailfrom=gourry.net; dkim=pass (2048-bit key) header.d=gourry.net header.i=@gourry.net header.b=irS2qXWx; arc=none smtp.client-ip=209.85.222.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gourry.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gourry.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gourry.net header.i=@gourry.net header.b="irS2qXWx" Received: by mail-qk1-f171.google.com with SMTP id af79cd13be357-8cd71fb9f06so186830585a.2 for ; Sat, 21 Mar 2026 08:04:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gourry.net; s=google; t=1774105460; x=1774710260; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=DpXtFQ1E8PxzzJ3jH0sASsMe/zJtVJUABHrep2C4jPI=; b=irS2qXWxWFeexAZMY+IiOHXE4lVvRhIKgIfbqdPOgR3M+D5Pr7sfFiF7DA6Y3jxSFj dLYH4jt0Or2o8Ci+d7sCnulb6ziFtZfsXjpCiYysZ4IUEr1nFoFY2ltSMhyL/HpkqlPm 4KzB67t+QzA8kJJZsm54xUetG5BIVv9SNjJ8TaB4x42xiSDf3pWT1I/BtdfxXck+lRJy CO+O0jUqrH+rEfrPiqzYAjgbmwdx7KAs84xJxpcNm+4jGWS/B9vFKibU99hXqDkrf0vW gjmkRUGF0eYJPql4RwmTmUDEy5cEFgHh6Yq4at3JxketATqN6r3GOPXmdKHyXX0jtsHy n0cg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774105460; x=1774710260; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=DpXtFQ1E8PxzzJ3jH0sASsMe/zJtVJUABHrep2C4jPI=; b=aW+94o7LQ2wkccDv5TZ7tXOBVZqjbVv1aDwHm72RqfaDdiN2014MrwjFob+I+dcUF2 X+WsFV8We2mO//pmvDbYApw5PtheHzzbopvE1G7CiZXPTNHkoeFM/yPoUO+1NNRufhqK UyI0VgDhnBzUrgQmzTUO0Uw2ki7U8A1EDKX0cZLHoOuCfe560y2VqSGmmN4ftPOF+Epb dEbSVz/uZ3WP4RT3MHm5zG3BYV17+u0O2DeL+Q6wtINa1JOkBBmajQAdHgptnpiIwYN7 vldYUaRsSbXhPXhXuFGuPLO5PCqd7LnVlGpivH1g2h6slnWQOh6I1Ot1mK7gv4KbiPk2 AXbg== X-Forwarded-Encrypted: i=1; AJvYcCUy0bPedWFQLl3MIKiFo+vhzYmpEBxr2/WGTT43qEhYSu5MDHP0kVdfkp98lIBqQQ6zkKF93sFZFip4AlY=@vger.kernel.org X-Gm-Message-State: AOJu0YyJQwaTp28iRD3EYiCnJtsjGGWJ0yI/N9sPTYV/Ey5kYVwtvZNG 8eg3V6NZLXoqZffAKgSYdVU1jJs7A0+6xFGhX2OOJWB9eZGP82Fn4nqjIclMHikY5bQ= X-Gm-Gg: ATEYQzzDstjZ2zpvY+7qdVHal4ScCzTd9J0jg3tCQLsHJXobF6d/CNgiO3yB8OtO+Y0 HZF9TXMJr6j2gQMCp0iR718g/kvQEbMsRJCqOkwCl/VGxZvC2tJGA4QWtiCHo8vPf8mOuVQdO1h lE17F47BMOtiXPvxAsvf+WSYSSci/9QzfgNiMbFibEVWezRlO7x6dpdT1V9lBXKspZfqDYwY7SD 4ur6wXwqZCY8chIzwpI31HLNjj95JRYhlOx7m/wZkAM1Zfj9Beny2lb+mTp+rAo2sx/LzdYz477 IX2k1MQyFDz6jVkFzkx8N6nlViGOYovfnMNX1elU81Enso38X615E7k8Bz/RJS0Zxyk1iNehP/Q eTFzpg7Cb2733DMmle4ZEN9C68iooRk2s1aboYu/aKyrTB5CMjdKNBC8Y5JGjo/2izcxEYUZnIH LJ3d/baCHTgK6/YOiI5OdvPzQbVi5BuWypUGwS9L9i7gK/kzeWRvAYNaoJ+Oms7xYN+NzKgSe+6 lc0uudf7JpmUS4= X-Received: by 2002:a05:620a:4055:b0:8cf:dd4b:8a53 with SMTP id af79cd13be357-8cfdd4b9250mr167236585a.30.1774105459722; Sat, 21 Mar 2026 08:04:19 -0700 (PDT) Received: from gourry-fedora-PF4VCD3F.lan (pool-96-255-20-138.washdc.ftas.verizon.net. [96.255.20.138]) by smtp.gmail.com with ESMTPSA id af79cd13be357-8cfc90ba89fsm391979885a.40.2026.03.21.08.04.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 21 Mar 2026 08:04:18 -0700 (PDT) From: Gregory Price To: linux-mm@kvack.org, vishal.l.verma@intel.com, dave.jiang@intel.com, akpm@linux-foundation.org, david@kernel.org, osalvador@suse.de Cc: dan.j.williams@intel.com, ljs@kernel.org, Liam.Howlett@oracle.com, vbabka@kernel.org, rppt@kernel.org, surenb@google.com, mhocko@suse.com, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, kernel-team@meta.com Subject: [PATCH 6/8] dax: plumb hotplug online_type through dax Date: Sat, 21 Mar 2026 11:04:02 -0400 Message-ID: <20260321150404.3288786-7-gourry@gourry.net> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260321150404.3288786-1-gourry@gourry.net> References: <20260321150404.3288786-1-gourry@gourry.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" There is no way for drivers leveraging dax_kmem to plumb through a preferred auto-online policy - the system default policy is forced. Add 'enum mmop' field to DAX device creation path to allow drivers to specify an auto-online policy when using the kmem driver. Current callers initialize online_type to mhp_get_default_online_type() to retain backward compatibility and to make explicit to the drivers what is actually happening underneath. No functional changes to existing callers. Cc:David Hildenbrand Signed-off-by: Gregory Price --- drivers/dax/bus.c | 3 +++ drivers/dax/bus.h | 2 ++ drivers/dax/cxl.c | 1 + drivers/dax/dax-private.h | 3 +++ drivers/dax/hmem/hmem.c | 1 + drivers/dax/kmem.c | 13 +++++++++++-- 6 files changed, 21 insertions(+), 2 deletions(-) diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c index c94c09622516..2c6140dc9382 100644 --- a/drivers/dax/bus.c +++ b/drivers/dax/bus.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 /* Copyright(c) 2017-2018 Intel Corporation. All rights reserved. */ #include +#include #include #include #include @@ -395,6 +396,7 @@ static ssize_t create_store(struct device *dev, struct = device_attribute *attr, .size =3D 0, .id =3D -1, .memmap_on_memory =3D false, + .online_type =3D mhp_get_default_online_type(), }; struct dev_dax *dev_dax =3D __devm_create_dev_dax(&data); =20 @@ -1494,6 +1496,7 @@ static struct dev_dax *__devm_create_dev_dax(struct d= ev_dax_data *data) ida_init(&dev_dax->ida); =20 dev_dax->memmap_on_memory =3D data->memmap_on_memory; + dev_dax->online_type =3D data->online_type; =20 inode =3D dax_inode(dax_dev); dev->devt =3D inode->i_rdev; diff --git a/drivers/dax/bus.h b/drivers/dax/bus.h index cbbf64443098..f037cd8a2d51 100644 --- a/drivers/dax/bus.h +++ b/drivers/dax/bus.h @@ -3,6 +3,7 @@ #ifndef __DAX_BUS_H__ #define __DAX_BUS_H__ #include +#include #include =20 struct dev_dax; @@ -24,6 +25,7 @@ struct dev_dax_data { resource_size_t size; int id; bool memmap_on_memory; + enum mmop online_type; }; =20 struct dev_dax *devm_create_dev_dax(struct dev_dax_data *data); diff --git a/drivers/dax/cxl.c b/drivers/dax/cxl.c index 13cd94d32ff7..d6fbec863361 100644 --- a/drivers/dax/cxl.c +++ b/drivers/dax/cxl.c @@ -27,6 +27,7 @@ static int cxl_dax_region_probe(struct device *dev) .id =3D -1, .size =3D range_len(&cxlr_dax->hpa_range), .memmap_on_memory =3D true, + .online_type =3D mhp_get_default_online_type(), }; =20 return PTR_ERR_OR_ZERO(devm_create_dev_dax(&data)); diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h index c6ae27c982f4..734fb83f5eb4 100644 --- a/drivers/dax/dax-private.h +++ b/drivers/dax/dax-private.h @@ -8,6 +8,7 @@ #include #include #include +#include =20 /* private routines between core files */ struct dax_device; @@ -77,6 +78,7 @@ struct dev_dax_range { * @dev: device core * @pgmap: pgmap for memmap setup / lifetime (driver owned) * @memmap_on_memory: allow kmem to put the memmap in the memory + * @online_type: MMOP_* online type for memory hotplug * @nr_range: size of @ranges * @ranges: range tuples of memory used */ @@ -91,6 +93,7 @@ struct dev_dax { struct device dev; struct dev_pagemap *pgmap; bool memmap_on_memory; + enum mmop online_type; int nr_range; struct dev_dax_range *ranges; }; diff --git a/drivers/dax/hmem/hmem.c b/drivers/dax/hmem/hmem.c index 1cf7c2a0ee1c..acbc574ced93 100644 --- a/drivers/dax/hmem/hmem.c +++ b/drivers/dax/hmem/hmem.c @@ -36,6 +36,7 @@ static int dax_hmem_probe(struct platform_device *pdev) .id =3D -1, .size =3D region_idle ? 0 : range_len(&mri->range), .memmap_on_memory =3D false, + .online_type =3D mhp_get_default_online_type(), }; =20 return PTR_ERR_OR_ZERO(devm_create_dev_dax(&data)); diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c index 798f389df992..d4c34b2e3766 100644 --- a/drivers/dax/kmem.c +++ b/drivers/dax/kmem.c @@ -16,6 +16,11 @@ #include "dax-private.h" #include "bus.h" =20 +/* Internal function exported only to kmem module */ +extern int __add_memory_driver_managed(int nid, u64 start, u64 size, + const char *resource_name, + mhp_t mhp_flags, enum mmop online_type); + /* Memory resource name used for add_memory_driver_managed(). */ static const char *kmem_name; /* Set if any memory will remain added when the driver will be unloaded. */ @@ -49,6 +54,7 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax) struct dax_kmem_data *data; struct memory_dev_type *mtype; int i, rc, mapped =3D 0; + enum mmop online_type; mhp_t mhp_flags; int numa_node; int adist =3D MEMTIER_DEFAULT_LOWTIER_ADISTANCE; @@ -111,6 +117,8 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax) goto err_reg_mgid; data->mgid =3D rc; =20 + online_type =3D dev_dax->online_type; + for (i =3D 0; i < dev_dax->nr_range; i++) { struct resource *res; struct range range; @@ -151,8 +159,9 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax) * Ensure that future kexec'd kernels will not treat * this as RAM automatically. */ - rc =3D add_memory_driver_managed(data->mgid, range.start, - range_len(&range), kmem_name, mhp_flags); + rc =3D __add_memory_driver_managed(data->mgid, range.start, + range_len(&range), kmem_name, mhp_flags, + online_type); =20 if (rc) { dev_warn(dev, "mapping%d: %#llx-%#llx memory add failed\n", --=20 2.53.0 From nobody Sat Apr 4 00:24:01 2026 Received: from mail-qk1-f180.google.com (mail-qk1-f180.google.com [209.85.222.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D442028B7DA for ; Sat, 21 Mar 2026 15:04:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774105467; cv=none; b=G2QVI3SdTvGgFF2HqwWvtq1gXi0toyrL+pbYAQ76FFBYtRVsc79o0RK850nuGkJAn9nJcTN8D1Cac8Mb4JJjcLK2IaePWH59edE6mWvbAcKUdYWsPBqKatYUOpo29JuAxCaqFQh8kcHErFBoenaxqCc+/+fAzMoUB1ThJKchSAA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774105467; c=relaxed/simple; bh=/Fxb0cxh0HZTdxAGbTO6shs2WBHqTx7oRHAWfPW0NKo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=n2ZUr6PfybBTCsN9U/IFpDwEXskQnz6tGvs0XkAiKJjVOZ+6Dud1Mi/LtAlVFrrx3SMPL1hfhQN6H+KT5aIPRvpHWFfkxIhgqP5gqdhuciuepbEflRDm4KBtMF/xchyxqYcdcbbSZJhrAP2OtW0xtSmb00olKxVYhA0RzEGmDpE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gourry.net; spf=pass smtp.mailfrom=gourry.net; dkim=pass (2048-bit key) header.d=gourry.net header.i=@gourry.net header.b=LXEh5mGz; arc=none smtp.client-ip=209.85.222.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gourry.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gourry.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gourry.net header.i=@gourry.net header.b="LXEh5mGz" Received: by mail-qk1-f180.google.com with SMTP id af79cd13be357-8cb20bcff5aso244487485a.3 for ; Sat, 21 Mar 2026 08:04:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gourry.net; s=google; t=1774105462; x=1774710262; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=KixtzO6qBa+4sBSVnEmaKMzBG/EvPApyoC742IFx6xs=; b=LXEh5mGzLypM58FmFvQqNdsKUnow/tQ5Z1HA5Gnp129g2Y0Eb2SIsgj4rcKXvJGGz3 IAc2PeHw2TmYuxiJY6fYKic/Q7JcXzll2kUeKGCVp/0yRDpaedrJSHf1D3wt08RDEzCX zlREtDcPiE5etLjqlICpxYdSVTviFLU77I/YPXuMMyvvTPsCLMrHU42mwHZY5OTvbmWM J37hbQb+hDjWlqE4jKDgwcCi2RG7S/ymgvT4HQHCslnNf28fEg+8cDryfBnxYT+ZJjcn NRBOMGANv3lgAoeJKDMaPB8XKPkRBNpC5E6UTOHg8tHyiKr4l30ZW2uKWXSzHCqToFBS xMlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774105462; x=1774710262; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=KixtzO6qBa+4sBSVnEmaKMzBG/EvPApyoC742IFx6xs=; b=IuUnmO6rgg3GxGcGBthWNejhB2nEhmrDb7ApIXUjTYSFpYeUkbs90JoZLDr3r32OVP nWfu9vlJVFKYWr6hLQ3wpyahzp7QUhqVPpKZz9OqV9teEvZt3YhmMfXQcuIQe318e8pc az/Wgpy5gWZ5ylZD8Whvg3myW+TgqKTasGWHO47CA7UC1tnJKY5uRAM6/HqsK5+iDLfA YYEF0kJQZbzN//QuwkGG0pnxb17BET8D7CQW4zxvr5H7z4eAetLzY/FIEiFBnOHDnd3V Pg98mpusMJZrkEJzEy9wbXi3gAkSgBOJBlrKWA/qfKQdv99lHWW6peas370uSLgBXXW6 w1mA== X-Forwarded-Encrypted: i=1; AJvYcCWrkH1umeRtnnL7WqBp3ueqT0Kp9pd+4tJ8hLK5YvzYYGlj93sualMILrW0qEtPkGeqW4pMLxbFkbxBS/U=@vger.kernel.org X-Gm-Message-State: AOJu0YyneIDQsxXWJQXgMAe1xsMiniadyGtfoYvaXU7K72bVKRaY6ydO +xj+YQDqNMPQoUYrKLXyo5c8+zJdU8Mcb+2Ue2sjg9+o+9MPpm3UbWYinUmLawjANiA= X-Gm-Gg: ATEYQzwdVHakR+Z6iGvssTKu6IK5igSMbJxy+bxenb5835rJpJo5WSCc7OzoELjcj0l X0/bpXob2zWzupQCfezEwPPu8E21GtRwc7VwWfPsm0dCROgmCkTfQnD+9EyDRR/ry+L4UOxU5w2 GyigMqbPxSuJ+l6mITVf85ImbUs7uewW7KHwh3qq500E0sNRYy2Fd6TyDzXTSci5VPkv+PPGfa4 yLSrHUp/fe7V9Z4Kdu+1ogflbIoj3oB5dlwteMY7w4e2j1YCMrr39bVQUniw5PINqn19305rjx9 cfcZONVSiiDbnz1UNeJLWjc9shgHyH0p+JQe09uhoOa0ygDe6CqgrFx09Ppr/ubXgi5tayQjIit q5pXCSlJ6ve0tNg+AN4WMucJnVNX2SDwYXulq2dSvWwpqHIXWDcAuQjHliDHa7ebAstnLqU5DJE hTKJrVX4DMtVr7W8wWVukCPLFE3Et+4iJYHMFB+KRDAZq0nfcSt/U0TjaLXZUdAcLGBo0QBDH4+ cM0cJ6epxl5kSE= X-Received: by 2002:a05:620a:254c:b0:8cd:d91f:b61 with SMTP id af79cd13be357-8cfc7f6a4camr969415785a.51.1774105461527; Sat, 21 Mar 2026 08:04:21 -0700 (PDT) Received: from gourry-fedora-PF4VCD3F.lan (pool-96-255-20-138.washdc.ftas.verizon.net. [96.255.20.138]) by smtp.gmail.com with ESMTPSA id af79cd13be357-8cfc90ba89fsm391979885a.40.2026.03.21.08.04.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 21 Mar 2026 08:04:21 -0700 (PDT) From: Gregory Price To: linux-mm@kvack.org, vishal.l.verma@intel.com, dave.jiang@intel.com, akpm@linux-foundation.org, david@kernel.org, osalvador@suse.de Cc: dan.j.williams@intel.com, ljs@kernel.org, Liam.Howlett@oracle.com, vbabka@kernel.org, rppt@kernel.org, surenb@google.com, mhocko@suse.com, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, kernel-team@meta.com Subject: [PATCH 7/8] dax/kmem: extract hotplug/hotremove helper functions Date: Sat, 21 Mar 2026 11:04:03 -0400 Message-ID: <20260321150404.3288786-8-gourry@gourry.net> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260321150404.3288786-1-gourry@gourry.net> References: <20260321150404.3288786-1-gourry@gourry.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Refactor kmem _probe() _remove() by extracting init, cleanup, hotplug, and hot-remove logic into separate helper functions: - dax_kmem_init_resources: inits IO_RESOURCE w/ request_mem_region - dax_kmem_cleanup_resources: cleans up initialized IO_RESOURCE - dax_kmem_do_hotplug: handles memory region reservation and adding - dax_kmem_do_hotremove: handles memory removal and resource cleanup This is a pure refactoring with no functional change. The helpers will enable future extensions to support more granular control over memory hotplug operations. We need to split hotplug/remove and init/cleanup in order to have the resources available for hot-add. Otherwise, when probe occurs, the dax devices are never added to sysfs because the resources are never registered. Signed-off-by: Gregory Price --- drivers/dax/kmem.c | 308 ++++++++++++++++++++++++++++++--------------- 1 file changed, 210 insertions(+), 98 deletions(-) diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c index d4c34b2e3766..8be9286f0ea3 100644 --- a/drivers/dax/kmem.c +++ b/drivers/dax/kmem.c @@ -47,15 +47,189 @@ struct dax_kmem_data { struct resource *res[]; }; =20 +/** + * dax_kmem_do_hotplug - hotplug memory for dax kmem device + * @dev_dax: the dev_dax instance + * @data: the dax_kmem_data structure with resource tracking + * + * Hotplugs all ranges in the dev_dax region as system memory. + * + * Returns the number of successfully mapped ranges, or negative error. + */ +static int dax_kmem_do_hotplug(struct dev_dax *dev_dax, + struct dax_kmem_data *data, + int online_type) +{ + struct device *dev =3D &dev_dax->dev; + int i, rc, onlined =3D 0; + mhp_t mhp_flags; + + for (i =3D 0; i < dev_dax->nr_range; i++) { + struct range range; + + rc =3D dax_kmem_range(dev_dax, i, &range); + if (rc) + continue; + + mhp_flags =3D MHP_NID_IS_MGID; + if (dev_dax->memmap_on_memory) + mhp_flags |=3D MHP_MEMMAP_ON_MEMORY; + + /* + * Ensure that future kexec'd kernels will not treat + * this as RAM automatically. + */ + rc =3D __add_memory_driver_managed(data->mgid, range.start, + range_len(&range), kmem_name, mhp_flags, + online_type); + + if (rc) { + dev_warn(dev, "mapping%d: %#llx-%#llx memory add failed\n", + i, range.start, range.end); + if (onlined) + continue; + return rc; + } + onlined++; + } + + return onlined; +} + +/** + * dax_kmem_init_resources - create memory regions for dax kmem + * @dev_dax: the dev_dax instance + * @data: the dax_kmem_data structure with resource tracking + * + * Initializes all the resources for the DAX + * + * Returns the number of successfully mapped ranges, or negative error. + */ +static int dax_kmem_init_resources(struct dev_dax *dev_dax, + struct dax_kmem_data *data) +{ + struct device *dev =3D &dev_dax->dev; + int i, rc, mapped =3D 0; + + for (i =3D 0; i < dev_dax->nr_range; i++) { + struct resource *res; + struct range range; + + rc =3D dax_kmem_range(dev_dax, i, &range); + if (rc) + continue; + + /* Skip ranges already added */ + if (data->res[i]) + continue; + + /* Region is permanently reserved if hotremove fails. */ + res =3D request_mem_region(range.start, range_len(&range), + data->res_name); + if (!res) { + dev_warn(dev, "mapping%d: %#llx-%#llx could not reserve region\n", + i, range.start, range.end); + /* + * Once some memory has been onlined we can't + * assume that it can be un-onlined safely. + */ + if (mapped) + continue; + return -EBUSY; + } + data->res[i] =3D res; + /* + * Set flags appropriate for System RAM. Leave ..._BUSY clear + * so that add_memory() can add a child resource. Do not + * inherit flags from the parent since it may set new flags + * unknown to us that will break add_memory() below. + */ + res->flags =3D IORESOURCE_SYSTEM_RAM; + mapped++; + } + return mapped; +} + +#ifdef CONFIG_MEMORY_HOTREMOVE +/** + * dax_kmem_do_hotremove - hot-remove memory for dax kmem device + * @dev_dax: the dev_dax instance + * @data: the dax_kmem_data structure with resource tracking + * + * Removes all ranges in the dev_dax region. + * + * Returns the number of successfully removed ranges. + */ +static int dax_kmem_do_hotremove(struct dev_dax *dev_dax, + struct dax_kmem_data *data) +{ + struct device *dev =3D &dev_dax->dev; + int i, success =3D 0; + + for (i =3D 0; i < dev_dax->nr_range; i++) { + struct range range; + int rc; + + rc =3D dax_kmem_range(dev_dax, i, &range); + if (rc) + continue; + + /* Skip ranges not currently added */ + if (!data->res[i]) + continue; + + rc =3D remove_memory(range.start, range_len(&range)); + if (rc =3D=3D 0) { + /* Release the resource for the successfully removed range */ + remove_resource(data->res[i]); + kfree(data->res[i]); + data->res[i] =3D NULL; + success++; + continue; + } + any_hotremove_failed =3D true; + dev_err(dev, "mapping%d: %#llx-%#llx hotremove failed\n", + i, range.start, range.end); + } + + return success; +} +#else +static int dax_kmem_do_hotremove(struct dev_dax *dev_dax, + struct dax_kmem_data *data) +{ + return -EBUSY; +} +#endif /* CONFIG_MEMORY_HOTREMOVE */ + +/** + * dax_kmem_cleanup_resources - remove the dax memory resources + * @dev_dax: the dev_dax instance + * @data: the dax_kmem_data structure with resource tracking + * + * Removes all resources in the dev_dax region. + */ +static void dax_kmem_cleanup_resources(struct dev_dax *dev_dax, + struct dax_kmem_data *data) +{ + int i; + + for (i =3D 0; i < dev_dax->nr_range; i++) { + if (!data->res[i]) + continue; + remove_resource(data->res[i]); + kfree(data->res[i]); + data->res[i] =3D NULL; + } +} + static int dev_dax_kmem_probe(struct dev_dax *dev_dax) { struct device *dev =3D &dev_dax->dev; unsigned long total_len =3D 0, orig_len =3D 0; struct dax_kmem_data *data; struct memory_dev_type *mtype; - int i, rc, mapped =3D 0; - enum mmop online_type; - mhp_t mhp_flags; + int i, rc; int numa_node; int adist =3D MEMTIER_DEFAULT_LOWTIER_ADISTANCE; =20 @@ -116,72 +290,27 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax) if (rc < 0) goto err_reg_mgid; data->mgid =3D rc; - - online_type =3D dev_dax->online_type; - - for (i =3D 0; i < dev_dax->nr_range; i++) { - struct resource *res; - struct range range; - - rc =3D dax_kmem_range(dev_dax, i, &range); - if (rc) - continue; - - /* Region is permanently reserved if hotremove fails. */ - res =3D request_mem_region(range.start, range_len(&range), data->res_nam= e); - if (!res) { - dev_warn(dev, "mapping%d: %#llx-%#llx could not reserve region\n", - i, range.start, range.end); - /* - * Once some memory has been onlined we can't - * assume that it can be un-onlined safely. - */ - if (mapped) - continue; - rc =3D -EBUSY; - goto err_request_mem; - } - data->res[i] =3D res; - - /* - * Set flags appropriate for System RAM. Leave ..._BUSY clear - * so that add_memory() can add a child resource. Do not - * inherit flags from the parent since it may set new flags - * unknown to us that will break add_memory() below. - */ - res->flags =3D IORESOURCE_SYSTEM_RAM; - - mhp_flags =3D MHP_NID_IS_MGID; - if (dev_dax->memmap_on_memory) - mhp_flags |=3D MHP_MEMMAP_ON_MEMORY; - - /* - * Ensure that future kexec'd kernels will not treat - * this as RAM automatically. - */ - rc =3D __add_memory_driver_managed(data->mgid, range.start, - range_len(&range), kmem_name, mhp_flags, - online_type); - - if (rc) { - dev_warn(dev, "mapping%d: %#llx-%#llx memory add failed\n", - i, range.start, range.end); - remove_resource(res); - kfree(res); - data->res[i] =3D NULL; - if (mapped) - continue; - goto err_request_mem; - } - mapped++; - } data->mtype =3D mtype; =20 dev_set_drvdata(dev, data); =20 + rc =3D dax_kmem_init_resources(dev_dax, data); + if (rc < 0) + goto err_resources; + + /* + * Hotplug using the configured online type for this device. + */ + rc =3D dax_kmem_do_hotplug(dev_dax, data, dev_dax->online_type); + if (rc < 0) + goto err_hotplug; + return 0; =20 -err_request_mem: +err_hotplug: + dax_kmem_cleanup_resources(dev_dax, data); +err_resources: + dev_set_drvdata(dev, NULL); memory_group_unregister(data->mgid); err_reg_mgid: kfree(data->res_name); @@ -195,7 +324,7 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax) #ifdef CONFIG_MEMORY_HOTREMOVE static void dev_dax_kmem_remove(struct dev_dax *dev_dax) { - int i, success =3D 0; + int success; int node =3D dev_dax->target_node; struct device *dev =3D &dev_dax->dev; struct dax_kmem_data *data =3D dev_get_drvdata(dev); @@ -206,42 +335,25 @@ static void dev_dax_kmem_remove(struct dev_dax *dev_d= ax) * there is no way to hotremove this memory until reboot because device * unbind will succeed even if we return failure. */ - for (i =3D 0; i < dev_dax->nr_range; i++) { - struct range range; - int rc; - - rc =3D dax_kmem_range(dev_dax, i, &range); - if (rc) - continue; - - rc =3D remove_memory(range.start, range_len(&range)); - if (rc =3D=3D 0) { - remove_resource(data->res[i]); - kfree(data->res[i]); - data->res[i] =3D NULL; - success++; - continue; - } - any_hotremove_failed =3D true; - dev_err(dev, - "mapping%d: %#llx-%#llx cannot be hotremoved until the next reboot\n", - i, range.start, range.end); + success =3D dax_kmem_do_hotremove(dev_dax, data); + if (success < dev_dax->nr_range) { + dev_err(dev, "Hotplug regions stuck online until reboot\n"); + return; } =20 - if (success >=3D dev_dax->nr_range) { - memory_group_unregister(data->mgid); - kfree(data->res_name); - kfree(data); - dev_set_drvdata(dev, NULL); - /* - * Clear the memtype association on successful unplug. - * If not, we have memory blocks left which can be - * offlined/onlined later. We need to keep memory_dev_type - * for that. This implies this reference will be around - * till next reboot. - */ - clear_node_memory_type(node, data->mtype); - } + dax_kmem_cleanup_resources(dev_dax, data); + memory_group_unregister(data->mgid); + kfree(data->res_name); + kfree(data); + dev_set_drvdata(dev, NULL); + /* + * Clear the memtype association on successful unplug. + * If not, we have memory blocks left which can be + * offlined/onlined later. We need to keep memory_dev_type + * for that. This implies this reference will be around + * till next reboot. + */ + clear_node_memory_type(node, data->mtype); } #else static void dev_dax_kmem_remove(struct dev_dax *dev_dax) --=20 2.53.0 From nobody Sat Apr 4 00:24:01 2026 Received: from mail-qk1-f173.google.com (mail-qk1-f173.google.com [209.85.222.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A90D5391E56 for ; Sat, 21 Mar 2026 15:04:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774105467; cv=none; b=ru9oQCaIsZxInItQLTHGGNbOUDLPX32kax9mZgpWwO2kyb+DtpWyHZBbLBh5Qxk9LmxaNP97COWiK/Qy2w8F0G5mV4CWI2q+zjfzVNM+BEpnYWtwzmbIsKLljvIj58qOUDt6OS3onFoZMSiMFIwqOs6pb7ElEtyzWrJYI/DuWy0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774105467; c=relaxed/simple; bh=DqZdLzszMzhm/Vsx1n/l9hukxOl1LDdHzTaf+SEFt2g=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ZoI4/5S5gdOvmWSpJe//LeNQfrl8wZ0r6h9QFoWtTtkrCSR94TmlVUDIqotAWX7IC4gvrUYOoRZAM5fc4hBhEd8KO0yLlDOoxP1svXR5MNJYSJOzzQaiinUylhwfvOShjhiyuXyZ9dXW448pSgghfMCf8aN7Xg81yP/1ApZeyDw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gourry.net; spf=pass smtp.mailfrom=gourry.net; dkim=pass (2048-bit key) header.d=gourry.net header.i=@gourry.net header.b=pR8AJmJR; arc=none smtp.client-ip=209.85.222.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gourry.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gourry.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gourry.net header.i=@gourry.net header.b="pR8AJmJR" Received: by mail-qk1-f173.google.com with SMTP id af79cd13be357-8cfd44fa075so117125185a.0 for ; Sat, 21 Mar 2026 08:04:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gourry.net; s=google; t=1774105464; x=1774710264; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=GotqY1iYunncyBpBt3k1vF2DkZbep3lRnQKBDrqpHyM=; b=pR8AJmJR6zQoWZd1U0ca9P4ivgfEDyKvNHYUYCuHyHnpp1zd74HdI33eB1kGS9o25O KSztkWhZGOFiL/MIb2fjCnaPUuDBzjEL1GqUp01cBIyJoMHjAjLzqIHop3ybx6XUy+R4 LOK27sLESYgBCjDzyoNsOKbjDkh+YhgFBqUrrarWH4YwFLmRa+GLYN/HtcGZTB3fOSjL uHWw07brweE0Omj2h8OH3Xl47rk39DhNbR0tmk+QMYDQQN4bBio+IBJ56SGQh4F8uk8t 0vyV3svgYs9M9tX/2GzYxXRWLFBlVBoxSSB7z3izFYixHt3vjCs+p1tKYltJXt5wYjpp zT3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774105464; x=1774710264; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=GotqY1iYunncyBpBt3k1vF2DkZbep3lRnQKBDrqpHyM=; b=Lr4ziOvLt7IEIe2JkH1hjy3eTJccTlMnB2y9bWaag9a+bqFmxJg2Jzdr3t7iB0KnPU oI2dvTFj+p0Jd2u80u9obasEeFPm5Iln0mqXH3zCo3qUwwtQ7GlQZVLgPeY0XoeRMaAm VoJnwz5guZToUGvXGFtdflqNEUuSM1rNtr7YDvkJe1hLF5AjddNvFK+Q9qt77RBEgCvm Mp4UX5nNMIKsTmlG7i4A91NjC5yqsi1wIeVeHlv+qqypcKz8wwXmYn2XwJa1ZKE5ca2u BebZMpVdOV0pr7OdE8Thcxqq88C2iw/4qOJAUZYfiQcGMmyvNqSLIC7VlR3wh0Sf7TII sguA== X-Forwarded-Encrypted: i=1; AJvYcCX0LONj1GCzrr440BwrzGY6Qw63uiEyyaW3Xz9HQybguvVGgHYoIc/kMUh2yR1C9ZQizEkUQd8losEeJ60=@vger.kernel.org X-Gm-Message-State: AOJu0YxQ8AxNSxkUlWTwtvIuATal3HyoLo6Cfq9Zfdl4zjmUCuH/EpGS LA5t08/ZErvMfzJYXZZa7Tam483O6YU/wjN9PtQZO+9amIOFReGulN7fEXrGFDqvWlE= X-Gm-Gg: ATEYQzwYNIIsnt7z1N0m6mJ6vkD3Q0yIGDEab0xZaIpEUaj6FkmvYKkI9NLeTtvlwzj AVSa50T35jWfuASPtl7XUWid5CeN2fEBgcfahbpuAry2fial1ojSoFhpLwbaNtmIgWkuaemc1X1 JpFQbUAOrgrTKHr9BPlYoL2BTU144no5/qPrmsasX/jZHGF/Y4MWbitONgf/tuH1pbU3gerbR9D Y5p51ysIrHJ3drIi9MjxQBGe1Vlz5YzZl9u8JhHY2ure6RGkY8a2Tr0pkKXO1PZKqcxJb5RDVEi GyEGxUQy45yi0f2Cx5N0CY8ED3krMd0w8FXWLlNDf7E3ZsWla3hpE0FLgNcwVpt4CRK5xPrOT01 NhmAlox5V0Od3BsPTS7s82mBDDHNyj0MBkQ4ZDPV7/Gdej4mN3rUhIIL1sU038w81Xm4Ello5VR wgmv0P+OvuUO62vQtPma+8VhbJzhJk58NHyRRFGdxVOUiQAJx5iSlcIN59HMPwDIbRO/ledALci D3+gQj+DRvV4xqACKD5LAciAw== X-Received: by 2002:a05:620a:4456:b0:8b2:ea5a:4149 with SMTP id af79cd13be357-8cfc7f873aamr1145888685a.65.1774105463364; Sat, 21 Mar 2026 08:04:23 -0700 (PDT) Received: from gourry-fedora-PF4VCD3F.lan (pool-96-255-20-138.washdc.ftas.verizon.net. [96.255.20.138]) by smtp.gmail.com with ESMTPSA id af79cd13be357-8cfc90ba89fsm391979885a.40.2026.03.21.08.04.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 21 Mar 2026 08:04:22 -0700 (PDT) From: Gregory Price To: linux-mm@kvack.org, vishal.l.verma@intel.com, dave.jiang@intel.com, akpm@linux-foundation.org, david@kernel.org, osalvador@suse.de Cc: dan.j.williams@intel.com, ljs@kernel.org, Liam.Howlett@oracle.com, vbabka@kernel.org, rppt@kernel.org, surenb@google.com, mhocko@suse.com, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, kernel-team@meta.com, Hannes Reinecke Subject: [PATCH 8/8] dax/kmem: add sysfs interface for atomic whole-device hotplug Date: Sat, 21 Mar 2026 11:04:04 -0400 Message-ID: <20260321150404.3288786-9-gourry@gourry.net> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260321150404.3288786-1-gourry@gourry.net> References: <20260321150404.3288786-1-gourry@gourry.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The dax kmem driver currently onlines memory automatically during probe using the system's default online policy but provides no way to control or query the entire region state at runtime. Additionally, there is no atomic mechanism to offline and remove the entire set of memory blocks together. Instead, this is presently done in two steps: (offline all, remove all). This creates a race condition where external entities can operate directly on the blocks and cause hot-unplug to fail. Add a new 'hotplug' sysfs attribute that allows userspace to control and query the entire memory region state. The interface supports the following states: - "unplug": memory is offline and blocks are not present - "online": memory is online as normal system RAM - "online_movable": memory is online in ZONE_MOVABLE Valid transitions: - unplugged -> online - unplugged -> online_movable - online -> unplugged - online_movable -> unplugged "offline" (memory blocks exist but are offline by default) is not supported because it's functionally equivalent to "unplugged" and entices races between offlining and unplugging. The initial state after probe currently checks if online_type matches mhp_get_default_online_type() - and if so calls dax_kmem_do_hotplug. This causes the creation of memory blocks, despite the fact that we should be in an unplugged state. This preserves userland backward compatibility for existing tools that expect the memory blocks to be present after kmem probe - and can be deprecated over time. As with any hot-remove mechanism, the removal can fail and if rollback fails the system can be left in an inconsistent state. Unbind Note: We used to call remove_memory() during unbind, which would fire a BUG() if any of the memory blocks were online at that time. We lift this into a WARN in the cleanup routine and don't attempt hotremove if ->state is not DAX_KMEM_UNPLUGGED or MMOP_OFFLINE. The resources are still leaked but this prevents deadlock on unbind if a memory region happens to be impossible to hotremove. Suggested-by: Hannes Reinecke Suggested-by: David Hildenbrand Signed-off-by: Gregory Price --- Documentation/ABI/testing/sysfs-bus-dax | 17 +++ drivers/dax/kmem.c | 164 +++++++++++++++++++++--- 2 files changed, 161 insertions(+), 20 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-bus-dax b/Documentation/ABI/te= sting/sysfs-bus-dax index b34266bfae49..faf6f63a368c 100644 --- a/Documentation/ABI/testing/sysfs-bus-dax +++ b/Documentation/ABI/testing/sysfs-bus-dax @@ -151,3 +151,20 @@ Description: memmap_on_memory parameter for memory_hotplug. This is typically set on the kernel command line - memory_hotplug.memmap_on_memory set to 'true' or 'force'." + +What: /sys/bus/dax/devices/daxX.Y/hotplug +Date: January, 2026 +KernelVersion: v6.21 +Contact: nvdimm@lists.linux.dev +Description: + (RW) Controls what hotplug state of the memory region. + Applies to all memory blocks associated with the device. + Only applies to dax_kmem devices. + + States: [unplugged, online, online_movable] + Arguments: + "unplug": memory is offline and blocks are not present + "online": memory is online as normal system RAM + "online_movable": memory is online in ZONE_MOVABLE + + Devices must unplug to online into a different state. diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c index 8be9286f0ea3..5dbd5b7862fd 100644 --- a/drivers/dax/kmem.c +++ b/drivers/dax/kmem.c @@ -40,10 +40,16 @@ static int dax_kmem_range(struct dev_dax *dev_dax, int = i, struct range *r) return 0; } =20 +#define DAX_KMEM_UNPLUGGED (-1) + struct dax_kmem_data { const char *res_name; int mgid; struct memory_dev_type *mtype; + int numa_node; + struct dev_dax *dev_dax; + int state; + struct mutex lock; /* protects hotplug state transitions */ struct resource *res[]; }; =20 @@ -51,8 +57,10 @@ struct dax_kmem_data { * dax_kmem_do_hotplug - hotplug memory for dax kmem device * @dev_dax: the dev_dax instance * @data: the dax_kmem_data structure with resource tracking + * @online_type: MMOP_ONLINE or MMOP_ONLINE_MOVABLE * - * Hotplugs all ranges in the dev_dax region as system memory. + * Hotplugs all ranges in the dev_dax region as system memory using + * the specified online type. * * Returns the number of successfully mapped ranges, or negative error. */ @@ -64,6 +72,12 @@ static int dax_kmem_do_hotplug(struct dev_dax *dev_dax, int i, rc, onlined =3D 0; mhp_t mhp_flags; =20 + if (data->state =3D=3D MMOP_ONLINE || data->state =3D=3D MMOP_ONLINE_MOVA= BLE) + return -EINVAL; + + if (online_type !=3D MMOP_ONLINE && online_type !=3D MMOP_ONLINE_MOVABLE) + return -EINVAL; + for (i =3D 0; i < dev_dax->nr_range; i++) { struct range range; =20 @@ -156,9 +170,9 @@ static int dax_kmem_init_resources(struct dev_dax *dev_= dax, * @dev_dax: the dev_dax instance * @data: the dax_kmem_data structure with resource tracking * - * Removes all ranges in the dev_dax region. + * Offlines and removes all ranges in the dev_dax region. * - * Returns the number of successfully removed ranges. + * Returns the number of successfully removed ranges, or negative error. */ static int dax_kmem_do_hotremove(struct dev_dax *dev_dax, struct dax_kmem_data *data) @@ -178,7 +192,7 @@ static int dax_kmem_do_hotremove(struct dev_dax *dev_da= x, if (!data->res[i]) continue; =20 - rc =3D remove_memory(range.start, range_len(&range)); + rc =3D offline_and_remove_memory(range.start, range_len(&range)); if (rc =3D=3D 0) { /* Release the resource for the successfully removed range */ remove_resource(data->res[i]); @@ -214,6 +228,20 @@ static void dax_kmem_cleanup_resources(struct dev_dax = *dev_dax, { int i; =20 + /* + * If the device unbind occurs before memory is hotremoved, we can never + * remove the memory (requires reboot). Attempting an offline operation + * here may cause deadlock and a failure to finish the unbind. + * + * This WARN used to be a BUG called by remove_memory(). + * + * Note: This leaks the resources. + */ + if (WARN(((data->state !=3D DAX_KMEM_UNPLUGGED) && + (data->state !=3D MMOP_OFFLINE)), + "Hotplug memory regions stuck online until reboot")) + return; + for (i =3D 0; i < dev_dax->nr_range; i++) { if (!data->res[i]) continue; @@ -223,6 +251,98 @@ static void dax_kmem_cleanup_resources(struct dev_dax = *dev_dax, } } =20 +static int dax_kmem_parse_state(const char *buf) +{ + if (sysfs_streq(buf, "unplug")) + return DAX_KMEM_UNPLUGGED; + if (sysfs_streq(buf, "online")) + return MMOP_ONLINE; + if (sysfs_streq(buf, "online_movable")) + return MMOP_ONLINE_MOVABLE; + return -EINVAL; +} + +static ssize_t hotplug_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct dax_kmem_data *data =3D dev_get_drvdata(dev); + const char *state_str; + + if (!data) + return -ENXIO; + + switch (data->state) { + case DAX_KMEM_UNPLUGGED: + state_str =3D "unplugged"; + break; + case MMOP_OFFLINE: + state_str =3D "offline"; + break; + case MMOP_ONLINE: + state_str =3D "online"; + break; + case MMOP_ONLINE_MOVABLE: + state_str =3D "online_movable"; + break; + default: + state_str =3D "unknown"; + break; + } + + return sysfs_emit(buf, "%s\n", state_str); +} + +static ssize_t hotplug_store(struct device *dev, struct device_attribute *= attr, + const char *buf, size_t len) +{ + struct dev_dax *dev_dax =3D to_dev_dax(dev); + struct dax_kmem_data *data =3D dev_get_drvdata(dev); + int online_type; + int rc; + + if (!data) + return -ENXIO; + + online_type =3D dax_kmem_parse_state(buf); + if (online_type < DAX_KMEM_UNPLUGGED) + return online_type; + + guard(mutex)(&data->lock); + + /* Already in requested state */ + if (data->state =3D=3D online_type) + return len; + + if (online_type =3D=3D DAX_KMEM_UNPLUGGED) { + rc =3D dax_kmem_do_hotremove(dev_dax, data); + if (rc < 0) { + dev_warn(dev, "hotplug state is inconsistent\n"); + return rc; + } + if (rc < dev_dax->nr_range) + dev_warn(dev, "partial hotremove: %d of %d ranges removed\n", + rc, dev_dax->nr_range); + else + data->state =3D DAX_KMEM_UNPLUGGED; + return len; + } + + /* + * online_type is MMOP_ONLINE or MMOP_ONLINE_MOVABLE + * Cannot switch between online types without unplugging first + */ + if (data->state =3D=3D MMOP_ONLINE || data->state =3D=3D MMOP_ONLINE_MOVA= BLE) + return -EBUSY; + + rc =3D dax_kmem_do_hotplug(dev_dax, data, online_type); + if (rc < 0) + return rc; + + data->state =3D online_type; + return len; +} +static DEVICE_ATTR_RW(hotplug); + static int dev_dax_kmem_probe(struct dev_dax *dev_dax) { struct device *dev =3D &dev_dax->dev; @@ -291,6 +411,10 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax) goto err_reg_mgid; data->mgid =3D rc; data->mtype =3D mtype; + data->numa_node =3D numa_node; + data->dev_dax =3D dev_dax; + data->state =3D DAX_KMEM_UNPLUGGED; + mutex_init(&data->lock); =20 dev_set_drvdata(dev, data); =20 @@ -301,9 +425,17 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax) /* * Hotplug using the configured online type for this device. */ - rc =3D dax_kmem_do_hotplug(dev_dax, data, dev_dax->online_type); - if (rc < 0) - goto err_hotplug; + if (dev_dax->online_type !=3D MMOP_OFFLINE || + dev_dax->online_type =3D=3D mhp_get_default_online_type()) { + rc =3D dax_kmem_do_hotplug(dev_dax, data, dev_dax->online_type); + if (rc < 0) + goto err_hotplug; + data->state =3D dev_dax->online_type; + } + + rc =3D device_create_file(dev, &dev_attr_hotplug); + if (rc) + dev_warn(dev, "failed to create hotplug sysfs entry\n"); =20 return 0; =20 @@ -324,23 +456,11 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax) #ifdef CONFIG_MEMORY_HOTREMOVE static void dev_dax_kmem_remove(struct dev_dax *dev_dax) { - int success; int node =3D dev_dax->target_node; struct device *dev =3D &dev_dax->dev; struct dax_kmem_data *data =3D dev_get_drvdata(dev); =20 - /* - * We have one shot for removing memory, if some memory blocks were not - * offline prior to calling this function remove_memory() will fail, and - * there is no way to hotremove this memory until reboot because device - * unbind will succeed even if we return failure. - */ - success =3D dax_kmem_do_hotremove(dev_dax, data); - if (success < dev_dax->nr_range) { - dev_err(dev, "Hotplug regions stuck online until reboot\n"); - return; - } - + device_remove_file(dev, &dev_attr_hotplug); dax_kmem_cleanup_resources(dev_dax, data); memory_group_unregister(data->mgid); kfree(data->res_name); @@ -358,6 +478,10 @@ static void dev_dax_kmem_remove(struct dev_dax *dev_da= x) #else static void dev_dax_kmem_remove(struct dev_dax *dev_dax) { + struct device *dev =3D &dev_dax->dev; + + device_remove_file(dev, &dev_attr_hotplug); + /* * Without hotremove purposely leak the request_mem_region() for the * device-dax range and return '0' to ->remove() attempts. The removal --=20 2.53.0