From nobody Mon Apr 6 15:50:40 2026 Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 871C3276028; Thu, 19 Mar 2026 01:28:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=216.40.44.13 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773883701; cv=none; b=cg9d8O6bjmmjd7ZT1uNvJEOmvw9XCB9pq1nKdq//saJE6mK39brBatGudn+TxD+bHTZvZfcNqCXfwRP90eO1qTnAvbEY9tNjkiVRukQcOLjDMtksISaIeAUwaxNHMZaLZde6x7+/snh8ln5dYuewAci/clE6yUveV7NO1WGZkiQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773883701; c=relaxed/simple; bh=naDQZF/BhBY22OO/DTY3ljuqY4kUQGW3ahYP4vle2lY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ITMJi4EOaXonWeKbm44VcINysQQA0eHd7w/MmdC8BSYU7JI370uKBJBRZAgFZ5EnQsmrU+Q04P9kFGlAc49GxMNXcV1shdX/bTwwZdrMRqpeBA1s5kOaukE+AoGnCvrpWMfyH2LSxMhrZVuj0h+gDUmD3WZ07yVacbotQXXghSE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=groves.net; spf=pass smtp.mailfrom=groves.net; arc=none smtp.client-ip=216.40.44.13 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=groves.net Received: from omf09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id E05701C705; Thu, 19 Mar 2026 01:28:15 +0000 (UTC) Received: from [HIDDEN] (Authenticated sender: john@groves.net) by omf09.hostedemail.com (Postfix) with ESMTPA id 15DAC2003C; Thu, 19 Mar 2026 01:28:04 +0000 (UTC) From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Shuah Khan , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, John Groves , Ira Weiny Subject: [PATCH V8 1/8] dax: move dax_pgoff_to_phys from [drivers/dax/] device.c to bus.c Date: Wed, 18 Mar 2026 20:28:02 -0500 Message-ID: <20260319012802.4392-1-john@groves.net> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260318202737.4344.dax@groves.net> References: <20260318202737.4344.dax@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Stat-Signature: 9kst1udehgz4p5csn1um45jijio9qu9q X-Rspamd-Server: rspamout08 X-Rspamd-Queue-Id: 15DAC2003C X-Session-Marker: 6A6F686E4067726F7665732E6E6574 X-Session-ID: U2FsdGVkX19+AV2eSSAS6FxlX2z8DKZwJht5Z46gcL8= X-HE-Tag: 1773883684-654206 X-HE-Meta: U2FsdGVkX1/xroKMhqzgTNq9a5qWs4ve+37JzDzIHFltAze2DKrKxr0X0JFvJTkL2ysuLqwtjaS+5eYCKJLbcomFFR3d86N9i8F2tP0UOBbaDDC6P+RcdAXWm9AYjgF2GjTRjthpEaPCvQdtnQCDoiNcyw4KPJHjAp5omh+2QtDUzbulyUQzpyPFDm4ydnRH0gAEPNELNFNmQ9tvkjnrQgS+pK8hZnc8XAeGA+Hb77qaVSgNrr2jb39iMwvl3L8jmlqMSvkg6irnUFABhXRaz45a5jBdWAWWSm5mK+uR0EvjpLjvBdrAZt/G1NS7SXh8I28whhBqWyxXej/CT+pDE1JKmqPnDE1p Content-Type: text/plain; charset="utf-8" This function will be used by both device.c and fsdev.c, but both are loadable modules. Moving to bus.c puts it in core and makes it available to both. No code changes - just relocated. Reviewed-by: Ira Weiny Reviewed-by: Dave Jiang Signed-off-by: John Groves --- drivers/dax/bus.c | 24 ++++++++++++++++++++++++ drivers/dax/device.c | 23 ----------------------- 2 files changed, 24 insertions(+), 23 deletions(-) diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c index c94c09622516..e4bd5c9f006c 100644 --- a/drivers/dax/bus.c +++ b/drivers/dax/bus.c @@ -1417,6 +1417,30 @@ static const struct device_type dev_dax_type =3D { .groups =3D dax_attribute_groups, }; =20 +/* see "strong" declaration in tools/testing/nvdimm/dax-dev.c */ +__weak phys_addr_t dax_pgoff_to_phys(struct dev_dax *dev_dax, pgoff_t pgof= f, + unsigned long size) +{ + int i; + + for (i =3D 0; i < dev_dax->nr_range; i++) { + struct dev_dax_range *dax_range =3D &dev_dax->ranges[i]; + struct range *range =3D &dax_range->range; + unsigned long long pgoff_end; + phys_addr_t phys; + + pgoff_end =3D dax_range->pgoff + PHYS_PFN(range_len(range)) - 1; + if (pgoff < dax_range->pgoff || pgoff > pgoff_end) + continue; + phys =3D PFN_PHYS(pgoff - dax_range->pgoff) + range->start; + if (phys + size - 1 <=3D range->end) + return phys; + break; + } + return -1; +} +EXPORT_SYMBOL_GPL(dax_pgoff_to_phys); + static struct dev_dax *__devm_create_dev_dax(struct dev_dax_data *data) { struct dax_region *dax_region =3D data->dax_region; diff --git a/drivers/dax/device.c b/drivers/dax/device.c index 528e81240c4d..2d2dbfd35e94 100644 --- a/drivers/dax/device.c +++ b/drivers/dax/device.c @@ -57,29 +57,6 @@ static int check_vma(struct dev_dax *dev_dax, struct vm_= area_struct *vma, vma->vm_file, func); } =20 -/* see "strong" declaration in tools/testing/nvdimm/dax-dev.c */ -__weak phys_addr_t dax_pgoff_to_phys(struct dev_dax *dev_dax, pgoff_t pgof= f, - unsigned long size) -{ - int i; - - for (i =3D 0; i < dev_dax->nr_range; i++) { - struct dev_dax_range *dax_range =3D &dev_dax->ranges[i]; - struct range *range =3D &dax_range->range; - unsigned long long pgoff_end; - phys_addr_t phys; - - pgoff_end =3D dax_range->pgoff + PHYS_PFN(range_len(range)) - 1; - if (pgoff < dax_range->pgoff || pgoff > pgoff_end) - continue; - phys =3D PFN_PHYS(pgoff - dax_range->pgoff) + range->start; - if (phys + size - 1 <=3D range->end) - return phys; - break; - } - return -1; -} - static void dax_set_mapping(struct vm_fault *vmf, unsigned long pfn, unsigned long fault_size) { --=20 2.53.0 From nobody Mon Apr 6 15:50:40 2026 Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 99B4B2750E6; Thu, 19 Mar 2026 01:28:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=216.40.44.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773883722; cv=none; b=AJsQTZMCHdrxcnRkGZKEtLZIu54JVAAe3298gh/PWfMYFVYiTNJvkxIF/gHVC/1JEI9u87SxBSOmaNBJqW07LwZXhiNz2eURTMoGjFdleB/WxDleQHZA2cYeGsYV5vJvOVhfBRvzlmmQ3YqBrh6/90xabG1OiGqZpBTMqXGtt+g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773883722; c=relaxed/simple; bh=Ry6ZwT2wcK2EVliZ6GUD3Ky5ZHSkdkjw/auIlV0cCvw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Wd7TcFsHMPa0/VMdyEBV0WshSARK6NYooh9Ff5XVgz82wpM0DNz5bHDYMOJucTzZOI0iJ9ldFFaI2RLc4mJc1fStgoHZP7dVCYsj5cOZ3n3mE4nG5WvQOp7eKmMDyAuTCantwZnZOBDELITDjCYGRlKAvbwC/uvhCqquZbL2SO8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=groves.net; spf=pass smtp.mailfrom=groves.net; arc=none smtp.client-ip=216.40.44.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=groves.net Received: from omf01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id C7051C1473; Thu, 19 Mar 2026 01:28:33 +0000 (UTC) Received: from [HIDDEN] (Authenticated sender: john@groves.net) by omf01.hostedemail.com (Postfix) with ESMTPA id 124C860011; Thu, 19 Mar 2026 01:28:21 +0000 (UTC) From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Shuah Khan , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, Jonathan Cameron , Ira Weiny , John Groves Subject: [PATCH V8 2/8] dax: Factor out dax_folio_reset_order() helper Date: Wed, 18 Mar 2026 20:28:20 -0500 Message-ID: <20260319012820.4420-1-john@groves.net> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260318202737.4344.dax@groves.net> References: <20260318202737.4344.dax@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Stat-Signature: 7jfuk6j51u3nexgxzqcigmkhm6c3yu93 X-Rspamd-Server: rspamout02 X-Rspamd-Queue-Id: 124C860011 X-Session-Marker: 6A6F686E4067726F7665732E6E6574 X-Session-ID: U2FsdGVkX193/2KCMcDN1qObYg7IwfiEoGyzE3Mk658= X-HE-Tag: 1773883701-112408 X-HE-Meta: U2FsdGVkX1+1VGfp9NQnPzohqKLhLXTmEFvKc1+zZIX4U7QqCLAY72FbBM/05Vv/nQQ0RDpc8hlaZag9axLqmDhnsYs8jYlc8F9Ptb/jalAyGraq27Wz6TUz347vPnyvVEaSZh0JvbQ02c0irK/1y6X1BGjUPTyrHHt5cyiZD1yiBv4OGsyiKeWck+oIdhbH2zZFNW5lNiXFKCjEKsNjV3ID0dNHIGKxzk3HkeqYjuG+hDl1vAVKW9TfuG4tLhaXA2s7VqGT8ntf+iPDHedLc35wjfAAnrFG0AB+4Ac5o47NGsTe9I5ytxn4u7DhQpVO0SRCcCdeiQYKYJHjPOsrpH09uBWcJQ6ce99XeNKa6FiNWXppkhKVVzZ/axrrvXQZ Content-Type: text/plain; charset="utf-8" From: John Groves Both fs/dax.c:dax_folio_put() and drivers/dax/fsdev.c: fsdev_clear_folio_state() (the latter coming in the next commit after this one) contain nearly identical code to reset a compound DAX folio back to order-0 pages. Factor this out into a shared helper function. The new dax_folio_reset_order() function: - Clears the folio's mapping and share count - Resets compound folio state via folio_reset_order() - Clears PageHead and compound_head for each sub-page - Restores the pgmap pointer for each resulting order-0 folio - Returns the original folio order (for callers that need to advance by that many pages) This simplifies fsdev_clear_folio_state() from ~50 lines to ~15 lines while maintaining the same functionality in both call sites. Suggested-by: Jonathan Cameron Reviewed-by: Ira Weiny Reviewed-by: Dave Jiang Signed-off-by: John Groves --- fs/dax.c | 60 +++++++++++++++++++++++++++++++++++++++----------------- 1 file changed, 42 insertions(+), 18 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 289e6254aa30..7d7bbfb32c41 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -378,6 +378,45 @@ static void dax_folio_make_shared(struct folio *folio) folio->share =3D 1; } =20 +/** + * dax_folio_reset_order - Reset a compound DAX folio to order-0 pages + * @folio: The folio to reset + * + * Splits a compound folio back into individual order-0 pages, + * clearing compound state and restoring pgmap pointers. + * + * Returns: the original folio order (0 if already order-0) + */ +int dax_folio_reset_order(struct folio *folio) +{ + struct dev_pagemap *pgmap =3D page_pgmap(&folio->page); + int order =3D folio_order(folio); + int i; + + folio->mapping =3D NULL; + folio->share =3D 0; + + if (!order) { + folio->pgmap =3D pgmap; + return 0; + } + + folio_reset_order(folio); + + for (i =3D 0; i < (1UL << order); i++) { + struct page *page =3D folio_page(folio, i); + struct folio *f =3D (struct folio *)page; + + ClearPageHead(page); + clear_compound_head(page); + f->mapping =3D NULL; + f->share =3D 0; + f->pgmap =3D pgmap; + } + + return order; +} + static inline unsigned long dax_folio_put(struct folio *folio) { unsigned long ref; @@ -391,28 +430,13 @@ static inline unsigned long dax_folio_put(struct foli= o *folio) if (ref) return ref; =20 - folio->mapping =3D NULL; - order =3D folio_order(folio); - if (!order) - return 0; - folio_reset_order(folio); + order =3D dax_folio_reset_order(folio); =20 + /* Debug check: verify refcounts are zero for all sub-folios */ for (i =3D 0; i < (1UL << order); i++) { - struct dev_pagemap *pgmap =3D page_pgmap(&folio->page); struct page *page =3D folio_page(folio, i); - struct folio *new_folio =3D (struct folio *)page; =20 - ClearPageHead(page); - clear_compound_head(page); - - new_folio->mapping =3D NULL; - /* - * Reset pgmap which was over-written by - * prep_compound_page(). - */ - new_folio->pgmap =3D pgmap; - new_folio->share =3D 0; - WARN_ON_ONCE(folio_ref_count(new_folio)); + WARN_ON_ONCE(folio_ref_count((struct folio *)page)); } =20 return ref; --=20 2.53.0 From nobody Mon Apr 6 15:50:40 2026 Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8F291288C0E; Thu, 19 Mar 2026 01:29:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=216.40.44.12 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773883773; cv=none; b=o017F4a7VZo+IZyh3JBT4BH2xWKexdg6sm9SRar6dD1paF+iOBkAvui4GGdGP0xvnrXyGPHm0KwIbFDQDZ0j2M2Pgwlz7YSUFHvnPZBJtk/+2s9Lry6g3hE7eJN5xPXRhpEl3KmyZTHWnmIij3t2IqdzGwxIOv7/i11CoPmvzVc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773883773; c=relaxed/simple; bh=9vIO+rt+XUsnHSIjCSY1INSB9v02OfqYgnn2TqubIsM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Oc2+Inf6x/6vFvmZ4uRV1LS8BHeRYdA+Eex5Cbc4Lz+1+JVMgyIxLjGDlfzpKyJIjs3gS3GagJTm0ic6S2BEvGsfFJMWtDS/hFnEtjys2zbmAXv6LoRee7SBJD/vvqQdXh13zA1iKSaqiRh8McfjXXzLdjOGfJTVGB/NJoTnGp4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=groves.net; spf=pass smtp.mailfrom=groves.net; arc=none smtp.client-ip=216.40.44.12 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=groves.net Received: from omf16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id E63971C039; Thu, 19 Mar 2026 01:29:23 +0000 (UTC) Received: from [HIDDEN] (Authenticated sender: john@groves.net) by omf16.hostedemail.com (Postfix) with ESMTPA id 25A172001F; Thu, 19 Mar 2026 01:29:13 +0000 (UTC) From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Shuah Khan , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, John Groves Subject: [PATCH V8 3/8] dax: add fsdev.c driver for fs-dax on character dax Date: Wed, 18 Mar 2026 20:28:37 -0500 Message-ID: <20260319012837.4443-1-john@groves.net> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260318202737.4344.dax@groves.net> References: <20260318202737.4344.dax@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Stat-Signature: e6ccc3d7nkf8xxo69hzghz8pyn5b7cff X-Rspamd-Server: rspamout01 X-Rspamd-Queue-Id: 25A172001F X-Session-Marker: 6A6F686E4067726F7665732E6E6574 X-Session-ID: U2FsdGVkX1+UoXtTtTu1z1vtoxGSOpwp0nY+sdEkGGI= X-HE-Tag: 1773883753-857106 X-HE-Meta: U2FsdGVkX1+xIMNFYs1kzJfCY5J1KNIq2mnrrXAD1p0h5JyrL/Og9YQYpecpOHjAkCnxeN7O2O4ycLN52Z3uk/QKHGb9l6U4PXyCEVt7JZ2KaJ+eB0NxnKJ9EJRZozHLBxzJc9B8zEeE3KlQ1RrwKEnOGnOQ+QaRuE921vN2U0AuZPYBC85G00V7g5QzjSU/ZOkcWdInMeW4VvxwZtN8QUrn+DWlaNzO24MyppKUdBE8uL9Gop+xHiZG/8z94i8zlbMkTabj9jnXy2pu2Wh5gB4etGauKHwGoh3SSfehrYzwfuw9i3d6xu3gOPO+/q7TeUNv7+rKRCF4wDXTts3DOU3X96Ujk6W2rED8O+PGJUwZGnIvzAbXpKq9GVI7PcvRgJC6eZBdqt1FYNk2EVkOziUDtbBsoOKOPwd3lU+wRNwJzqW/ddahhY7gCIip5IWxbpNNiila4qdfbMPk0YQ/edrKr2ZWOXDGYywZNlKOXCLOGp6Pp5pKwpl1dEyHsHInDql7qwsKJto= Content-Type: text/plain; charset="utf-8" The new fsdev driver provides pages/folios initialized compatibly with fsdax - normal rather than devdax-style refcounting, and starting out with order-0 folios. When fsdev binds to a daxdev, it is usually (always?) switching from the devdax mode (device.c), which pre-initializes compound folios according to its alignment. Fsdev uses fsdev_clear_folio_state() to switch the folios into a fsdax-compatible state. A side effect of this is that raw mmap doesn't (can't?) work on an fsdev dax instance. Accordingly, The fsdev driver does not provide raw mmap - devices must be put in 'devdax' mode (drivers/dax/device.c) to get raw mmap capability. In this commit is just the framework, which remaps pages/folios compatibly with fsdax. Enabling dax changes: - bus.h: add DAXDRV_FSDEV_TYPE driver type - bus.c: allow DAXDRV_FSDEV_TYPE drivers to bind to daxdevs - dax.h: prototype inode_dax(), which fsdev needs Suggested-by: Dan Williams Suggested-by: Gregory Price Signed-off-by: John Groves --- MAINTAINERS | 8 ++ drivers/dax/Makefile | 6 + drivers/dax/bus.c | 4 + drivers/dax/bus.h | 1 + drivers/dax/fsdev.c | 253 +++++++++++++++++++++++++++++++++++++++++++ fs/dax.c | 1 + include/linux/dax.h | 3 + 7 files changed, 276 insertions(+) create mode 100644 drivers/dax/fsdev.c diff --git a/MAINTAINERS b/MAINTAINERS index 96ea84948d76..e83cfcf7e932 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -7298,6 +7298,14 @@ L: linux-cxl@vger.kernel.org S: Supported F: drivers/dax/ =20 +DEVICE DIRECT ACCESS (DAX) [fsdev_dax] +M: John Groves +M: John Groves +L: nvdimm@lists.linux.dev +L: linux-cxl@vger.kernel.org +S: Supported +F: drivers/dax/fsdev.c + DEVICE FREQUENCY (DEVFREQ) M: MyungJoo Ham M: Kyungmin Park diff --git a/drivers/dax/Makefile b/drivers/dax/Makefile index 5ed5c39857c8..3bae252fd1bf 100644 --- a/drivers/dax/Makefile +++ b/drivers/dax/Makefile @@ -5,10 +5,16 @@ obj-$(CONFIG_DEV_DAX_KMEM) +=3D kmem.o obj-$(CONFIG_DEV_DAX_PMEM) +=3D dax_pmem.o obj-$(CONFIG_DEV_DAX_CXL) +=3D dax_cxl.o =20 +# fsdev_dax: fs-dax compatible devdax driver (needs DEV_DAX and FS_DAX) +ifeq ($(CONFIG_FS_DAX),y) +obj-$(CONFIG_DEV_DAX) +=3D fsdev_dax.o +endif + dax-y :=3D super.o dax-y +=3D bus.o device_dax-y :=3D device.o dax_pmem-y :=3D pmem.o dax_cxl-y :=3D cxl.o +fsdev_dax-y :=3D fsdev.o =20 obj-y +=3D hmem/ diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c index e4bd5c9f006c..562e2b06f61a 100644 --- a/drivers/dax/bus.c +++ b/drivers/dax/bus.c @@ -81,6 +81,10 @@ static int dax_match_type(const struct dax_device_driver= *dax_drv, struct device !IS_ENABLED(CONFIG_DEV_DAX_KMEM)) return 1; =20 + /* fsdev driver can also bind to device-type dax devices */ + if (dax_drv->type =3D=3D DAXDRV_FSDEV_TYPE && type =3D=3D DAXDRV_DEVICE_T= YPE) + return 1; + return 0; } =20 diff --git a/drivers/dax/bus.h b/drivers/dax/bus.h index cbbf64443098..880bdf7e72d7 100644 --- a/drivers/dax/bus.h +++ b/drivers/dax/bus.h @@ -31,6 +31,7 @@ struct dev_dax *devm_create_dev_dax(struct dev_dax_data *= data); enum dax_driver_type { DAXDRV_KMEM_TYPE, DAXDRV_DEVICE_TYPE, + DAXDRV_FSDEV_TYPE, }; =20 struct dax_device_driver { diff --git a/drivers/dax/fsdev.c b/drivers/dax/fsdev.c new file mode 100644 index 000000000000..e5b4396ce401 --- /dev/null +++ b/drivers/dax/fsdev.c @@ -0,0 +1,253 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright(c) 2026 Micron Technology, Inc. */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "dax-private.h" +#include "bus.h" + +/* + * FS-DAX compatible devdax driver + * + * Unlike drivers/dax/device.c which pre-initializes compound folios based + * on device alignment (via vmemmap_shift), this driver leaves folios + * uninitialized similar to pmem. This allows fs-dax filesystems like famfs + * to work without needing special handling for pre-initialized folios. + * + * Key differences from device.c: + * - pgmap type is MEMORY_DEVICE_FS_DAX (not MEMORY_DEVICE_GENERIC) + * - vmemmap_shift is NOT set (folios remain order-0) + * - fs-dax can dynamically create compound folios as needed + * - No mmap support - all access is through fs-dax/iomap + */ + + +static void fsdev_cdev_del(void *cdev) +{ + cdev_del(cdev); +} + +static void fsdev_kill(void *dev_dax) +{ + kill_dev_dax(dev_dax); +} + +/* + * Page map operations for FS-DAX mode + * Similar to fsdax_pagemap_ops in drivers/nvdimm/pmem.c + * + * Note: folio_free callback is not needed for MEMORY_DEVICE_FS_DAX. + * The core mm code in free_zone_device_folio() handles the wake_up_var() + * directly for this memory type. + */ +static int fsdev_pagemap_memory_failure(struct dev_pagemap *pgmap, + unsigned long pfn, unsigned long nr_pages, int mf_flags) +{ + struct dev_dax *dev_dax =3D pgmap->owner; + u64 offset =3D PFN_PHYS(pfn) - dev_dax->ranges[0].range.start; + u64 len =3D nr_pages << PAGE_SHIFT; + + return dax_holder_notify_failure(dev_dax->dax_dev, offset, + len, mf_flags); +} + +static const struct dev_pagemap_ops fsdev_pagemap_ops =3D { + .memory_failure =3D fsdev_pagemap_memory_failure, +}; + +/* + * Clear any stale folio state from pages in the given range. + * This is necessary because device_dax pre-initializes compound folios + * based on vmemmap_shift, and that state may persist after driver unbind. + * Since fsdev_dax uses MEMORY_DEVICE_FS_DAX without vmemmap_shift, fs-dax + * expects to find clean order-0 folios that it can build into compound + * folios on demand. + * + * At probe time, no filesystem should be mounted yet, so all mappings + * are stale and must be cleared along with compound state. + */ +static void fsdev_clear_folio_state(struct dev_dax *dev_dax) +{ + for (int i =3D 0; i < dev_dax->nr_range; i++) { + struct range *range =3D &dev_dax->ranges[i].range; + unsigned long pfn =3D PHYS_PFN(range->start); + unsigned long end_pfn =3D PHYS_PFN(range->end) + 1; + + while (pfn < end_pfn) { + struct folio *folio =3D pfn_folio(pfn); + int order =3D dax_folio_reset_order(folio); + + pfn +=3D 1UL << order; + } + } +} + +static void fsdev_clear_folio_state_action(void *data) +{ + fsdev_clear_folio_state(data); +} + +static int fsdev_open(struct inode *inode, struct file *filp) +{ + struct dax_device *dax_dev =3D inode_dax(inode); + struct dev_dax *dev_dax =3D dax_get_private(dax_dev); + + filp->private_data =3D dev_dax; + + return 0; +} + +static int fsdev_release(struct inode *inode, struct file *filp) +{ + return 0; +} + +static const struct file_operations fsdev_fops =3D { + .llseek =3D noop_llseek, + .owner =3D THIS_MODULE, + .open =3D fsdev_open, + .release =3D fsdev_release, +}; + +static int fsdev_dax_probe(struct dev_dax *dev_dax) +{ + struct dax_device *dax_dev =3D dev_dax->dax_dev; + struct device *dev =3D &dev_dax->dev; + struct dev_pagemap *pgmap; + u64 data_offset =3D 0; + struct inode *inode; + struct cdev *cdev; + void *addr; + int rc, i; + + if (static_dev_dax(dev_dax)) { + if (dev_dax->nr_range > 1) { + dev_warn(dev, "static pgmap / multi-range device conflict\n"); + return -EINVAL; + } + + pgmap =3D dev_dax->pgmap; + } else { + size_t pgmap_size; + + if (dev_dax->pgmap) { + dev_warn(dev, "dynamic-dax with pre-populated page map\n"); + return -EINVAL; + } + + pgmap_size =3D struct_size(pgmap, ranges, dev_dax->nr_range - 1); + pgmap =3D devm_kzalloc(dev, pgmap_size, GFP_KERNEL); + if (!pgmap) + return -ENOMEM; + + pgmap->nr_range =3D dev_dax->nr_range; + dev_dax->pgmap =3D pgmap; + + for (i =3D 0; i < dev_dax->nr_range; i++) { + struct range *range =3D &dev_dax->ranges[i].range; + + pgmap->ranges[i] =3D *range; + } + } + + for (i =3D 0; i < dev_dax->nr_range; i++) { + struct range *range =3D &dev_dax->ranges[i].range; + + if (!devm_request_mem_region(dev, range->start, + range_len(range), dev_name(dev))) { + dev_warn(dev, "mapping%d: %#llx-%#llx could not reserve range\n", + i, range->start, range->end); + return -EBUSY; + } + } + + /* + * FS-DAX compatible mode: Use MEMORY_DEVICE_FS_DAX type and + * do NOT set vmemmap_shift. This leaves folios at order-0, + * allowing fs-dax to dynamically create compound folios as needed + * (similar to pmem behavior). + */ + pgmap->type =3D MEMORY_DEVICE_FS_DAX; + pgmap->ops =3D &fsdev_pagemap_ops; + pgmap->owner =3D dev_dax; + + /* + * CRITICAL DIFFERENCE from device.c: + * We do NOT set vmemmap_shift here, even if align > PAGE_SIZE. + * This ensures folios remain order-0 and are compatible with + * fs-dax's folio management. + */ + + addr =3D devm_memremap_pages(dev, pgmap); + if (IS_ERR(addr)) + return PTR_ERR(addr); + + /* + * Clear any stale compound folio state left over from a previous + * driver (e.g., device_dax with vmemmap_shift). Also register this + * as a devm action so folio state is cleared on unbind, ensuring + * clean pages for subsequent drivers (e.g., kmem for system-ram). + */ + fsdev_clear_folio_state(dev_dax); + rc =3D devm_add_action_or_reset(dev, fsdev_clear_folio_state_action, + dev_dax); + if (rc) + return rc; + + /* Detect whether the data is at a non-zero offset into the memory */ + if (pgmap->range.start !=3D dev_dax->ranges[0].range.start) { + u64 phys =3D dev_dax->ranges[0].range.start; + u64 pgmap_phys =3D dev_dax->pgmap[0].range.start; + + if (!WARN_ON(pgmap_phys > phys)) + data_offset =3D phys - pgmap_phys; + + pr_debug("%s: offset detected phys=3D%llx pgmap_phys=3D%llx offset=3D%ll= x\n", + __func__, phys, pgmap_phys, data_offset); + } + + inode =3D dax_inode(dax_dev); + cdev =3D inode->i_cdev; + cdev_init(cdev, &fsdev_fops); + cdev->owner =3D dev->driver->owner; + cdev_set_parent(cdev, &dev->kobj); + rc =3D cdev_add(cdev, dev->devt, 1); + if (rc) + return rc; + + rc =3D devm_add_action_or_reset(dev, fsdev_cdev_del, cdev); + if (rc) + return rc; + + run_dax(dax_dev); + return devm_add_action_or_reset(dev, fsdev_kill, dev_dax); +} + +static struct dax_device_driver fsdev_dax_driver =3D { + .probe =3D fsdev_dax_probe, + .type =3D DAXDRV_FSDEV_TYPE, +}; + +static int __init dax_init(void) +{ + return dax_driver_register(&fsdev_dax_driver); +} + +static void __exit dax_exit(void) +{ + dax_driver_unregister(&fsdev_dax_driver); +} + +MODULE_AUTHOR("John Groves"); +MODULE_DESCRIPTION("FS-DAX Device: fs-dax compatible devdax driver"); +MODULE_LICENSE("GPL"); +module_init(dax_init); +module_exit(dax_exit); +MODULE_ALIAS_DAX_DEVICE(0); diff --git a/fs/dax.c b/fs/dax.c index 7d7bbfb32c41..85a4b428e72b 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -416,6 +416,7 @@ int dax_folio_reset_order(struct folio *folio) =20 return order; } +EXPORT_SYMBOL_GPL(dax_folio_reset_order); =20 static inline unsigned long dax_folio_put(struct folio *folio) { diff --git a/include/linux/dax.h b/include/linux/dax.h index bf103f317cac..996493f5c538 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -51,6 +51,7 @@ struct dax_holder_operations { =20 #if IS_ENABLED(CONFIG_DAX) struct dax_device *alloc_dax(void *private, const struct dax_operations *o= ps); + void *dax_holder(struct dax_device *dax_dev); void put_dax(struct dax_device *dax_dev); void kill_dax(struct dax_device *dax_dev); @@ -151,8 +152,10 @@ static inline void fs_put_dax(struct dax_device *dax_d= ev, void *holder) #endif /* CONFIG_BLOCK && CONFIG_FS_DAX */ =20 #if IS_ENABLED(CONFIG_FS_DAX) +struct dax_device *inode_dax(struct inode *inode); int dax_writeback_mapping_range(struct address_space *mapping, struct dax_device *dax_dev, struct writeback_control *wbc); +int dax_folio_reset_order(struct folio *folio); =20 struct page *dax_layout_busy_page(struct address_space *mapping); struct page *dax_layout_busy_page_range(struct address_space *mapping, lof= f_t start, loff_t end); --=20 2.53.0 From nobody Mon Apr 6 15:50:40 2026 Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BA5F428FFF6; Thu, 19 Mar 2026 01:29:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=216.40.44.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773883791; cv=none; b=Hv2l7xuRouCLDTZ+1Jh/RIt6epYY3aNuudBxOrctMevraIIW7sufK0VjZb6GPxu7vaPT6cZuuP+zMA2p4CdNHSfMf0oVZTD4UeEDdcZnRCsBtshkCB3ebUehvGCg0tbESJc2XVpcJZYslf78Iiemosy8H8wcqmKxjwKCn33OEDs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773883791; c=relaxed/simple; bh=+7qbZO+oJj9sQzSsIPmzQyBDvR4X1djf6CJiJixJ0Nc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=j0P94ASelhRvDynWCuXPW3LfMI2jXPeZCDtbaT3sihB/MQHNwSHnwVMEgHs6rBTOJxfcEFKe6HOG7LHLcF7ukodO1+K00GsPZbEJcp6ZaSmNnJzld92B1bzB/cS7XniIjQKgJYqBTp/PTMv++lcrKvxI3lUEp3xuHHyxaim0vj8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=groves.net; spf=pass smtp.mailfrom=groves.net; arc=none smtp.client-ip=216.40.44.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=groves.net Received: from omf14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 2B8EB1402CF; Thu, 19 Mar 2026 01:29:44 +0000 (UTC) Received: from [HIDDEN] (Authenticated sender: john@groves.net) by omf14.hostedemail.com (Postfix) with ESMTPA id 1741B30; Thu, 19 Mar 2026 01:29:32 +0000 (UTC) From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Shuah Khan , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, John Groves , Ira Weiny Subject: [PATCH V8 4/8] dax: Save the kva from memremap Date: Wed, 18 Mar 2026 20:29:28 -0500 Message-ID: <20260319012928.4475-1-john@groves.net> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260318202737.4344.dax@groves.net> References: <20260318202737.4344.dax@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Stat-Signature: dkpkffgus8z66bd9f9sd14x6bhyfu1nc X-Rspamd-Server: rspamout01 X-Rspamd-Queue-Id: 1741B30 X-Session-Marker: 6A6F686E4067726F7665732E6E6574 X-Session-ID: U2FsdGVkX18DqTQzoVHQR2j04qujFgikIh0PwIqh0sM= X-HE-Tag: 1773883772-493721 X-HE-Meta: U2FsdGVkX1/TOaPnpzQYJ2rlYO5+cP+e93RxraFn/Z9D2rhOUpEzm0Vg6OOW8Ib4ENiSrSa5Ta+mkewmtIJll8KHV/C5fZ9QDRri0ow0zeRnp0Esy7Ugo1A+N3LlA+zzMF5iptT7kAsxaTPOLtW6sQEMKIgoJQ6snFJFrTYMMLdT+kGFocRHZfKDbTmVtV06enuCd7DYWEdEuKfas5frt/PlZYjLWALsKfltOIp/grxcnXLwdYtnG2pF1X9Bd335bHOyIOzW4cqtLpK7nm8x8ivFCx/ME8HrvemGn+V6vDSIRTXZ1igmzp4gilFf9p4bcbE7OMoItc0BTTSrBA5sLhwGyS52Fyr0 Content-Type: text/plain; charset="utf-8" Save the kva from memremap because we need it for iomap rw support. Prior to famfs, there were no iomap users of /dev/dax - so the virtual address from memremap was not needed. Reviewed-by: Ira Weiny Reviewed-by: Dave Jiang Signed-off-by: John Groves --- drivers/dax/dax-private.h | 2 ++ drivers/dax/fsdev.c | 1 + 2 files changed, 3 insertions(+) diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h index c6ae27c982f4..7a3727d76a68 100644 --- a/drivers/dax/dax-private.h +++ b/drivers/dax/dax-private.h @@ -69,6 +69,7 @@ struct dev_dax_range { * data while the device is activated in the driver. * @region: parent region * @dax_dev: core dax functionality + * @virt_addr: kva from memremap; used by fsdev_dax * @align: alignment of this instance * @target_node: effective numa node if dev_dax memory range is onlined * @dyn_id: is this a dynamic or statically created instance @@ -83,6 +84,7 @@ struct dev_dax_range { struct dev_dax { struct dax_region *region; struct dax_device *dax_dev; + void *virt_addr; unsigned int align; int target_node; bool dyn_id; diff --git a/drivers/dax/fsdev.c b/drivers/dax/fsdev.c index e5b4396ce401..d2f6c0341c24 100644 --- a/drivers/dax/fsdev.c +++ b/drivers/dax/fsdev.c @@ -212,6 +212,7 @@ static int fsdev_dax_probe(struct dev_dax *dev_dax) pr_debug("%s: offset detected phys=3D%llx pgmap_phys=3D%llx offset=3D%ll= x\n", __func__, phys, pgmap_phys, data_offset); } + dev_dax->virt_addr =3D addr + data_offset; =20 inode =3D dax_inode(dax_dev); cdev =3D inode->i_cdev; --=20 2.53.0 From nobody Mon Apr 6 15:50:40 2026 Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BA7822C21FF; Thu, 19 Mar 2026 01:30:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=216.40.44.14 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773883812; cv=none; b=su878hufurMPxEAqp/xa1YcK7SzQ4DFWtJSXwCqTKwS4+E0rw5DXtpt+n7k8ywh7QzeoLBSBMJNkuEohwhvwSyBxZXW8FdCn56vFeS2GYRp+Jhw45QApU0iWHIF35GeDQ6w569PwQnxG+QQ+NRHf/3LfxG7yKWv65ZCpjlj5Xxs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773883812; c=relaxed/simple; bh=Du28CXsaM4ajzOZ3yy8fwl2y4q9y12TZj0ITFOq6wz0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=uV6PoF2wRFF2G2LxPZc1bF5ZkrXKOMt6rr9Q8C7NEGtIJspyVkWbak28l6oqYkPzp7lOIl6I94X3TxNgTXQ3lSJ2FLvOlJuh+zr1Qe0VQowHdsTRhSMHnRKv8KgzmN/qUb5OUOM/7gdpFil/bJI2CAuT6pd1kiex+xyL8IXW2Xw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=groves.net; spf=pass smtp.mailfrom=groves.net; arc=none smtp.client-ip=216.40.44.14 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=groves.net Received: from omf03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 249D3C1385; Thu, 19 Mar 2026 01:30:01 +0000 (UTC) Received: from [HIDDEN] (Authenticated sender: john@groves.net) by omf03.hostedemail.com (Postfix) with ESMTPA id 89AFA6000D; Thu, 19 Mar 2026 01:29:50 +0000 (UTC) From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Shuah Khan , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, John Groves Subject: [PATCH V8 5/8] dax: Add dax_operations for use by fs-dax on fsdev dax Date: Wed, 18 Mar 2026 20:29:48 -0500 Message-ID: <20260319012948.4493-1-john@groves.net> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260318202737.4344.dax@groves.net> References: <20260318202737.4344.dax@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Stat-Signature: jrhx3gfgpz8obw3b7hy8z4rdgn3a3omk X-Rspamd-Server: rspamout08 X-Rspamd-Queue-Id: 89AFA6000D X-Session-Marker: 6A6F686E4067726F7665732E6E6574 X-Session-ID: U2FsdGVkX1989RsKLzlgHtrTYrZxTEcn3dEJDAesSl0= X-HE-Tag: 1773883790-425132 X-HE-Meta: U2FsdGVkX19G9L0DFHWVe4xBTXUyQaBvrVCK6oT42OxNqzJwIFIpIUWKuXWSoRYtaCA/gXQEoIEC5j+Z1Y0/kflz6bxBRv52jm+u7V+1c4Os2ibljIqimVHkChgH2JiqMWNxjonEKrJubidae2PWqMdTPz5glA8sehPYDyDGuIOx5KSxzhLcntBRh1L48O8cfTQmXSmr6+/O2C+g+2/ed2ZpgbkxwBv1ERC51ta15WvnH7q/S068HtUqp5UBBZR/6449y0lrg7VfA+hjiItwr1k6BNeJL+eWHSbIlNYl3njVjTPkzokPCXb2xgvMfbQ0Gvqo2pqSJBc= Content-Type: text/plain; charset="utf-8" From: John Groves fsdev: Add dax_operations for use by famfs. This replicates the functionality from drivers/nvdimm/pmem.c that conventional fs-dax file systems (e.g. xfs) use to support dax read/write/mmap to a daxdev - without which famfs can't sit atop a daxdev. - These methods are based on pmem_dax_ops from drivers/nvdimm/pmem.c - fsdev_dax_direct_access() returns the hpa, pfn and kva. The kva was newly stored as dev_dax->virt_addr by dev_dax_probe(). - The hpa/pfn are used for mmap (dax_iomap_fault()), and the kva is used for read/write (dax_iomap_rw()) - fsdev_dax_recovery_write() and dev_dax_zero_page_range() have not been tested yet. I'm looking for suggestions as to how to test those. - dax-private.h: add dev_dax->cached_size, which fsdev needs to remember. The dev_dax size cannot change while a driver is bound (dev_dax_resize returns -EBUSY if dev->driver is set). Caching the size at probe time allows fsdev's direct_access path can use it without acquiring dax_dev_rwsem (which isn't exported anyway). Signed-off-by: John Groves --- drivers/dax/dax-private.h | 1 + drivers/dax/fsdev.c | 83 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 84 insertions(+) diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h index 7a3727d76a68..ee8f3af8387f 100644 --- a/drivers/dax/dax-private.h +++ b/drivers/dax/dax-private.h @@ -85,6 +85,7 @@ struct dev_dax { struct dax_region *region; struct dax_device *dax_dev; void *virt_addr; + u64 cached_size; unsigned int align; int target_node; bool dyn_id; diff --git a/drivers/dax/fsdev.c b/drivers/dax/fsdev.c index d2f6c0341c24..5a1e504c9281 100644 --- a/drivers/dax/fsdev.c +++ b/drivers/dax/fsdev.c @@ -28,6 +28,84 @@ * - No mmap support - all access is through fs-dax/iomap */ =20 +static void fsdev_write_dax(void *pmem_addr, struct page *page, + unsigned int off, unsigned int len) +{ + while (len) { + void *mem =3D kmap_local_page(page); + unsigned int chunk =3D min_t(unsigned int, len, PAGE_SIZE - off); + + memcpy_flushcache(pmem_addr, mem + off, chunk); + kunmap_local(mem); + len -=3D chunk; + off =3D 0; + page++; + pmem_addr +=3D chunk; + } +} + +static long __fsdev_dax_direct_access(struct dax_device *dax_dev, pgoff_t = pgoff, + long nr_pages, enum dax_access_mode mode, void **kaddr, + unsigned long *pfn) +{ + struct dev_dax *dev_dax =3D dax_get_private(dax_dev); + size_t size =3D nr_pages << PAGE_SHIFT; + size_t offset =3D pgoff << PAGE_SHIFT; + void *virt_addr =3D dev_dax->virt_addr + offset; + phys_addr_t phys; + unsigned long local_pfn; + + phys =3D dax_pgoff_to_phys(dev_dax, pgoff, nr_pages << PAGE_SHIFT); + if (phys =3D=3D -1) { + dev_dbg(&dev_dax->dev, + "pgoff (%#lx) out of range\n", pgoff); + return -EFAULT; + } + + if (kaddr) + *kaddr =3D virt_addr; + + local_pfn =3D PHYS_PFN(phys); + if (pfn) + *pfn =3D local_pfn; + + /* + * Use cached_size which was computed at probe time. The size cannot + * change while the driver is bound (resize returns -EBUSY). + */ + return PHYS_PFN(min(size, dev_dax->cached_size - offset)); +} + +static int fsdev_dax_zero_page_range(struct dax_device *dax_dev, + pgoff_t pgoff, size_t nr_pages) +{ + void *kaddr; + + WARN_ONCE(nr_pages > 1, "%s: nr_pages > 1\n", __func__); + __fsdev_dax_direct_access(dax_dev, pgoff, 1, DAX_ACCESS, &kaddr, NULL); + fsdev_write_dax(kaddr, ZERO_PAGE(0), 0, PAGE_SIZE); + return 0; +} + +static long fsdev_dax_direct_access(struct dax_device *dax_dev, + pgoff_t pgoff, long nr_pages, enum dax_access_mode mode, + void **kaddr, unsigned long *pfn) +{ + return __fsdev_dax_direct_access(dax_dev, pgoff, nr_pages, mode, + kaddr, pfn); +} + +static size_t fsdev_dax_recovery_write(struct dax_device *dax_dev, pgoff_t= pgoff, + void *addr, size_t bytes, struct iov_iter *i) +{ + return _copy_from_iter_flushcache(addr, bytes, i); +} + +static const struct dax_operations dev_dax_ops =3D { + .direct_access =3D fsdev_dax_direct_access, + .zero_page_range =3D fsdev_dax_zero_page_range, + .recovery_write =3D fsdev_dax_recovery_write, +}; =20 static void fsdev_cdev_del(void *cdev) { @@ -168,6 +246,11 @@ static int fsdev_dax_probe(struct dev_dax *dev_dax) } } =20 + /* Cache size now; it cannot change while driver is bound */ + dev_dax->cached_size =3D 0; + for (i =3D 0; i < dev_dax->nr_range; i++) + dev_dax->cached_size +=3D range_len(&dev_dax->ranges[i].range); + /* * FS-DAX compatible mode: Use MEMORY_DEVICE_FS_DAX type and * do NOT set vmemmap_shift. This leaves folios at order-0, --=20 2.53.0 From nobody Mon Apr 6 15:50:40 2026 Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A6CE52DB7B7; Thu, 19 Mar 2026 01:30:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=216.40.44.11 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773883824; cv=none; b=kg0aRuL8apvx/XMeANu2kk5vQxSirkqx8lj2R7YBCaH7VbL5rand93NUccSt2XQSouo+C8iJtG4PDLVXxHQPEIP3PlSlKSzxQaQWAqtduP/maypsYvYt+UMlNkxqjovM/+KnvBuFVtnPTEJnvaRGahZibXaYbhU5KWXz3yLeRD4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773883824; c=relaxed/simple; bh=EYgs9oY8SZc2u+RfPuZ/4kEb1/RHVm54MSMI2GbEowI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=QtoJ5ffVbxELqwMraGw3yYrtE3yJucGiqyJMJFaLQNe33q2wsvDxsjC1uEuhDDhLU4WDScgznKuKzug9sV2AH9pkrEQiKiBONQC27kT/AOeR/ZcmCNXu2Hq3cbkImEDzrMNo1BDY2iWStJ6dI4cmTnlZorH8Y338CYEZkHU6+jo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=groves.net; spf=pass smtp.mailfrom=groves.net; arc=none smtp.client-ip=216.40.44.11 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=groves.net Received: from omf11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 48A76140277; Thu, 19 Mar 2026 01:30:18 +0000 (UTC) Received: from [HIDDEN] (Authenticated sender: john@groves.net) by omf11.hostedemail.com (Postfix) with ESMTPA id 569052002F; Thu, 19 Mar 2026 01:30:07 +0000 (UTC) From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Shuah Khan , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, John Groves Subject: [PATCH V8 6/8] dax: Add dax_set_ops() for setting dax_operations at bind time Date: Wed, 18 Mar 2026 20:30:05 -0500 Message-ID: <20260319013005.4511-1-john@groves.net> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260318202737.4344.dax@groves.net> References: <20260318202737.4344.dax@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 569052002F X-Stat-Signature: jyk5zjweg8k3w8k6fu9z93qowpue78g6 X-Rspamd-Server: rspamout03 X-Session-Marker: 6A6F686E4067726F7665732E6E6574 X-Session-ID: U2FsdGVkX1/cqMgvZOv/sKnlWCv+t+cNQ5jUdmhhhsA= X-HE-Tag: 1773883807-52540 X-HE-Meta: U2FsdGVkX18qSRGOV56IRDCpo8K8tA4/v9vAl3DGm2Qdbzh8uap6q4HGpmUslfRpaBd/8Y/EvlY2AXgM2DdlS6fxk9InURsUJVWRifvGqBa4+Z+34bJYsafLOszfcm458qr8PqufFZ7578fPFEHZuyeVqEOzl9J36NcLfnf5i3wAV1wwmMEurmIfJU/DKVWGtebwd5GCOXdBFk78N8tzqnDrrDx1FMV7Lcotw21t2+t4HcYRIAszhMIGLOICA5tDhgCCwGjzrPEa8/DMz4UFMJ67N33ywnIdUovEiDOT2NR/CGtfEbpTCCn0VrfVGh03CYuuNYL/z+SBSt30q4MT88Sh+3SIrI/Z Content-Type: text/plain; charset="utf-8" From: John Groves Add a new dax_set_ops() function that allows drivers to set the dax_operations after the dax_device has been allocated. This is needed for fsdev_dax where the operations need to be set during probe and cleared during unbind. The fsdev driver uses devm_add_action_or_reset() for cleanup consistency, avoiding the complexity of mixing devm-managed resources with manual cleanup in a remove() callback. This ensures cleanup happens automatically in the correct reverse order when the device is unbound. Reviewed-by: Dave Jiang Signed-off-by: John Groves --- drivers/dax/fsdev.c | 16 ++++++++++++++++ drivers/dax/super.c | 38 +++++++++++++++++++++++++++++++++++++- include/linux/dax.h | 1 + 3 files changed, 54 insertions(+), 1 deletion(-) diff --git a/drivers/dax/fsdev.c b/drivers/dax/fsdev.c index 5a1e504c9281..36d39f3ef135 100644 --- a/drivers/dax/fsdev.c +++ b/drivers/dax/fsdev.c @@ -117,6 +117,13 @@ static void fsdev_kill(void *dev_dax) kill_dev_dax(dev_dax); } =20 +static void fsdev_clear_ops(void *data) +{ + struct dev_dax *dev_dax =3D data; + + dax_set_ops(dev_dax->dax_dev, NULL); +} + /* * Page map operations for FS-DAX mode * Similar to fsdax_pagemap_ops in drivers/nvdimm/pmem.c @@ -310,6 +317,15 @@ static int fsdev_dax_probe(struct dev_dax *dev_dax) if (rc) return rc; =20 + /* Set the dax operations for fs-dax access path */ + rc =3D dax_set_ops(dax_dev, &dev_dax_ops); + if (rc) + return rc; + + rc =3D devm_add_action_or_reset(dev, fsdev_clear_ops, dev_dax); + if (rc) + return rc; + run_dax(dax_dev); return devm_add_action_or_reset(dev, fsdev_kill, dev_dax); } diff --git a/drivers/dax/super.c b/drivers/dax/super.c index c00b9dff4a06..ba0b4cd18a77 100644 --- a/drivers/dax/super.c +++ b/drivers/dax/super.c @@ -157,6 +157,9 @@ long dax_direct_access(struct dax_device *dax_dev, pgof= f_t pgoff, long nr_pages, if (!dax_alive(dax_dev)) return -ENXIO; =20 + if (!dax_dev->ops) + return -EOPNOTSUPP; + if (nr_pages < 0) return -EINVAL; =20 @@ -207,6 +210,10 @@ int dax_zero_page_range(struct dax_device *dax_dev, pg= off_t pgoff, =20 if (!dax_alive(dax_dev)) return -ENXIO; + + if (!dax_dev->ops) + return -EOPNOTSUPP; + /* * There are no callers that want to zero more than one page as of now. * Once users are there, this check can be removed after the @@ -223,7 +230,7 @@ EXPORT_SYMBOL_GPL(dax_zero_page_range); size_t dax_recovery_write(struct dax_device *dax_dev, pgoff_t pgoff, void *addr, size_t bytes, struct iov_iter *iter) { - if (!dax_dev->ops->recovery_write) + if (!dax_dev->ops || !dax_dev->ops->recovery_write) return 0; return dax_dev->ops->recovery_write(dax_dev, pgoff, addr, bytes, iter); } @@ -307,6 +314,35 @@ void set_dax_nomc(struct dax_device *dax_dev) } EXPORT_SYMBOL_GPL(set_dax_nomc); =20 +/** + * dax_set_ops - set the dax_operations for a dax_device + * @dax_dev: the dax_device to configure + * @ops: the operations to set (may be NULL to clear) + * + * This allows drivers to set the dax_operations after the dax_device + * has been allocated. This is needed when the device is created before + * the driver that needs specific ops is bound (e.g., fsdev_dax binding + * to a dev_dax created by hmem). + * + * When setting non-NULL ops, fails if ops are already set (returns -EBUSY= ). + * When clearing ops (NULL), always succeeds. + * + * Return: 0 on success, -EBUSY if ops already set + */ +int dax_set_ops(struct dax_device *dax_dev, const struct dax_operations *o= ps) +{ + if (ops) { + /* Setting ops: fail if already set */ + if (cmpxchg(&dax_dev->ops, NULL, ops) !=3D NULL) + return -EBUSY; + } else { + /* Clearing ops: always allowed */ + dax_dev->ops =3D NULL; + } + return 0; +} +EXPORT_SYMBOL_GPL(dax_set_ops); + bool dax_alive(struct dax_device *dax_dev) { lockdep_assert_held(&dax_srcu); diff --git a/include/linux/dax.h b/include/linux/dax.h index 996493f5c538..8d469a23c485 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -245,6 +245,7 @@ static inline void dax_break_layout_final(struct inode = *inode) =20 bool dax_alive(struct dax_device *dax_dev); void *dax_get_private(struct dax_device *dax_dev); +int dax_set_ops(struct dax_device *dax_dev, const struct dax_operations *o= ps); long dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, long nr_= pages, enum dax_access_mode mode, void **kaddr, unsigned long *pfn); size_t dax_copy_from_iter(struct dax_device *dax_dev, pgoff_t pgoff, void = *addr, --=20 2.53.0 From nobody Mon Apr 6 15:50:40 2026 Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 24CAE2E11A6; Thu, 19 Mar 2026 01:30:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=216.40.44.13 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773883842; cv=none; b=pMCFIgS30zjsGbFfsWdPfm5Wt51jgQI87/k1jHqOj/GNSJiH2Ib1QMSruvOpKTYtKRMhwgCv9fGDDsYL2Hxy4zjFuLJczUd9SJ89dTpcCSMPJNV1dt+uyjS1ZSb9abA7ZGfgG6Z8n89Q3+aVsBtLYE51WCptEhdaOe7mgkX3xFs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773883842; c=relaxed/simple; bh=mu7F6EbKH3FQO2cw+iEfmXw0RIlWeJ7w3ZCalUplYig=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=bjimHQlIO93VmhJLvlifcvCJ3PNZYW0wSwnpErceH3LgZ5IDEhbs/Ah2sfcK9U/CXHcZsEtzH5hgeoT0KvLQvhGG6aWn6z+Slqo52U69f+fc2zeqluKisEcHggQmzpD2jxRyU3tTjv0F9qk8xOSOq8mviCHYWvjzmzCzhR4iapI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=groves.net; spf=pass smtp.mailfrom=groves.net; arc=none smtp.client-ip=216.40.44.13 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=groves.net Received: from omf09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 53E7D1B72E8; Thu, 19 Mar 2026 01:30:34 +0000 (UTC) Received: from [HIDDEN] (Authenticated sender: john@groves.net) by omf09.hostedemail.com (Postfix) with ESMTPA id EE64E2002A; Thu, 19 Mar 2026 01:30:23 +0000 (UTC) From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Shuah Khan , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, John Groves Subject: [PATCH V8 7/8] dax: Add fs_dax_get() func to prepare dax for fs-dax usage Date: Wed, 18 Mar 2026 20:30:22 -0500 Message-ID: <20260319013022.4531-1-john@groves.net> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260318202737.4344.dax@groves.net> References: <20260318202737.4344.dax@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: EE64E2002A X-Stat-Signature: gigd84f1a8scdd4pjso1om9e5abp3egz X-Rspamd-Server: rspamout03 X-Session-Marker: 6A6F686E4067726F7665732E6E6574 X-Session-ID: U2FsdGVkX18W7ZaX2fgubPZ7Q2KPHmP9OrS/mINysaU= X-HE-Tag: 1773883823-871223 X-HE-Meta: U2FsdGVkX19W1sakYRyevFB0WDJ857L+jtQD1a7ycQIGNV3ki62iK9SuRsfk/96KSoJ+U8xsPYGumJkQ+y+6LYcun+5u/nM1/AUdQqonakPdkJNYYoPr5nlew62r92JjU4mGXj9ka3hLCAzHu5N1X8HLmTIjMXknTgCuEiMDfHoMllNVjpJfg7SyPcJoEk5crffZTqyjwK63Swyuz0aPuk3wqMVnyNP2OlH83joyGT+O4jczRZ+wfDPfVZU2Cfqm93QLzcCNBTIXlyCZ7tH0t/0ombLUIWeO3QDBVt2fktG2VhmLS9q4urfDSMz7LLDCBRCx6kV6pC0= Content-Type: text/plain; charset="utf-8" The fs_dax_get() function should be called by fs-dax file systems after opening a fsdev dax device. This adds holder_operations, which provides a memory failure callback path and effects exclusivity between callers of fs_dax_get(). fs_dax_get() is specific to fsdev_dax, so it checks the driver type (which required touching bus.[ch]). fs_dax_get() fails if fsdev_dax is not bound to the memory. This function serves the same role as fs_dax_get_by_bdev(), which dax file systems call after opening the pmem block device. This can't be located in fsdev.c because struct dax_device is opaque there. This will be called by fs/fuse/famfs.c in a subsequent commit. Signed-off-by: John Groves --- drivers/dax/bus.c | 2 -- drivers/dax/bus.h | 2 ++ drivers/dax/super.c | 66 ++++++++++++++++++++++++++++++++++++++++++++- include/linux/dax.h | 16 ++++++++--- 4 files changed, 79 insertions(+), 7 deletions(-) diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c index 562e2b06f61a..8a8710a8234e 100644 --- a/drivers/dax/bus.c +++ b/drivers/dax/bus.c @@ -39,8 +39,6 @@ static int dax_bus_uevent(const struct device *dev, struc= t kobj_uevent_env *env) return add_uevent_var(env, "MODALIAS=3D" DAX_DEVICE_MODALIAS_FMT, 0); } =20 -#define to_dax_drv(__drv) container_of_const(__drv, struct dax_device_driv= er, drv) - static struct dax_id *__dax_match_id(const struct dax_device_driver *dax_d= rv, const char *dev_name) { diff --git a/drivers/dax/bus.h b/drivers/dax/bus.h index 880bdf7e72d7..dc6f112ac4a4 100644 --- a/drivers/dax/bus.h +++ b/drivers/dax/bus.h @@ -42,6 +42,8 @@ struct dax_device_driver { void (*remove)(struct dev_dax *dev); }; =20 +#define to_dax_drv(__drv) container_of_const(__drv, struct dax_device_driv= er, drv) + int __dax_driver_register(struct dax_device_driver *dax_drv, struct module *module, const char *mod_name); #define dax_driver_register(driver) \ diff --git a/drivers/dax/super.c b/drivers/dax/super.c index ba0b4cd18a77..d4ab60c406bf 100644 --- a/drivers/dax/super.c +++ b/drivers/dax/super.c @@ -14,6 +14,7 @@ #include #include #include "dax-private.h" +#include "bus.h" =20 /** * struct dax_device - anchor object for dax services @@ -111,6 +112,10 @@ struct dax_device *fs_dax_get_by_bdev(struct block_dev= ice *bdev, u64 *start_off, } EXPORT_SYMBOL_GPL(fs_dax_get_by_bdev); =20 +#endif /* CONFIG_BLOCK && CONFIG_FS_DAX */ + +#if IS_ENABLED(CONFIG_FS_DAX) + void fs_put_dax(struct dax_device *dax_dev, void *holder) { if (dax_dev && holder && @@ -119,7 +124,66 @@ void fs_put_dax(struct dax_device *dax_dev, void *hold= er) put_dax(dax_dev); } EXPORT_SYMBOL_GPL(fs_put_dax); -#endif /* CONFIG_BLOCK && CONFIG_FS_DAX */ + +/** + * fs_dax_get() - get ownership of a devdax via holder/holder_ops + * + * fs-dax file systems call this function to prepare to use a devdax devic= e for + * fsdax. This is like fs_dax_get_by_bdev(), but the caller already has st= ruct + * dev_dax (and there is no bdev). The holder makes this exclusive. + * + * @dax_dev: dev to be prepared for fs-dax usage + * @holder: filesystem or mapped device inside the dax_device + * @hops: operations for the inner holder + * + * Returns: 0 on success, <0 on failure + */ +int fs_dax_get(struct dax_device *dax_dev, void *holder, + const struct dax_holder_operations *hops) +{ + struct dev_dax *dev_dax; + struct dax_device_driver *dax_drv; + int id; + + id =3D dax_read_lock(); + if (!dax_dev || !dax_alive(dax_dev) || !igrab(&dax_dev->inode)) { + dax_read_unlock(id); + return -ENODEV; + } + dax_read_unlock(id); + + /* Verify the device is bound to fsdev_dax driver */ + dev_dax =3D dax_get_private(dax_dev); + if (!dev_dax) { + iput(&dax_dev->inode); + return -ENODEV; + } + + device_lock(&dev_dax->dev); + if (!dev_dax->dev.driver) { + device_unlock(&dev_dax->dev); + iput(&dax_dev->inode); + return -ENODEV; + } + dax_drv =3D to_dax_drv(dev_dax->dev.driver); + if (dax_drv->type !=3D DAXDRV_FSDEV_TYPE) { + device_unlock(&dev_dax->dev); + iput(&dax_dev->inode); + return -EOPNOTSUPP; + } + device_unlock(&dev_dax->dev); + + if (cmpxchg(&dax_dev->holder_data, NULL, holder)) { + iput(&dax_dev->inode); + return -EBUSY; + } + + dax_dev->holder_ops =3D hops; + + return 0; +} +EXPORT_SYMBOL_GPL(fs_dax_get); +#endif /* CONFIG_FS_DAX */ =20 enum dax_device_flags { /* !alive + rcu grace period =3D=3D no new operations / mappings */ diff --git a/include/linux/dax.h b/include/linux/dax.h index 8d469a23c485..f14fa2147175 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -131,7 +131,6 @@ int dax_add_host(struct dax_device *dax_dev, struct gen= disk *disk); void dax_remove_host(struct gendisk *disk); struct dax_device *fs_dax_get_by_bdev(struct block_device *bdev, u64 *star= t_off, void *holder, const struct dax_holder_operations *ops); -void fs_put_dax(struct dax_device *dax_dev, void *holder); #else static inline int dax_add_host(struct dax_device *dax_dev, struct gendisk = *disk) { @@ -146,12 +145,12 @@ static inline struct dax_device *fs_dax_get_by_bdev(s= truct block_device *bdev, { return NULL; } -static inline void fs_put_dax(struct dax_device *dax_dev, void *holder) -{ -} #endif /* CONFIG_BLOCK && CONFIG_FS_DAX */ =20 #if IS_ENABLED(CONFIG_FS_DAX) +void fs_put_dax(struct dax_device *dax_dev, void *holder); +int fs_dax_get(struct dax_device *dax_dev, void *holder, + const struct dax_holder_operations *hops); struct dax_device *inode_dax(struct inode *inode); int dax_writeback_mapping_range(struct address_space *mapping, struct dax_device *dax_dev, struct writeback_control *wbc); @@ -166,6 +165,15 @@ dax_entry_t dax_lock_mapping_entry(struct address_spac= e *mapping, void dax_unlock_mapping_entry(struct address_space *mapping, unsigned long index, dax_entry_t cookie); #else +static inline void fs_put_dax(struct dax_device *dax_dev, void *holder) +{ +} + +static inline int fs_dax_get(struct dax_device *dax_dev, void *holder, + const struct dax_holder_operations *hops) +{ + return -EOPNOTSUPP; +} static inline struct page *dax_layout_busy_page(struct address_space *mapp= ing) { return NULL; --=20 2.53.0 From nobody Mon Apr 6 15:50:40 2026 Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 935F82C11E4; Thu, 19 Mar 2026 01:30:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=216.40.44.11 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773883859; cv=none; b=MbXqEL7dgQJupdc3vshKBqWlOcCYsPIlEdl3exc1DlvizBPTgspcFbv3mydYfgfPFDunnJlhZK1ev5z06ho7TFDLmrDgh8+ddH9lSMqNzQg9Y2fjVzdvnVjEvWfyobvgF3lomUtjh94TP8EiXnvbxrtq2vp/+8/2BaoGBw8W+YE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773883859; c=relaxed/simple; bh=sfA+3bZvB69CzyWGHddi43wMVOfaqDDDp4VXMHtxteE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ZBLoea7rQ6NnV4u0xl9xFmRL8mdE8dAs65P2LbQPOKxHB/7hs0fbaBfrXiNoXLu/lvxX8mFRFJo7jn085KUhjk+whWp9kqAmQalupcW9JJIct2e5ZNyVEB18yksfwoCo7Kkz+JyDc7sll05zgyYOy6A8NzGqMsZr41rb3zJVjMA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=groves.net; spf=pass smtp.mailfrom=groves.net; arc=none smtp.client-ip=216.40.44.11 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=groves.net Received: from omf17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id DE9F01C3C9; Thu, 19 Mar 2026 01:30:50 +0000 (UTC) Received: from [HIDDEN] (Authenticated sender: john@groves.net) by omf17.hostedemail.com (Postfix) with ESMTPA id 4CF5A17; Thu, 19 Mar 2026 01:30:40 +0000 (UTC) From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Shuah Khan , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, John Groves Subject: [PATCH V8 8/8] dax: export dax_dev_get() Date: Wed, 18 Mar 2026 20:30:38 -0500 Message-ID: <20260319013038.4549-1-john@groves.net> X-Mailer: git-send-email 2.52.0 In-Reply-To: <20260318202737.4344.dax@groves.net> References: <20260318202737.4344.dax@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspamout05 X-Rspamd-Queue-Id: 4CF5A17 X-Stat-Signature: j193qn9epbsx3jmggzmtcj51pzrybqki X-Session-Marker: 6A6F686E4067726F7665732E6E6574 X-Session-ID: U2FsdGVkX191OAqzPXQRJKjpTo6o4MsYujJdCFxnfJQ= X-HE-Tag: 1773883840-433547 X-HE-Meta: U2FsdGVkX1+5CnNg1C5ZPYEOZ3Q1aVuXASLCeSLcwrIAfnA8vN/LETLXkZgQ1nTKYAOhT1/3y2pwjY2qLC6UfLMeH30OiVHKZ2BDqs9Hi/h6cw3fm126Rv+3td5TDEYGlqOqTARr/bGHqdaa4fyr3VhusGctg+9X05cv8ImdglvUlN+ecwLMvCtF9EOGeSF29SEP9F3m/aLMDOrO7Jtxf4Ujru7EAatRjqHeCeUnKZjGGp8lRHntGMOzr5ISCwdWq77q1jOEMEUIgGNEtlJ5k4Qe30PA8xMqh5HbVLfpFHOe9kQTmYo0wvInw0NeA7lO3Tbi7kq2Eb0ykgBWc0H8Gz3I8IvnyINl Content-Type: text/plain; charset="utf-8" famfs needs to look up a dax_device by dev_t when resolving fmap entries that reference character dax devices. Reviewed-by: Dave Jiang Signed-off-by: John Groves --- drivers/dax/super.c | 3 ++- include/linux/dax.h | 1 + 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/dax/super.c b/drivers/dax/super.c index d4ab60c406bf..25cf99dd9360 100644 --- a/drivers/dax/super.c +++ b/drivers/dax/super.c @@ -521,7 +521,7 @@ static int dax_set(struct inode *inode, void *data) return 0; } =20 -static struct dax_device *dax_dev_get(dev_t devt) +struct dax_device *dax_dev_get(dev_t devt) { struct dax_device *dax_dev; struct inode *inode; @@ -544,6 +544,7 @@ static struct dax_device *dax_dev_get(dev_t devt) =20 return dax_dev; } +EXPORT_SYMBOL_GPL(dax_dev_get); =20 struct dax_device *alloc_dax(void *private, const struct dax_operations *o= ps) { diff --git a/include/linux/dax.h b/include/linux/dax.h index f14fa2147175..2c7aba26a9ad 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -55,6 +55,7 @@ struct dax_device *alloc_dax(void *private, const struct = dax_operations *ops); void *dax_holder(struct dax_device *dax_dev); void put_dax(struct dax_device *dax_dev); void kill_dax(struct dax_device *dax_dev); +struct dax_device *dax_dev_get(dev_t devt); void dax_write_cache(struct dax_device *dax_dev, bool wc); bool dax_write_cache_enabled(struct dax_device *dax_dev); bool dax_synchronous(struct dax_device *dax_dev); --=20 2.53.0