From nobody Sat Feb 7 07:10:19 2026 Received: from mail-oi1-f182.google.com (mail-oi1-f182.google.com [209.85.167.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9C33E35773A for ; Wed, 7 Jan 2026 15:33:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800029; cv=none; b=ZPur9fCwfuuM2vBuwTOvoAmN9CkX4V+tSNXTjD3U6eQ9ky7xy5/vxefNxl74zWZuZbMOHyTRh2P62Lt4GICqnqVvKvK0koynkSBuNhl9oQz+vVaoU/PiVSioHeLcS6g9RDUAtTEZueHjZc582iLRtXTUSYDnTYEGOEwnekcuCBs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800029; c=relaxed/simple; bh=NuWusZLC5UQlkk7hlcFERKM1c2CGKJNdml11bqf0XyQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ZfHyqWcByAvOX2LXz3sQAzPfBKBFH2XOlLiKHLNv2Xvb3q+GJ3Q90KkArR5QkzEWSxNbChD9OGNd5rNw9eiRbBbp02j4eQRwGX3zhhfWboFROf8Xq1IF8SncP56PbieWehWEfL7sGqRYLyjrZvi+V/BEG8ZMoK0/Ie80rZc1Hp4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Hurcpro+; arc=none smtp.client-ip=209.85.167.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Hurcpro+" Received: by mail-oi1-f182.google.com with SMTP id 5614622812f47-459993ff4fcso916854b6e.1 for ; Wed, 07 Jan 2026 07:33:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767800025; x=1768404825; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=PmVFHvoo6qZY7RUGgdXNII0ab/XeZ6ETTaZK349UK+g=; b=Hurcpro+8LEtZ49Fvgl+naOUniaWSrr/jLSb8QJPmUHiYENbE0g0KrAFu25i91KrFP F717IoaxWysjYc99rZofw7HUlRRbcOiAOpnZMoIprDKPFaCQXEF4raacQxZb2jwrwMHv A5BPNjgo0p5Epru9PkGls/pD3t4qcIjyc/z3dARYelhNNqmNnAaYBvS1VZw4rw0SNJvz Ytw6LQAQDSg0diXBAUDlIbq97k3cDYRX64HyTPsL/M9Acnx9JX4bBVvSbxVvNt+F1shE gejKbtEcnf0PXEeVv1FO2oSoq+rFsNKxBT1WcY2fhmEvUFAE8y/Pdx1GF/quZi9JDls/ jSoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767800025; x=1768404825; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=PmVFHvoo6qZY7RUGgdXNII0ab/XeZ6ETTaZK349UK+g=; b=u3V4Axiezc1M/clKt8ibC2XMeLXrc3C10BaQszCiyqLjSzej/hkUkFVFMRKofkjiLV GKXC6wZh6CtdRBhjIvnzEfcC2yPik8V+0lAZEHKOzWys4CJlMTBM41hKORlJplE+14YO dEzVC7prkP0oQLNkuTct22duXmuL1r2LbVY6JozKYTiqiLJDf+YxWqCAYi6mKYyWfxwY /08vR/w4RqekJwRknChLRXEe0ONtc3Cr5s9lPk+RsVJi+FiaGRo/4cgfVk1uViHn36XQ GlrEvJWNe7m6FakuSj+pAca2c7VtAHX5uEVD5riNWNQL5LSl0tS7LFGz0Do/dpgzPytA iISA== X-Forwarded-Encrypted: i=1; AJvYcCWI7EAjW4Qr5jBeOeSPe5vP6uSyhxOGSD8Rk3wgl+WnUhiQcyWbFWKeZ+wlRSFTcoLTdM2yBk4Bd1/JZ4g=@vger.kernel.org X-Gm-Message-State: AOJu0YxkS4OWkqw86bMv0GJarT5AoLtnvGSVtF2Hi4cyw7iqw3TcZU7n Op1rUmNvZST/My5abQDhHNe2b9KUnkUVFvEv2SASnnTK/ZtCsrZlVMDE X-Gm-Gg: AY/fxX5Og6lTqJmIyR9N64z8coSofoFyV9dZt8S3fTQCvPIcC2YmCmMBsj5b5puBTSN HvdCIeOapNjTVDkqi4a0PIddrMgpyQBM/1SaI7ZM28A80tISVFbK8oiEzp/6DoHz5y5bcBH8+h0 Fm9LA23OFBVxu5NGgP5SCfGO9tp8rzO79/OGV7v9Fu1ePnjWBCmsNKB5vo9E2jq30J6aY/vryJ9 NPliTLq8/zUpl5rFBIgUXfivKFUCD83KhIML7Li+DnNBw2icKAOrEebTLGPCLOZUH85lYCNTk4k Zc865aLIryQ5YdMYhXKCIpzZiCp9nYghFvMcWtBD0hTxaIxrUwVt7tUpE+Rbq/Eti2LgMYgQzd/ nyOYf+PdIb3qFLbOb/7gi0JpvgbviTnyy3Ioco1kJKjqpxGlt5InkfrWbNwsvVIaI4PvyYHqOj9 6eIQFL7A7xEGy4O5WSPA7G/qunpJNiJaqsNY8l4K9hQBNc X-Google-Smtp-Source: AGHT+IHkLI2SMyhkKOgLjFmL6YFwT+7moAdIUHTTTIFi2bmiuCeEJRn6ElXDVg0S7UXRPCs1iwtC5Q== X-Received: by 2002:a05:6808:c2d8:b0:45a:5584:b84d with SMTP id 5614622812f47-45a6bebe564mr1179771b6e.32.1767800025297; Wed, 07 Jan 2026 07:33:45 -0800 (PST) Received: from localhost.localdomain ([2603:8080:1500:3d89:a917:5124:7300:7cef]) by smtp.gmail.com with ESMTPSA id 5614622812f47-45a5e2f1de5sm2398106b6e.22.2026.01.07.07.33.43 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 07 Jan 2026 07:33:44 -0800 (PST) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, John Groves Subject: [PATCH V3 01/21] dax: move dax_pgoff_to_phys from [drivers/dax/] device.c to bus.c Date: Wed, 7 Jan 2026 09:33:10 -0600 Message-ID: <20260107153332.64727-2-john@groves.net> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20260107153332.64727-1-john@groves.net> References: <20260107153244.64703-1-john@groves.net> <20260107153332.64727-1-john@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This function will be used by both device.c and fsdev.c, but both are loadable modules. Moving to bus.c puts it in core and makes it available to both. No code changes - just relocated. Signed-off-by: John Groves --- drivers/dax/bus.c | 27 +++++++++++++++++++++++++++ drivers/dax/device.c | 23 ----------------------- 2 files changed, 27 insertions(+), 23 deletions(-) diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c index fde29e0ad68b..a2f9a3cc30a5 100644 --- a/drivers/dax/bus.c +++ b/drivers/dax/bus.c @@ -7,6 +7,9 @@ #include #include #include +#include +#include +#include #include "dax-private.h" #include "bus.h" =20 @@ -1417,6 +1420,30 @@ static const struct device_type dev_dax_type =3D { .groups =3D dax_attribute_groups, }; =20 +/* see "strong" declaration in tools/testing/nvdimm/dax-dev.c */ +__weak phys_addr_t dax_pgoff_to_phys(struct dev_dax *dev_dax, pgoff_t pgof= f, + unsigned long size) +{ + int i; + + for (i =3D 0; i < dev_dax->nr_range; i++) { + struct dev_dax_range *dax_range =3D &dev_dax->ranges[i]; + struct range *range =3D &dax_range->range; + unsigned long long pgoff_end; + phys_addr_t phys; + + pgoff_end =3D dax_range->pgoff + PHYS_PFN(range_len(range)) - 1; + if (pgoff < dax_range->pgoff || pgoff > pgoff_end) + continue; + phys =3D PFN_PHYS(pgoff - dax_range->pgoff) + range->start; + if (phys + size - 1 <=3D range->end) + return phys; + break; + } + return -1; +} +EXPORT_SYMBOL_GPL(dax_pgoff_to_phys); + static struct dev_dax *__devm_create_dev_dax(struct dev_dax_data *data) { struct dax_region *dax_region =3D data->dax_region; diff --git a/drivers/dax/device.c b/drivers/dax/device.c index 22999a402e02..132c1d03fd07 100644 --- a/drivers/dax/device.c +++ b/drivers/dax/device.c @@ -57,29 +57,6 @@ static int check_vma(struct dev_dax *dev_dax, struct vm_= area_struct *vma, vma->vm_file, func); } =20 -/* see "strong" declaration in tools/testing/nvdimm/dax-dev.c */ -__weak phys_addr_t dax_pgoff_to_phys(struct dev_dax *dev_dax, pgoff_t pgof= f, - unsigned long size) -{ - int i; - - for (i =3D 0; i < dev_dax->nr_range; i++) { - struct dev_dax_range *dax_range =3D &dev_dax->ranges[i]; - struct range *range =3D &dax_range->range; - unsigned long long pgoff_end; - phys_addr_t phys; - - pgoff_end =3D dax_range->pgoff + PHYS_PFN(range_len(range)) - 1; - if (pgoff < dax_range->pgoff || pgoff > pgoff_end) - continue; - phys =3D PFN_PHYS(pgoff - dax_range->pgoff) + range->start; - if (phys + size - 1 <=3D range->end) - return phys; - break; - } - return -1; -} - static void dax_set_mapping(struct vm_fault *vmf, unsigned long pfn, unsigned long fault_size) { --=20 2.49.0 From nobody Sat Feb 7 07:10:19 2026 Received: from mail-oi1-f182.google.com (mail-oi1-f182.google.com [209.85.167.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A56D6393DCE for ; Wed, 7 Jan 2026 15:33:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800032; cv=none; b=elrcjDcs6izT1iDVAzz/CnwCCI6emDOvzjrPxry4Ru0tPCMCM01/z5jvfiP5qhHdFm3RhxTaip1ftBIFy/I5T2tiGuBxdlPWSIx4D5a/ErCfP3s+UZykzhrUgTLiqrWdylBNQ47BdbvuX1n+rwYb5pNiIjYB/emguhpCrXVL+us= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800032; c=relaxed/simple; bh=XYuMwvSGfsD6m78+ebqdaOLpAZMpGJGzuP5EaFJuLj0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=sM1MoVl2QJXR/D7xRbhf9y41Eo5/5s322DrRHhMJMYJPNUib5pIYirXgvHROLdAysUguJNOhvDJEUNRzEuYG81Ld30qTLISWdT1gvzJLi2utxNh+sbhrqKbeCHZZxaDzAEfP/1y8MapryRQh3XZT1rTbkQg+dzaDejbDehtp6EQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=m4LwVfsH; arc=none smtp.client-ip=209.85.167.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="m4LwVfsH" Received: by mail-oi1-f182.google.com with SMTP id 5614622812f47-45085a4ab72so1325773b6e.1 for ; Wed, 07 Jan 2026 07:33:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767800028; x=1768404828; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=PpTU4sBIkkpSuASPg2mbfPHdAKpdWRtHuhFiIrJI6T0=; b=m4LwVfsH83mlZnaxdHRnJm5RhTULori+eE+xpnK1+x04Tje119eDewOJ0batT453Dt 2fHoQbSj8u3GW7nTbHzd6Yc1T1kVjYzgqdJaWvAkJB7tr55C4HddAIdCApoIAS5ndbkH IyhvN2vvqLdRvEP8IeWbmm7IAwdJN0ZUCAD1lutsMSBLCdQVKzXFv2yVqYnceselhi3B TFos77MDruqSj5mxmfvos1oJvq2JdrUqNtR7e32UFWIVi+BOdxTdm0cZ2U42C1QDlKd5 6vFiji0jxNHntJT5GHaE1AyZZhgj62eJXmiGy9bxLPKtJaD1tH8vCtwCVfdpUttNoI2U 91FQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767800028; x=1768404828; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=PpTU4sBIkkpSuASPg2mbfPHdAKpdWRtHuhFiIrJI6T0=; b=SdXQta+MCdpMfH7LWCG+Ce9UHaRynbzZLXmnI756l6/BgM6JkZU+Ihk2OdP2/ukbWy zWm5wen2XWgG8HLZSOslFlgPu8cYZone6PsUwj5EOV5gr9uIG4Uq/sXEAoabyDgsLhEH x7fXDm+TZZfK7AAne/35xtfiFcpW/gH0iP98bD1B3KRMRPfVYDw/LcXG2Lybv14SLL66 6HX8SZVDiShJmnlIHDBnBK8cuKidU1Cp1n8Q3vgq+ekAC9OGkoiS1zYv4z64pNipei1s 8iPNNfznrVtbR57X4U3ZVJ407Mwspo0GOki8mAhe2/SAnyjwhzhgJ4tixULRiMUu97FY I/Gw== X-Forwarded-Encrypted: i=1; AJvYcCUvq3Gw+/pFMjl4atwJgynAlnrMVdSaCBY4kDy8r0yqec/mVV0UHgArK22+XNHVNmwgPllMGBoCUzvbs8o=@vger.kernel.org X-Gm-Message-State: AOJu0YzgQRZtqmgv6twn+D1WbtAeai9xJVJKPuIyoTxBB8OXx/tEUIaY B9AMtvVlt0wtMGBoMlgbupHO8q5fDMgJctky593u3Qq3Grz7HCdr8byg X-Gm-Gg: AY/fxX4GMJMsx9PWhZSgVgZ2jRgNi31PMwHnAh9KE6fVNOx5Jzl1NPK0S82KKz2O3wg HseAf0TtwEcYLrGg49Ynn7LXRPevsoZiZppsnCT4pkBPRf+Di9TZKk02k8wEbMUp2sV3Oql0yEJ V9HjEakWZ+gKwBxVAH8auXlh5cCrQUrNsatXKwOkZtzIWTmRjkPYYjwcu4sNXVlfXa4nmlCCipS AIXmt6jpLWSoEiouMNorlGL2jn05fcnNw/k9hzyH4yxfX5B3domIK3AvYi8Shus8cvdqXCYfaJI Y58+VG2R55D9lmkkXyad7hrxrZw+f+1AsIuRIExEw/BvguQJyrjXipDWZEhb8opjHKPZbAUMrQx OspYHTu7AceNTDLd7ZHGIU+2GprV3KNO543rVR9Tb91oULnpyhbAPMQm9gdgoHSMfZ0VdZS9S7H Dv9jv1jKOllUjAETse2J/T8IQPNmMfcAZh92dmiM3jwOP1 X-Google-Smtp-Source: AGHT+IHPGtjoYX1Si7Gm5rn+aIJuMk4zbmp2FbhjNzf6QxbBnA8l/jsAfKj/acLHvT+M5z4N6tttBA== X-Received: by 2002:a05:6808:6412:b0:450:cc6d:d4ce with SMTP id 5614622812f47-45a6bf090cdmr1300044b6e.63.1767800028458; Wed, 07 Jan 2026 07:33:48 -0800 (PST) Received: from localhost.localdomain ([2603:8080:1500:3d89:a917:5124:7300:7cef]) by smtp.gmail.com with ESMTPSA id 5614622812f47-45a5e2f1de5sm2398106b6e.22.2026.01.07.07.33.46 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 07 Jan 2026 07:33:48 -0800 (PST) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, John Groves Subject: [PATCH V3 02/21] dax: add fsdev.c driver for fs-dax on character dax Date: Wed, 7 Jan 2026 09:33:11 -0600 Message-ID: <20260107153332.64727-3-john@groves.net> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20260107153332.64727-1-john@groves.net> References: <20260107153244.64703-1-john@groves.net> <20260107153332.64727-1-john@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The new fsdev driver provides pages/folios initialized compatibly with fsdax - normal rather than devdax-style refcounting, and starting out with order-0 folios. When fsdev binds to a daxdev, it is usually (always?) switching from the devdax mode (device.c), which pre-initializes compound folios according to its alignment. Fsdev uses fsdev_clear_folio_state() to switch the folios into a fsdax-compatible state. A side effect of this is that raw mmap doesn't (can't?) work on an fsdev dax instance. Accordingly, The fsdev driver does not provide raw mmap - devices must be put in 'devdax' mode (drivers/dax/device.c) to get raw mmap capability. In this commit is just the framework, which remaps pages/folios compatibly with fsdax. Enabling dax changes: * bus.h: add DAXDRV_FSDEV_TYPE driver type * bus.c: allow DAXDRV_FSDEV_TYPE drivers to bind to daxdevs * dax.h: prototype inode_dax(), which fsdev needs Suggested-by: Dan Williams Suggested-by: Gregory Price Signed-off-by: John Groves --- MAINTAINERS | 8 ++ drivers/dax/Kconfig | 17 +++ drivers/dax/Makefile | 2 + drivers/dax/bus.c | 4 + drivers/dax/bus.h | 1 + drivers/dax/fsdev.c | 276 +++++++++++++++++++++++++++++++++++++++++++ include/linux/dax.h | 4 + 7 files changed, 312 insertions(+) create mode 100644 drivers/dax/fsdev.c diff --git a/MAINTAINERS b/MAINTAINERS index 765ad2daa218..90429cb06090 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -7184,6 +7184,14 @@ L: linux-cxl@vger.kernel.org S: Supported F: drivers/dax/ =20 +DEVICE DIRECT ACCESS (DAX) [fsdev_dax] +M: John Groves +M: John Groves +L: nvdimm@lists.linux.dev +L: linux-cxl@vger.kernel.org +S: Supported +F: drivers/dax/fsdev.c + DEVICE FREQUENCY (DEVFREQ) M: MyungJoo Ham M: Kyungmin Park diff --git a/drivers/dax/Kconfig b/drivers/dax/Kconfig index d656e4c0eb84..491325d914a8 100644 --- a/drivers/dax/Kconfig +++ b/drivers/dax/Kconfig @@ -78,4 +78,21 @@ config DEV_DAX_KMEM =20 Say N if unsure. =20 +config DEV_DAX_FS + tristate "FSDEV DAX: fs-dax compatible device driver" + depends on DEV_DAX + default DEV_DAX + help + Support a device-dax driver mode that is compatible with fs-dax + filesystems. Unlike the standard device-dax driver which + pre-initializes compound folios based on device alignment, this + driver leaves folios uninitialized (similar to pmem) allowing + fs-dax to manage folio lifecycles dynamically. + + This driver uses MEMORY_DEVICE_FS_DAX type and does not set + vmemmap_shift, making it compatible with filesystems like famfs + that use the iomap-based fs-dax infrastructure. + + Say M if you plan to use fs-dax filesystems on /dev/dax devices. + Say N if you only need raw character device access to DAX memory. endif diff --git a/drivers/dax/Makefile b/drivers/dax/Makefile index 5ed5c39857c8..77aa3df3285c 100644 --- a/drivers/dax/Makefile +++ b/drivers/dax/Makefile @@ -4,11 +4,13 @@ obj-$(CONFIG_DEV_DAX) +=3D device_dax.o obj-$(CONFIG_DEV_DAX_KMEM) +=3D kmem.o obj-$(CONFIG_DEV_DAX_PMEM) +=3D dax_pmem.o obj-$(CONFIG_DEV_DAX_CXL) +=3D dax_cxl.o +obj-$(CONFIG_DEV_DAX_FS) +=3D fsdev_dax.o =20 dax-y :=3D super.o dax-y +=3D bus.o device_dax-y :=3D device.o dax_pmem-y :=3D pmem.o dax_cxl-y :=3D cxl.o +fsdev_dax-y :=3D fsdev.o =20 obj-y +=3D hmem/ diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c index a2f9a3cc30a5..0d7228acb913 100644 --- a/drivers/dax/bus.c +++ b/drivers/dax/bus.c @@ -84,6 +84,10 @@ static int dax_match_type(const struct dax_device_driver= *dax_drv, struct device !IS_ENABLED(CONFIG_DEV_DAX_KMEM)) return 1; =20 + /* fsdev driver can also bind to device-type dax devices */ + if (dax_drv->type =3D=3D DAXDRV_FSDEV_TYPE && type =3D=3D DAXDRV_DEVICE_T= YPE) + return 1; + return 0; } =20 diff --git a/drivers/dax/bus.h b/drivers/dax/bus.h index cbbf64443098..880bdf7e72d7 100644 --- a/drivers/dax/bus.h +++ b/drivers/dax/bus.h @@ -31,6 +31,7 @@ struct dev_dax *devm_create_dev_dax(struct dev_dax_data *= data); enum dax_driver_type { DAXDRV_KMEM_TYPE, DAXDRV_DEVICE_TYPE, + DAXDRV_FSDEV_TYPE, }; =20 struct dax_device_driver { diff --git a/drivers/dax/fsdev.c b/drivers/dax/fsdev.c new file mode 100644 index 000000000000..2a3249d1529c --- /dev/null +++ b/drivers/dax/fsdev.c @@ -0,0 +1,276 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright(c) 2026 Micron Technology, Inc. */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "dax-private.h" +#include "bus.h" + +/* + * FS-DAX compatible devdax driver + * + * Unlike drivers/dax/device.c which pre-initializes compound folios based + * on device alignment (via vmemmap_shift), this driver leaves folios + * uninitialized similar to pmem. This allows fs-dax filesystems like famfs + * to work without needing special handling for pre-initialized folios. + * + * Key differences from device.c: + * - pgmap type is MEMORY_DEVICE_FS_DAX (not MEMORY_DEVICE_GENERIC) + * - vmemmap_shift is NOT set (folios remain order-0) + * - fs-dax can dynamically create compound folios as needed + * - No mmap support - all access is through fs-dax/iomap + */ + + +static void fsdev_cdev_del(void *cdev) +{ + cdev_del(cdev); +} + +static void fsdev_kill(void *dev_dax) +{ + kill_dev_dax(dev_dax); +} + +/* + * Page map operations for FS-DAX mode + * Similar to fsdax_pagemap_ops in drivers/nvdimm/pmem.c + * + * Note: folio_free callback is not needed for MEMORY_DEVICE_FS_DAX. + * The core mm code in free_zone_device_folio() handles the wake_up_var() + * directly for this memory type. + */ +static int fsdev_pagemap_memory_failure(struct dev_pagemap *pgmap, + unsigned long pfn, unsigned long nr_pages, int mf_flags) +{ + struct dev_dax *dev_dax =3D pgmap->owner; + u64 offset =3D PFN_PHYS(pfn) - dev_dax->ranges[0].range.start; + u64 len =3D nr_pages << PAGE_SHIFT; + + return dax_holder_notify_failure(dev_dax->dax_dev, offset, + len, mf_flags); +} + +static const struct dev_pagemap_ops fsdev_pagemap_ops =3D { + .memory_failure =3D fsdev_pagemap_memory_failure, +}; + +/* + * Clear any stale folio state from pages in the given range. + * This is necessary because device_dax pre-initializes compound folios + * based on vmemmap_shift, and that state may persist after driver unbind. + * Since fsdev_dax uses MEMORY_DEVICE_FS_DAX without vmemmap_shift, fs-dax + * expects to find clean order-0 folios that it can build into compound + * folios on demand. + * + * At probe time, no filesystem should be mounted yet, so all mappings + * are stale and must be cleared along with compound state. + */ +static void fsdev_clear_folio_state(struct dev_dax *dev_dax) +{ + int i; + + for (i =3D 0; i < dev_dax->nr_range; i++) { + struct range *range =3D &dev_dax->ranges[i].range; + unsigned long pfn, end_pfn; + + pfn =3D PHYS_PFN(range->start); + end_pfn =3D PHYS_PFN(range->end) + 1; + + while (pfn < end_pfn) { + struct page *page =3D pfn_to_page(pfn); + struct folio *folio =3D (struct folio *)page; + struct dev_pagemap *pgmap =3D page_pgmap(page); + int order =3D folio_order(folio); + + /* + * Clear any stale mapping pointer. At probe time, + * no filesystem is mounted, so any mapping is stale. + */ + folio->mapping =3D NULL; + folio->share =3D 0; + + if (order > 0) { + int j; + + folio_reset_order(folio); + for (j =3D 0; j < (1UL << order); j++) { + struct page *p =3D page + j; + + ClearPageHead(p); + clear_compound_head(p); + ((struct folio *)p)->mapping =3D NULL; + ((struct folio *)p)->share =3D 0; + ((struct folio *)p)->pgmap =3D pgmap; + } + pfn +=3D (1UL << order); + } else { + folio->pgmap =3D pgmap; + pfn++; + } + } + } +} + +static int fsdev_open(struct inode *inode, struct file *filp) +{ + struct dax_device *dax_dev =3D inode_dax(inode); + struct dev_dax *dev_dax =3D dax_get_private(dax_dev); + + dev_dbg(&dev_dax->dev, "trace\n"); + filp->private_data =3D dev_dax; + + return 0; +} + +static int fsdev_release(struct inode *inode, struct file *filp) +{ + struct dev_dax *dev_dax =3D filp->private_data; + + dev_dbg(&dev_dax->dev, "trace\n"); + return 0; +} + +static const struct file_operations fsdev_fops =3D { + .llseek =3D noop_llseek, + .owner =3D THIS_MODULE, + .open =3D fsdev_open, + .release =3D fsdev_release, +}; + +static int fsdev_dax_probe(struct dev_dax *dev_dax) +{ + struct dax_device *dax_dev =3D dev_dax->dax_dev; + struct device *dev =3D &dev_dax->dev; + struct dev_pagemap *pgmap; + u64 data_offset =3D 0; + struct inode *inode; + struct cdev *cdev; + void *addr; + int rc, i; + + if (static_dev_dax(dev_dax)) { + if (dev_dax->nr_range > 1) { + dev_warn(dev, + "static pgmap / multi-range device conflict\n"); + return -EINVAL; + } + + pgmap =3D dev_dax->pgmap; + } else { + if (dev_dax->pgmap) { + dev_warn(dev, + "dynamic-dax with pre-populated page map\n"); + return -EINVAL; + } + + pgmap =3D devm_kzalloc(dev, + struct_size(pgmap, ranges, dev_dax->nr_range - 1), + GFP_KERNEL); + if (!pgmap) + return -ENOMEM; + + pgmap->nr_range =3D dev_dax->nr_range; + dev_dax->pgmap =3D pgmap; + + for (i =3D 0; i < dev_dax->nr_range; i++) { + struct range *range =3D &dev_dax->ranges[i].range; + + pgmap->ranges[i] =3D *range; + } + } + + for (i =3D 0; i < dev_dax->nr_range; i++) { + struct range *range =3D &dev_dax->ranges[i].range; + + if (!devm_request_mem_region(dev, range->start, + range_len(range), dev_name(dev))) { + dev_warn(dev, "mapping%d: %#llx-%#llx could not reserve range\n", + i, range->start, range->end); + return -EBUSY; + } + } + + /* + * FS-DAX compatible mode: Use MEMORY_DEVICE_FS_DAX type and + * do NOT set vmemmap_shift. This leaves folios at order-0, + * allowing fs-dax to dynamically create compound folios as needed + * (similar to pmem behavior). + */ + pgmap->type =3D MEMORY_DEVICE_FS_DAX; + pgmap->ops =3D &fsdev_pagemap_ops; + pgmap->owner =3D dev_dax; + + /* + * CRITICAL DIFFERENCE from device.c: + * We do NOT set vmemmap_shift here, even if align > PAGE_SIZE. + * This ensures folios remain order-0 and are compatible with + * fs-dax's folio management. + */ + + addr =3D devm_memremap_pages(dev, pgmap); + if (IS_ERR(addr)) + return PTR_ERR(addr); + + /* + * Clear any stale compound folio state left over from a previous + * driver (e.g., device_dax with vmemmap_shift). + */ + fsdev_clear_folio_state(dev_dax); + + /* Detect whether the data is at a non-zero offset into the memory */ + if (pgmap->range.start !=3D dev_dax->ranges[0].range.start) { + u64 phys =3D dev_dax->ranges[0].range.start; + u64 pgmap_phys =3D dev_dax->pgmap[0].range.start; + + if (!WARN_ON(pgmap_phys > phys)) + data_offset =3D phys - pgmap_phys; + + pr_debug("%s: offset detected phys=3D%llx pgmap_phys=3D%llx offset=3D%ll= x\n", + __func__, phys, pgmap_phys, data_offset); + } + + inode =3D dax_inode(dax_dev); + cdev =3D inode->i_cdev; + cdev_init(cdev, &fsdev_fops); + cdev->owner =3D dev->driver->owner; + cdev_set_parent(cdev, &dev->kobj); + rc =3D cdev_add(cdev, dev->devt, 1); + if (rc) + return rc; + + rc =3D devm_add_action_or_reset(dev, fsdev_cdev_del, cdev); + if (rc) + return rc; + + run_dax(dax_dev); + return devm_add_action_or_reset(dev, fsdev_kill, dev_dax); +} + +static struct dax_device_driver fsdev_dax_driver =3D { + .probe =3D fsdev_dax_probe, + .type =3D DAXDRV_FSDEV_TYPE, +}; + +static int __init dax_init(void) +{ + return dax_driver_register(&fsdev_dax_driver); +} + +static void __exit dax_exit(void) +{ + dax_driver_unregister(&fsdev_dax_driver); +} + +MODULE_AUTHOR("John Groves"); +MODULE_DESCRIPTION("FS-DAX Device: fs-dax compatible devdax driver"); +MODULE_LICENSE("GPL"); +module_init(dax_init); +module_exit(dax_exit); +MODULE_ALIAS_DAX_DEVICE(0); diff --git a/include/linux/dax.h b/include/linux/dax.h index 9d624f4d9df6..74e098010016 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -51,6 +51,10 @@ struct dax_holder_operations { =20 #if IS_ENABLED(CONFIG_DAX) struct dax_device *alloc_dax(void *private, const struct dax_operations *o= ps); + +#if IS_ENABLED(CONFIG_DEV_DAX_FS) +struct dax_device *inode_dax(struct inode *inode); +#endif void *dax_holder(struct dax_device *dax_dev); void put_dax(struct dax_device *dax_dev); void kill_dax(struct dax_device *dax_dev); --=20 2.49.0 From nobody Sat Feb 7 07:10:19 2026 Received: from mail-oi1-f180.google.com (mail-oi1-f180.google.com [209.85.167.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 62A073A1CEF for ; Wed, 7 Jan 2026 15:33:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800034; cv=none; b=AmHrchTFpQUHCYBio6pe5B7u9VkE/mxkoPWtJoSjZsSrx8usyAC0zbhSgkM2xZ85T5+ZicoQyWQa1ntc9eeETUZjFAuBpYxEVQ5udL7R/LMW3mVRbkfYyPfdJ7FtkPMiGNtEJuqgxdoY3z1qeM9ujnQkvjESoCgsnwws+rtI0pY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800034; c=relaxed/simple; bh=hD5I6TrfIkqR6ZL+5ujqz6J1CgfKrd1fwnq5BYvWvKY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=EeMh0xdExlfX1oggA8D01B7qaNkrEx3fAdG3qL1O5SgGsZvRAJs7qoLnBAcSRGisk81kmnSiV9ISzmq9EQixMntYGh13/+O2o8mUdclB2cP/RFpvPAbmFHcyee9gX465OuFEKdwVNzT+ponwkbpJZtoqEy4E9I3jwKXxBIUDzWw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=FqPnRaEq; arc=none smtp.client-ip=209.85.167.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="FqPnRaEq" Received: by mail-oi1-f180.google.com with SMTP id 5614622812f47-455ddb90934so747634b6e.0 for ; Wed, 07 Jan 2026 07:33:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767800031; x=1768404831; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=sf1UcbhIiBjc/Uk94gbmf97qs8tTGqNT0DMGx4pvtTw=; b=FqPnRaEqzQkwy+NHMj5UExKUCB2RT6msJEJ7Rk5BXRxQkjQA0cwU2mNuZLBiZVFb/8 xk4EhgaRDDC6ig/3lBl3i3gR1lT0kinz9NxCx0pccFHf9OTXlDtQ4cEgRnaMGtg+GO5x IGjSwKY3awbgnxXV1yif2KjallwZq8ihFgjLfVPjMlB7XR7opveNzFuSg7fD2RXN7K84 L/CipeWyL39PwULdnoV+IlTwn/pCEGcoPjYdutALiI4r4oUuD6Mk02iDu3UVBYlM078+ 2gkjD/AUTEGMh5051dEng1qgEaSGm8jOjnU58EjaNueIs+PvBOHA0uZ0M3OLMscEhEZg CDOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767800031; x=1768404831; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=sf1UcbhIiBjc/Uk94gbmf97qs8tTGqNT0DMGx4pvtTw=; b=S0peVb7SamK+t2LfIsAH1Sxe8CihTXCRPjZzeT284WEcNkgxFS23b72OxMa/sMoTyu mOJrbWE9nY/WpBLZsWrXN74JS2MzyHN1M8DpaXGYDecLkB1o6S294+YSD1je6wSc2Ta6 kCWWdufIc+BvohrC9GVs2AdOYn9mcPbl6ryP1YuSkVDeNqqlps6Ax4wWxabtrl5XJOo9 qbhDTfLANWuligtxOAMEUjdFBu0F+Kub8wnXn+3NwnQwO79VSa0ylCc67qsijNFxtjKc SbMketVWw7NrjL83zO/0dRe3hzzq1NWGn/jMNV2qh1ChGnttn53rt0LskFz8W5j1Do58 r7Hw== X-Forwarded-Encrypted: i=1; AJvYcCWYgbsFCV7IspWvKj09CtKA7hjOzwHQM7dR8/Lg/SS21H7IyrjxhOy00bj8j6J9NuTn52WC42UHeOI5HtQ=@vger.kernel.org X-Gm-Message-State: AOJu0YzmRzeMv2CJJoxUKeeE9zrLNXuNToV/aSRLyhViT2JidYmB9Jb3 0Rkjr9s4r9gHJ4xOctzUleC/0NGUheltCuNQ6YzdsyLylA8EvLUSZ8O7 X-Gm-Gg: AY/fxX5Prur8QSGLwDWBCFZGWVoNRtAnr5UjU1e0nKfShQ4QVr2BiYm6wely2eWRQys d82IAUeRjrl0bu1B36jbE+eFS9Rh38fxpcCczIa/SlLU5SWdDn/DAiFUDqQgX0hale+bHmFgNUe XJ3pEakzfxnHt0qd3wk6dleDJ8KVHp4cDwZ0kemBMAJYyie/CY3dRiU4w/2ehUMXBnc8wxpXbpB YkzUgwRHZXzvve4m3V/M8brnFmTkzsM6Ed9hqOWTJD1KOWzPNSu2PV/Ck/xlaK7UX2zD9F+sJGI PrpaK/NrOYoM5okZIruckdIFxTH4j53O6p3FpUxYd6tz86yr7jAl2BimMCpLKqjgWl33h8ylamm bMrPmK+NTr6vcRvVUBaMUl9wuIyaQYrlf5LfwHzgMldyYfqFotZU9rEZWcnSj+XTz+gjbYOzQgk 5Yy/WTRIQPxb59JMAyew+T85gUqCXHqr2tW96R/MI/6BpR X-Google-Smtp-Source: AGHT+IEqVwo0n9Yhx+3/EIppk2vTR8GrN00AjqQRNTP0en45Aku1OwSuAXlc5vd++JGo553Ggky0EQ== X-Received: by 2002:a05:6808:1822:b0:450:3823:b607 with SMTP id 5614622812f47-45a6bf24bebmr1260765b6e.59.1767800031155; Wed, 07 Jan 2026 07:33:51 -0800 (PST) Received: from localhost.localdomain ([2603:8080:1500:3d89:a917:5124:7300:7cef]) by smtp.gmail.com with ESMTPSA id 5614622812f47-45a5e2f1de5sm2398106b6e.22.2026.01.07.07.33.49 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 07 Jan 2026 07:33:50 -0800 (PST) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, John Groves Subject: [PATCH V3 03/21] dax: Save the kva from memremap Date: Wed, 7 Jan 2026 09:33:12 -0600 Message-ID: <20260107153332.64727-4-john@groves.net> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20260107153332.64727-1-john@groves.net> References: <20260107153244.64703-1-john@groves.net> <20260107153332.64727-1-john@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Save the kva from memremap because we need it for iomap rw support. Prior to famfs, there were no iomap users of /dev/dax - so the virtual address from memremap was not needed. (also fill in missing kerneldoc comment fields for struct dev_dax) Signed-off-by: John Groves --- drivers/dax/dax-private.h | 4 ++++ drivers/dax/fsdev.c | 1 + 2 files changed, 5 insertions(+) diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h index 0867115aeef2..1bb1631af485 100644 --- a/drivers/dax/dax-private.h +++ b/drivers/dax/dax-private.h @@ -69,18 +69,22 @@ struct dev_dax_range { * data while the device is activated in the driver. * @region - parent region * @dax_dev - core dax functionality + * @virt_addr - kva from memremap; used by fsdev_dax + * @align - alignment of this instance * @target_node: effective numa node if dev_dax memory range is onlined * @dyn_id: is this a dynamic or statically created instance * @id: ida allocated id when the dax_region is not static * @ida: mapping id allocator * @dev - device core * @pgmap - pgmap for memmap setup / lifetime (driver owned) + * @memmap_on_memory - allow kmem to put the memmap in the memory * @nr_range: size of @ranges * @ranges: range tuples of memory used */ struct dev_dax { struct dax_region *region; struct dax_device *dax_dev; + void *virt_addr; unsigned int align; int target_node; bool dyn_id; diff --git a/drivers/dax/fsdev.c b/drivers/dax/fsdev.c index 2a3249d1529c..c5c660b193e5 100644 --- a/drivers/dax/fsdev.c +++ b/drivers/dax/fsdev.c @@ -235,6 +235,7 @@ static int fsdev_dax_probe(struct dev_dax *dev_dax) pr_debug("%s: offset detected phys=3D%llx pgmap_phys=3D%llx offset=3D%ll= x\n", __func__, phys, pgmap_phys, data_offset); } + dev_dax->virt_addr =3D addr + data_offset; =20 inode =3D dax_inode(dax_dev); cdev =3D inode->i_cdev; --=20 2.49.0 From nobody Sat Feb 7 07:10:19 2026 Received: from mail-oi1-f172.google.com (mail-oi1-f172.google.com [209.85.167.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1668B3AA1AF for ; Wed, 7 Jan 2026 15:33:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800038; cv=none; b=YQrWqDADJpRx9MBzApokmZKpztM16YStwL3BlCCzEayrvKTDyOC6vl0yplTnQEdta4SY+z75er4m6MyRKvAc5RsB3Z0KwVAx8QFqAzurlNExMKXoHMpTt0wEhxTj6JwXH5xjfLn9ujddAen63kv/OlToBy4a8yype6DluIqRDMY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800038; c=relaxed/simple; bh=/zbQHEHtq57A2Q6JHU6gAg/bO8p74zRbMnNSIB3dT5M=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Z1jjxAYCtnbGubDpuvoZz/exPmC6gJ3Jol4FH2ejTDIXRoqoIXjQlQeFFP+4lPVtpgyP18N0mJ5nqy2yCAdT+AYwahquF2dLQHTy/UP+gH1TbECSou3nAD6IQ2WztvxaRE8jdwMoWF0TKpb8kiM1KoUDBR3gzebCoKMRDk6lKro= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=lt5Lfk3l; arc=none smtp.client-ip=209.85.167.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="lt5Lfk3l" Received: by mail-oi1-f172.google.com with SMTP id 5614622812f47-44fe903c1d6so619664b6e.0 for ; Wed, 07 Jan 2026 07:33:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767800034; x=1768404834; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=5ATdgisM3Tik/NiyOwZFhtzJ8q/T+7QdescJPqQtpWw=; b=lt5Lfk3lcoB8+ss+HcELQvjRr1WblEz5sVwhQy11HLnS8reSuHpu0lTxPLH+rM+JeV VCrxplhNlRYnGmkPBk5t2OCtFD8QuOYBC3nCIMFMYCLWBC7u2L8m/Z5WXVcYs2OiGWQ4 jCeMHvlgM2J4ou0Hv4pkcYCUIE5lcvatGoStX+xMLk8dcL4jzq0TXM2RCVMaeEssfJGe RWoQkyat8cKsyzvQOEZTK1qLBo6gSLWNki5iOgEDxQyuhgsycQYgygg6A3yPgi1+jIEb Wv2vTHskl7GeXZBFWG26D9CKMogBltlOXc6LkbgDYdhtTLzC/AtlDn30DvSdrZz5MeHL pJbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767800034; x=1768404834; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=5ATdgisM3Tik/NiyOwZFhtzJ8q/T+7QdescJPqQtpWw=; b=bRLHxbIb/dOLxQBwaCzHguXjA2xrI+cack/Uf0d7qHKHoRoJLR1axLVndYhDFInFtp Qm2ZSDj+fuGF8E61cH5tIfCNB68aLPA7/Pw8h0mYs8ixmzhm8acuWxPPzjUf5u3RBxXj 6efY5kR3xKjnJ5XVi1zXPRRpRFg/Si6Aqk1v5FHJcNkxU+nz/4kNjZCAxwWX5eaookp8 3ynAfckBfnf41KxJ7itBF8c6Pk/tuDge9p7xaDmjv5C3RwOcrIov4bLips+uDAsbWzbh SBVbIecV1SFyZOemsv+eef2slAIK7n4sQIopdn/N8t6vco1cjV/NxrUzxiR0OjGYsik5 eyfA== X-Forwarded-Encrypted: i=1; AJvYcCXCHci9GCTmBvjc1V2qkzyxs11B322TRYmUclXkUJGR3arC9fPs2tJqjRbq9iO3UAUBpc1sq5+l9G6mRGA=@vger.kernel.org X-Gm-Message-State: AOJu0Yy1AVuInFDqOHNbshamjgL1J+I3sFUW4r46l1zwnLmrRGbrFiPG 6kTHC45ip9hi5N+w+3WzJJ9F/QMIQhhhE+VrL326Hv/ERDMVaUagbBNC X-Gm-Gg: AY/fxX7yf4CrAAxEqm3fjuaKKJGsHCGKDsJKu8wWkKdHpw5YwuoK9KT/UAKtYHj/Ss7 3qkIFBYXyK2lYhm3wk4twbb93IUWFdSo1Tme3BoUtTVG4PWNEJ6ubZ+R2VjF/iwu6KtGenbc31m 8cv3AwZrtfX6JeOS7HbGw6uV9xrHtcrGsuMcT/TWOsWT+pT5yM2JjSAydxlXJ1BQ4h40EY83Vhc 8WZBuolY5zEgTgBBENffZt2xoSMq+0vv1Ko2ZtLaynwoeeFo0oauyYNqziM4WRVH5K3ghkG4BHH OJuzoU8YW7B/bVM+OnC6ihA321klf32rbgdDeWESKGEZBW4RStSnVFe46wjeItV6KA5i+Lgp4Jk pffTD23X2sQL/AZWPwIGCx7bgyOYUfhytfELx86BxyDrQXH7FZJ7QqsSQhhITnu5+YpvcYArwon 5JjteH9O0RnXyetFmmtQTjF4cVuIoA2zkdR+xqvYAUiEW7DjY0Xgkx5zM= X-Google-Smtp-Source: AGHT+IHve8YogzpLKDi2h5kk3r5qd0JvNPCUalMGA+1cdHL4kOqm5K/1cEfDtLFJl4ZqKfKbsphkmg== X-Received: by 2002:a05:6808:179e:b0:450:bbed:7a75 with SMTP id 5614622812f47-45a6bde282cmr1183058b6e.28.1767800033754; Wed, 07 Jan 2026 07:33:53 -0800 (PST) Received: from localhost.localdomain ([2603:8080:1500:3d89:a917:5124:7300:7cef]) by smtp.gmail.com with ESMTPSA id 5614622812f47-45a5e2f1de5sm2398106b6e.22.2026.01.07.07.33.51 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 07 Jan 2026 07:33:53 -0800 (PST) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, John Groves Subject: [PATCH V3 04/21] dax: Add dax_operations for use by fs-dax on fsdev dax Date: Wed, 7 Jan 2026 09:33:13 -0600 Message-ID: <20260107153332.64727-5-john@groves.net> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20260107153332.64727-1-john@groves.net> References: <20260107153244.64703-1-john@groves.net> <20260107153332.64727-1-john@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: John Groves * These methods are based on pmem_dax_ops from drivers/nvdimm/pmem.c * fsdev_dax_direct_access() returns the hpa, pfn and kva. The kva was newly stored as dev_dax->virt_addr by dev_dax_probe(). * The hpa/pfn are used for mmap (dax_iomap_fault()), and the kva is used for read/write (dax_iomap_rw()) * fsdev_dax_recovery_write() and dev_dax_zero_page_range() have not been tested yet. I'm looking for suggestions as to how to test those. * dax-private.h: add dev_dax->cached_size, which fsdev needs to remember. The dev_dax size cannot change while a driver is bound (dev_dax_resize returns -EBUSY if dev->driver is set). Caching the size at probe time allows fsdev's direct_access path can use it without acquiring dax_dev_rwsem (which isn't exported anyway). Signed-off-by: John Groves --- drivers/dax/dax-private.h | 1 + drivers/dax/fsdev.c | 80 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 81 insertions(+) diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h index 1bb1631af485..fbd8348cc71c 100644 --- a/drivers/dax/dax-private.h +++ b/drivers/dax/dax-private.h @@ -85,6 +85,7 @@ struct dev_dax { struct dax_region *region; struct dax_device *dax_dev; void *virt_addr; + u64 cached_size; unsigned int align; int target_node; bool dyn_id; diff --git a/drivers/dax/fsdev.c b/drivers/dax/fsdev.c index c5c660b193e5..9e2f83aa2584 100644 --- a/drivers/dax/fsdev.c +++ b/drivers/dax/fsdev.c @@ -27,6 +27,81 @@ * - No mmap support - all access is through fs-dax/iomap */ =20 +static void fsdev_write_dax(void *pmem_addr, struct page *page, + unsigned int off, unsigned int len) +{ + while (len) { + void *mem =3D kmap_local_page(page); + unsigned int chunk =3D min_t(unsigned int, len, PAGE_SIZE - off); + + memcpy_flushcache(pmem_addr, mem + off, chunk); + kunmap_local(mem); + len -=3D chunk; + off =3D 0; + page++; + pmem_addr +=3D chunk; + } +} + +static long __fsdev_dax_direct_access(struct dax_device *dax_dev, pgoff_t = pgoff, + long nr_pages, enum dax_access_mode mode, void **kaddr, + unsigned long *pfn) +{ + struct dev_dax *dev_dax =3D dax_get_private(dax_dev); + size_t size =3D nr_pages << PAGE_SHIFT; + size_t offset =3D pgoff << PAGE_SHIFT; + void *virt_addr =3D dev_dax->virt_addr + offset; + phys_addr_t phys; + unsigned long local_pfn; + + WARN_ON(!dev_dax->virt_addr); + + phys =3D dax_pgoff_to_phys(dev_dax, pgoff, nr_pages << PAGE_SHIFT); + + if (kaddr) + *kaddr =3D virt_addr; + + local_pfn =3D PHYS_PFN(phys); + if (pfn) + *pfn =3D local_pfn; + + /* + * Use cached_size which was computed at probe time. The size cannot + * change while the driver is bound (resize returns -EBUSY). + */ + return PHYS_PFN(min_t(size_t, size, dev_dax->cached_size - offset)); +} + +static int fsdev_dax_zero_page_range(struct dax_device *dax_dev, + pgoff_t pgoff, size_t nr_pages) +{ + void *kaddr; + + WARN_ONCE(nr_pages > 1, "%s: nr_pages > 1\n", __func__); + __fsdev_dax_direct_access(dax_dev, pgoff, 1, DAX_ACCESS, &kaddr, NULL); + fsdev_write_dax(kaddr, ZERO_PAGE(0), 0, PAGE_SIZE); + return 0; +} + +static long fsdev_dax_direct_access(struct dax_device *dax_dev, + pgoff_t pgoff, long nr_pages, enum dax_access_mode mode, + void **kaddr, unsigned long *pfn) +{ + return __fsdev_dax_direct_access(dax_dev, pgoff, nr_pages, mode, + kaddr, pfn); +} + +static size_t fsdev_dax_recovery_write(struct dax_device *dax_dev, pgoff_t= pgoff, + void *addr, size_t bytes, struct iov_iter *i) +{ + return _copy_from_iter_flushcache(addr, bytes, i); +} + +static const struct dax_operations dev_dax_ops =3D { + .direct_access =3D fsdev_dax_direct_access, + .zero_page_range =3D fsdev_dax_zero_page_range, + .recovery_write =3D fsdev_dax_recovery_write, +}; =20 static void fsdev_cdev_del(void *cdev) { @@ -197,6 +272,11 @@ static int fsdev_dax_probe(struct dev_dax *dev_dax) } } =20 + /* Cache size now; it cannot change while driver is bound */ + dev_dax->cached_size =3D 0; + for (i =3D 0; i < dev_dax->nr_range; i++) + dev_dax->cached_size +=3D range_len(&dev_dax->ranges[i].range); + /* * FS-DAX compatible mode: Use MEMORY_DEVICE_FS_DAX type and * do NOT set vmemmap_shift. This leaves folios at order-0, --=20 2.49.0 From nobody Sat Feb 7 07:10:19 2026 Received: from mail-oi1-f175.google.com (mail-oi1-f175.google.com [209.85.167.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A560B3B8D58 for ; Wed, 7 Jan 2026 15:33:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800040; cv=none; b=GqJs4oJp2qbbtP+VcrobFpOjns8hJgB6XDdxaxAzcsSdxDtK39H5U3cN5MXBOmFDu/WH+W6EE7vpQ/7V0Wa4zrBOxiwEqwd6/AnGfJqAG2WKmT16kxQm9RF1qElJaJSD5x11CbJZintUYzh/a71KmY7ko58BTtZ8QuXgxJ6KjRY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800040; c=relaxed/simple; bh=Z9XPB5cChIo/t/9LU2WCPFfPnMA+oTk5cMf0x2oZ3bQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=QMPFfovgymNrXwmcyTKxSU1NUU7tLGnTuKwnFGxxx/E4Lf0NLMI7JRqrJSfA58rgBkxBQVBT/ZGcHRVrA04j7xBDff/E5W4/gZEAYpVlpqMR+J32tiaj+fl7PvZKAfkjHGRSU5DqDKW0j5Re84WX10AHXgMW3vuN+jKKLr6PSXs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=jYZfh877; arc=none smtp.client-ip=209.85.167.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="jYZfh877" Received: by mail-oi1-f175.google.com with SMTP id 5614622812f47-450b3f60c31so1049240b6e.3 for ; Wed, 07 Jan 2026 07:33:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767800036; x=1768404836; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=OlFXJUauQcVO7MpBSFt3XLVu48darYU11J/SZXzaKDg=; b=jYZfh877RReMmfFRD6GWP3bQzt1ANjBRZGVIUOLvyUfx9OnJZG/H1IeyRu+MhrLzRQ 23u/9qutVr5C00CfshrsMxHls3MQCfG1ReI2Gst4aqyPyo9sQ7Um/VmitCMnFh+QXkR3 FfY+GEqtWmwwLT8/zv/57whvbIV4yGUyJ60GtUA0//hbmft6k4TzTWzOxboL/aT2u18f +jc9RgK9PyGsaygo3mSg61TGLX1gvq/mTe5UhBk3cK4HcBWCYOjhS9CQ7ccWC5rR3tKH KPGAI3jaBPT14jjTxP8bwsB6RQwX/3Y9Ont8y49DbY1RD0VA5BGNqnLzgDzT08BNWTPx 2oNA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767800036; x=1768404836; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=OlFXJUauQcVO7MpBSFt3XLVu48darYU11J/SZXzaKDg=; b=mczOpHU8Yy+YO6j1Fo7KfO+VZvBqng9TnNUMyAJK8pEvNe0cQ5u49JXkBNXGOtedEx jjbknD2jEgD/eKsnABTwPmPbwZ4at/LYKc27sKvSmk6nqzpKCwbEbLJKO8gLuftJWYpg QvDYPUp1t27KDaxReHkNTxOqs/GqUoq3gF1PL0hd9Jomp27Q0KuSKBdrPrOjXq5oPxMd THa3m1I0RlBhthSZ2A00UM2xONd1LvIGz9ag6gTvaXBz5Tl/0YoKRNAewZOqoQx2ONq4 10Xb+hZLBowvB+fYjZc89ZulNVgNXVEs1BBotH0u4ZB37WDwmWxiXUc6Xxn+KK1nPGwm Bgjg== X-Forwarded-Encrypted: i=1; AJvYcCU9iaL6EBX5h8Y7r7SlN1NisH+iCPk+DAjaCA1/GKJCDCPEJhqmtRMhuc5LkiJQ6JphsCkOebIPXqqi2BE=@vger.kernel.org X-Gm-Message-State: AOJu0YyGBbUSFVOvzVh0raqAmlk7+xFMbuoE0yYzlv6cz4Cx5avVOZ2O 9Y4iGJAvFkg5WwVtdEZyMj79Gy88x7DXd1b8napmEOvlIRZjxdEmAHmA X-Gm-Gg: AY/fxX6v7sytF03i8IZ4GhrtQ8pRYj4+xozcvrWsAXn8CtEb+mjBvLKfwYdCyzMGZ/T r7oU8gUgE2A+uhnKb+Df/djzgwS4gAkG64soBXcPqg8m2i65PwdfyUkP9UO8vtcB8BmTcVJAWpl Rd4gKbfN4gy7axmbulpNznoBXB8EBFk9Zpk5AC5eo2OS/+X+XQNdMcAalPCMMDPuSUwS54TymCv iIqgn/1TvxTawXsYQM0ODdOzVut0LCkwzHNI0Ku8D5WU2rzEePMYG5ceSg7BtQLj6kJ7EHhq7rN bVyfoSdmZqUDL56uF+tdHX0TTTS0To5JzVSMwIsep565C3XTfHIc+j19mYs/1lm+cdA2KRjoUtq YWeO9rwZhmyq4x+1lybj6YIp5Z4EAUYC+jMDlx3GMCjUyQL3Wi2QqGFeW6/Z/3WX91j9aF7ziiq fnnNZ9J51y13Ben/MPK1rIA4jUbfJi9zpBDr+R5RbrL2HS X-Google-Smtp-Source: AGHT+IGZ+SYDHTTG5p48tl+QB5iPnahkcLty5U9a8Qi/hJOxw+cPuomkuu2gZKHVriePbcQqcWrLVA== X-Received: by 2002:a05:6808:221e:b0:43f:7287:a5de with SMTP id 5614622812f47-45a6bdfb427mr1080778b6e.41.1767800036459; Wed, 07 Jan 2026 07:33:56 -0800 (PST) Received: from localhost.localdomain ([2603:8080:1500:3d89:a917:5124:7300:7cef]) by smtp.gmail.com with ESMTPSA id 5614622812f47-45a5e2f1de5sm2398106b6e.22.2026.01.07.07.33.54 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 07 Jan 2026 07:33:56 -0800 (PST) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, John Groves Subject: [PATCH V3 05/21] dax: Add dax_set_ops() for setting dax_operations at bind time Date: Wed, 7 Jan 2026 09:33:14 -0600 Message-ID: <20260107153332.64727-6-john@groves.net> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20260107153332.64727-1-john@groves.net> References: <20260107153244.64703-1-john@groves.net> <20260107153332.64727-1-john@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: John Groves The dax_device is created (in the non-pmem case) at hmem probe time via devm_create_dev_dax(), before we know which driver (device_dax, fsdev_dax, or kmem) will bind - by calling alloc_dax() with NULL ops, drivers (i.e. fsdev_dax) that need specific dax_operations must set them later. Add dax_set_ops() exported function so fsdev_dax can set its ops at probe time and clear them on remove. device_dax doesn't need ops since it uses the mmap fault path directly. Use cmpxchg() to atomically set ops only if currently NULL, returning -EBUSY if ops are already set. This prevents accidental double-binding. Clearing ops (NULL) always succeeds. Signed-off-by: John Groves --- drivers/dax/fsdev.c | 12 ++++++++++++ drivers/dax/super.c | 38 +++++++++++++++++++++++++++++++++++++- include/linux/dax.h | 1 + 3 files changed, 50 insertions(+), 1 deletion(-) diff --git a/drivers/dax/fsdev.c b/drivers/dax/fsdev.c index 9e2f83aa2584..3f4f593896e3 100644 --- a/drivers/dax/fsdev.c +++ b/drivers/dax/fsdev.c @@ -330,12 +330,24 @@ static int fsdev_dax_probe(struct dev_dax *dev_dax) if (rc) return rc; =20 + /* Set the dax operations for fs-dax access path */ + rc =3D dax_set_ops(dax_dev, &dev_dax_ops); + if (rc) + return rc; + run_dax(dax_dev); return devm_add_action_or_reset(dev, fsdev_kill, dev_dax); } =20 +static void fsdev_dax_remove(struct dev_dax *dev_dax) +{ + /* Clear ops on unbind so they aren't used with a different driver */ + dax_set_ops(dev_dax->dax_dev, NULL); +} + static struct dax_device_driver fsdev_dax_driver =3D { .probe =3D fsdev_dax_probe, + .remove =3D fsdev_dax_remove, .type =3D DAXDRV_FSDEV_TYPE, }; =20 diff --git a/drivers/dax/super.c b/drivers/dax/super.c index c00b9dff4a06..ba0b4cd18a77 100644 --- a/drivers/dax/super.c +++ b/drivers/dax/super.c @@ -157,6 +157,9 @@ long dax_direct_access(struct dax_device *dax_dev, pgof= f_t pgoff, long nr_pages, if (!dax_alive(dax_dev)) return -ENXIO; =20 + if (!dax_dev->ops) + return -EOPNOTSUPP; + if (nr_pages < 0) return -EINVAL; =20 @@ -207,6 +210,10 @@ int dax_zero_page_range(struct dax_device *dax_dev, pg= off_t pgoff, =20 if (!dax_alive(dax_dev)) return -ENXIO; + + if (!dax_dev->ops) + return -EOPNOTSUPP; + /* * There are no callers that want to zero more than one page as of now. * Once users are there, this check can be removed after the @@ -223,7 +230,7 @@ EXPORT_SYMBOL_GPL(dax_zero_page_range); size_t dax_recovery_write(struct dax_device *dax_dev, pgoff_t pgoff, void *addr, size_t bytes, struct iov_iter *iter) { - if (!dax_dev->ops->recovery_write) + if (!dax_dev->ops || !dax_dev->ops->recovery_write) return 0; return dax_dev->ops->recovery_write(dax_dev, pgoff, addr, bytes, iter); } @@ -307,6 +314,35 @@ void set_dax_nomc(struct dax_device *dax_dev) } EXPORT_SYMBOL_GPL(set_dax_nomc); =20 +/** + * dax_set_ops - set the dax_operations for a dax_device + * @dax_dev: the dax_device to configure + * @ops: the operations to set (may be NULL to clear) + * + * This allows drivers to set the dax_operations after the dax_device + * has been allocated. This is needed when the device is created before + * the driver that needs specific ops is bound (e.g., fsdev_dax binding + * to a dev_dax created by hmem). + * + * When setting non-NULL ops, fails if ops are already set (returns -EBUSY= ). + * When clearing ops (NULL), always succeeds. + * + * Return: 0 on success, -EBUSY if ops already set + */ +int dax_set_ops(struct dax_device *dax_dev, const struct dax_operations *o= ps) +{ + if (ops) { + /* Setting ops: fail if already set */ + if (cmpxchg(&dax_dev->ops, NULL, ops) !=3D NULL) + return -EBUSY; + } else { + /* Clearing ops: always allowed */ + dax_dev->ops =3D NULL; + } + return 0; +} +EXPORT_SYMBOL_GPL(dax_set_ops); + bool dax_alive(struct dax_device *dax_dev) { lockdep_assert_held(&dax_srcu); diff --git a/include/linux/dax.h b/include/linux/dax.h index 74e098010016..3fcd8562b72b 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -246,6 +246,7 @@ static inline void dax_break_layout_final(struct inode = *inode) =20 bool dax_alive(struct dax_device *dax_dev); void *dax_get_private(struct dax_device *dax_dev); +int dax_set_ops(struct dax_device *dax_dev, const struct dax_operations *o= ps); long dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, long nr_= pages, enum dax_access_mode mode, void **kaddr, unsigned long *pfn); size_t dax_copy_from_iter(struct dax_device *dax_dev, pgoff_t pgoff, void = *addr, --=20 2.49.0 From nobody Sat Feb 7 07:10:19 2026 Received: from mail-oi1-f180.google.com (mail-oi1-f180.google.com [209.85.167.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 70978318ECC for ; Wed, 7 Jan 2026 15:34:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800042; cv=none; b=nwPP/gSax8QjqzVEbph4KeURgKsSU0RRx6WOzZYyPiU3jeMsCplI/O6GY7jC6qeHMSo9UgGKh7wJhETN2IVpr2J9f3mUmnOxdB2DvutpND8LFb/IvIo7nKFr8ibaDUxCaRjx4Sg5xtt5h2NidwoNbZa70DXj5w+DPeRICJzMiT0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800042; c=relaxed/simple; bh=WRVZBT8kMG5wCLoPZZC5bOMvnEB+0NlJlCyntEWpc5Q=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=jboP9X5F9TKlBy7+AsXce8Hoaa37b7TUulXyq0iHff6sKKsEGRFhtbnhRqoJXQz6FCVSs3WopOhFrr5mkNnAVSFsV3wn9AwYVc52tFtggoKfaO7pPVo1EtO37171A5aVX+icaypyiclHF79Jo/zSUGJ+nS1hDiHKtj6dZXKCtsE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=aEhR4xQC; arc=none smtp.client-ip=209.85.167.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="aEhR4xQC" Received: by mail-oi1-f180.google.com with SMTP id 5614622812f47-455bef556a8so1445250b6e.1 for ; Wed, 07 Jan 2026 07:34:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767800039; x=1768404839; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=MFpUE248sHWKO0jMsz45uNwCckbm+2ReGU0X8tOcxqc=; b=aEhR4xQCYQN3NA9/prqTA48FLx6GDk1H72V6kODI+yl1ODxSilRXDuEwIyV76CEy/p HEFbk6y3GqE7PwAVvx9pUIzpC8wqz4KOTqrU2pCOXpbfyEHfYb+gXkuG4TW0FWtPG8DG c19y5MAHm1U4E7QzIxQhIpYLMrjWDSxPZUhdesQngGpEfo1Dar0Vz5+oAOKlgRLv6dGs JNT/UEvWSDizb8H9D1IvU5b9i7ENu0tcpN59iOh6T348FGwyjPgCP0yZmvZYpdw+3EyK s/tUeECj2HEWt0bvWE2k+6b3/qfHTBKeNSYJXC7m+DNDHEMdbKMYwgVbpK22xzE7tWdl XAxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767800039; x=1768404839; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=MFpUE248sHWKO0jMsz45uNwCckbm+2ReGU0X8tOcxqc=; b=c3dW/PM+FudZ128xuzCi2dHeVEGuUc50T2WaManhKFtpzG1+cRDLfbF22SpxH3QP4N I3yDwm6tYRqfU4IY2yAV59VXS1lCMRcD3cwnix/BbAnCUNpEbG4+3CiT/CgVcI+Eudpy Hf6jw1f1LUGmWxwIy9/iL/6ke1AcC11gUXKGxVUWS0bFc88/hlP0CHainKRiTkipr1I2 XbLKZRJDUDN7hdFYXOIi067qmCBu8a7JjYyXSpUN6Iu13cqscGRxrAbo42qbPKxgU+6x U9GIiJjows+4P/6RQa0HNKFhb25EQVPQKze/0XITMdx2UGQhqgzKzKoXWcumJAc2HYbo 9icg== X-Forwarded-Encrypted: i=1; AJvYcCUWrZaORFCMWb2zduhQSjV4917HfW8rwRS1xQzCafoKktjMjk/D4RFK9dWywbMQCSHkj4g2aLZIt6Ks6tc=@vger.kernel.org X-Gm-Message-State: AOJu0YyuSKr8ZQmQ0i25cqMxNYQtPmE14D1UC52gjdU7hWCX5SHinHVw CbAOUunXAsfMSatDp1LyftY6CtzMgcaHYT76Qbps9I/JTt7ir4/foVJR X-Gm-Gg: AY/fxX46rnp2WGzabzi2HRGikPXdTtVBSKSuENAlyWcNJyxtHjXNLx3Qa8HOzseVhAh DQdqcQ82y7RpAYmP+0AZGGVxeqZitE1zSzpSiAvycpfHcH6Nu4MqmNoBjsOBLn1UIbT4XDF46V4 2JAMf5h/L77uXD15FA/sZDmXPss6iHR8W8ZSwsCpd2xkqO/9dRBiWUxNcyR8ecnXn8n6xlpOxt9 qYB5oVWcZnpdOKrfwImG94pxN9EwQO4F2E5RRY+VLx5PAIyc6FK5ulrelQzPi8PQ5s+GhXgaJ0Q BtfvHtjnYIEZwJth7yTi3WgFA6CReYLWa/rhceqQ50KhXOG0E88fp/LGewk/Dkn+/6HzNcbGgnY XUPZuC3gU7U1zO9HRpqEwvlc62rDqKM26TxqdmKfwb0QvwnKLQ++X0GASuIXLathUZQEZxAHirx sIhrSRo0CC5kVMmhOulUojepqyGduADm4xO/16a201iyaE X-Google-Smtp-Source: AGHT+IEszE+IwGAo4ineGymBiCez8ScClYQB6AM3nCGB9mlkfxe96RTOF5+tZUxQ9+aCXt3y02aEdQ== X-Received: by 2002:a05:6808:6412:b0:455:f4e7:d09a with SMTP id 5614622812f47-45a6bccc205mr1321889b6e.12.1767800039196; Wed, 07 Jan 2026 07:33:59 -0800 (PST) Received: from localhost.localdomain ([2603:8080:1500:3d89:a917:5124:7300:7cef]) by smtp.gmail.com with ESMTPSA id 5614622812f47-45a5e2f1de5sm2398106b6e.22.2026.01.07.07.33.57 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 07 Jan 2026 07:33:58 -0800 (PST) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, John Groves Subject: [PATCH V3 06/21] dax: Add fs_dax_get() func to prepare dax for fs-dax usage Date: Wed, 7 Jan 2026 09:33:15 -0600 Message-ID: <20260107153332.64727-7-john@groves.net> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20260107153332.64727-1-john@groves.net> References: <20260107153244.64703-1-john@groves.net> <20260107153332.64727-1-john@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The fs_dax_get() function should be called by fs-dax file systems after opening a fsdev dax device. This adds holder_operations, which provides a memory failure callback path and effects exclusivity between callers of fs_dax_get(). fs_dax_get() is specific to fsdev_dax, so it checks the driver type (which required touching bus.[ch]). fs_dax_get() fails if fsdev_dax is not bound to the memory. This function serves the same role as fs_dax_get_by_bdev(), which dax file systems call after opening the pmem block device. This can't be located in fsdev.c because struct dax_device is opaque there. This will be called by fs/fuse/famfs.c in a subsequent commit. Signed-off-by: John Groves --- drivers/dax/bus.c | 2 -- drivers/dax/bus.h | 2 ++ drivers/dax/super.c | 54 +++++++++++++++++++++++++++++++++++++++++++++ include/linux/dax.h | 1 + 4 files changed, 57 insertions(+), 2 deletions(-) diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c index 0d7228acb913..6e0e28116edc 100644 --- a/drivers/dax/bus.c +++ b/drivers/dax/bus.c @@ -42,8 +42,6 @@ static int dax_bus_uevent(const struct device *dev, struc= t kobj_uevent_env *env) return add_uevent_var(env, "MODALIAS=3D" DAX_DEVICE_MODALIAS_FMT, 0); } =20 -#define to_dax_drv(__drv) container_of_const(__drv, struct dax_device_driv= er, drv) - static struct dax_id *__dax_match_id(const struct dax_device_driver *dax_d= rv, const char *dev_name) { diff --git a/drivers/dax/bus.h b/drivers/dax/bus.h index 880bdf7e72d7..dc6f112ac4a4 100644 --- a/drivers/dax/bus.h +++ b/drivers/dax/bus.h @@ -42,6 +42,8 @@ struct dax_device_driver { void (*remove)(struct dev_dax *dev); }; =20 +#define to_dax_drv(__drv) container_of_const(__drv, struct dax_device_driv= er, drv) + int __dax_driver_register(struct dax_device_driver *dax_drv, struct module *module, const char *mod_name); #define dax_driver_register(driver) \ diff --git a/drivers/dax/super.c b/drivers/dax/super.c index ba0b4cd18a77..68c45b918cff 100644 --- a/drivers/dax/super.c +++ b/drivers/dax/super.c @@ -14,6 +14,7 @@ #include #include #include "dax-private.h" +#include "bus.h" =20 /** * struct dax_device - anchor object for dax services @@ -121,6 +122,59 @@ void fs_put_dax(struct dax_device *dax_dev, void *hold= er) EXPORT_SYMBOL_GPL(fs_put_dax); #endif /* CONFIG_BLOCK && CONFIG_FS_DAX */ =20 +#if IS_ENABLED(CONFIG_DEV_DAX_FS) +/** + * fs_dax_get() - get ownership of a devdax via holder/holder_ops + * + * fs-dax file systems call this function to prepare to use a devdax devic= e for + * fsdax. This is like fs_dax_get_by_bdev(), but the caller already has st= ruct + * dev_dax (and there is no bdev). The holder makes this exclusive. + * + * @dax_dev: dev to be prepared for fs-dax usage + * @holder: filesystem or mapped device inside the dax_device + * @hops: operations for the inner holder + * + * Returns: 0 on success, <0 on failure + */ +int fs_dax_get(struct dax_device *dax_dev, void *holder, + const struct dax_holder_operations *hops) +{ + struct dev_dax *dev_dax; + struct dax_device_driver *dax_drv; + int id; + + id =3D dax_read_lock(); + if (!dax_dev || !dax_alive(dax_dev) || !igrab(&dax_dev->inode)) { + dax_read_unlock(id); + return -ENODEV; + } + dax_read_unlock(id); + + /* Verify the device is bound to fsdev_dax driver */ + dev_dax =3D dax_get_private(dax_dev); + if (!dev_dax || !dev_dax->dev.driver) { + iput(&dax_dev->inode); + return -ENODEV; + } + + dax_drv =3D to_dax_drv(dev_dax->dev.driver); + if (dax_drv->type !=3D DAXDRV_FSDEV_TYPE) { + iput(&dax_dev->inode); + return -EOPNOTSUPP; + } + + if (cmpxchg(&dax_dev->holder_data, NULL, holder)) { + iput(&dax_dev->inode); + return -EBUSY; + } + + dax_dev->holder_ops =3D hops; + + return 0; +} +EXPORT_SYMBOL_GPL(fs_dax_get); +#endif /* DEV_DAX_FS */ + enum dax_device_flags { /* !alive + rcu grace period =3D=3D no new operations / mappings */ DAXDEV_ALIVE, diff --git a/include/linux/dax.h b/include/linux/dax.h index 3fcd8562b72b..76f2a75f3144 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -53,6 +53,7 @@ struct dax_holder_operations { struct dax_device *alloc_dax(void *private, const struct dax_operations *o= ps); =20 #if IS_ENABLED(CONFIG_DEV_DAX_FS) +int fs_dax_get(struct dax_device *dax_dev, void *holder, const struct dax_= holder_operations *hops); struct dax_device *inode_dax(struct inode *inode); #endif void *dax_holder(struct dax_device *dax_dev); --=20 2.49.0 From nobody Sat Feb 7 07:10:19 2026 Received: from mail-oi1-f169.google.com (mail-oi1-f169.google.com [209.85.167.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 456CE318EFC for ; Wed, 7 Jan 2026 15:34:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800045; cv=none; b=KdUifTUOEbZlaZAxPeQlWrtz9RTaDtSy+M8meEM8cUV+80Ze3CHQMnR1i3KRBO102zlyWxUaWpbmqpTmsA/2lN2Bl5/l+M9nXWl6wRjMVD0XWymFfY0mF4DoyK8NZZIj0nVZ3ARXb0o54fF39IzanJqCrKPkNU1OnJFcyoUtVbc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800045; c=relaxed/simple; bh=PyiR4akbn2AeaAQOQjDM80GGb6TmVA/W+ehpX5bfLv4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=t5bX1kUwsk1NCocQgLvDhDGctHkYRDdVOWmyi8tfc3asT3v++G6JHSxMXf3qSrlaMvM9tczzGHRydlwsGYJ+wuAeClx3ojEAcXj97VZ0bLGzowS3PhJqxeaIapo/kcV2ND2xOu4SFYvawpY4D7oLbRwVggI1RXrh2tEz0Y50R+Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ZJhZzofs; arc=none smtp.client-ip=209.85.167.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ZJhZzofs" Received: by mail-oi1-f169.google.com with SMTP id 5614622812f47-455a461ab6eso929648b6e.2 for ; Wed, 07 Jan 2026 07:34:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767800042; x=1768404842; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=rU7mq0TXsreClp5X76nq6lsunavYv2TLFlf9O5dybH4=; b=ZJhZzofsdnRTqpnyYlsDKwYBdEOS0/pCw8mia1ef9vH+uS06XmJlIPH9JpZQgCs3Kt ihKbx8+lT3CenybuDOeJD6Az1nsGSSNadf0WQcFok1FPFTjl+nRXr+gWYtJzOPJ3s/KT UmryE8hYvgNMo0q8LYR7SHGOcadrOPeWAo6JPJ5DKJlsvw5oXzP+GF/Zf+htc9TDn0sg LFvzrDGg2mfzMm0TyQxJnfv9Wnw2/J0qVg/A9b4v9LH6w708jzN5EzZLxbEUTnkqg3bs pncLMUaGjHDH9o8bqsMWs/mKZL0e+9hjzRn2l0UmWQbM7jbwSI8rtapRTsKqfSBrdb9L h42w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767800042; x=1768404842; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=rU7mq0TXsreClp5X76nq6lsunavYv2TLFlf9O5dybH4=; b=GnmcL0X+EokBq1BupgEHudUZKczbId/VsNJpKCrik5IqEhU/zuQP3gOBEFdga/xLTQ mzK1cfMrBM+QngSxeVCaeS6OW06W0RXuuTETLI6PuHSPaQmObr7pSfrLnD5Xr1aQ/qjG iICPaoSmDxIHwKmapLXn0GYaZlG6E2DVq+rxl5+7cdMIWzuFF6QrxbzX+Fb9GDWjHuxJ A3m/BqSR7nAjqzEMHKK9XzTjsM++/wK+tGqim9x4SehMsG5/Tvhe1meS0HHRI4I2dZQi y+oFKwg++S8hUsqBcyQZXH8O+9VqACM8ZQokPBxjtHlk7YIuBCnHrMoZkhzR7kXpsHhi 3Pzw== X-Forwarded-Encrypted: i=1; AJvYcCVSHWdAxHmdh3U8geN6aXDLjb8QyPOsJhdDj+LUHoSgjIs8cZ/MQ6GMLox0dTefAag4L98nI9UY7cXCckg=@vger.kernel.org X-Gm-Message-State: AOJu0Yw3gsGZ5YPzfMG0G4oP9y4PuKpY/nutMzYMqqtPlondnRup5ZFL T1NZCiwlxR+PyQxkclZpuURvpzVpX0g7tl6X0HuX5Aqhb9rNjdMm3B3G X-Gm-Gg: AY/fxX4OIfgekPSoK6HcB2if/vKdCa/9vdKkCjzJAV+n/ei+Iwdu3Cm5lm1BGdy5FUz WXg+MHrMIOJ/X0+CejGCKVITmm/CRB+TG4AhjpwfRnydFIRYMvHozWEati8/EwvQmXpCrfsAiKE ip83jp97tzUVazy0QIg8NaHo/RXO4HIHNt7jdjAogttUPngELHN04VnkuWROIvI2Sk/2gZkdMSZ n/zmxR8YN5bWb6Gql+Sc+9DlE2kRJumMIaKaZKl04NXq5Fi7D5E2/HJZO2YzeHiNnT7c9x9pQsI tHHkADxqVJQRXjQVeiGfpdWTBq5m/LAy1WBVkknf4z0Bbqv9XGQbKCZd1ExGBEaW/D34JaJPhVX eCsjAkJixnmpNtOxLUBYphCDLwvNAYC0xENv2zQkowHCQM3zK/AI9CeYX+fF852tO+zS7nzolkK v3bqRhn2H61WgGH65t+LN2TKahV6+ypyDEWn3LTnWda5FczAAJnmrtPLY= X-Google-Smtp-Source: AGHT+IGGuPhuNMGxMH3hMywT/xZMd9I63HO5XzGofyUeNCfHY8A9ec2ffaSkWkYNbPyHpjuffbFtfA== X-Received: by 2002:a05:6808:1188:b0:455:7da6:44d5 with SMTP id 5614622812f47-45a6bdaf691mr1055246b6e.27.1767800041948; Wed, 07 Jan 2026 07:34:01 -0800 (PST) Received: from localhost.localdomain ([2603:8080:1500:3d89:a917:5124:7300:7cef]) by smtp.gmail.com with ESMTPSA id 5614622812f47-45a5e2f1de5sm2398106b6e.22.2026.01.07.07.33.59 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 07 Jan 2026 07:34:01 -0800 (PST) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, John Groves Subject: [PATCH V3 07/21] dax: prevent driver unbind while filesystem holds device Date: Wed, 7 Jan 2026 09:33:16 -0600 Message-ID: <20260107153332.64727-8-john@groves.net> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20260107153332.64727-1-john@groves.net> References: <20260107153244.64703-1-john@groves.net> <20260107153332.64727-1-john@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: John Groves Add custom bind/unbind sysfs attributes for the dax bus that check whether a filesystem has registered as a holder (via fs_dax_get()) before allowing driver unbind. When a filesystem like famfs mounts on a dax device, it registers itself as the holder via dax_holder_ops. Previously, there was no mechanism to prevent driver unbind while the filesystem was mounted, which could cause some havoc. The new unbind_store() checks dax_holder() and returns -EBUSY if a holder is registered, giving userspace proper feedback that the device is in use. To use our custom bind/unbind handlers instead of the default ones, set suppress_bind_attrs=3Dtrue on all dax drivers during registration. Signed-off-by: John Groves --- drivers/dax/bus.c | 53 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 53 insertions(+) diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c index 6e0e28116edc..ed453442739d 100644 --- a/drivers/dax/bus.c +++ b/drivers/dax/bus.c @@ -151,9 +151,61 @@ static ssize_t remove_id_store(struct device_driver *d= rv, const char *buf, } static DRIVER_ATTR_WO(remove_id); =20 +static const struct bus_type dax_bus_type; + +/* + * Custom bind/unbind handlers for dax bus. + * The unbind handler checks if a filesystem holds the dax device and + * returns -EBUSY if so, preventing driver unbind while in use. + */ +static ssize_t unbind_store(struct device_driver *drv, const char *buf, + size_t count) +{ + struct device *dev; + int rc =3D -ENODEV; + + dev =3D bus_find_device_by_name(&dax_bus_type, NULL, buf); + if (dev && dev->driver =3D=3D drv) { + struct dev_dax *dev_dax =3D to_dev_dax(dev); + + if (dax_holder(dev_dax->dax_dev)) { + dev_dbg(dev, + "%s: blocking unbind due to active holder\n", + __func__); + rc =3D -EBUSY; + goto out; + } + device_release_driver(dev); + rc =3D count; + } +out: + put_device(dev); + return rc; +} +static DRIVER_ATTR_WO(unbind); + +static ssize_t bind_store(struct device_driver *drv, const char *buf, + size_t count) +{ + struct device *dev; + int rc =3D -ENODEV; + + dev =3D bus_find_device_by_name(&dax_bus_type, NULL, buf); + if (dev) { + rc =3D device_driver_attach(drv, dev); + if (!rc) + rc =3D count; + } + put_device(dev); + return rc; +} +static DRIVER_ATTR_WO(bind); + static struct attribute *dax_drv_attrs[] =3D { &driver_attr_new_id.attr, &driver_attr_remove_id.attr, + &driver_attr_bind.attr, + &driver_attr_unbind.attr, NULL, }; ATTRIBUTE_GROUPS(dax_drv); @@ -1591,6 +1643,7 @@ int __dax_driver_register(struct dax_device_driver *d= ax_drv, drv->name =3D mod_name; drv->mod_name =3D mod_name; drv->bus =3D &dax_bus_type; + drv->suppress_bind_attrs =3D true; =20 return driver_register(drv); } --=20 2.49.0 From nobody Sat Feb 7 07:10:19 2026 Received: from mail-oi1-f175.google.com (mail-oi1-f175.google.com [209.85.167.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3180538946F for ; Wed, 7 Jan 2026 15:34:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800048; cv=none; b=npkupn3lYrFxJEt31ecIU/dMEuRGGn/uIVYkjwkFN8VZEdhkXkre58ZluYe0aci+EO8Ounl7QFA0X5nveqlDEfgGwUdeJgXpYaibA1rXMg4vfAsc0dZSNKGdSrSgUDGlNs6TcrsCii+RTv+PUEHV09HEjQuXsjtZ3bY5y+5weQ4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800048; c=relaxed/simple; bh=rXk3y4C8WJeOSfMI8cZ5ZlL/TCkv73ERx5NvQkCSpyA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=olTVSlzPsAb20YNfOv8sTpg6N4F+iLrz2Q8KH4ZqEpt5dOJro7+kspSwiWOrn44/zHkBI7lP+4D66ltG8wN8OGTbOQpL+b5k8gkT2RzZqFMxEA/JtWdMSBXmXoqAj/NneF8vOO8Y2FMJdE/IgUmxowC2s6+FaDzOUafL4ajh6Cs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=G4GZnOim; arc=none smtp.client-ip=209.85.167.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="G4GZnOim" Received: by mail-oi1-f175.google.com with SMTP id 5614622812f47-450b5338459so1353375b6e.2 for ; Wed, 07 Jan 2026 07:34:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767800045; x=1768404845; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=nEjVEszBS1bXBch+QTu9nax0OtrFw3AqtXa5XoSe9fA=; b=G4GZnOimGKYLaHQGRjcwbXlnk11Hf9ZpstIvvG+0jiRuMet2ipinn4hboqAiV/4XKU yZo4+HdbHWHL/gZz9FrZ95shPbqvj8ssUBjlKn3Ca3futo8WkJmJDXdA1vN9p3X2qt6A qi4bBIojc8Pzmcyy06LC47Hk53pQHCRdibIi0tfU/kCsHXrVqe1DzFmXUhC50GkGRWb2 52lVIoswD4+b0GJM6x0ps+eK+B6WBo4vc7LGsU0EzzEqQkJpMoKbohZplLpxxCODZfq/ u7KxjGCjENGGHp0bxE8yTSNFTnTaGIgTMeiQvNfNKiGvzcXhMZCTbnSakdMEPoVCchdS AASA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767800045; x=1768404845; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=nEjVEszBS1bXBch+QTu9nax0OtrFw3AqtXa5XoSe9fA=; b=f7Ph8flRyVVyERgDgoGnelbMBJBWLtZgfzskdu4bamFEqFjmKRY/GXZY93Z7t6d090 wFaKnH1XLbw9rLRL9iCihDGMbmXj0kC+eGRmQNyOyZEuwSwL/pTO4Fp03yI8MLN86dSC BOnFm+yIQfDgSxzvKXae3Bz2w5TWTqEAFp17CeE4SmzEXq6DYeOB/sFCZ8LP3cadEgMb v0LNI4CDqKlGc4JttFR8vveJHLB3EE2A3XgRWz+XbjlsyycHSpUSrgx1CJ7sa+NUqez8 ZVwgcbpgj8oNr7+ZJvCHFgjzGpwuz1jTn/BbrKoTw/hyoqyvGAgNB+/VzBXWmXDM14Ry aFNA== X-Forwarded-Encrypted: i=1; AJvYcCXeLMMFLZjKaJKJzON4uFfT9SXOme/qWLcuYaLw/NUhTKUIcIzJH8SKmCKfJgZKo5NpXY/aXB/iD5JYP4w=@vger.kernel.org X-Gm-Message-State: AOJu0YwmCD02OlRmxBP03EnHV+8IF07S1X35F7rYX0gmORkKtXbQVXhw hb0szdu4oHZkXMBmgCDY9N0qqiCOJOenaEkDlkSbexpwduYYdkjHDfjl X-Gm-Gg: AY/fxX4MRZYcX1Mj8fHsGWeW/Sr1WDYFovT5qhVW7dgkJxD1ocN5xHGuHCaHLXtbby9 Q1jU/BkSObfIV+R62CDVUs/mszEnPXQvyEtZtOmV8TGBoQMEt4JOXIy8luhpkwW590xj/tSoKmt TdYov/KMIuc9NB9iEWoMFQtQfz4fMNjZK91cNzNuMZ+kWWkQfUCTQc1zQ3rC++S0ZO/M5W0c7SN G/w8RruNT10dKxXyMxxep8ZQK1V7wamyyFv93XF+GLx80IDTkzGb7ReUirqc4tl4Ap5UAPMRh9S 8Ex5mO2BlecJagXng5sKPfvxj3JhpE/5+ydeLOy586+sX6FTYk5Fts02upNEwZ64zNXugDZoQuz cQgnSKu/+FBxqeE3iNsIlt1Ght5LjxBHRCwOh19VQdvvNFQuTkahnZetAcYN05YwBWcc4K4bHuZ KgZySMpQHVlFw5krfZn2xS7mDzFHt50PCZ8ELWjFYUGOCav79fCzeUHaY= X-Google-Smtp-Source: AGHT+IFX5hj+WsCZcQN0F9ui7U4tLr7OXye3gCol3kbUSaMg5lrVm0TVcSYSe8f2gPxdB0ZE4wEzIg== X-Received: by 2002:a05:6808:c28c:b0:43f:2a62:8b79 with SMTP id 5614622812f47-45a6bd4ad18mr1041781b6e.29.1767800044921; Wed, 07 Jan 2026 07:34:04 -0800 (PST) Received: from localhost.localdomain ([2603:8080:1500:3d89:a917:5124:7300:7cef]) by smtp.gmail.com with ESMTPSA id 5614622812f47-45a5e2f1de5sm2398106b6e.22.2026.01.07.07.34.03 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 07 Jan 2026 07:34:04 -0800 (PST) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, John Groves Subject: [PATCH V3 08/21] dax: export dax_dev_get() Date: Wed, 7 Jan 2026 09:33:17 -0600 Message-ID: <20260107153332.64727-9-john@groves.net> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20260107153332.64727-1-john@groves.net> References: <20260107153244.64703-1-john@groves.net> <20260107153332.64727-1-john@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" famfs needs to look up a dax_device by dev_t when resolving fmap entries that reference character dax devices. Signed-off-by: John Groves --- drivers/dax/super.c | 3 ++- include/linux/dax.h | 1 + 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/dax/super.c b/drivers/dax/super.c index 68c45b918cff..c14b07be6a4e 100644 --- a/drivers/dax/super.c +++ b/drivers/dax/super.c @@ -511,7 +511,7 @@ static int dax_set(struct inode *inode, void *data) return 0; } =20 -static struct dax_device *dax_dev_get(dev_t devt) +struct dax_device *dax_dev_get(dev_t devt) { struct dax_device *dax_dev; struct inode *inode; @@ -534,6 +534,7 @@ static struct dax_device *dax_dev_get(dev_t devt) =20 return dax_dev; } +EXPORT_SYMBOL_GPL(dax_dev_get); =20 struct dax_device *alloc_dax(void *private, const struct dax_operations *o= ps) { diff --git a/include/linux/dax.h b/include/linux/dax.h index 76f2a75f3144..2a04c3535806 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -56,6 +56,7 @@ struct dax_device *alloc_dax(void *private, const struct = dax_operations *ops); int fs_dax_get(struct dax_device *dax_dev, void *holder, const struct dax_= holder_operations *hops); struct dax_device *inode_dax(struct inode *inode); #endif +struct dax_device *dax_dev_get(dev_t devt); void *dax_holder(struct dax_device *dax_dev); void put_dax(struct dax_device *dax_dev); void kill_dax(struct dax_device *dax_dev); --=20 2.49.0 From nobody Sat Feb 7 07:10:19 2026 Received: from mail-oi1-f178.google.com (mail-oi1-f178.google.com [209.85.167.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D7488318EF8 for ; Wed, 7 Jan 2026 15:34:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800050; cv=none; b=IkN4+DZjCwMaFfguT1lUpR7oFK6ugiHW4FL0BIgs8P5NcY/P28AmFBa5zqsvKsCg+dw8MxZYIosho4iDFJj4C0r++ABsUgh6NHnLObTanLWK95kptmTfkniNqMoY0WpZI+RQJCWgZ+DsJ3rMAPxZlna5MnxbqLVEYblYC0ZuyZA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800050; c=relaxed/simple; bh=G0p06p4Oj55Eoe9gPvplGLniqIrk/l49GkW6H+icHvs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=KBZcRLYS8DQDUuoyIvNJ45kt6jJx/oD3BEZJ8rlHSAVLWXB3bmBFHIOJWFCkZhp9qvjDAeYv6XjzylzGv4mvhX593hhvWtZW2C46eF+Ek5NV8fXmsgwGmJGo1suN77ifVZEHsy/SLyvS18YLA9zCmeCU9rTl3x8XV92XOrNoi3E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=WFpr9umO; arc=none smtp.client-ip=209.85.167.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="WFpr9umO" Received: by mail-oi1-f178.google.com with SMTP id 5614622812f47-450b3f60c31so1049372b6e.3 for ; Wed, 07 Jan 2026 07:34:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767800048; x=1768404848; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=eO9EJ1SzMv+Xy0KnqPrEHfvHLOumowI49Qpx0L6iCR4=; b=WFpr9umOXk3BBFeUe/mPboHm7D0pgGFB+1BB2X7hNWI2t+0tyMml5JDSL1v35KrOQK 9bSl416Mg7zGl+XaNaKe6fcqQDcSooHqQrP9M8+dG4qV3eMHeOl5QplIkjCYCdYTNyBW c1kt6GwVqdQMiT4BFJknc1tS5BA4eEPmeaY6mYi3xq12EmvIjG1KqrVocqOmStrTmNZP jK//xnxq4Gj1YVGIUhUC222VfIDWmQ37s4KglsjdH324e0Rue9802EMvRR6GVwc9NXyY lJeWfaLK4ThnjYD00slpHxKppa56s3xA15rsp8XCWUhWmYvlyp/FOlUtqCA/oQbzAc42 syPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767800048; x=1768404848; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=eO9EJ1SzMv+Xy0KnqPrEHfvHLOumowI49Qpx0L6iCR4=; b=iUVUyjPC9CQAYX96W63GhIBGRFuJ670CA+fGF+BW998EjthptIfDr0qWGhRaiGwmf8 ngS6l5lC5y2Fs12HCNa5t+SIBcltRIjoMdIst0BzelhZc8P7tks2Pi6IW4p9RqimykYY qzPcBJHcZXa14TuSUsf0AUPZVihFWCeN3C47yl1GDHGDUPF6X3RvIhEBHBVkY7ECp60E ZntePnh6zLcoc40vZ8H3x/Y8UfPo52vI440LajtmRCT2Ceeajx17nZQtTyYnnm/OflYB iaJXD7ZGM10tHaPRJqWl07JNIwwy2/YthN3GBJVmFUinLUYe31yTex6oWE1Af0UeGgIu vLkA== X-Forwarded-Encrypted: i=1; AJvYcCVPdvZEMNq6e8jKgzTihSoT134SNnX02flYZjEFvUZc2nO7F2vhQ8nVPvBeGA7KbTDfEhECy/UdCVw2Hb4=@vger.kernel.org X-Gm-Message-State: AOJu0YybIbCrIoVrRA80F5L9OR21iVTf6QteUk+5Kzi3L7wj5UFvmPTM tllQMfp+thNdhoof8aOhuC/qvoTnvyvxbOZ6EJc+CprCY7b9vMjO0UJt X-Gm-Gg: AY/fxX5AJ5Qb6j7eIu1YyOAUg2Cp9GQLC7WWvv4ZtsnKEb0WwTXcEcacQZ10lBLJt3O arphmjxbYn8SuqqjE9B99nFZY2kzdvxdExkkgYsnxlcd+btscy4LlOhsB2z1gA+KiNY/xZ6rog/ tKDWuv/yBdmqf1UN4BhOQFdwSyrXJm2dA16oOAKQQ0IcFgqV6Fbhhivg8WjzsKIt8yQg9ztTqEQ 6e9fy731UTZvgbFmxW2/jIYG6xmD81J3jeJTd9rrNpWWOaled/pZzYRjU6IQ8JRqNndaWxKQ4UG L/Cq2coGEdZMfLZvuQZ1TpXrSXfjmCW85ZEMvottpsOCqQSXfTHRLyXpjbDhXUB1USzcDWy2ziP l3x9GZY637wWLQGsWlQWpGvyQKEzOdSgduzyjHli27wsD+Xq9iHDIv7/4qgoOywQyVZ99ePib8M G08R86jwoSz/5PpdtnBOk/hudxJyEIGdWP8BZhmw//6U9G X-Google-Smtp-Source: AGHT+IGtCDu0G6eYH3g3wT1M6IKy3YB9HAJ3+3EJ/cXXN1DdsBvhXWe62M+fG6NKVgTW6NRYPp6OFA== X-Received: by 2002:a05:6808:18aa:b0:450:32f0:4887 with SMTP id 5614622812f47-45a6bdfd28fmr1087090b6e.31.1767800047653; Wed, 07 Jan 2026 07:34:07 -0800 (PST) Received: from localhost.localdomain ([2603:8080:1500:3d89:a917:5124:7300:7cef]) by smtp.gmail.com with ESMTPSA id 5614622812f47-45a5e2f1de5sm2398106b6e.22.2026.01.07.07.34.05 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 07 Jan 2026 07:34:07 -0800 (PST) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, John Groves Subject: [PATCH V3 09/21] famfs_fuse: magic.h: Add famfs magic numbers Date: Wed, 7 Jan 2026 09:33:18 -0600 Message-ID: <20260107153332.64727-10-john@groves.net> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20260107153332.64727-1-john@groves.net> References: <20260107153244.64703-1-john@groves.net> <20260107153332.64727-1-john@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Famfs distinguishes between its on-media and in-memory superblocks. This reserves the numbers, but they are only used by the user space components of famfs. Signed-off-by: John Groves --- include/uapi/linux/magic.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h index 638ca21b7a90..712b097bf2a5 100644 --- a/include/uapi/linux/magic.h +++ b/include/uapi/linux/magic.h @@ -38,6 +38,8 @@ #define OVERLAYFS_SUPER_MAGIC 0x794c7630 #define FUSE_SUPER_MAGIC 0x65735546 #define BCACHEFS_SUPER_MAGIC 0xca451a4e +#define FAMFS_SUPER_MAGIC 0x87b282ff +#define FAMFS_STATFS_MAGIC 0x87b282fd =20 #define MINIX_SUPER_MAGIC 0x137F /* minix v1 fs, 14 char names */ #define MINIX_SUPER_MAGIC2 0x138F /* minix v1 fs, 30 char names */ --=20 2.49.0 From nobody Sat Feb 7 07:10:19 2026 Received: from mail-oi1-f170.google.com (mail-oi1-f170.google.com [209.85.167.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6A2923933E0 for ; Wed, 7 Jan 2026 15:34:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800054; cv=none; b=BfRg4U9gOY0GyfMmjTdq2A8VyJ0thktFxQgdPv/bpKwBzB5kL5SKAiksp+j7w9tzvDtsRRTIiKPyYbiGEMf5Cs23Aj9/Ua7NLYpLI/a11PwqKhl39yXCWEi3HUDqeQssRpoVxUia8mU8aToqvunkf4EKmE8ykf36QwqHzpX1Hfo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800054; c=relaxed/simple; bh=4TXKGjJAeQJwDyMsPwSGTtB9EZ4t0ayfW9/mqhAX0Rk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=bqLft6QP5ktF2EZ7ofYycFIgMeQo4K4tqY8ZOlC5RiZwYy9ZE89uI0tWwVyqCro7h8ABBhNmOuNrMMbcMAC5cFP9b3PZJ+qGEdCgt2l4vPV7JJxFLE9saYF8Rs4vP6BGpCTmkpl2XVF/l5BfL3JzMbkwSb10xQn07klge/OFdc8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Cgf62Zsk; arc=none smtp.client-ip=209.85.167.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Cgf62Zsk" Received: by mail-oi1-f170.google.com with SMTP id 5614622812f47-450b5338459so1353445b6e.2 for ; Wed, 07 Jan 2026 07:34:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767800050; x=1768404850; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=xsoizNZgLZaDBPInQieK4jd0iMDgMqPyINA++t0nMgM=; b=Cgf62ZskAjdt/2L9b76VksAj77KbHek0naYG1vY1XQe23J4K38Nz1m/K740Rq7OlVo zLrwDRz5pGtWlCx667y+8EAUB273ZdeuzKrbxsBY+oNeTvu0SY49seTPDl4AvCkc8QPB uVImzIVQC82a6q7azDL99a9d/fyPNfahq1bHxWIqZ5uJEJhv9t/1KtKG/fEgSR2DBKbI Zb6q4dOJcfuXVbsGmRwaBLvw0jmpHd/yl3zjhn96IJqrWRWoY6C8HOTfAMH/wye1Rhb7 NUlMbF9S8UQ3rFA1t0Y2ei35zezVqIdTn9oGDJu8+NiTbR7w+V0CavvAqeizpWdW2KCX SePQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767800050; x=1768404850; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=xsoizNZgLZaDBPInQieK4jd0iMDgMqPyINA++t0nMgM=; b=QwL6/uIFCtLNya+6n6PQxDiG9YRh7YIEacK1KsO1MgCdsT8pw0mf1Hgv+i/HU6TUYd V92xFzJRAShfGITCQZBnRlarPbllT2DNCZXA3D4mZShd8YazE9KEHRHfQNKJ5t6sFI2Z f0zTQW3Dex8jCW0YJRr2boXOsOSrZ+vMSdkaoLQahLfjgCOeR4Mrf2Z0Wvj7DAWrjkgz IZpjZPCCp/SLZI5+21wPZS7tUSBNQeWKGocV3tJecs4gt3WgmDMGyDCAJtWY9t4ovSzk kfPxmhdK/W+dBhgGXkHC7KmXGLPoD4QcIHL4FhAgZo5gI2UyoKywQhvB1SbeUnLAwait ++LQ== X-Forwarded-Encrypted: i=1; AJvYcCXXtQO1rkDq/vTkgYha59/qtvJHAJltWmrp4Wodx9hM5tDIad5V62vy/V+R/l7feetlbkm70OH+caMC5LM=@vger.kernel.org X-Gm-Message-State: AOJu0YyOAzm37BKhCuHx/rAvN7/0UGvzXHtQOtGpPiIrXCKzrIgNcwqa mV07r7YkqppvYsUT0LfJ4vn4Nz4kz9EzKKvDoNDp4KR0BLa7gm+/IZMB X-Gm-Gg: AY/fxX40XriHq34980G+fUSG1h3HnD7tiNyjJzfF4G19pV3rSY0LAsqvqU5IJSLg3mt pMTqV6lvT0CzTY3rtSbLVhvaOtkcnUfwN14o9kNt85Fu7MpbYhKcL+1oXR6YkGUNaWH5a7z2oym C1u1Be0OlbiWHxXQVVacUI33SgIwScsCFMVbDJnv0RYkWmH0//H9WrSDs9nbcyOmvWnjvsP59IG +LHp2BT7/yQGDbP2cASb40s7Ziigyken7g0lS/KrIjkk7vkHYRjNZvlGxInvOc2iEFm6VsDKayR 413bEO7lOH+yH8y+Fo8ujF7wWKIdmbfVDM+pozm6RoEe8t6MNK1eUzA2Q23vjAGu2LKy7LIReCA hB76nzZvTAYuWxElNa4q6E2RkbUo95OvgK9StPLQTzcPjgG3i8yGf8S1myLdGM74K4pqLmatwpp W0hN4gQJuSKG+Hx3iG3OKA3aNqT0d74NQlWYAEAdkZFZK3cJWBv7KVjmk= X-Google-Smtp-Source: AGHT+IFO/ZLxuqaj17abyw10+8CVi9iLvz7AXlLgZerskMOrQo6BFT5caxK0XQr+JNLPJZbm/V9Mbw== X-Received: by 2002:a05:6808:1786:b0:44f:f747:f9f with SMTP id 5614622812f47-45a6be3820fmr1144818b6e.36.1767800050216; Wed, 07 Jan 2026 07:34:10 -0800 (PST) Received: from localhost.localdomain ([2603:8080:1500:3d89:a917:5124:7300:7cef]) by smtp.gmail.com with ESMTPSA id 5614622812f47-45a5e2f1de5sm2398106b6e.22.2026.01.07.07.34.08 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 07 Jan 2026 07:34:09 -0800 (PST) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, John Groves Subject: [PATCH V3 10/21] famfs_fuse: Kconfig Date: Wed, 7 Jan 2026 09:33:19 -0600 Message-ID: <20260107153332.64727-11-john@groves.net> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20260107153332.64727-1-john@groves.net> References: <20260107153244.64703-1-john@groves.net> <20260107153332.64727-1-john@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add FUSE_FAMFS_DAX config parameter, to control compilation of famfs within fuse. Signed-off-by: John Groves --- fs/fuse/Kconfig | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/fs/fuse/Kconfig b/fs/fuse/Kconfig index 3a4ae632c94a..3b6d3121fe40 100644 --- a/fs/fuse/Kconfig +++ b/fs/fuse/Kconfig @@ -76,3 +76,17 @@ config FUSE_IO_URING =20 If you want to allow fuse server/client communication through io-uring, answer Y + +config FUSE_FAMFS_DAX + bool "FUSE support for fs-dax filesystems backed by devdax" + depends on FUSE_FS + depends on DEV_DAX + default FUSE_FS + select DEV_DAX_FS + help + This enables the fabric-attached memory file system (famfs), + which enables formatting devdax memory as a file system. Famfs + is primarily intended for scale-out shared access to + disaggregated memory. + + To enable famfs or other fuse/fs-dax file systems, answer Y --=20 2.49.0 From nobody Sat Feb 7 07:10:19 2026 Received: from mail-oi1-f169.google.com (mail-oi1-f169.google.com [209.85.167.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 068FE3933E9 for ; Wed, 7 Jan 2026 15:34:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800056; cv=none; b=toukd70rertQQ2+1LVSY+hwFzyCENtUQupL+3tn190oXLNBIIKGlVbSJ63i7hJ1dHvhZzkXXifEGD91zBEns9vqsBpfLAYCHAVvwFcSeBhRzaDXeI3SPo9FDwTKONxKXzA7AzSk33xjdRxREQy1gNI95ZNghbCqMM6T66HTYjiM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800056; c=relaxed/simple; bh=KDdIsA9D92H7U82UXu8uRfeAr/a85ssGj/h8ss0xZlY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=IkgEF3WoThmNmm47NHoZniWZwBrU0KNpUlTAI8nj7QPKRd4EBVOr2lx9ApstbFVAEhi7AeAUwgNIrvPvrr/un860RZh8PCIlfI1VmeOAHFpZ7oVjx/iA7RfTpBQHkXzHUVURyJGSMv+wuYdsjGY7EK48U+UdSYOvEoZXxnMkPIk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=UTSrqzPz; arc=none smtp.client-ip=209.85.167.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="UTSrqzPz" Received: by mail-oi1-f169.google.com with SMTP id 5614622812f47-459993ff4fcso917192b6e.1 for ; Wed, 07 Jan 2026 07:34:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767800053; x=1768404853; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=TBLWubQyI6xZVEp0jqdiIeZ2GXSmn2JLxEmTFZ45eqM=; b=UTSrqzPzZsLJoa+3EZGSx8ckaD7su+jBeOoFOF6p14d2SgZ5RoXejUYiX6UK0mSaJB 67y2EbucQNS8YGAsyOrB308ZXNrS4hkgBz/HAoKqPau76rbZDNzjbN4L9A+jub+zhAiX 6exFP0DGunPk+pHyobmuMzRkvCDhmjOThpqlxpYrrdvUYonF88ZKCAQXgakwxWCU9B6E ellAT2m+voJ+F9tET8rfwhDG78Ut/+yr+zqrhZUP02bakLr9c68Zakv8YFoRCJYVMe1h 1hwix+Hmn9E3LlLWHwbTjCP6e3CKMCL3zv0AvtUvJyGyUijG5APTE+q9HE8qtPSKSRJz x2QA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767800053; x=1768404853; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=TBLWubQyI6xZVEp0jqdiIeZ2GXSmn2JLxEmTFZ45eqM=; b=tkeeTvoMD0aLFpRO3hmKj/1j6/kB4XiCPfhkEQNAbM7k6FwXuZB1pMlHeu5a0uM57Y 9nvrSbEkXsax/SpawiF00w0QHD5g6HNPv/gWJfTQ8sunem8QlfHk7G1TywLYVJCvHyim EiPztFAAFHQDsnLoyk4LNMTJDnGdpvFHWMVbOEY2zTYmkqkffU9HvICSEfAyHWEQKurJ lWXfQpJLHP50/SHbnMQdgqiEc1qxuzuMrdrfprxi+7h4Si0TPfunNNIqsgYcNeYRwoVU k1DwePjtRUPQqV1Bw8b1P2zLuhN43U1eLdcy/uCvT+I3TWZXUI4tpnvUYRCaNU9wYU6G wdDA== X-Forwarded-Encrypted: i=1; AJvYcCUeCVx++468/2zLoLplMC8KWqYA9GsHQ97LWaXMC7HnQbbxJNC5gvhglc1tHP0XEb4clKgXfijZo6zw18s=@vger.kernel.org X-Gm-Message-State: AOJu0YynoG7CGgaNgYYGS8+TLftK5tPg3Bg/wBN7yQi+xmCAG0BpFI+R HJFiV+Nb93N7Qkgdk/kyk0ip+1OHA+g9I3LGW6YmRRFpqwWTq+APZt2n X-Gm-Gg: AY/fxX6LRUcCKWgA0CB+79A+XiF4t/dulOt7F0IIzyTZqjc+is1zMFhEckVnhjFyl/c 9Punuwb0UJm+QDS11eHYSQqzIBElsILtJ4hF8PuuH6ZsjNMQqoPqxVgdWFvsBVAJjhiXlczYKAs RcdI+MwyIGw02WWBKhTs4McJCvCDutmJCAlflq7oecLawlypdzWl2Pueb/vMyRPfsE1AW+CSAmQ EaH8wivhh6OPkTWnUy5BwQIcgh8bwVYVsWcOOADdJpUjCqSGiAocElVY6zfb3DpDnSqARg8hn3Q rBCY24tEVY0gByzUxY7blWb4EFgnnoeMG4c4vjssk42/xGgRWrRws1uAy1fmGH5sfuTQBKSJC14 iQHASP07DHGA6r8p3yhMK9xmC/1mD7ARsbwk7KMnDz7i9q1lLn3WbSfaD7bazY4gVDczrjIKiaS 39+ycA43kuO11WAmRYq9YZXvAleX5Ga4n8PsJzeaJE3FIy X-Google-Smtp-Source: AGHT+IFOd8s1wzfv2HGwBLfy7yoEMT8R9K3YpHqc6qg9LqPf5eknmZBY0Kt5yrjbMWCqtHbEQ9YNXw== X-Received: by 2002:a05:6808:3206:b0:450:c6af:7c25 with SMTP id 5614622812f47-45a6bd8b7a2mr1331578b6e.21.1767800052873; Wed, 07 Jan 2026 07:34:12 -0800 (PST) Received: from localhost.localdomain ([2603:8080:1500:3d89:a917:5124:7300:7cef]) by smtp.gmail.com with ESMTPSA id 5614622812f47-45a5e2f1de5sm2398106b6e.22.2026.01.07.07.34.10 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 07 Jan 2026 07:34:12 -0800 (PST) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, John Groves Subject: [PATCH V3 11/21] famfs_fuse: Update macro s/FUSE_IS_DAX/FUSE_IS_VIRTIO_DAX/ Date: Wed, 7 Jan 2026 09:33:20 -0600 Message-ID: <20260107153332.64727-12-john@groves.net> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20260107153332.64727-1-john@groves.net> References: <20260107153244.64703-1-john@groves.net> <20260107153332.64727-1-john@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Virtio_fs now needs to determine if an inode is DAX && not famfs. Signed-off-by: John Groves Reviewed-by: Joanne Koong --- fs/fuse/dir.c | 2 +- fs/fuse/file.c | 13 ++++++++----- fs/fuse/fuse_i.h | 6 +++++- fs/fuse/inode.c | 4 ++-- fs/fuse/iomode.c | 2 +- 5 files changed, 17 insertions(+), 10 deletions(-) diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c index 4b6b3d2758ff..1400c9d733ba 100644 --- a/fs/fuse/dir.c +++ b/fs/fuse/dir.c @@ -2153,7 +2153,7 @@ int fuse_do_setattr(struct mnt_idmap *idmap, struct d= entry *dentry, is_truncate =3D true; } =20 - if (FUSE_IS_DAX(inode) && is_truncate) { + if (FUSE_IS_VIRTIO_DAX(fi) && is_truncate) { filemap_invalidate_lock(mapping); fault_blocked =3D true; err =3D fuse_dax_break_layouts(inode, 0, -1); diff --git a/fs/fuse/file.c b/fs/fuse/file.c index 01bc894e9c2b..093569033ed1 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -252,7 +252,7 @@ static int fuse_open(struct inode *inode, struct file *= file) int err; bool is_truncate =3D (file->f_flags & O_TRUNC) && fc->atomic_o_trunc; bool is_wb_truncate =3D is_truncate && fc->writeback_cache; - bool dax_truncate =3D is_truncate && FUSE_IS_DAX(inode); + bool dax_truncate =3D is_truncate && FUSE_IS_VIRTIO_DAX(fi); =20 if (fuse_is_bad(inode)) return -EIO; @@ -1812,11 +1812,12 @@ static ssize_t fuse_file_read_iter(struct kiocb *io= cb, struct iov_iter *to) struct file *file =3D iocb->ki_filp; struct fuse_file *ff =3D file->private_data; struct inode *inode =3D file_inode(file); + struct fuse_inode *fi =3D get_fuse_inode(inode); =20 if (fuse_is_bad(inode)) return -EIO; =20 - if (FUSE_IS_DAX(inode)) + if (FUSE_IS_VIRTIO_DAX(fi)) return fuse_dax_read_iter(iocb, to); =20 /* FOPEN_DIRECT_IO overrides FOPEN_PASSTHROUGH */ @@ -1833,11 +1834,12 @@ static ssize_t fuse_file_write_iter(struct kiocb *i= ocb, struct iov_iter *from) struct file *file =3D iocb->ki_filp; struct fuse_file *ff =3D file->private_data; struct inode *inode =3D file_inode(file); + struct fuse_inode *fi =3D get_fuse_inode(inode); =20 if (fuse_is_bad(inode)) return -EIO; =20 - if (FUSE_IS_DAX(inode)) + if (FUSE_IS_VIRTIO_DAX(fi)) return fuse_dax_write_iter(iocb, from); =20 /* FOPEN_DIRECT_IO overrides FOPEN_PASSTHROUGH */ @@ -2370,10 +2372,11 @@ static int fuse_file_mmap(struct file *file, struct= vm_area_struct *vma) struct fuse_file *ff =3D file->private_data; struct fuse_conn *fc =3D ff->fm->fc; struct inode *inode =3D file_inode(file); + struct fuse_inode *fi =3D get_fuse_inode(inode); int rc; =20 /* DAX mmap is superior to direct_io mmap */ - if (FUSE_IS_DAX(inode)) + if (FUSE_IS_VIRTIO_DAX(fi)) return fuse_dax_mmap(file, vma); =20 /* @@ -2934,7 +2937,7 @@ static long fuse_file_fallocate(struct file *file, in= t mode, loff_t offset, .mode =3D mode }; int err; - bool block_faults =3D FUSE_IS_DAX(inode) && + bool block_faults =3D FUSE_IS_VIRTIO_DAX(fi) && (!(mode & FALLOC_FL_KEEP_SIZE) || (mode & (FALLOC_FL_PUNCH_HOLE | FALLOC_FL_ZERO_RANGE))); =20 diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index 7f16049387d1..17736c0a6d2f 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -1508,7 +1508,11 @@ void fuse_free_conn(struct fuse_conn *fc); =20 /* dax.c */ =20 -#define FUSE_IS_DAX(inode) (IS_ENABLED(CONFIG_FUSE_DAX) && IS_DAX(inode)) +/* This macro is used by virtio_fs, but now it also needs to filter for + * "not famfs" + */ +#define FUSE_IS_VIRTIO_DAX(fuse_inode) (IS_ENABLED(CONFIG_FUSE_DAX) \ + && IS_DAX(&fuse_inode->inode)) =20 ssize_t fuse_dax_read_iter(struct kiocb *iocb, struct iov_iter *to); ssize_t fuse_dax_write_iter(struct kiocb *iocb, struct iov_iter *from); diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index 819e50d66622..ed667920997f 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -162,7 +162,7 @@ static void fuse_evict_inode(struct inode *inode) /* Will write inode on close/munmap and in all other dirtiers */ WARN_ON(inode_state_read_once(inode) & I_DIRTY_INODE); =20 - if (FUSE_IS_DAX(inode)) + if (FUSE_IS_VIRTIO_DAX(fi)) dax_break_layout_final(inode); =20 truncate_inode_pages_final(&inode->i_data); @@ -170,7 +170,7 @@ static void fuse_evict_inode(struct inode *inode) if (inode->i_sb->s_flags & SB_ACTIVE) { struct fuse_conn *fc =3D get_fuse_conn(inode); =20 - if (FUSE_IS_DAX(inode)) + if (FUSE_IS_VIRTIO_DAX(fi)) fuse_dax_inode_cleanup(inode); if (fi->nlookup) { fuse_queue_forget(fc, fi->forget, fi->nodeid, diff --git a/fs/fuse/iomode.c b/fs/fuse/iomode.c index 3728933188f3..31ee7f3304c6 100644 --- a/fs/fuse/iomode.c +++ b/fs/fuse/iomode.c @@ -203,7 +203,7 @@ int fuse_file_io_open(struct file *file, struct inode *= inode) * io modes are not relevant with DAX and with server that does not * implement open. */ - if (FUSE_IS_DAX(inode) || !ff->args) + if (FUSE_IS_VIRTIO_DAX(fi) || !ff->args) return 0; =20 /* --=20 2.49.0 From nobody Sat Feb 7 07:10:19 2026 Received: from mail-oi1-f174.google.com (mail-oi1-f174.google.com [209.85.167.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EB6933933FA for ; Wed, 7 Jan 2026 15:34:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800058; cv=none; b=mSSL4IQQX6/C8EA7mkmQdEchm1BH7NUpDNHdWyuGiHEXzLoSKR6sQ261YnIePC9HTkpGIT9kEdM+W2swAgVCmMSlEE6GITwtDQr46VyiKFj81A7HFY714wMe1dk4dxVmvtacaRGztkA6yCCXQ9s//Z++MsUc2GNsRa8tW3Mltbc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800058; c=relaxed/simple; bh=duMgT7bdepQi0N/0JRMDje1jrbfU026HoAJQ/LhmjsY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=W4Na9lAdN5v4jni8Mk2zTeLtKafv4kh1jop0LvFEKgQX6UuVCv58O5G0etqcX4awp8V6m2JKyCHkDwqEGfNIGBEdxYEZNqQrybhmAWSnq6OJfCXdYxKFuc4uXVOPPaj/RH1qocgocSoHBsqheZcO35sQw4kv0z16GcY3MZhucHw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=fopw6qgY; arc=none smtp.client-ip=209.85.167.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="fopw6qgY" Received: by mail-oi1-f174.google.com with SMTP id 5614622812f47-45392215f74so1095682b6e.3 for ; Wed, 07 Jan 2026 07:34:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767800056; x=1768404856; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=DvpvNKjuheK6bNSoVUbkdsgMg3JblqnFA9UfzPx8aEA=; b=fopw6qgYDS1U7w3NFD44UhZP1zkaczDhM4pfG8+oF2NErR3LgZbC8hZXGSFpH3SY5T xbW/jUDffSYA9bCK+y+C4slyvt/CbTu56mqaOIu1yag2OhDLuAllujMbwL1g1IswoDFI up3mOwx8GpmJeVzNBS0p3WNxfQiUT1puVfDhMbSpGj8tB23AF/9X1krglaCHXFyusbpp gZUV8LwE+BKyf6Z2vzcFyNzmfyDPKd4zczoJY5FXwor9QUcwk8ijW5gxgVDnkNd+DmaK bnxeGvZtZPFrA7ucU/IHB+FrTAg1jeOVVN0HPDFNFItW1bCCeCHjBzdeuCBxgR+xHex2 FzFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767800056; x=1768404856; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=DvpvNKjuheK6bNSoVUbkdsgMg3JblqnFA9UfzPx8aEA=; b=ZvFV6uyGOPnePVyeE6S+FAuHm74j6I4SvufrJTr8ZGqIQ7hCeGXLpqzedq3s7TTsH1 QhDbPpOitPpkd6WZQ7jByTvKYrfuRngi9wO2gOoPGsARhFZQxCp1Wl1tDzpHD2rITNOJ fRDnfsTH6fkkDU3RuZleT85Ix3HpKvv2+S93tJR1U+NA5w4QmAf8FFAcrap/WqzSXhJp aBzuptpA5RvIbNwWef01Eyz+MhzCUplYftKWaLebhq05yVASbvbtnu2d4wJ8yeaRKWqy QmJPbtKFRL7ZMNNGBpV3Cb3YygscKplF85DjSyPCkVZBMQD9Dq8x+WxB2/+cdCsAxqq9 LRHw== X-Forwarded-Encrypted: i=1; AJvYcCVOwlryxi4enm5Dgir6J8tMNZrLRofXCgz826l9syKgSkzgAIR74KRFFRMvg04Qroy7/K/MFpR2xODS01o=@vger.kernel.org X-Gm-Message-State: AOJu0YyT1NTm/bUppXWU7alk2BY6SBEvEtoE7iL9KxOQM0wDHxM0DNrR qnZTSzAaXjl3P0W9d5svuS2QQs0FqdsUYZg+DXrQxBIxna99vc7qYP64 X-Gm-Gg: AY/fxX4vCvZGYuYzE2J/JBAzXDHydn2kkxBbyUNRGERfHkq/cy42Fbih4iLkQNZLFto fLxSn9NLFfPiV3MwVugWDwAn8Sf3SxxEHDK20c6525eoMyz7Ea0W0Hn4llaWXZSZUCXjMgq3Goe HkbR9RG9nge1PFueiJn5cZJ4F9tx2mC1ng0k3DkNCYZV3APxIW3MRp0p7vJESoLoQb4AL/1e0gd ULK3+q28Du0E19Ooqw7egr+8JXjH+2mxGfV00IHRRoXdzB9PJtTf4r+OB6TtVl/ZbFTISky+Coi GFbNqUCmMNi8FjwwlIYCdwHXVcLUKsoqoeUA1pJFRt0yrPZOyxATlarOCmm8XtpFzGC1DjWYWf2 VaAJBmWHKguo6SOD9ZXJrgRMdduIeo85gGzpCHuLE0PucmKJrfrklD+Y1zAy37t949MQkyzMf9G R6sZkiDsdSEeSymJfZ2kdTXbR+4CPUYqhLseabXYTT6leG X-Google-Smtp-Source: AGHT+IH/ERiPsB3hUWVBd46Q+0Jhbm8DG3SRCThraKFEJm6cv2tmGnQM3WnzxXsB8duHZXMQqQd33g== X-Received: by 2002:a05:6808:c1f2:b0:44f:e49e:8e42 with SMTP id 5614622812f47-45a6bef202emr1051295b6e.48.1767800055714; Wed, 07 Jan 2026 07:34:15 -0800 (PST) Received: from localhost.localdomain ([2603:8080:1500:3d89:a917:5124:7300:7cef]) by smtp.gmail.com with ESMTPSA id 5614622812f47-45a5e2f1de5sm2398106b6e.22.2026.01.07.07.34.13 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 07 Jan 2026 07:34:15 -0800 (PST) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, John Groves Subject: [PATCH V3 12/21] famfs_fuse: Basic fuse kernel ABI enablement for famfs Date: Wed, 7 Jan 2026 09:33:21 -0600 Message-ID: <20260107153332.64727-13-john@groves.net> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20260107153332.64727-1-john@groves.net> References: <20260107153244.64703-1-john@groves.net> <20260107153332.64727-1-john@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" * FUSE_DAX_FMAP flag in INIT request/reply * fuse_conn->famfs_iomap (enable famfs-mapped files) to denote a famfs-enabled connection Signed-off-by: John Groves Reviewed-by: Joanne Koong --- fs/fuse/fuse_i.h | 3 +++ fs/fuse/inode.c | 6 ++++++ include/uapi/linux/fuse.h | 5 +++++ 3 files changed, 14 insertions(+) diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index 17736c0a6d2f..ec2446099010 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -921,6 +921,9 @@ struct fuse_conn { /* Is synchronous FUSE_INIT allowed? */ unsigned int sync_init:1; =20 + /* dev_dax_iomap support for famfs */ + unsigned int famfs_iomap:1; + /* Use io_uring for communication */ unsigned int io_uring; =20 diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index ed667920997f..acabf92a11f8 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -1456,6 +1456,10 @@ static void process_init_reply(struct fuse_mount *fm= , struct fuse_args *args, =20 if (flags & FUSE_REQUEST_TIMEOUT) timeout =3D arg->request_timeout; + + if (IS_ENABLED(CONFIG_FUSE_FAMFS_DAX) && + flags & FUSE_DAX_FMAP) + fc->famfs_iomap =3D 1; } else { ra_pages =3D fc->max_read / PAGE_SIZE; fc->no_lock =3D 1; @@ -1517,6 +1521,8 @@ static struct fuse_init_args *fuse_new_init(struct fu= se_mount *fm) flags |=3D FUSE_SUBMOUNTS; if (IS_ENABLED(CONFIG_FUSE_PASSTHROUGH)) flags |=3D FUSE_PASSTHROUGH; + if (IS_ENABLED(CONFIG_FUSE_FAMFS_DAX)) + flags |=3D FUSE_DAX_FMAP; =20 /* * This is just an information flag for fuse server. No need to check diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h index c13e1f9a2f12..5e2c93433823 100644 --- a/include/uapi/linux/fuse.h +++ b/include/uapi/linux/fuse.h @@ -240,6 +240,9 @@ * - add FUSE_COPY_FILE_RANGE_64 * - add struct fuse_copy_file_range_out * - add FUSE_NOTIFY_PRUNE + * + * 7.46 + * - Add FUSE_DAX_FMAP capability - ability to handle in-kernel fsdax m= aps */ =20 #ifndef _LINUX_FUSE_H @@ -448,6 +451,7 @@ struct fuse_file_lock { * FUSE_OVER_IO_URING: Indicate that client supports io-uring * FUSE_REQUEST_TIMEOUT: kernel supports timing out requests. * init_out.request_timeout contains the timeout (in secs) + * FUSE_DAX_FMAP: kernel supports dev_dax_iomap (aka famfs) fmaps */ #define FUSE_ASYNC_READ (1 << 0) #define FUSE_POSIX_LOCKS (1 << 1) @@ -495,6 +499,7 @@ struct fuse_file_lock { #define FUSE_ALLOW_IDMAP (1ULL << 40) #define FUSE_OVER_IO_URING (1ULL << 41) #define FUSE_REQUEST_TIMEOUT (1ULL << 42) +#define FUSE_DAX_FMAP (1ULL << 43) =20 /** * CUSE INIT request/reply flags --=20 2.49.0 From nobody Sat Feb 7 07:10:19 2026 Received: from mail-oi1-f176.google.com (mail-oi1-f176.google.com [209.85.167.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CD3DA394478 for ; Wed, 7 Jan 2026 15:34:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800062; cv=none; b=d2Bf5eyTIZyby29cmtCTR+lCj/Eat0Z4ajyqUB/CaFnA1Re6hez4yruned7/OaZzMwcB5Xjx5MKB1kSeMtvBhh7xkXCAjp1gEUciB0k0SedXcHNHrPwBAcIMedFaEaCt4NAcvtAPuW60Wc1xcjDrQN6kZrk3vjxX1aPT4wh5Jeo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800062; c=relaxed/simple; bh=eVZvTKoGzS9QT9AZIVcrZZeY/VZ69Zoen8NpofDAoxQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ZuILTV80ZniDPNMsVGuHa+cW5GoCzx+/7C4et7oBv9GZ5QZPYqGNOWCx498NHw8gUUJfZeSPcaqnPnTDM09krrmxjITEW43O3XMGDZrP0NKKa3IcVFjNmj+ifFdkVtrOtpfSE5ZvTBiA0H7xuK2iQLfxmbQpF38sXo3CHC4hkq0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=UGnvaW4c; arc=none smtp.client-ip=209.85.167.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="UGnvaW4c" Received: by mail-oi1-f176.google.com with SMTP id 5614622812f47-457c1148a5bso824438b6e.1 for ; Wed, 07 Jan 2026 07:34:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767800058; x=1768404858; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=ioRMhdbsJjC8v5pL1MPtuv/dIU5BMDQOIZe20AJ80PY=; b=UGnvaW4co7XWirS/NsFr1WBKa5z2Yg9k29ztmMLr5Rh/iwNkIUZL4MInE1Up1WNYnW aPUBxfiPAQKzdu6D3XZqWtIgv0nFhLt7zkU0skIarPSJfgfs7sRT3OcAc4/msmER1vwP 47RlqOdzGLN1YjCTUU93/CpkZ744c7B56s/pkPPNqJaXDqqUbZ/9z3Ne6mleRmUEzmDX 3f6jVFR3WhI1Z7wKs6kYbBYivQBju0sMlu8VfUGcIztrovsgk2PPJXQM8xrZNH8bNyly cuehlgEs8bzW+/b2Zyb0ceXGiAKsqoVTZjcm0iOMRNAfN9dBExhRnWBygUgbmV2Xzi7Q gViA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767800058; x=1768404858; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ioRMhdbsJjC8v5pL1MPtuv/dIU5BMDQOIZe20AJ80PY=; b=Pr4e6vV3t9eQAHoBIUdmPYvGZgUcqc8PRMFc9yDYt9kZ4YuASIyXtUqh7rFVQ9MjC5 MOjhUCVqLfjcQus2w38GOyvMV9GzPZ68gDhLrdk8AO3KzlwY5+kPtcZe4gBO6j63kxOc CZMATl/aXTQ7EyPhPL78MDeiKP8tc1ADrdhrIUhlnMVDerJvZgQAk/nE7FYQM3xPZWdP Q1UgGC7C0PuUHB51gfECOVVzBhqrs4WxkyYqJmmEg6+TtqtHUwwJArmvSMqB/c28EO5u rbhAFjb7k/uptvp0Oqbk3epwFZ8abqNFMlNcRknwtUsFF9l7jXNIo5DRoJnYMzMcJ9Xp n76A== X-Forwarded-Encrypted: i=1; AJvYcCUP1sFdVnt+3hdpgADa5OW8psjgYU5xyiaF5mAnkYkdxG4V4UHSpyRaYRq0DTjxidZlWxKMBn8FXLoRwgg=@vger.kernel.org X-Gm-Message-State: AOJu0YyYD9CY294jhkGPqhTDWDqnWtDYuvZDbxT0UmrcesCGLRg1/F7E YPILIWadKJUs2GCO6J7BZpYWsqGCqnM2c2AfUcbsdhbSAUwgES8VKJw0 X-Gm-Gg: AY/fxX5pIeTTLGDgJp1ltAgUfjsw0vfHuwA4+mkhw/uHTC/JmHE0UXjx+yw/89ZrCF2 42YIsTfVmS4oC+iIJhTrjVoNL8Zgm3viAzrLGqOgrllUZgw0UXw9T6Kxd+8LoDdDwpu8hGIlBHE 9/fr2qir4C6iSfI9RFgRxwbmGz2ZGuIpE8xuz2qhaLlDv7yymVFElFL2dfnyibATRHUCgtCHaRw GyPPTMOmAhbOmr7DINM8+6BZ3GDPuqm75K7alKE9Rbx6ULTV+SyRImKwVA7MLJdi62HmBQKtkFG 03MAH6V/iArimA95/JOhRRFeyUWdoP9q4jmBygtt9mRtUo1GSzMo5Z894DMbXOx8gsNLXSnX4/J sb4IUrctCJrMaJn1F7Z49+6ZaLtjG06WJhRsMKU9jd4XGvgoFg/mtmvqkHSQsK1UvYM03XRzvVC Lvi+oGH1SY2V4kJdiZO0Imi9qx0zOZgegRWUZEayIRO39Z X-Google-Smtp-Source: AGHT+IG/OP1ZeA8RVwGBd2hiwLTF8z2bJWxQ5Kax1EN4BvyeAQ3tR8TZf8cyaewyY/AtJiZ/nDb6hw== X-Received: by 2002:a05:6808:3442:b0:453:f62:dddc with SMTP id 5614622812f47-45a6bccd815mr1304391b6e.7.1767800058505; Wed, 07 Jan 2026 07:34:18 -0800 (PST) Received: from localhost.localdomain ([2603:8080:1500:3d89:a917:5124:7300:7cef]) by smtp.gmail.com with ESMTPSA id 5614622812f47-45a5e2f1de5sm2398106b6e.22.2026.01.07.07.34.16 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 07 Jan 2026 07:34:18 -0800 (PST) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, John Groves Subject: [PATCH V3 13/21] famfs_fuse: Famfs mount opt: -o shadow= Date: Wed, 7 Jan 2026 09:33:22 -0600 Message-ID: <20260107153332.64727-14-john@groves.net> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20260107153332.64727-1-john@groves.net> References: <20260107153244.64703-1-john@groves.net> <20260107153332.64727-1-john@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The shadow path is a (usually in tmpfs) file system area used by the famfs user space to communicate with the famfs fuse server. There is a minor dilemma that the user space tools must be able to resolve from a mount point path to a shadow path. Passing in the 'shadow=3D' argument at mount time causes the shadow path to be exposed via /proc/mounts, Solving this dilemma. The shadow path is not otherwise used in the kernel. Signed-off-by: John Groves --- fs/fuse/fuse_i.h | 25 ++++++++++++++++++++++++- fs/fuse/inode.c | 28 +++++++++++++++++++++++++++- 2 files changed, 51 insertions(+), 2 deletions(-) diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index ec2446099010..84d0ee2a501d 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -620,9 +620,11 @@ struct fuse_fs_context { unsigned int blksize; const char *subtype; =20 - /* DAX device, may be NULL */ + /* DAX device for virtiofs, may be NULL */ struct dax_device *dax_dev; =20 + const char *shadow; /* famfs - null if not famfs */ + /* fuse_dev pointer to fill in, should contain NULL on entry */ void **fudptr; }; @@ -998,6 +1000,18 @@ struct fuse_conn { /* Request timeout (in jiffies). 0 =3D no timeout */ unsigned int req_timeout; } timeout; + + /* + * This is a workaround until fuse uses iomap for reads. + * For fuseblk servers, this represents the blocksize passed in at + * mount time and for regular fuse servers, this is equivalent to + * inode->i_blkbits. + */ + u8 blkbits; + +#if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX) + char *shadow; +#endif }; =20 /* @@ -1631,4 +1645,13 @@ extern void fuse_sysctl_unregister(void); #define fuse_sysctl_unregister() do { } while (0) #endif /* CONFIG_SYSCTL */ =20 +/* famfs.c */ + +static inline void famfs_teardown(struct fuse_conn *fc) +{ +#if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX) + kfree(fc->shadow); +#endif +} + #endif /* _FS_FUSE_I_H */ diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index acabf92a11f8..2e0844aabbae 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -783,6 +783,9 @@ enum { OPT_ALLOW_OTHER, OPT_MAX_READ, OPT_BLKSIZE, +#if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX) + OPT_SHADOW, +#endif OPT_ERR }; =20 @@ -797,6 +800,9 @@ static const struct fs_parameter_spec fuse_fs_parameter= s[] =3D { fsparam_u32 ("max_read", OPT_MAX_READ), fsparam_u32 ("blksize", OPT_BLKSIZE), fsparam_string ("subtype", OPT_SUBTYPE), +#if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX) + fsparam_string("shadow", OPT_SHADOW), +#endif {} }; =20 @@ -892,6 +898,15 @@ static int fuse_parse_param(struct fs_context *fsc, st= ruct fs_parameter *param) ctx->blksize =3D result.uint_32; break; =20 +#if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX) + case OPT_SHADOW: + if (ctx->shadow) + return invalfc(fsc, "Multiple shadows specified"); + ctx->shadow =3D param->string; + param->string =3D NULL; + break; +#endif + default: return -EINVAL; } @@ -905,6 +920,7 @@ static void fuse_free_fsc(struct fs_context *fsc) =20 if (ctx) { kfree(ctx->subtype); + kfree(ctx->shadow); kfree(ctx); } } @@ -936,7 +952,10 @@ static int fuse_show_options(struct seq_file *m, struc= t dentry *root) else if (fc->dax_mode =3D=3D FUSE_DAX_INODE_USER) seq_puts(m, ",dax=3Dinode"); #endif - +#if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX) + if (fc->shadow) + seq_printf(m, ",shadow=3D%s", fc->shadow); +#endif return 0; } =20 @@ -1041,6 +1060,8 @@ void fuse_conn_put(struct fuse_conn *fc) WARN_ON(atomic_read(&bucket->count) !=3D 1); kfree(bucket); } + famfs_teardown(fc); + if (IS_ENABLED(CONFIG_FUSE_PASSTHROUGH)) fuse_backing_files_free(fc); call_rcu(&fc->rcu, delayed_release); @@ -1916,6 +1937,11 @@ int fuse_fill_super_common(struct super_block *sb, s= truct fuse_fs_context *ctx) *ctx->fudptr =3D fud; wake_up_all(&fuse_dev_waitq); } + +#if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX) + fc->shadow =3D kstrdup(ctx->shadow, GFP_KERNEL); +#endif + mutex_unlock(&fuse_mutex); return 0; =20 --=20 2.49.0 From nobody Sat Feb 7 07:10:19 2026 Received: from mail-oi1-f175.google.com (mail-oi1-f175.google.com [209.85.167.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 63BB839449B for ; Wed, 7 Jan 2026 15:34:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800065; cv=none; b=fKZxM5hsN+zkoysGspzChnKGuNQ/+OHnzq3iPcrE6ZdzIjuL7TqsQjiVx0yx63REbqBA3pHuVcO19ZKNYpc9YKow1V6CT4mFxdnAh11gV4sYTRZ3P1wn8Vfc7F8O+AgKAJJGO0ySrf5B6r431PX0GYMooSAwCrIjEh3lknYKhhQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800065; c=relaxed/simple; bh=PCnUhHjPjtOchfmyY/COLS4ltzYEGwOIYlwWEM8c7sc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Ca3Znm1jW7iqr+E+aKla3B7fbwkahRxWnBXG5Gev04JMMuwuxZP3Aw6WhYfGEEFNtrQ0iHkysGshaSNr+S4AxXOkqQ9sJh1zEghRNTmehncPtWb8adehXnX2WqR2n5QpYd1ZYCfg7cw7qEk13rY+PWY9S9NMtxxZy6pAZ8ZFtAQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=KjG555wt; arc=none smtp.client-ip=209.85.167.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="KjG555wt" Received: by mail-oi1-f175.google.com with SMTP id 5614622812f47-455bef556a8so1445560b6e.1 for ; Wed, 07 Jan 2026 07:34:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767800061; x=1768404861; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=GtZa5/+fDcb2v1caKx2aWM14flK3GsHU5CozFLRvBAM=; b=KjG555wtAlNSH05hSqF1h+h9VXHwSKBp7/8X+M1k57h/N/I3tBXyStzNjm4tFHUYrC tjFFIk6K5PjAbjXNvWqDPjFb2tAbRrXOuuUoJT7aJpDC76cT8KxxB49DLuzpkQnrotzp 6GgMc8pNG6WxoapSyJgYsuYggjbYhOtWocAOoo0p8bOr1gIw1km2lUm6D8VKKNIWSRSG kSM+Ed6qb4kWjyr82KpXOIHo7HqOHpDSQ4CsF8Dawfhqbo4Garn3Gnpqwwoee9jSRzuJ BefrFjoVDWjQmURHlSir3gd7p/OO9JMAQelMkhUQdhqfLlS/frnpecVTijWBA8Ze1Y4e xOEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767800061; x=1768404861; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=GtZa5/+fDcb2v1caKx2aWM14flK3GsHU5CozFLRvBAM=; b=i0rXjrMsh8m06Onkml2NeQW6M49FajszdkFhWC6apfbkXGuyN/oqqH/BAOYfHPpxda DgaWznOBhNjEKhfdJ+NzanVgIB5dqwWM3lqEKdfBNjNg1+CWArBJygWV2cZXRDUCBbRV +zVJfuHfCkoPKZBrAD5GrzoJ90OiCoRWAbMPHpce0srsnXP0CC3WqhC+U0ovoQjsL6LP pvYJBK3vhes0TYSD4/u1ZSkGsCjYu6PkEKfZwQNa47J7ZIkP6LazZE1+s0kyK56hF09v On04VykqUgSWqQlq0eAXyBkrpfwM27JzH5cglHXHzDTYeUj8lVM3UAUgE9dHyDpYjFow b3qw== X-Forwarded-Encrypted: i=1; AJvYcCUIUR2ptVNnAd8wSDbQR1DdW1NXsZDMdAbMW5FEimY5yyE5z5Xw7osHPPkmwQLA54Tue8ThWr0c4O0ls2w=@vger.kernel.org X-Gm-Message-State: AOJu0YxPpryf9Q6fM/D6Kg9R0RR8OqIYD+MU6G4mZ+QEsHrf1gxWWwYO BN//sf6CnsesYwCQ0o8HcslXy7zmD0Hn04TD9CVMY/lKKxsUvQrwFqxB X-Gm-Gg: AY/fxX4sg6HAU2d1EPBTwyo/BMtmnwSPtnnmzBAUybHDH31QngR7LdF8Fch90Jx/Jx3 UlFk/DX3HFHZXD9CmjqawasfxhzNoe2+DhP9PdLxSx4puyhQLgEL2LkxAugXSQ6bvDm30/7sTXP uizFaAWEo7TZbBy6kwWuRW12JhjKREJ8y3GdP/LIMvfokMtPpH9JIhIpUj7u6Eeo0gi74HJUNYQ M8Noa77R465/vzFb8A/RRuukINVfkkAkYc0WKzRuvvs/gyy05CLg2m0WzCYo+MVfU02ZZmOsO5K ckFfI0B+DHvraM+Fyo0AQ9AIhtcO7QfInk1ZwTGKlNe6gS3zUKb5UAZqcqeA56MP4Yorc8YqTnV t4leDk6mfj+I9KTqkX6cgAgLmIKmTLlvnIEtYupw4677h6weAYHL//qKKpuDp82bAa/GMtJpH6q eMyYjZBPMs8a5qPKv5ieEL2TCWnMuz1jEDFQEZLy6xSmeh X-Google-Smtp-Source: AGHT+IHwCw+2kouS0gewRU1M+KsK1d+ZYKglvGvcMRJpE25Q1Aws6HPVpW7a/OLQu64TjuYD5OcbVw== X-Received: by 2002:a05:6808:5192:b0:459:b40f:8404 with SMTP id 5614622812f47-45a6bd3d92fmr1103755b6e.29.1767800061183; Wed, 07 Jan 2026 07:34:21 -0800 (PST) Received: from localhost.localdomain ([2603:8080:1500:3d89:a917:5124:7300:7cef]) by smtp.gmail.com with ESMTPSA id 5614622812f47-45a5e2f1de5sm2398106b6e.22.2026.01.07.07.34.19 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 07 Jan 2026 07:34:20 -0800 (PST) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, John Groves Subject: [PATCH V3 14/21] famfs_fuse: Plumb the GET_FMAP message/response Date: Wed, 7 Jan 2026 09:33:23 -0600 Message-ID: <20260107153332.64727-15-john@groves.net> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20260107153332.64727-1-john@groves.net> References: <20260107153244.64703-1-john@groves.net> <20260107153332.64727-1-john@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Upon completion of an OPEN, if we're in famfs-mode we do a GET_FMAP to retrieve and cache up the file-to-dax map in the kernel. If this succeeds, read/write/mmap are resolved direct-to-dax with no upcalls. Signed-off-by: John Groves --- MAINTAINERS | 8 +++++ fs/fuse/Makefile | 1 + fs/fuse/famfs.c | 74 +++++++++++++++++++++++++++++++++++++++ fs/fuse/file.c | 14 +++++++- fs/fuse/fuse_i.h | 47 ++++++++++++++++++++++++- fs/fuse/inode.c | 8 ++++- fs/fuse/iomode.c | 2 +- include/uapi/linux/fuse.h | 7 ++++ 8 files changed, 157 insertions(+), 4 deletions(-) create mode 100644 fs/fuse/famfs.c diff --git a/MAINTAINERS b/MAINTAINERS index 90429cb06090..526309943026 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -10374,6 +10374,14 @@ F: fs/fuse/ F: include/uapi/linux/fuse.h F: tools/testing/selftests/filesystems/fuse/ =20 +FUSE [FAMFS Fabric-Attached Memory File System] +M: John Groves +M: John Groves +L: linux-cxl@vger.kernel.org +L: linux-fsdevel@vger.kernel.org +S: Supported +F: fs/fuse/famfs.c + FUTEX SUBSYSTEM M: Thomas Gleixner M: Ingo Molnar diff --git a/fs/fuse/Makefile b/fs/fuse/Makefile index 22ad9538dfc4..3f8dcc8cbbd0 100644 --- a/fs/fuse/Makefile +++ b/fs/fuse/Makefile @@ -17,5 +17,6 @@ fuse-$(CONFIG_FUSE_DAX) +=3D dax.o fuse-$(CONFIG_FUSE_PASSTHROUGH) +=3D passthrough.o backing.o fuse-$(CONFIG_SYSCTL) +=3D sysctl.o fuse-$(CONFIG_FUSE_IO_URING) +=3D dev_uring.o +fuse-$(CONFIG_FUSE_FAMFS_DAX) +=3D famfs.o =20 virtiofs-y :=3D virtio_fs.o diff --git a/fs/fuse/famfs.c b/fs/fuse/famfs.c new file mode 100644 index 000000000000..0f7e3f00e1e7 --- /dev/null +++ b/fs/fuse/famfs.c @@ -0,0 +1,74 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * famfs - dax file system for shared fabric-attached memory + * + * Copyright 2023-2025 Micron Technology, Inc. + * + * This file system, originally based on ramfs the dax support from xfs, + * is intended to allow multiple host systems to mount a common file system + * view of dax files that map to shared memory. + */ + +#include +#include +#include +#include +#include +#include +#include + +#include "fuse_i.h" + + +#define FMAP_BUFSIZE PAGE_SIZE + +int +fuse_get_fmap(struct fuse_mount *fm, struct inode *inode) +{ + struct fuse_inode *fi =3D get_fuse_inode(inode); + size_t fmap_bufsize =3D FMAP_BUFSIZE; + u64 nodeid =3D get_node_id(inode); + ssize_t fmap_size; + void *fmap_buf; + int rc; + + FUSE_ARGS(args); + + /* Don't retrieve if we already have the famfs metadata */ + if (fi->famfs_meta) + return 0; + + fmap_buf =3D kcalloc(1, FMAP_BUFSIZE, GFP_KERNEL); + if (!fmap_buf) + return -EIO; + + args.opcode =3D FUSE_GET_FMAP; + args.nodeid =3D nodeid; + + /* Variable-sized output buffer + * this causes fuse_simple_request() to return the size of the + * output payload + */ + args.out_argvar =3D true; + args.out_numargs =3D 1; + args.out_args[0].size =3D fmap_bufsize; + args.out_args[0].value =3D fmap_buf; + + /* Send GET_FMAP command */ + rc =3D fuse_simple_request(fm, &args); + if (rc < 0) { + pr_err("%s: err=3D%d from fuse_simple_request()\n", + __func__, rc); + return rc; + } + fmap_size =3D rc; + + /* We retrieved the "fmap" (the file's map to memory), but + * we haven't used it yet. A call to famfs_file_init_dax() will be added + * here in a subsequent patch, when we add the ability to attach + * fmaps to files. + */ + + kfree(fmap_buf); + return 0; +} diff --git a/fs/fuse/file.c b/fs/fuse/file.c index 093569033ed1..1f64bf68b5ee 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -277,6 +277,16 @@ static int fuse_open(struct inode *inode, struct file = *file) err =3D fuse_do_open(fm, get_node_id(inode), file, false); if (!err) { ff =3D file->private_data; + + if ((fm->fc->famfs_iomap) && (S_ISREG(inode->i_mode))) { + /* Get the famfs fmap - failure is fatal */ + err =3D fuse_get_fmap(fm, inode); + if (err) { + fuse_sync_release(fi, ff, file->f_flags); + goto out_nowrite; + } + } + err =3D fuse_finish_open(inode, file); if (err) fuse_sync_release(fi, ff, file->f_flags); @@ -284,12 +294,14 @@ static int fuse_open(struct inode *inode, struct file= *file) fuse_truncate_update_attr(inode, file); } =20 +out_nowrite: if (is_wb_truncate || dax_truncate) fuse_release_nowrite(inode); if (!err) { if (is_truncate) truncate_pagecache(inode, 0); - else if (!(ff->open_flags & FOPEN_KEEP_CACHE)) + else if (!(ff->open_flags & FOPEN_KEEP_CACHE) && + !fuse_file_famfs(fi)) invalidate_inode_pages2(inode->i_mapping); } if (dax_truncate) diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index 84d0ee2a501d..691c7850cf4e 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -223,6 +223,14 @@ struct fuse_inode { * so preserve the blocksize specified by the server. */ u8 cached_i_blkbits; + +#if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX) + /* Pointer to the file's famfs metadata. Primary content is the + * in-memory version of the fmap - the map from file's offset range + * to DAX memory + */ + void *famfs_meta; +#endif }; =20 /** FUSE inode state bits */ @@ -1525,11 +1533,14 @@ void fuse_free_conn(struct fuse_conn *fc); =20 /* dax.c */ =20 +static inline int fuse_file_famfs(struct fuse_inode *fi); /* forward */ + /* This macro is used by virtio_fs, but now it also needs to filter for * "not famfs" */ #define FUSE_IS_VIRTIO_DAX(fuse_inode) (IS_ENABLED(CONFIG_FUSE_DAX) \ - && IS_DAX(&fuse_inode->inode)) + && IS_DAX(&fuse_inode->inode) \ + && !fuse_file_famfs(fuse_inode)) =20 ssize_t fuse_dax_read_iter(struct kiocb *iocb, struct iov_iter *to); ssize_t fuse_dax_write_iter(struct kiocb *iocb, struct iov_iter *from); @@ -1654,4 +1665,38 @@ static inline void famfs_teardown(struct fuse_conn *= fc) #endif } =20 +static inline struct fuse_backing *famfs_meta_set(struct fuse_inode *fi, + void *meta) +{ +#if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX) + return xchg(&fi->famfs_meta, meta); +#else + return NULL; +#endif +} + +static inline void famfs_meta_free(struct fuse_inode *fi) +{ + /* Stub wil be connected in a subsequent commit */ +} + +static inline int fuse_file_famfs(struct fuse_inode *fi) +{ +#if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX) + return (READ_ONCE(fi->famfs_meta) !=3D NULL); +#else + return 0; +#endif +} + +#if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX) +int fuse_get_fmap(struct fuse_mount *fm, struct inode *inode); +#else +static inline int +fuse_get_fmap(struct fuse_mount *fm, struct inode *inode) +{ + return 0; +} +#endif + #endif /* _FS_FUSE_I_H */ diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index 2e0844aabbae..9e121a1d63b7 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -120,6 +120,9 @@ static struct inode *fuse_alloc_inode(struct super_bloc= k *sb) if (IS_ENABLED(CONFIG_FUSE_PASSTHROUGH)) fuse_inode_backing_set(fi, NULL); =20 + if (IS_ENABLED(CONFIG_FUSE_FAMFS_DAX)) + famfs_meta_set(fi, NULL); + return &fi->inode; =20 out_free_forget: @@ -141,6 +144,9 @@ static void fuse_free_inode(struct inode *inode) if (IS_ENABLED(CONFIG_FUSE_PASSTHROUGH)) fuse_backing_put(fuse_inode_backing(fi)); =20 + if (S_ISREG(inode->i_mode) && fuse_file_famfs(fi)) + famfs_meta_free(fi); + kmem_cache_free(fuse_inode_cachep, fi); } =20 @@ -162,7 +168,7 @@ static void fuse_evict_inode(struct inode *inode) /* Will write inode on close/munmap and in all other dirtiers */ WARN_ON(inode_state_read_once(inode) & I_DIRTY_INODE); =20 - if (FUSE_IS_VIRTIO_DAX(fi)) + if (FUSE_IS_VIRTIO_DAX(fi) || fuse_file_famfs(fi)) dax_break_layout_final(inode); =20 truncate_inode_pages_final(&inode->i_data); diff --git a/fs/fuse/iomode.c b/fs/fuse/iomode.c index 31ee7f3304c6..948148316ef0 100644 --- a/fs/fuse/iomode.c +++ b/fs/fuse/iomode.c @@ -203,7 +203,7 @@ int fuse_file_io_open(struct file *file, struct inode *= inode) * io modes are not relevant with DAX and with server that does not * implement open. */ - if (FUSE_IS_VIRTIO_DAX(fi) || !ff->args) + if (FUSE_IS_VIRTIO_DAX(fi) || fuse_file_famfs(fi) || !ff->args) return 0; =20 /* diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h index 5e2c93433823..bfb92a4aa8a9 100644 --- a/include/uapi/linux/fuse.h +++ b/include/uapi/linux/fuse.h @@ -669,6 +669,9 @@ enum fuse_opcode { FUSE_STATX =3D 52, FUSE_COPY_FILE_RANGE_64 =3D 53, =20 + /* Famfs / devdax opcodes */ + FUSE_GET_FMAP =3D 54, + /* CUSE specific operations */ CUSE_INIT =3D 4096, =20 @@ -1313,4 +1316,8 @@ struct fuse_uring_cmd_req { uint8_t padding[6]; }; =20 +/* Famfs fmap message components */ + +#define FAMFS_FMAP_MAX 32768 /* Largest supported fmap message */ + #endif /* _LINUX_FUSE_H */ --=20 2.49.0 From nobody Sat Feb 7 07:10:19 2026 Received: from mail-oi1-f176.google.com (mail-oi1-f176.google.com [209.85.167.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 532F739527F for ; Wed, 7 Jan 2026 15:34:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800068; cv=none; b=N0/6W2wFjHcQcki2rJRJ0s+e+2vtMITvUqxdNc8/CkgZnh9lEB94mT7HhwyiW6YwlDyswttXUx5YxM3H6v1iVP75xPNJrrodTuMk2W8/ULh7yHZVMoHoitT6YPXrAFeWoPFUN5p64uMCR681nPrsWaloU55BJn8jDJLflwRFOzM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800068; c=relaxed/simple; bh=h9Sqjy47srXPlgEvqR6/9J2jNoOZTa7FEL0qXKABD2Q=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=TRLPv2+mwR0eJ9ZPZT3FW/G+mVut47KOWM3gXgf5XBwgqpv8j/syEji5GhiQ4RbvkgGLZpVGbLSw/Qgk1Y62n/BtyQZFX+ni8NSWXpk2go0tJ1TWYS+D4+2Fiujq+UWBN+nGs5dL1BmwMZFkSMG6ztsSqZlkdMH4BeiSiy2WC5w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=SpfcrFJX; arc=none smtp.client-ip=209.85.167.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="SpfcrFJX" Received: by mail-oi1-f176.google.com with SMTP id 5614622812f47-45358572a11so1326215b6e.3 for ; Wed, 07 Jan 2026 07:34:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767800064; x=1768404864; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=+lJSBmZkR9C+PpVibRCFcFXweP+k0Ln4vPDIoB1U5DA=; b=SpfcrFJX4wg9P1AnXM4FGaHKXJrR4NV6zRc7mKFDFUOV7IxPATooNdIV4nFWSI/lmf 2vb8a2JehdHmDlgREyrgtORrNSXGq2piTYLrEy6YwfAgaZPHIHSv/xUcNMKe18MOqLfK 48N5LGQ1rF7CiGQK2CfAUzY/PK3SktXpd7VCPYHvHRYvga66z8MGe2A9XD7UebGImgSv baoq9QPVwhAZ2G9S3lfB2L8zlB3FUkwUYwBlCeAy0U1ihQaVaI38AOcyuLv3QWJW4KTR s+H+qXV27128OFmUtJeBzhHpfzKMtMbXpxo2W1uTpFATrTGPhbTa6Yl/0rBNWtJ7/yog D1Ag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767800064; x=1768404864; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=+lJSBmZkR9C+PpVibRCFcFXweP+k0Ln4vPDIoB1U5DA=; b=AxTpZjncL8L4yCN+4aGVD7i1eoGk+Izb2QwFpQICVefX2eQjFJaBeB84jsUcp6D4z5 tDyiQ69L3eekWZ/hYuOJPcqXjcvDvjL0MarO1gG8bhoN1Cjt2Tg4U35oVlbjYlmOudWa FqxRufmU1BaWJSFaCpvWRE6ZEACR1WxDomCAYIULJg1NrTV9CXlrhKZLI/g0xdB/+CYa d9VlC//8Yt/CJgpGY7j3GBA6GXsI+Y4G1i6v0fs1//SY+eDqK40edBLNQPsNT+MjEewv xxbN5m+V8hmjmKCIMxSMn8NFArSxe1oNoMUI+gD2v8KPoScVaumN/hjzX5Fb14xHvwt2 Y4cw== X-Forwarded-Encrypted: i=1; AJvYcCWQS4fMiPlEQHZUodsEUGj271eAJLNXdluPZOjEKcI+0OzJHfmI0g0ZQpq2vbsvBdTUXSCTsbKDuypoaZc=@vger.kernel.org X-Gm-Message-State: AOJu0YyRsK8hKLLCLSati62dwTKbodJsX9dsQ3hLqsV69o3VN0FhUmPK RrfaMQobpG1DgtwhgqvWx1iRWI2GfPJxADW5oWOAMW8fX/RK6+uZiRCp X-Gm-Gg: AY/fxX6YnfXBzeMcBBPMvTkajmTEmT/qHsGvuauTToEKicem6cOC75yPyH0KtrML377 Ohv7Wru2fFUeCeZsf8mQDeRfMlZsGslrpngi9hYlrrp3UuiE1PYjBXMvZhcbLPJR89AClV8tLHl oSUBPQnFVlWIwx1mFGfNV7lELBGksny62WC+o7eRJCBxQeze7SHeTCWQelE4E6/OOekPYUZpTtZ VlAQXtjQP+ixc1rfcGqH+7gridWiHNCZzW22WabBxb0gxl5Csyr7FlZX/2EKFJ02/rx0XSphr2N 38cMqNyihXUxpz2w+9IUmAm6mZ6kbYbzPHoBJDDf1XoOZgIP6zre++Ey0ZPCuCPJS88vA8Ig7f4 HrOcrxfZEzE1a2XlNh7fsDmPBbNjWX/TUKBmIU1kQ0pz1qezCjTYH5aGtJk+JQgQ1LT2dgh610H INo/1u0KRxEfNodORCQNESZa6XJBPJENFB5976k5ny3Mlk X-Google-Smtp-Source: AGHT+IHZ6YQhs0uhhFpPCmYEP9salfbJwKE+Z3jaekz2Pj+nmx8WoppnEXFheRQGdele+77MhXPwcA== X-Received: by 2002:a05:6808:1801:b0:44f:e512:4ca2 with SMTP id 5614622812f47-45a6be7e99fmr1313009b6e.40.1767800063919; Wed, 07 Jan 2026 07:34:23 -0800 (PST) Received: from localhost.localdomain ([2603:8080:1500:3d89:a917:5124:7300:7cef]) by smtp.gmail.com with ESMTPSA id 5614622812f47-45a5e2f1de5sm2398106b6e.22.2026.01.07.07.34.21 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 07 Jan 2026 07:34:23 -0800 (PST) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, John Groves Subject: [PATCH V3 15/21] famfs_fuse: Create files with famfs fmaps Date: Wed, 7 Jan 2026 09:33:24 -0600 Message-ID: <20260107153332.64727-16-john@groves.net> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20260107153332.64727-1-john@groves.net> References: <20260107153244.64703-1-john@groves.net> <20260107153332.64727-1-john@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" On completion of GET_FMAP message/response, setup the full famfs metadata such that it's possible to handle read/write/mmap directly to dax. Note that the devdax_iomap plumbing is not in yet... * Add famfs_kfmap.h: in-memory structures for resolving famfs file maps (fmaps) to dax. * famfs.c: allocate, initialize and free fmaps * inode.c: only allow famfs mode if the fuse server has CAP_SYS_RAWIO * Update MAINTAINERS for the new files. Signed-off-by: John Groves --- MAINTAINERS | 1 + fs/fuse/famfs.c | 355 +++++++++++++++++++++++++++++++++++++- fs/fuse/famfs_kfmap.h | 67 +++++++ fs/fuse/fuse_i.h | 22 ++- fs/fuse/inode.c | 21 ++- include/uapi/linux/fuse.h | 56 ++++++ 6 files changed, 510 insertions(+), 12 deletions(-) create mode 100644 fs/fuse/famfs_kfmap.h diff --git a/MAINTAINERS b/MAINTAINERS index 526309943026..16b0606a3b85 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -10381,6 +10381,7 @@ L: linux-cxl@vger.kernel.org L: linux-fsdevel@vger.kernel.org S: Supported F: fs/fuse/famfs.c +F: fs/fuse/famfs_kfmap.h =20 FUTEX SUBSYSTEM M: Thomas Gleixner diff --git a/fs/fuse/famfs.c b/fs/fuse/famfs.c index 0f7e3f00e1e7..2aabd1d589fd 100644 --- a/fs/fuse/famfs.c +++ b/fs/fuse/famfs.c @@ -17,9 +17,355 @@ #include #include =20 +#include "famfs_kfmap.h" #include "fuse_i.h" =20 =20 +/*************************************************************************= **/ + +void +__famfs_meta_free(void *famfs_meta) +{ + struct famfs_file_meta *fmap =3D famfs_meta; + + if (!fmap) + return; + + if (fmap) { + switch (fmap->fm_extent_type) { + case SIMPLE_DAX_EXTENT: + kfree(fmap->se); + break; + case INTERLEAVED_EXTENT: + if (fmap->ie) + kfree(fmap->ie->ie_strips); + + kfree(fmap->ie); + break; + default: + pr_err("%s: invalid fmap type\n", __func__); + break; + } + } + kfree(fmap); +} + +static int +famfs_check_ext_alignment(struct famfs_meta_simple_ext *se) +{ + int errs =3D 0; + + if (se->dev_index !=3D 0) + errs++; + + /* TODO: pass in alignment so we can support the other page sizes */ + if (!IS_ALIGNED(se->ext_offset, PMD_SIZE)) + errs++; + + if (!IS_ALIGNED(se->ext_len, PMD_SIZE)) + errs++; + + return errs; +} + +/** + * famfs_fuse_meta_alloc() - Allocate famfs file metadata + * @metap: Pointer to an mcache_map_meta pointer + * @ext_count: The number of extents needed + * + * Returns: 0=3Dsuccess + * -errno=3Dfailure + */ +static int +famfs_fuse_meta_alloc( + void *fmap_buf, + size_t fmap_buf_size, + struct famfs_file_meta **metap) +{ + struct famfs_file_meta *meta =3D NULL; + struct fuse_famfs_fmap_header *fmh; + size_t extent_total =3D 0; + size_t next_offset =3D 0; + int errs =3D 0; + int i, j; + int rc; + + fmh =3D (struct fuse_famfs_fmap_header *)fmap_buf; + + /* Move past fmh in fmap_buf */ + next_offset +=3D sizeof(*fmh); + if (next_offset > fmap_buf_size) { + pr_err("%s:%d: fmap_buf underflow offset/size %ld/%ld\n", + __func__, __LINE__, next_offset, fmap_buf_size); + return -EINVAL; + } + + if (fmh->nextents < 1) { + pr_err("%s: nextents %d < 1\n", __func__, fmh->nextents); + return -EINVAL; + } + + if (fmh->nextents > FUSE_FAMFS_MAX_EXTENTS) { + pr_err("%s: nextents %d > max (%d) 1\n", + __func__, fmh->nextents, FUSE_FAMFS_MAX_EXTENTS); + return -E2BIG; + } + + meta =3D kzalloc(sizeof(*meta), GFP_KERNEL); + if (!meta) + return -ENOMEM; + + meta->error =3D false; + meta->file_type =3D fmh->file_type; + meta->file_size =3D fmh->file_size; + meta->fm_extent_type =3D fmh->ext_type; + + switch (fmh->ext_type) { + case FUSE_FAMFS_EXT_SIMPLE: { + struct fuse_famfs_simple_ext *se_in; + + se_in =3D (struct fuse_famfs_simple_ext *)(fmap_buf + next_offset); + + /* Move past simple extents */ + next_offset +=3D fmh->nextents * sizeof(*se_in); + if (next_offset > fmap_buf_size) { + pr_err("%s:%d: fmap_buf underflow offset/size %ld/%ld\n", + __func__, __LINE__, next_offset, fmap_buf_size); + rc =3D -EINVAL; + goto errout; + } + + meta->fm_nextents =3D fmh->nextents; + + meta->se =3D kcalloc(meta->fm_nextents, sizeof(*(meta->se)), + GFP_KERNEL); + if (!meta->se) { + rc =3D -ENOMEM; + goto errout; + } + + if ((meta->fm_nextents > FUSE_FAMFS_MAX_EXTENTS) || + (meta->fm_nextents < 1)) { + rc =3D -EINVAL; + goto errout; + } + + for (i =3D 0; i < fmh->nextents; i++) { + meta->se[i].dev_index =3D se_in[i].se_devindex; + meta->se[i].ext_offset =3D se_in[i].se_offset; + meta->se[i].ext_len =3D se_in[i].se_len; + + /* Record bitmap of referenced daxdev indices */ + meta->dev_bitmap |=3D (1 << meta->se[i].dev_index); + + errs +=3D famfs_check_ext_alignment(&meta->se[i]); + + extent_total +=3D meta->se[i].ext_len; + } + break; + } + + case FUSE_FAMFS_EXT_INTERLEAVE: { + s64 size_remainder =3D meta->file_size; + struct fuse_famfs_iext *ie_in; + int niext =3D fmh->nextents; + + meta->fm_niext =3D niext; + + /* Allocate interleaved extent */ + meta->ie =3D kcalloc(niext, sizeof(*(meta->ie)), GFP_KERNEL); + if (!meta->ie) { + rc =3D -ENOMEM; + goto errout; + } + + /* + * Each interleaved extent has a simple extent list of strips. + * Outer loop is over separate interleaved extents + */ + for (i =3D 0; i < niext; i++) { + u64 nstrips; + struct fuse_famfs_simple_ext *sie_in; + + /* ie_in =3D one interleaved extent in fmap_buf */ + ie_in =3D (struct fuse_famfs_iext *) + (fmap_buf + next_offset); + + /* Move past one interleaved extent header in fmap_buf */ + next_offset +=3D sizeof(*ie_in); + if (next_offset > fmap_buf_size) { + pr_err("%s:%d: fmap_buf underflow offset/size %ld/%ld\n", + __func__, __LINE__, next_offset, + fmap_buf_size); + rc =3D -EINVAL; + goto errout; + } + + nstrips =3D ie_in->ie_nstrips; + meta->ie[i].fie_chunk_size =3D ie_in->ie_chunk_size; + meta->ie[i].fie_nstrips =3D ie_in->ie_nstrips; + meta->ie[i].fie_nbytes =3D ie_in->ie_nbytes; + + if (!meta->ie[i].fie_nbytes) { + pr_err("%s: zero-length interleave!\n", + __func__); + rc =3D -EINVAL; + goto errout; + } + + /* sie_in =3D the strip extents in fmap_buf */ + sie_in =3D (struct fuse_famfs_simple_ext *) + (fmap_buf + next_offset); + + /* Move past strip extents in fmap_buf */ + next_offset +=3D nstrips * sizeof(*sie_in); + if (next_offset > fmap_buf_size) { + pr_err("%s:%d: fmap_buf underflow offset/size %ld/%ld\n", + __func__, __LINE__, next_offset, + fmap_buf_size); + rc =3D -EINVAL; + goto errout; + } + + if ((nstrips > FUSE_FAMFS_MAX_STRIPS) || (nstrips < 1)) { + pr_err("%s: invalid nstrips=3D%lld (max=3D%d)\n", + __func__, nstrips, + FUSE_FAMFS_MAX_STRIPS); + errs++; + } + + /* Allocate strip extent array */ + meta->ie[i].ie_strips =3D kcalloc(ie_in->ie_nstrips, + sizeof(meta->ie[i].ie_strips[0]), + GFP_KERNEL); + if (!meta->ie[i].ie_strips) { + rc =3D -ENOMEM; + goto errout; + } + + /* Inner loop is over strips */ + for (j =3D 0; j < nstrips; j++) { + struct famfs_meta_simple_ext *strips_out; + u64 devindex =3D sie_in[j].se_devindex; + u64 offset =3D sie_in[j].se_offset; + u64 len =3D sie_in[j].se_len; + + strips_out =3D meta->ie[i].ie_strips; + strips_out[j].dev_index =3D devindex; + strips_out[j].ext_offset =3D offset; + strips_out[j].ext_len =3D len; + + /* Record bitmap of referenced daxdev indices */ + meta->dev_bitmap |=3D (1 << devindex); + + extent_total +=3D len; + errs +=3D famfs_check_ext_alignment(&strips_out[j]); + size_remainder -=3D len; + } + } + + if (size_remainder > 0) { + /* Sum of interleaved extent sizes is less than file size! */ + pr_err("%s: size_remainder %lld (0x%llx)\n", + __func__, size_remainder, size_remainder); + rc =3D -EINVAL; + goto errout; + } + break; + } + + default: + pr_err("%s: invalid ext_type %d\n", __func__, fmh->ext_type); + rc =3D -EINVAL; + goto errout; + } + + if (errs > 0) { + pr_err("%s: %d alignment errors found\n", __func__, errs); + rc =3D -EINVAL; + goto errout; + } + + /* More sanity checks */ + if (extent_total < meta->file_size) { + pr_err("%s: file size %ld larger than map size %ld\n", + __func__, meta->file_size, extent_total); + rc =3D -EINVAL; + goto errout; + } + + if (cmpxchg(metap, NULL, meta) !=3D NULL) { + pr_debug("%s: fmap race detected\n", __func__); + rc =3D 0; /* fmap already installed */ + goto errout; + } + + return 0; +errout: + __famfs_meta_free(meta); + return rc; +} + +/** + * famfs_file_init_dax() - init famfs dax file metadata + * + * @fm: fuse_mount + * @inode: the inode + * @fmap_buf: fmap response message + * @fmap_size: Size of the fmap message + * + * Initialize famfs metadata for a file, based on the contents of the GET_= FMAP + * response + * + * Return: 0=3Dsuccess + * -errno=3Dfailure + */ +int +famfs_file_init_dax( + struct fuse_mount *fm, + struct inode *inode, + void *fmap_buf, + size_t fmap_size) +{ + struct fuse_inode *fi =3D get_fuse_inode(inode); + struct famfs_file_meta *meta =3D NULL; + int rc =3D 0; + + if (fi->famfs_meta) { + pr_notice("%s: i_no=3D%ld fmap_size=3D%ld ALREADY INITIALIZED\n", + __func__, + inode->i_ino, fmap_size); + return 0; + } + + rc =3D famfs_fuse_meta_alloc(fmap_buf, fmap_size, &meta); + if (rc) + goto errout; + + /* Publish the famfs metadata on fi->famfs_meta */ + inode_lock(inode); + if (fi->famfs_meta) { + rc =3D -EEXIST; /* file already has famfs metadata */ + } else { + if (famfs_meta_set(fi, meta) !=3D NULL) { + pr_debug("%s: file already had metadata\n", __func__); + __famfs_meta_free(meta); + /* rc is 0 - the file is valid */ + goto unlock_out; + } + i_size_write(inode, meta->file_size); + inode->i_flags |=3D S_DAX; + } + unlock_out: + inode_unlock(inode); + +errout: + if (rc) + __famfs_meta_free(meta); + + return rc; +} + #define FMAP_BUFSIZE PAGE_SIZE =20 int @@ -63,12 +409,9 @@ fuse_get_fmap(struct fuse_mount *fm, struct inode *inod= e) } fmap_size =3D rc; =20 - /* We retrieved the "fmap" (the file's map to memory), but - * we haven't used it yet. A call to famfs_file_init_dax() will be added - * here in a subsequent patch, when we add the ability to attach - * fmaps to files. - */ + /* Convert fmap into in-memory format and hang from inode */ + rc =3D famfs_file_init_dax(fm, inode, fmap_buf, fmap_size); =20 kfree(fmap_buf); - return 0; + return rc; } diff --git a/fs/fuse/famfs_kfmap.h b/fs/fuse/famfs_kfmap.h new file mode 100644 index 000000000000..058645cb10a1 --- /dev/null +++ b/fs/fuse/famfs_kfmap.h @@ -0,0 +1,67 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * famfs - dax file system for shared fabric-attached memory + * + * Copyright 2023-2025 Micron Technology, Inc. + */ +#ifndef FAMFS_KFMAP_H +#define FAMFS_KFMAP_H + +/* + * The structures below are the in-memory metadata format for famfs files. + * Metadata retrieved via the GET_FMAP response is converted to this format + * for use in resolving file mapping faults. + * + * The GET_FMAP response contains the same information, but in a more + * message-and-versioning-friendly format. Those structs can be found in t= he + * famfs section of include/uapi/linux/fuse.h (aka fuse_kernel.h in libfus= e) + */ + +enum famfs_file_type { + FAMFS_REG, + FAMFS_SUPERBLOCK, + FAMFS_LOG, +}; + +/* We anticipate the possibility of supporting additional types of extents= */ +enum famfs_extent_type { + SIMPLE_DAX_EXTENT, + INTERLEAVED_EXTENT, + INVALID_EXTENT_TYPE, +}; + +struct famfs_meta_simple_ext { + u64 dev_index; + u64 ext_offset; + u64 ext_len; +}; + +struct famfs_meta_interleaved_ext { + u64 fie_nstrips; + u64 fie_chunk_size; + u64 fie_nbytes; + struct famfs_meta_simple_ext *ie_strips; +}; + +/* + * Each famfs dax file has this hanging from its fuse_inode->famfs_meta + */ +struct famfs_file_meta { + bool error; + enum famfs_file_type file_type; + size_t file_size; + enum famfs_extent_type fm_extent_type; + u64 dev_bitmap; /* bitmap of referenced daxdevs by index */ + union { /* This will make code a bit more readable */ + struct { + size_t fm_nextents; + struct famfs_meta_simple_ext *se; + }; + struct { + size_t fm_niext; + struct famfs_meta_interleaved_ext *ie; + }; + }; +}; + +#endif /* FAMFS_KFMAP_H */ diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index 691c7850cf4e..f9e920e95baf 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -1658,6 +1658,12 @@ extern void fuse_sysctl_unregister(void); =20 /* famfs.c */ =20 +#if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX) +int famfs_file_init_dax(struct fuse_mount *fm, + struct inode *inode, void *fmap_buf, + size_t fmap_size); +void __famfs_meta_free(void *map); +#endif static inline void famfs_teardown(struct fuse_conn *fc) { #if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX) @@ -1665,11 +1671,18 @@ static inline void famfs_teardown(struct fuse_conn = *fc) #endif } =20 +static inline void famfs_meta_init(struct fuse_inode *fi) +{ +#if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX) + fi->famfs_meta =3D NULL; +#endif +} + static inline struct fuse_backing *famfs_meta_set(struct fuse_inode *fi, void *meta) { #if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX) - return xchg(&fi->famfs_meta, meta); + return cmpxchg(&fi->famfs_meta, NULL, meta); #else return NULL; #endif @@ -1677,7 +1690,12 @@ static inline struct fuse_backing *famfs_meta_set(st= ruct fuse_inode *fi, =20 static inline void famfs_meta_free(struct fuse_inode *fi) { - /* Stub wil be connected in a subsequent commit */ +#if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX) + if (fi->famfs_meta !=3D NULL) { + __famfs_meta_free(fi->famfs_meta); + famfs_meta_set(fi, NULL); + } +#endif } =20 static inline int fuse_file_famfs(struct fuse_inode *fi) diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index 9e121a1d63b7..391ead26bfa2 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -121,7 +121,7 @@ static struct inode *fuse_alloc_inode(struct super_bloc= k *sb) fuse_inode_backing_set(fi, NULL); =20 if (IS_ENABLED(CONFIG_FUSE_FAMFS_DAX)) - famfs_meta_set(fi, NULL); + famfs_meta_init(fi); =20 return &fi->inode; =20 @@ -1485,8 +1485,21 @@ static void process_init_reply(struct fuse_mount *fm= , struct fuse_args *args, timeout =3D arg->request_timeout; =20 if (IS_ENABLED(CONFIG_FUSE_FAMFS_DAX) && - flags & FUSE_DAX_FMAP) - fc->famfs_iomap =3D 1; + flags & FUSE_DAX_FMAP) { + /* famfs_iomap is only allowed if the fuse + * server has CAP_SYS_RAWIO. This was checked + * in fuse_send_init, and FUSE_DAX_IOMAP was + * set in in_flags if so. Only allow enablement + * if we find it there. This function is + * normally not running in fuse server context, + * so we can do the capability check here... + */ + u64 in_flags =3D ((u64)ia->in.flags2 << 32) + | ia->in.flags; + + if (in_flags & FUSE_DAX_FMAP) + fc->famfs_iomap =3D 1; + } } else { ra_pages =3D fc->max_read / PAGE_SIZE; fc->no_lock =3D 1; @@ -1548,7 +1561,7 @@ static struct fuse_init_args *fuse_new_init(struct fu= se_mount *fm) flags |=3D FUSE_SUBMOUNTS; if (IS_ENABLED(CONFIG_FUSE_PASSTHROUGH)) flags |=3D FUSE_PASSTHROUGH; - if (IS_ENABLED(CONFIG_FUSE_FAMFS_DAX)) + if (IS_ENABLED(CONFIG_FUSE_FAMFS_DAX) && capable(CAP_SYS_RAWIO)) flags |=3D FUSE_DAX_FMAP; =20 /* diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h index bfb92a4aa8a9..e6dd3c24bb11 100644 --- a/include/uapi/linux/fuse.h +++ b/include/uapi/linux/fuse.h @@ -243,6 +243,13 @@ * * 7.46 * - Add FUSE_DAX_FMAP capability - ability to handle in-kernel fsdax m= aps + * - Add the following structures for the GET_FMAP message reply compon= ents: + * - struct fuse_famfs_simple_ext + * - struct fuse_famfs_iext + * - struct fuse_famfs_fmap_header + * - Add the following enumerated types + * - enum fuse_famfs_file_type + * - enum famfs_ext_type */ =20 #ifndef _LINUX_FUSE_H @@ -1318,6 +1325,55 @@ struct fuse_uring_cmd_req { =20 /* Famfs fmap message components */ =20 +#define FAMFS_FMAP_VERSION 1 + #define FAMFS_FMAP_MAX 32768 /* Largest supported fmap message */ +#define FUSE_FAMFS_MAX_EXTENTS 32 +#define FUSE_FAMFS_MAX_STRIPS 32 + +enum fuse_famfs_file_type { + FUSE_FAMFS_FILE_REG, + FUSE_FAMFS_FILE_SUPERBLOCK, + FUSE_FAMFS_FILE_LOG, +}; + +enum famfs_ext_type { + FUSE_FAMFS_EXT_SIMPLE =3D 0, + FUSE_FAMFS_EXT_INTERLEAVE =3D 1, +}; + +struct fuse_famfs_simple_ext { + uint32_t se_devindex; + uint32_t reserved; + uint64_t se_offset; + uint64_t se_len; +}; + +struct fuse_famfs_iext { /* Interleaved extent */ + uint32_t ie_nstrips; + uint32_t ie_chunk_size; + uint64_t ie_nbytes; /* Total bytes for this interleaved_ext; + * sum of strips may be more + */ + uint64_t reserved; +}; + +struct fuse_famfs_fmap_header { + uint8_t file_type; /* enum famfs_file_type */ + uint8_t reserved; + uint16_t fmap_version; + uint32_t ext_type; /* enum famfs_log_ext_type */ + uint32_t nextents; + uint32_t reserved0; + uint64_t file_size; + uint64_t reserved1; +}; + +static inline int32_t fmap_msg_min_size(void) +{ + /* Smallest fmap message is a header plus one simple extent */ + return (sizeof(struct fuse_famfs_fmap_header) + + sizeof(struct fuse_famfs_simple_ext)); +} =20 #endif /* _LINUX_FUSE_H */ --=20 2.49.0 From nobody Sat Feb 7 07:10:19 2026 Received: from mail-dy1-f169.google.com (mail-dy1-f169.google.com [74.125.82.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C04EB38F248 for ; Wed, 7 Jan 2026 17:09:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767805788; cv=none; b=cfEa3LltdbZD6fgBiUIiw982BdZjnMoeeL0BsJgPB5WGCrbpGLmeTyQoVPDyaH1Ab55+0Nv0cAZ7l+tAKd2Yw/MmIyd9c4bqcw0yUhRCivMoQJX4/+QdBSB5fvivPS8I7B6BekALG0QHUXTieMhezYDAeJfdTFJdh3/roXNY7Zg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767805788; c=relaxed/simple; bh=VILu4/aZEDEuDrhwIC96ZPrRjlL5TOp4r9I40ua7OBM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qhH6dlZcNZOA2r81fRplYbtdfjlxV+JO5bugXWNRYIiwQzHS+s2a2BNcRFvf7CHffrUGBcSmle100zor5jckfJdNkKgPqgR1EAwPMsEDyzvY8xRxHosVSYXKwsvBiuTg3QzZ7QSslPa6fjbz9qJH7bP05PQe6ae2JMkDHX9SNsA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=i27VgbB9; arc=none smtp.client-ip=74.125.82.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="i27VgbB9" Received: by mail-dy1-f169.google.com with SMTP id 5a478bee46e88-2abe15d8a4bso3574583eec.0 for ; Wed, 07 Jan 2026 09:09:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767805783; x=1768410583; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=Z26pH3kxXMl+NsnFCMmTOxJmvl6Ayq/rxp+1YLnLzkg=; b=i27VgbB9RsqdkO6lqsdH/nnl66FrSsX/NEBUmSJ2OhhrL82etGS+5dgKhQLFQOH1aT jUx1f1sa7/EV8/D8/YqDXx98NkAJvo9Om6QoJJz8+f1wVQNPmyhAbuiht9g1j0x3ewwk rNPGDLya62BGzOSdVkixjl6WcXmLaMjSfoQotYnUh9DsXdG5wEcIDXvOTP5NSArxkVfA 0kQ63/Cd3pJlBLXhSCcWCNaxooaaYwaOzhVDgatp0oC5pgSAwcGKOhsO47T4DqBq1IDD iMaTI0DxE/Qebm7Rfduj1R6xkr29lO8ezWChDbIDvSwvYBFGcsM75doSpSdSYugsHNJW TvXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767805783; x=1768410583; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Z26pH3kxXMl+NsnFCMmTOxJmvl6Ayq/rxp+1YLnLzkg=; b=XABxWYIgyOUjKs48vCzus6M2eNHUTBnM6gtoKBALSmBZNJ39gdCFWo20Jfn3YihkXn /BjW+vErU/3Q5RujnD5Eg1+siD+qSBYMieiaoVQJVG478xpU7CjUx538+/9GbTm7eW1V lkAEkiO+WnI0KGIknJ6MsZNehncwEWXIprGXnK2KtdKgzwPK6zIFn9b1FPJACnXh1yUm iJDCZvzxtLX+eETqoWx/UDPMEEEvkXtLr+P3gX7lVlADNx/lpPnm86oHjeiPvL4SNqwd YR+cx2B1m9PUiuBIqJUVF5X57jrp4Ccv3qQbIs8YtJsl3wpwY0bQkA/eOe2040J7KQjX daZw== X-Forwarded-Encrypted: i=1; AJvYcCXj1ynYv9T2oDUDnR4wBM2cvn5ti1pIlE/9htbJDw7K4yRZmXYNDUsBMbPWXUwMlWYXEiFgxj3YURaO5Yw=@vger.kernel.org X-Gm-Message-State: AOJu0Yw2lGcCCivbiYkqGr39mXK0B7mclsTal7oBTHUeaatfRUSBWLpJ i3Z15ypnGS9inYP4EF0w0/LQPmiqB1SsgEsPOp6mQd9/qtjc5khVQ28LuxfcYw== X-Gm-Gg: AY/fxX60mG+0t8MF2BDGYdPVgeNDigXH3vTYHbvNeDHPI56Eh+/BuRjuCAMqWDUYBNR ZFbdH0ncNv0xD3lQWhhiPFOCBcHThpFej8iIAXCMMrDAOdkYS7gUlLeDLwWVuWEipzeRSKmGqY+ SUXIs8/GaX+pRry4aT1TmNTiRT3yb+YVfTS36+qioeRxfRJy807WoviMQIXSdaH8v8NQpoa7hFp GxmtZ92WUjRLcTGoxy478M9hzRQKnmxnEdKSUW0FdDDntjIUGfDMhZnGdW1MY2HwgxUCh/CCkBe PxkeH8rcSYH83/EHQimX+Q3I2aKE2aB0AawrHTPqWCtx9vkd76tKHqAnxxl17I/OAycQlz9oJCw THitfV5xeftT4TanR1FTU0nJSjCieGc/3AJVnAjl9L8ZT3HS6+w+xhGHTs+if/SM0n3W3Y7Ii3P 04pO3oa4fQvScmG0bHQZB1oGvw7kKM1r0UUcS7aE213B4U X-Google-Smtp-Source: AGHT+IG/GJazDtrV8Iv0UenI1pYkQFFjdMGheA22/FDANC++il82RbCdKOOAl/rkQAbrZ5tuK05Wfg== X-Received: by 2002:a05:6808:2224:b0:450:d8ef:d804 with SMTP id 5614622812f47-45a6be37898mr1407082b6e.39.1767800066998; Wed, 07 Jan 2026 07:34:26 -0800 (PST) Received: from localhost.localdomain ([2603:8080:1500:3d89:a917:5124:7300:7cef]) by smtp.gmail.com with ESMTPSA id 5614622812f47-45a5e2f1de5sm2398106b6e.22.2026.01.07.07.34.24 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 07 Jan 2026 07:34:26 -0800 (PST) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, John Groves Subject: [PATCH V3 16/21] famfs_fuse: GET_DAXDEV message and daxdev_table Date: Wed, 7 Jan 2026 09:33:25 -0600 Message-ID: <20260107153332.64727-17-john@groves.net> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20260107153332.64727-1-john@groves.net> References: <20260107153244.64703-1-john@groves.net> <20260107153332.64727-1-john@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" * The new GET_DAXDEV message/response is added * The famfs.c:famfs_teardown() function is added as a primary teardown function for famfs. * The command it triggered by the update_daxdev_table() call, if there are any daxdevs in the subject fmap that are not represented in the daxdev_table yet. * fs/namei.c: export may_open_dev() Signed-off-by: John Groves --- fs/fuse/famfs.c | 236 ++++++++++++++++++++++++++++++++++++++ fs/fuse/famfs_kfmap.h | 26 +++++ fs/fuse/fuse_i.h | 13 ++- fs/fuse/inode.c | 4 +- fs/namei.c | 1 + include/uapi/linux/fuse.h | 20 ++++ 6 files changed, 298 insertions(+), 2 deletions(-) diff --git a/fs/fuse/famfs.c b/fs/fuse/famfs.c index 2aabd1d589fd..b5cd1b5c1d6c 100644 --- a/fs/fuse/famfs.c +++ b/fs/fuse/famfs.c @@ -20,6 +20,239 @@ #include "famfs_kfmap.h" #include "fuse_i.h" =20 +/* + * famfs_teardown() + * + * Deallocate famfs metadata for a fuse_conn + */ +void +famfs_teardown(struct fuse_conn *fc) +{ + struct famfs_dax_devlist *devlist =3D fc->dax_devlist; + int i; + + kfree(fc->shadow); + + fc->dax_devlist =3D NULL; + + if (!devlist) + return; + + if (!devlist->devlist) + goto out; + + /* Close & release all the daxdevs in our table */ + for (i =3D 0; i < devlist->nslots; i++) { + struct famfs_daxdev *dd =3D &devlist->devlist[i]; + + if (!dd->valid) + continue; + + /* Release reference from dax_dev_get() */ + if (dd->devp) + put_dax(dd->devp); + + kfree(dd->name); + } + kfree(devlist->devlist); + +out: + kfree(devlist); +} + +static int +famfs_verify_daxdev(const char *pathname, dev_t *devno) +{ + struct inode *inode; + struct path path; + int err; + + if (!pathname || !*pathname) + return -EINVAL; + + err =3D kern_path(pathname, LOOKUP_FOLLOW, &path); + if (err) + return err; + + inode =3D d_backing_inode(path.dentry); + if (!S_ISCHR(inode->i_mode)) { + err =3D -EINVAL; + goto out_path_put; + } + + if (!may_open_dev(&path)) { /* had to export this */ + err =3D -EACCES; + goto out_path_put; + } + + *devno =3D inode->i_rdev; + +out_path_put: + path_put(&path); + return err; +} + +/** + * famfs_fuse_get_daxdev() - Retrieve info for a DAX device from fuse serv= er + * + * Send a GET_DAXDEV message to the fuse server to retrieve info on a + * dax device. + * + * @fm: fuse_mount + * @index: the index of the dax device; daxdevs are referred to by index + * in fmaps, and the server resolves the index to a particular da= xdev + * + * Returns: 0=3Dsuccess + * -errno=3Dfailure + */ +static int +famfs_fuse_get_daxdev(struct fuse_mount *fm, const u64 index) +{ + struct fuse_daxdev_out daxdev_out =3D { 0 }; + struct fuse_conn *fc =3D fm->fc; + struct famfs_daxdev *daxdev; + int err =3D 0; + + FUSE_ARGS(args); + + /* Store the daxdev in our table */ + if (index >=3D fc->dax_devlist->nslots) { + pr_err("%s: index(%lld) > nslots(%d)\n", + __func__, index, fc->dax_devlist->nslots); + err =3D -EINVAL; + goto out; + } + + args.opcode =3D FUSE_GET_DAXDEV; + args.nodeid =3D index; + + args.in_numargs =3D 0; + + args.out_numargs =3D 1; + args.out_args[0].size =3D sizeof(daxdev_out); + args.out_args[0].value =3D &daxdev_out; + + /* Send GET_DAXDEV command */ + err =3D fuse_simple_request(fm, &args); + if (err) { + pr_err("%s: err=3D%d from fuse_simple_request()\n", + __func__, err); + /* + * Error will be that the payload is smaller than FMAP_BUFSIZE, + * which is the max we can handle. Empty payload handled below. + */ + goto out; + } + + down_write(&fc->famfs_devlist_sem); + + daxdev =3D &fc->dax_devlist->devlist[index]; + + /* Abort if daxdev is now valid (race - another thread got it first) */ + if (daxdev->valid) { + up_write(&fc->famfs_devlist_sem); + /* We already have a valid entry at this index */ + pr_debug("%s: daxdev already known\n", __func__); + goto out; + } + + /* Verify that the dev is valid and can be opened and gets the devno */ + err =3D famfs_verify_daxdev(daxdev_out.name, &daxdev->devno); + if (err) { + up_write(&fc->famfs_devlist_sem); + pr_err("%s: err=3D%d from famfs_verify_daxdev()\n", __func__, err); + goto out; + } + + /* This will fail if it's not a dax device */ + daxdev->devp =3D dax_dev_get(daxdev->devno); + if (!daxdev->devp) { + up_write(&fc->famfs_devlist_sem); + pr_warn("%s: device %s not found or not dax\n", + __func__, daxdev_out.name); + err =3D -ENODEV; + goto out; + } + + daxdev->name =3D kstrdup(daxdev_out.name, GFP_KERNEL); + wmb(); /* all daxdev fields must be visible before marking it valid */ + daxdev->valid =3D 1; + + up_write(&fc->famfs_devlist_sem); + +out: + return err; +} + +/** + * famfs_update_daxdev_table() - Update the daxdev table + * @fm - fuse_mount + * @meta - famfs_file_meta, in-memory format, built from a GET_FMAP respon= se + * + * This function is called for each new file fmap, to verify whether all + * referenced daxdevs are already known (i.e. in the table). Any daxdev + * indices referenced in @meta but not in the table will be retrieved via + * famfs_fuse_get_daxdev() and added to the table + * + * Return: 0=3Dsuccess + * -errno=3Dfailure + */ +static int +famfs_update_daxdev_table( + struct fuse_mount *fm, + const struct famfs_file_meta *meta) +{ + struct famfs_dax_devlist *local_devlist; + struct fuse_conn *fc =3D fm->fc; + int err; + int i; + + /* First time through we will need to allocate the dax_devlist */ + if (unlikely(!fc->dax_devlist)) { + local_devlist =3D kcalloc(1, sizeof(*fc->dax_devlist), GFP_KERNEL); + if (!local_devlist) + return -ENOMEM; + + local_devlist->nslots =3D MAX_DAXDEVS; + + local_devlist->devlist =3D kcalloc(MAX_DAXDEVS, + sizeof(struct famfs_daxdev), + GFP_KERNEL); + if (!local_devlist->devlist) { + kfree(local_devlist); + return -ENOMEM; + } + + /* We don't need famfs_devlist_sem here because we use cmpxchg */ + if (cmpxchg(&fc->dax_devlist, NULL, local_devlist) !=3D NULL) { + kfree(local_devlist->devlist); + kfree(local_devlist); /* another thread beat us to it */ + } + } + + down_read(&fc->famfs_devlist_sem); + for (i =3D 0; i < fc->dax_devlist->nslots; i++) { + if (!(meta->dev_bitmap & (1ULL << i))) + continue; + + /* This file meta struct references devindex i + * if devindex i isn't in the table; get it... + */ + if (!(fc->dax_devlist->devlist[i].valid)) { + up_read(&fc->famfs_devlist_sem); + + err =3D famfs_fuse_get_daxdev(fm, i); + if (err) + pr_err("%s: failed to get daxdev=3D%d\n", + __func__, i); + + down_read(&fc->famfs_devlist_sem); + } + } + up_read(&fc->famfs_devlist_sem); + + return 0; +} =20 /*************************************************************************= **/ =20 @@ -342,6 +575,9 @@ famfs_file_init_dax( if (rc) goto errout; =20 + /* Make sure this fmap doesn't reference any unknown daxdevs */ + famfs_update_daxdev_table(fm, meta); + /* Publish the famfs metadata on fi->famfs_meta */ inode_lock(inode); if (fi->famfs_meta) { diff --git a/fs/fuse/famfs_kfmap.h b/fs/fuse/famfs_kfmap.h index 058645cb10a1..e76b9057a1e0 100644 --- a/fs/fuse/famfs_kfmap.h +++ b/fs/fuse/famfs_kfmap.h @@ -64,4 +64,30 @@ struct famfs_file_meta { }; }; =20 +/* + * famfs_daxdev - tracking struct for a daxdev within a famfs file system + * + * This is the in-memory daxdev metadata that is populated by parsing + * the responses to GET_FMAP messages + */ +struct famfs_daxdev { + /* Include dev uuid? */ + bool valid; + bool error; + dev_t devno; + struct dax_device *devp; + char *name; +}; + +#define MAX_DAXDEVS 24 + +/* + * famfs_dax_devlist - list of famfs_daxdev's + */ +struct famfs_dax_devlist { + int nslots; + int ndevs; + struct famfs_daxdev *devlist; +}; + #endif /* FAMFS_KFMAP_H */ diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index f9e920e95baf..d308b74c83ec 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -1018,6 +1018,8 @@ struct fuse_conn { u8 blkbits; =20 #if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX) + struct rw_semaphore famfs_devlist_sem; + struct famfs_dax_devlist *dax_devlist; char *shadow; #endif }; @@ -1663,13 +1665,15 @@ int famfs_file_init_dax(struct fuse_mount *fm, struct inode *inode, void *fmap_buf, size_t fmap_size); void __famfs_meta_free(void *map); -#endif +void famfs_teardown(struct fuse_conn *fc); +#else static inline void famfs_teardown(struct fuse_conn *fc) { #if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX) kfree(fc->shadow); #endif } +#endif =20 static inline void famfs_meta_init(struct fuse_inode *fi) { @@ -1688,6 +1692,13 @@ static inline struct fuse_backing *famfs_meta_set(st= ruct fuse_inode *fi, #endif } =20 +static inline void famfs_init_devlist_sem(struct fuse_conn *fc) +{ +#if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX) + init_rwsem(&fc->famfs_devlist_sem); +#endif +} + static inline void famfs_meta_free(struct fuse_inode *fi) { #if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX) diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index 391ead26bfa2..78787efcfd07 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -1497,8 +1497,10 @@ static void process_init_reply(struct fuse_mount *fm= , struct fuse_args *args, u64 in_flags =3D ((u64)ia->in.flags2 << 32) | ia->in.flags; =20 - if (in_flags & FUSE_DAX_FMAP) + if (in_flags & FUSE_DAX_FMAP) { + famfs_init_devlist_sem(fc); fc->famfs_iomap =3D 1; + } } } else { ra_pages =3D fc->max_read / PAGE_SIZE; diff --git a/fs/namei.c b/fs/namei.c index bf0f66f0e9b9..b47511ac7337 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -4162,6 +4162,7 @@ bool may_open_dev(const struct path *path) return !(path->mnt->mnt_flags & MNT_NODEV) && !(path->mnt->mnt_sb->s_iflags & SB_I_NODEV); } +EXPORT_SYMBOL(may_open_dev); =20 static int may_open(struct mnt_idmap *idmap, const struct path *path, int acc_mode, int flag) diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h index e6dd3c24bb11..2432ccc4f913 100644 --- a/include/uapi/linux/fuse.h +++ b/include/uapi/linux/fuse.h @@ -247,6 +247,9 @@ * - struct fuse_famfs_simple_ext * - struct fuse_famfs_iext * - struct fuse_famfs_fmap_header + * - Add the following structs for the GET_DAXDEV message and reply + * - struct fuse_get_daxdev_in + * - struct fuse_get_daxdev_out * - Add the following enumerated types * - enum fuse_famfs_file_type * - enum famfs_ext_type @@ -678,6 +681,7 @@ enum fuse_opcode { =20 /* Famfs / devdax opcodes */ FUSE_GET_FMAP =3D 54, + FUSE_GET_DAXDEV =3D 55, =20 /* CUSE specific operations */ CUSE_INIT =3D 4096, @@ -1369,6 +1373,22 @@ struct fuse_famfs_fmap_header { uint64_t reserved1; }; =20 +struct fuse_get_daxdev_in { + uint32_t daxdev_num; +}; + +#define DAXDEV_NAME_MAX 256 + +/* fuse_daxdev_out has enough space for a uuid if we need it */ +struct fuse_daxdev_out { + uint16_t index; + uint16_t reserved; + uint32_t reserved2; + uint64_t reserved3; + uint64_t reserved4; + char name[DAXDEV_NAME_MAX]; +}; + static inline int32_t fmap_msg_min_size(void) { /* Smallest fmap message is a header plus one simple extent */ --=20 2.49.0 From nobody Sat Feb 7 07:10:19 2026 Received: from mail-oi1-f180.google.com (mail-oi1-f180.google.com [209.85.167.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3EE0339527C for ; Wed, 7 Jan 2026 15:34:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800075; cv=none; b=ErGa5yVAqnwMa7xLbpo0eFlBfwcaBjHewzfkbxlC0FyJHLKMMH98ISlobHFDzE1trUVeuUWtdH1EGogOkVHbG7JL7DC/C1hIRbdJt6wre4BVMb9NdqwskImvXpZ15XvVADM8gYN1HWC8OWgUCq0yQwk5OWnX2LTtv0QNiYi5hd0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800075; c=relaxed/simple; bh=x1HevHzr3na0O8/IiFxQw38tuXfKxwG/GQFHodSpajc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Be78PDWymYrlLKJFcybzlsrlmYmkjohXbk3ZjChwmhfvg+0bc0fdeMNG7ZOen6YZHO1+6J7HQnDhn3xtcsrZnYZ8DHs8a6QtpR67NYvFF2s6IfPVqzTQprpOMgosAzu+dCrLQnwdpL+kJR2GYenm1JZ2JDLr1fLpk84FGcdH26E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ZOEV+Q6z; arc=none smtp.client-ip=209.85.167.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ZOEV+Q6z" Received: by mail-oi1-f180.google.com with SMTP id 5614622812f47-459993ff4fcso917374b6e.1 for ; Wed, 07 Jan 2026 07:34:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767800071; x=1768404871; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=J5SqoQ/RrRvmvQNe0j4L2pW17kZHbFwtV+OnjS+wxCg=; b=ZOEV+Q6zYRoovoaizmwTVA4zFxhZlun8WPAKLi+NxJ8VzuDZJ3eBaqBvc/Vh1OG4H1 CaQQEEdEf/6ERiNjf9vLvDoyI0HFu42GTk5prFbD1UCmfyYTgtKJ2BvwrJk+WqzVi4QU MGdvyHVQOrMXj63MXO9IcaAEFvotMHFWKQp2v9pnAbksyd+DJfjmx46UVqVqwrFus4wq 9okNEul1i634EHkAAIK8t6bRZjAJtxz7bw/ULtURMWsfvtALpUTpNGQyJgZ6Xui8CZYw +IxEUuC9LLKNoYUNpWfVVy97x2hv6Jwqh2NL6bjDyl1E3IwsFGxjXwfmCKFl2tXzTO99 iCVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767800071; x=1768404871; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=J5SqoQ/RrRvmvQNe0j4L2pW17kZHbFwtV+OnjS+wxCg=; b=HgAjzhbMxeDy2QV7CFfS/hEaciGe94yxGTNHmzv0EPFZzVC30C16vwu4gDI1q5onMb o2NnHHK2bEXzwkR5x103KQIawRykz6BYR8K1BezVJFy19D7d6oOVz6HKWPMkcUWBA989 5zZtN/kDp8famjzR+s5zYFDaOtHIFkjBCbr2Bd5VvqJrXFUD8i+KxDDzWQ41YxECHFLe cVdevAWomCigv+IsyoRLOpjreIsJcgDjVRUvGPaLDhhhHZKPyzV72kcavBqjNHAntkSI Q5hwxkuuQRd9UDXnb9oliPrtMYuw3RhCLidhb+dOEawFZSHmFxzL6CPIEr76dbF5LTtW blfQ== X-Forwarded-Encrypted: i=1; AJvYcCWxFV20S7tfaHY9+tjAXjZbFXPMPPZA3YXmV1RjUFjI4N6UQJtODOar2K5gxFVgQDIDtIFboPGdRNhwBD8=@vger.kernel.org X-Gm-Message-State: AOJu0YxN3L1+n4bcCHWetB8M2UddaktC6mlChXGFDWcCJu/XUXGWiEyr JVpEFquwQ2fjgQods430JpEwY/melvp9R7r8mejg2Wu3+ntnD16nXMO/ X-Gm-Gg: AY/fxX4oO2iykbhM4Vit2FCqSAKvGg3+4D0O3KoQUaKzwWn6CVJBFX5uyx/juRiuDOA Y3CO/y2Wo88F7BIOlZBLJwdn4MkkHxJHmqPQwK9mOUbPOXe5x4X7CJv36yPl2fjhEJo3OFsY6Fr ItwPQ6lOgtKpblCRRXRFd8Y+nP9Lq5b1bDZncja3+/5uSO8sruY8KFzgrLUjZdJ5K2PBDA4Ilov UHB5++iuEIoFgx42Gywuk0jQP4Cvl38Mm093Mbdpt1UdzS5g2huX7498vX1NjlWIHxVVcJhd+Zh 9XxH7Sz/931VJa/dMkWbQWsR45WlbQUg4cdNyUErAu72DDnxDuDIO77+V3QdKjuUJh/tj/qJt1k VfLLTxi3YGsuIzEqeCqQyQGSfHne4N2QtBuYNNft104XQwNukN/M8E+mn9hvVM8A2Dx+DTsruSw V5rAOXfU3j1lqrUIvzi5K0YKykwPpW+Hz/ZrfCY/E9Enkg X-Google-Smtp-Source: AGHT+IER9TEcP0Gd0RbsM3hsxTOU129XPPmV1lfMnDIbnqKGjMILdd47hgMEMRu2pd2P+LaOJ3lJwA== X-Received: by 2002:a05:6808:c1b2:b0:450:907:b523 with SMTP id 5614622812f47-45a6bd54369mr1160511b6e.6.1767800071059; Wed, 07 Jan 2026 07:34:31 -0800 (PST) Received: from localhost.localdomain ([2603:8080:1500:3d89:a917:5124:7300:7cef]) by smtp.gmail.com with ESMTPSA id 5614622812f47-45a5e2f1de5sm2398106b6e.22.2026.01.07.07.34.29 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 07 Jan 2026 07:34:30 -0800 (PST) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, John Groves Subject: [PATCH V3 17/21] famfs_fuse: Plumb dax iomap and fuse read/write/mmap Date: Wed, 7 Jan 2026 09:33:26 -0600 Message-ID: <20260107153332.64727-18-john@groves.net> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20260107153332.64727-1-john@groves.net> References: <20260107153244.64703-1-john@groves.net> <20260107153332.64727-1-john@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This commit fills in read/write/mmap handling for famfs files. The dev_dax_iomap interface is used - just like xfs in fs-dax mode. * Read/write are handled by famfs_fuse_[read|write]_iter() via dax_iomap_rw() to fsdev_dax. * Mmap is handled by famfs_fuse_mmap() * Faults are handled by famfs_filemap*fault(), using dax_iomap_fault() to fsdev_dax. * File offset to dax offset resolution is handled via famfs_fuse_iomap_begin(), which uses famfs "fmaps" to resolve the the requested (file, offset) to an offset on a dax device (by way of famfs_fileofs_to_daxofs() and famfs_interleave_fileofs_to_daxofs()) Signed-off-by: John Groves --- fs/fuse/famfs.c | 458 +++++++++++++++++++++++++++++++++++++++++++++++ fs/fuse/file.c | 18 +- fs/fuse/fuse_i.h | 18 ++ 3 files changed, 492 insertions(+), 2 deletions(-) diff --git a/fs/fuse/famfs.c b/fs/fuse/famfs.c index b5cd1b5c1d6c..c02b14789c6e 100644 --- a/fs/fuse/famfs.c +++ b/fs/fuse/famfs.c @@ -602,6 +602,464 @@ famfs_file_init_dax( return rc; } =20 +/********************************************************************* + * iomap_operations + * + * This stuff uses the iomap (dax-related) helpers to resolve file offsets= to + * offsets within a dax device. + */ + +static ssize_t famfs_file_bad(struct inode *inode); + +static int +famfs_interleave_fileofs_to_daxofs(struct inode *inode, struct iomap *ioma= p, + loff_t file_offset, off_t len, unsigned int flags) +{ + struct fuse_inode *fi =3D get_fuse_inode(inode); + struct famfs_file_meta *meta =3D fi->famfs_meta; + struct fuse_conn *fc =3D get_fuse_conn(inode); + loff_t local_offset =3D file_offset; + int i; + + /* This function is only for extent_type INTERLEAVED_EXTENT */ + if (meta->fm_extent_type !=3D INTERLEAVED_EXTENT) { + pr_err("%s: bad extent type\n", __func__); + goto err_out; + } + + if (famfs_file_bad(inode)) + goto err_out; + + iomap->offset =3D file_offset; + + for (i =3D 0; i < meta->fm_niext; i++) { + struct famfs_meta_interleaved_ext *fei =3D &meta->ie[i]; + u64 chunk_size =3D fei->fie_chunk_size; + u64 nstrips =3D fei->fie_nstrips; + u64 ext_size =3D fei->fie_nbytes; + + ext_size =3D min_t(u64, ext_size, meta->file_size); + + if (ext_size =3D=3D 0) { + pr_err("%s: ext_size=3D%lld file_size=3D%ld\n", + __func__, fei->fie_nbytes, meta->file_size); + goto err_out; + } + + /* Is the data is in this striped extent? */ + if (local_offset < ext_size) { + u64 chunk_num =3D local_offset / chunk_size; + u64 chunk_offset =3D local_offset % chunk_size; + u64 stripe_num =3D chunk_num / nstrips; + u64 strip_num =3D chunk_num % nstrips; + u64 chunk_remainder =3D chunk_size - chunk_offset; + u64 strip_offset =3D chunk_offset + (stripe_num * chunk_size); + u64 strip_dax_ofs =3D fei->ie_strips[strip_num].ext_offset; + u64 strip_devidx =3D fei->ie_strips[strip_num].dev_index; + + if (strip_devidx >=3D fc->dax_devlist->nslots) { + pr_err("%s: strip_devidx %llu >=3D nslots %d\n", + __func__, strip_devidx, + fc->dax_devlist->nslots); + goto err_out; + } + + if (!fc->dax_devlist->devlist[strip_devidx].valid) { + pr_err("%s: daxdev=3D%lld invalid\n", __func__, + strip_devidx); + goto err_out; + } + + iomap->addr =3D strip_dax_ofs + strip_offset; + iomap->offset =3D file_offset; + iomap->length =3D min_t(loff_t, len, chunk_remainder); + + iomap->dax_dev =3D fc->dax_devlist->devlist[strip_devidx].devp; + + iomap->type =3D IOMAP_MAPPED; + iomap->flags =3D flags; + + return 0; + } + local_offset -=3D ext_size; /* offset is beyond this striped extent */ + } + + err_out: + pr_err("%s: err_out\n", __func__); + + /* We fell out the end of the extent list. + * Set iomap to zero length in this case, and return 0 + * This just means that the r/w is past EOF + */ + iomap->addr =3D 0; /* there is no valid dax device offset */ + iomap->offset =3D file_offset; /* file offset */ + iomap->length =3D 0; /* this had better result in no access to dax mem */ + iomap->dax_dev =3D NULL; + iomap->type =3D IOMAP_MAPPED; + iomap->flags =3D flags; + + return 0; +} + +/** + * famfs_fileofs_to_daxofs() - Resolve (file, offset, len) to (daxdev, off= set, len) + * + * This function is called by famfs_fuse_iomap_begin() to resolve an offse= t in a + * file to an offset in a dax device. This is upcalled from dax from calls= to + * both * dax_iomap_fault() and dax_iomap_rw(). Dax finishes the job reso= lving + * a fault to a specific physical page (the fault case) or doing a memcpy + * variant (the rw case) + * + * Pages can be PTE (4k), PMD (2MiB) or (theoretically) PuD (1GiB) + * (these sizes are for X86; may vary on other cpu architectures + * + * @inode: The file where the fault occurred + * @iomap: To be filled in to indicate where to find the right memor= y, + * relative to a dax device. + * @file_offset: Within the file where the fault occurred (will be page bo= undary) + * @len: The length of the faulted mapping (will be a page multipl= e) + * (will be trimmed in *iomap if it's disjoint in the extent= list) + * @flags: + * + * Return values: 0. (info is returned in a modified @iomap struct) + */ +static int +famfs_fileofs_to_daxofs(struct inode *inode, struct iomap *iomap, + loff_t file_offset, off_t len, unsigned int flags) +{ + struct fuse_inode *fi =3D get_fuse_inode(inode); + struct famfs_file_meta *meta =3D fi->famfs_meta; + struct fuse_conn *fc =3D get_fuse_conn(inode); + loff_t local_offset =3D file_offset; + int i; + + if (!fc->dax_devlist) { + pr_err("%s: null dax_devlist\n", __func__); + goto err_out; + } + + if (famfs_file_bad(inode)) + goto err_out; + + if (meta->fm_extent_type =3D=3D INTERLEAVED_EXTENT) + return famfs_interleave_fileofs_to_daxofs(inode, iomap, + file_offset, + len, flags); + + iomap->offset =3D file_offset; + + for (i =3D 0; i < meta->fm_nextents; i++) { + /* TODO: check devindex too */ + loff_t dax_ext_offset =3D meta->se[i].ext_offset; + loff_t dax_ext_len =3D meta->se[i].ext_len; + u64 daxdev_idx =3D meta->se[i].dev_index; + + + /* TODO: test that superblock and log offsets only happen + * with superblock and log files. Requires instrumentaiton + * from user space... + */ + + /* local_offset is the offset minus the size of extents skipped + * so far; If local_offset < dax_ext_len, the data of interest + * starts in this extent + */ + if (local_offset < dax_ext_len) { + loff_t ext_len_remainder =3D dax_ext_len - local_offset; + struct famfs_daxdev *dd; + + if (daxdev_idx >=3D fc->dax_devlist->nslots) { + pr_err("%s: daxdev_idx %llu >=3D nslots %d\n", + __func__, daxdev_idx, + fc->dax_devlist->nslots); + goto err_out; + } + + dd =3D &fc->dax_devlist->devlist[daxdev_idx]; + + if (!dd->valid || dd->error) { + pr_err("%s: daxdev=3D%lld %s\n", __func__, + daxdev_idx, + dd->valid ? "error" : "invalid"); + goto err_out; + } + + /* + * OK, we found the file metadata extent where this + * data begins + * @local_offset - The offset within the current + * extent + * @ext_len_remainder - Remaining length of ext after + * skipping local_offset + * Outputs: + * iomap->addr: the offset within the dax device where + * the data starts + * iomap->offset: the file offset + * iomap->length: the valid length resolved here + */ + iomap->addr =3D dax_ext_offset + local_offset; + iomap->offset =3D file_offset; + iomap->length =3D min_t(loff_t, len, ext_len_remainder); + + iomap->dax_dev =3D fc->dax_devlist->devlist[daxdev_idx].devp; + + iomap->type =3D IOMAP_MAPPED; + iomap->flags =3D flags; + return 0; + } + local_offset -=3D dax_ext_len; /* Get ready for the next extent */ + } + + err_out: + pr_err("%s: err_out\n", __func__); + + /* We fell out the end of the extent list. + * Set iomap to zero length in this case, and return 0 + * This just means that the r/w is past EOF + */ + iomap->addr =3D 0; /* there is no valid dax device offset */ + iomap->offset =3D file_offset; /* file offset */ + iomap->length =3D 0; /* this had better result in no access to dax mem */ + iomap->dax_dev =3D NULL; + iomap->type =3D IOMAP_MAPPED; + iomap->flags =3D flags; + + return 0; +} + +/** + * famfs_fuse_iomap_begin() - Handler for iomap_begin upcall from dax + * + * This function is pretty simple because files are + * * never partially allocated + * * never have holes (never sparse) + * * never "allocate on write" + * + * @inode: inode for the file being accessed + * @offset: offset within the file + * @length: Length being accessed at offset + * @flags: + * @iomap: iomap struct to be filled in, resolving (offset, length) to + * (daxdev, offset, len) + * @srcmap: + */ +static int +famfs_fuse_iomap_begin(struct inode *inode, loff_t offset, loff_t length, + unsigned int flags, struct iomap *iomap, struct iomap *srcmap) +{ + struct fuse_inode *fi =3D get_fuse_inode(inode); + struct famfs_file_meta *meta =3D fi->famfs_meta; + size_t size; + + size =3D i_size_read(inode); + + WARN_ON(size !=3D meta->file_size); + + return famfs_fileofs_to_daxofs(inode, iomap, offset, length, flags); +} + +/* Note: We never need a special set of write_iomap_ops because famfs never + * performs allocation on write. + */ +const struct iomap_ops famfs_iomap_ops =3D { + .iomap_begin =3D famfs_fuse_iomap_begin, +}; + +/********************************************************************* + * vm_operations + */ +static vm_fault_t +__famfs_fuse_filemap_fault(struct vm_fault *vmf, unsigned int pe_size, + bool write_fault) +{ + struct inode *inode =3D file_inode(vmf->vma->vm_file); + vm_fault_t ret; + unsigned long pfn; + + if (!IS_DAX(file_inode(vmf->vma->vm_file))) { + pr_err("%s: file not marked IS_DAX!!\n", __func__); + return VM_FAULT_SIGBUS; + } + + if (write_fault) { + sb_start_pagefault(inode->i_sb); + file_update_time(vmf->vma->vm_file); + } + + ret =3D dax_iomap_fault(vmf, pe_size, &pfn, NULL, &famfs_iomap_ops); + if (ret & VM_FAULT_NEEDDSYNC) + ret =3D dax_finish_sync_fault(vmf, pe_size, pfn); + + if (write_fault) + sb_end_pagefault(inode->i_sb); + + return ret; +} + +static inline bool +famfs_is_write_fault(struct vm_fault *vmf) +{ + return (vmf->flags & FAULT_FLAG_WRITE) && + (vmf->vma->vm_flags & VM_SHARED); +} + +static vm_fault_t +famfs_filemap_fault(struct vm_fault *vmf) +{ + return __famfs_fuse_filemap_fault(vmf, 0, famfs_is_write_fault(vmf)); +} + +static vm_fault_t +famfs_filemap_huge_fault(struct vm_fault *vmf, unsigned int pe_size) +{ + return __famfs_fuse_filemap_fault(vmf, pe_size, famfs_is_write_fault(vmf)= ); +} + +static vm_fault_t +famfs_filemap_page_mkwrite(struct vm_fault *vmf) +{ + return __famfs_fuse_filemap_fault(vmf, 0, true); +} + +static vm_fault_t +famfs_filemap_pfn_mkwrite(struct vm_fault *vmf) +{ + return __famfs_fuse_filemap_fault(vmf, 0, true); +} + +static vm_fault_t +famfs_filemap_map_pages(struct vm_fault *vmf, pgoff_t start_pgoff, + pgoff_t end_pgoff) +{ + return filemap_map_pages(vmf, start_pgoff, end_pgoff); +} + +const struct vm_operations_struct famfs_file_vm_ops =3D { + .fault =3D famfs_filemap_fault, + .huge_fault =3D famfs_filemap_huge_fault, + .map_pages =3D famfs_filemap_map_pages, + .page_mkwrite =3D famfs_filemap_page_mkwrite, + .pfn_mkwrite =3D famfs_filemap_pfn_mkwrite, +}; + +/********************************************************************* + * file_operations + */ + +/** + * famfs_file_bad() - Check for files that aren't in a valid state + * + * @inode - inode + * + * Returns: 0=3Dsuccess + * -errno=3Dfailure + */ +static ssize_t +famfs_file_bad(struct inode *inode) +{ + struct fuse_inode *fi =3D get_fuse_inode(inode); + struct famfs_file_meta *meta =3D fi->famfs_meta; + size_t i_size =3D i_size_read(inode); + + if (!meta) { + pr_err("%s: un-initialized famfs file\n", __func__); + return -EIO; + } + if (meta->error) { + pr_debug("%s: previously detected metadata errors\n", __func__); + return -EIO; + } + if (i_size !=3D meta->file_size) { + pr_warn("%s: i_size overwritten from %ld to %ld\n", + __func__, meta->file_size, i_size); + meta->error =3D true; + return -ENXIO; + } + if (!IS_DAX(inode)) { + pr_debug("%s: inode %llx IS_DAX is false\n", + __func__, (u64)inode); + return -ENXIO; + } + return 0; +} + +static ssize_t +famfs_fuse_rw_prep(struct kiocb *iocb, struct iov_iter *ubuf) +{ + struct inode *inode =3D iocb->ki_filp->f_mapping->host; + size_t i_size =3D i_size_read(inode); + size_t count =3D iov_iter_count(ubuf); + size_t max_count; + ssize_t rc; + + rc =3D famfs_file_bad(inode); + if (rc) + return rc; + + /* Avoid unsigned underflow if position is past EOF */ + if (iocb->ki_pos >=3D i_size) + max_count =3D 0; + else + max_count =3D i_size - iocb->ki_pos; + + if (count > max_count) + iov_iter_truncate(ubuf, max_count); + + if (!iov_iter_count(ubuf)) + return 0; + + return rc; +} + +ssize_t +famfs_fuse_read_iter(struct kiocb *iocb, struct iov_iter *to) +{ + ssize_t rc; + + rc =3D famfs_fuse_rw_prep(iocb, to); + if (rc) + return rc; + + if (!iov_iter_count(to)) + return 0; + + rc =3D dax_iomap_rw(iocb, to, &famfs_iomap_ops); + + file_accessed(iocb->ki_filp); + return rc; +} + +ssize_t +famfs_fuse_write_iter(struct kiocb *iocb, struct iov_iter *from) +{ + ssize_t rc; + + rc =3D famfs_fuse_rw_prep(iocb, from); + if (rc) + return rc; + + if (!iov_iter_count(from)) + return 0; + + return dax_iomap_rw(iocb, from, &famfs_iomap_ops); +} + +int +famfs_fuse_mmap(struct file *file, struct vm_area_struct *vma) +{ + struct inode *inode =3D file_inode(file); + ssize_t rc; + + rc =3D famfs_file_bad(inode); + if (rc) + return (int)rc; + + file_accessed(file); + vma->vm_ops =3D &famfs_file_vm_ops; + vm_flags_set(vma, VM_HUGEPAGE); + return 0; +} + #define FMAP_BUFSIZE PAGE_SIZE =20 int diff --git a/fs/fuse/file.c b/fs/fuse/file.c index 1f64bf68b5ee..45a09a7f0012 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -1831,6 +1831,8 @@ static ssize_t fuse_file_read_iter(struct kiocb *iocb= , struct iov_iter *to) =20 if (FUSE_IS_VIRTIO_DAX(fi)) return fuse_dax_read_iter(iocb, to); + if (fuse_file_famfs(fi)) + return famfs_fuse_read_iter(iocb, to); =20 /* FOPEN_DIRECT_IO overrides FOPEN_PASSTHROUGH */ if (ff->open_flags & FOPEN_DIRECT_IO) @@ -1853,6 +1855,8 @@ static ssize_t fuse_file_write_iter(struct kiocb *ioc= b, struct iov_iter *from) =20 if (FUSE_IS_VIRTIO_DAX(fi)) return fuse_dax_write_iter(iocb, from); + if (fuse_file_famfs(fi)) + return famfs_fuse_write_iter(iocb, from); =20 /* FOPEN_DIRECT_IO overrides FOPEN_PASSTHROUGH */ if (ff->open_flags & FOPEN_DIRECT_IO) @@ -1868,9 +1872,13 @@ static ssize_t fuse_splice_read(struct file *in, lof= f_t *ppos, unsigned int flags) { struct fuse_file *ff =3D in->private_data; + struct inode *inode =3D file_inode(in); + struct fuse_inode *fi =3D get_fuse_inode(inode); =20 /* FOPEN_DIRECT_IO overrides FOPEN_PASSTHROUGH */ - if (fuse_file_passthrough(ff) && !(ff->open_flags & FOPEN_DIRECT_IO)) + if (fuse_file_famfs(fi)) + return -EIO; /* famfs does not use the page cache... */ + else if (fuse_file_passthrough(ff) && !(ff->open_flags & FOPEN_DIRECT_IO)) return fuse_passthrough_splice_read(in, ppos, pipe, len, flags); else return filemap_splice_read(in, ppos, pipe, len, flags); @@ -1880,9 +1888,13 @@ static ssize_t fuse_splice_write(struct pipe_inode_i= nfo *pipe, struct file *out, loff_t *ppos, size_t len, unsigned int flags) { struct fuse_file *ff =3D out->private_data; + struct inode *inode =3D file_inode(out); + struct fuse_inode *fi =3D get_fuse_inode(inode); =20 /* FOPEN_DIRECT_IO overrides FOPEN_PASSTHROUGH */ - if (fuse_file_passthrough(ff) && !(ff->open_flags & FOPEN_DIRECT_IO)) + if (fuse_file_famfs(fi)) + return -EIO; /* famfs does not use the page cache... */ + else if (fuse_file_passthrough(ff) && !(ff->open_flags & FOPEN_DIRECT_IO)) return fuse_passthrough_splice_write(pipe, out, ppos, len, flags); else return iter_file_splice_write(pipe, out, ppos, len, flags); @@ -2390,6 +2402,8 @@ static int fuse_file_mmap(struct file *file, struct v= m_area_struct *vma) /* DAX mmap is superior to direct_io mmap */ if (FUSE_IS_VIRTIO_DAX(fi)) return fuse_dax_mmap(file, vma); + if (fuse_file_famfs(fi)) + return famfs_fuse_mmap(file, vma); =20 /* * If inode is in passthrough io mode, because it has some file open diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index d308b74c83ec..5e52c3ba6e94 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -1664,6 +1664,9 @@ extern void fuse_sysctl_unregister(void); int famfs_file_init_dax(struct fuse_mount *fm, struct inode *inode, void *fmap_buf, size_t fmap_size); +ssize_t famfs_fuse_write_iter(struct kiocb *iocb, struct iov_iter *from); +ssize_t famfs_fuse_read_iter(struct kiocb *iocb, struct iov_iter *to); +int famfs_fuse_mmap(struct file *file, struct vm_area_struct *vma); void __famfs_meta_free(void *map); void famfs_teardown(struct fuse_conn *fc); #else @@ -1673,6 +1676,21 @@ static inline void famfs_teardown(struct fuse_conn *= fc) kfree(fc->shadow); #endif } +static inline ssize_t famfs_fuse_write_iter(struct kiocb *iocb, + struct iov_iter *to) +{ + return -ENODEV; +} +static inline ssize_t famfs_fuse_read_iter(struct kiocb *iocb, + struct iov_iter *to) +{ + return -ENODEV; +} +static inline int famfs_fuse_mmap(struct file *file, + struct vm_area_struct *vma) +{ + return -ENODEV; +} #endif =20 static inline void famfs_meta_init(struct fuse_inode *fi) --=20 2.49.0 From nobody Sat Feb 7 07:10:19 2026 Received: from mail-qk1-f178.google.com (mail-qk1-f178.google.com [209.85.222.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 770E0366547 for ; Wed, 7 Jan 2026 17:25:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767806707; cv=none; b=glr2Rs4jOVXnLTMAOH9fbi0zjF9g3m9dVZ/F+3WI/SRFeSP/HcLR8inPZgKZ3HfVuMwkUU1+eJkw8DWl98dEmzcFnCc6EvuuGzZIOI/Vkzm2Ze39t7cSj6p48Kr57woCL9uFgSQpfUCK16J5aT7Xm4mj8PPk+UGbl8P04XTYkQM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767806707; c=relaxed/simple; bh=dEOXcM0qLcGyYDdpYrpwaPuAKh0TeDT6z74M9N7a6x8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=O2zn8SWocr2pD3gjMZb4AC716Ssvo0APiVqtNWhm8DytznPtSvZah/iUCBuZvqQ6EuCDeXzBa8KWPhpwN565LBNU48SxpT0gg65khPpglPXDmoTkaya6+N9qTmjV5K/0G2sVg3ZxzOEcA2MTWb1BSf3W98XT3pXBNCQhToC5hUc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=BfmQYI/P; arc=none smtp.client-ip=209.85.222.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="BfmQYI/P" Received: by mail-qk1-f178.google.com with SMTP id af79cd13be357-8ba3ffd54dbso325922785a.1 for ; Wed, 07 Jan 2026 09:25:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767806698; x=1768411498; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=IsUBbcgm6GmRe1pvdN9mqC0Y9Ykr2Q9MySmIE+mClu8=; b=BfmQYI/P72+qie+qjOeUfmyuzbo4cgeeN0oSCrIu9Gijupc772c+2CH3fXwdFjJFO2 TSMqqmHuIyW5ZkZA5cObLG/8qajnr6O+qJP17tYdLwInXI2p8ywFlUQsDhGq03xteAku Vsz/KtBARpiJYACKGEQ83VLrjvyrko9iFFSaFRKc4R4pJ8bW+MbgZhNioZe7rbdPI/ie jahhA+66H2eVcZ/+G10HBK2eGS6gAZGuN3t38EJZVYYZEHUMKeZ6EIzZ2uxrKe604Ozc GUkNEbyURzX3bsNSiY60cHtDSGTBIu/XDcA1Ph15DXh5KvxQkh7v/CzncURL5qc2VkFx o2pQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767806698; x=1768411498; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=IsUBbcgm6GmRe1pvdN9mqC0Y9Ykr2Q9MySmIE+mClu8=; b=sQ+C4X+DvM7NolgNo/hWEKqUalyKOl5uWhimgeneHFuzJeVvcHFxbouyCOqGumNGze vusSn9Fu+soHNGjTEd0W/NYCy10uy8QTOIA9vKY8J80xa8SiI1wKUyi8zQjQztZkLZWH o0/vXgIofhJOIB/Tmjayejnhf/KqL0Xaauraua6JeA4tglC7orLQEygmrl5s+Fx3NK4/ 3X3P7d2omciz0oxMFqODqspv6q7D3nh4Q1tpD0gro9ZqEvOyaqT65T9AW8nmCSl/cnQS IJBYtY0nqIifzT6XBvGkN7nnP1tdfI5mEsTjnY8yT6zkioBD2Wv+aRIGXfb9A3m4i8nr jYjQ== X-Forwarded-Encrypted: i=1; AJvYcCXrW12Yw5BkwzM0BCg3lurq4LN/s5Ef/DHdnRwGu5HlRVN+9diovKBwtFzL2r4xJ21mjCrB++DuxdjmNKw=@vger.kernel.org X-Gm-Message-State: AOJu0Yyb3ecKFJ6waLHgilDGW0sZYL6Ed0olM4D56o7iHptQbglYgqUN bypuyZjgPikD9VVwTSEj63N7ldls21P3djOiX7Wj7wsmE1ihaW/GEr6m X-Gm-Gg: AY/fxX5X5whTiHMfoaYoUQ+547xk3h4s6V52EEzRGu/VMqEhj/rHn8wG7CWEgr6UATX D36lJmmPL3aPUk47BNEV3qHQ2GjAA/1DpddGvM56ndPjVgV70Gh7fLbqyztWm1TKq/pnWs5mkGb pvBRyFMwC3DT4JxMHYzdTBYum5tSqI1QLarCaYN0MVGc47/V1Jb7mcJE2KnEp004nIiDS2d/z+r tP0Nv2LUUz151fGdReOKsYLJXO0dHv9CrZWxoHE5Faiea3Ygp0gexA6W5itkl0gzH9e3G/Py4Qs ukamPSkSJUur8kmeYdq2/9v6KUWTMbTaIYslicm9BYdPn0zbGCrifCbvyvHM0tYl/I2gMpgRdmb Ojsz6rH1qnlmAgUPcSoDB6EQIYDpLAb022xij5KrJiSkY6HOBlOgNsucadaLSrWwlXjzOU741vk 3QYIBpvTJB0yA5slpNJkJHFAfu33d/XAwthHsdKmjDIJJK X-Google-Smtp-Source: AGHT+IEAvq5jIS7yNdaICnZjGGLTm+9F3WfQSWgCD5SfovqjmCa/xKQRtzE2sJgjYbpKAc4uVl4aqw== X-Received: by 2002:a05:6808:150f:b0:45a:5894:4979 with SMTP id 5614622812f47-45a6bdbcf78mr1522078b6e.20.1767800073716; Wed, 07 Jan 2026 07:34:33 -0800 (PST) Received: from localhost.localdomain ([2603:8080:1500:3d89:a917:5124:7300:7cef]) by smtp.gmail.com with ESMTPSA id 5614622812f47-45a5e2f1de5sm2398106b6e.22.2026.01.07.07.34.31 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 07 Jan 2026 07:34:33 -0800 (PST) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, John Groves Subject: [PATCH V3 18/21] famfs_fuse: Add holder_operations for dax notify_failure() Date: Wed, 7 Jan 2026 09:33:27 -0600 Message-ID: <20260107153332.64727-19-john@groves.net> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20260107153332.64727-1-john@groves.net> References: <20260107153244.64703-1-john@groves.net> <20260107153332.64727-1-john@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Memory errors are at least somewhat more likely on disaggregated memory than on-board memory. This commit registers to be notified by fsdev_dax in the event that a memory failure is detected. When a file access resolves to a daxdev with memory errors, it will fail with an appropriate error. If a daxdev failed fs_dax_get(), we set dd->dax_err. If a daxdev called our notify_failure(), set dd->error. When any of the above happens, set (file)->error and stop allowing access. In general, the recovery from memory errors is to unmount the file system and re-initialize the memory, but there may be usable degraded modes of operation - particularly in the future when famfs supports file systems backed by more than one daxdev. In those cases, accessing data that is on a working daxdev can still work. For now, return errors for any file that has encountered a memory or dax error. Signed-off-by: John Groves --- fs/fuse/famfs.c | 115 +++++++++++++++++++++++++++++++++++++++--- fs/fuse/famfs_kfmap.h | 3 +- 2 files changed, 109 insertions(+), 9 deletions(-) diff --git a/fs/fuse/famfs.c b/fs/fuse/famfs.c index c02b14789c6e..4eb87c5c628e 100644 --- a/fs/fuse/famfs.c +++ b/fs/fuse/famfs.c @@ -20,6 +20,26 @@ #include "famfs_kfmap.h" #include "fuse_i.h" =20 +static void famfs_set_daxdev_err( + struct fuse_conn *fc, struct dax_device *dax_devp); + +static int +famfs_dax_notify_failure(struct dax_device *dax_devp, u64 offset, + u64 len, int mf_flags) +{ + struct fuse_conn *fc =3D dax_holder(dax_devp); + + famfs_set_daxdev_err(fc, dax_devp); + + return 0; +} + +static const struct dax_holder_operations famfs_fuse_dax_holder_ops =3D { + .notify_failure =3D famfs_dax_notify_failure, +}; + +/*************************************************************************= ****/ + /* * famfs_teardown() * @@ -48,9 +68,12 @@ famfs_teardown(struct fuse_conn *fc) if (!dd->valid) continue; =20 - /* Release reference from dax_dev_get() */ - if (dd->devp) + /* Only call fs_put_dax if fs_dax_get succeeded */ + if (dd->devp) { + if (!dd->dax_err) + fs_put_dax(dd->devp, fc); put_dax(dd->devp); + } =20 kfree(dd->name); } @@ -174,6 +197,17 @@ famfs_fuse_get_daxdev(struct fuse_mount *fm, const u64= index) goto out; } =20 + err =3D fs_dax_get(daxdev->devp, fc, &famfs_fuse_dax_holder_ops); + if (err) { + /* If fs_dax_get() fails, we don't attempt recovery; + * We mark the daxdev valid with dax_err + */ + daxdev->dax_err =3D 1; + pr_err("%s: fs_dax_get(%lld) failed\n", + __func__, (u64)daxdev->devno); + err =3D -EBUSY; + } + daxdev->name =3D kstrdup(daxdev_out.name, GFP_KERNEL); wmb(); /* all daxdev fields must be visible before marking it valid */ daxdev->valid =3D 1; @@ -254,6 +288,38 @@ famfs_update_daxdev_table( return 0; } =20 +static void +famfs_set_daxdev_err( + struct fuse_conn *fc, + struct dax_device *dax_devp) +{ + int i; + + /* Gotta search the list by dax_devp; + * read lock because we're not adding or removing daxdev entries + */ + down_read(&fc->famfs_devlist_sem); + for (i =3D 0; i < fc->dax_devlist->nslots; i++) { + if (fc->dax_devlist->devlist[i].valid) { + struct famfs_daxdev *dd =3D &fc->dax_devlist->devlist[i]; + + if (dd->devp !=3D dax_devp) + continue; + + dd->error =3D true; + up_read(&fc->famfs_devlist_sem); + + pr_err("%s: memory error on daxdev %s (%d)\n", + __func__, dd->name, i); + goto done; + } + } + up_read(&fc->famfs_devlist_sem); + pr_err("%s: memory err on unrecognized daxdev\n", __func__); + +done: +} + /*************************************************************************= **/ =20 void @@ -611,6 +677,26 @@ famfs_file_init_dax( =20 static ssize_t famfs_file_bad(struct inode *inode); =20 +static int famfs_dax_err(struct famfs_daxdev *dd) +{ + if (!dd->valid) { + pr_err("%s: daxdev=3D%s invalid\n", + __func__, dd->name); + return -EIO; + } + if (dd->dax_err) { + pr_err("%s: daxdev=3D%s dax_err\n", + __func__, dd->name); + return -EIO; + } + if (dd->error) { + pr_err("%s: daxdev=3D%s memory error\n", + __func__, dd->name); + return -EHWPOISON; + } + return 0; +} + static int famfs_interleave_fileofs_to_daxofs(struct inode *inode, struct iomap *ioma= p, loff_t file_offset, off_t len, unsigned int flags) @@ -648,6 +734,7 @@ famfs_interleave_fileofs_to_daxofs(struct inode *inode,= struct iomap *iomap, =20 /* Is the data is in this striped extent? */ if (local_offset < ext_size) { + struct famfs_daxdev *dd; u64 chunk_num =3D local_offset / chunk_size; u64 chunk_offset =3D local_offset % chunk_size; u64 stripe_num =3D chunk_num / nstrips; @@ -656,6 +743,7 @@ famfs_interleave_fileofs_to_daxofs(struct inode *inode,= struct iomap *iomap, u64 strip_offset =3D chunk_offset + (stripe_num * chunk_size); u64 strip_dax_ofs =3D fei->ie_strips[strip_num].ext_offset; u64 strip_devidx =3D fei->ie_strips[strip_num].dev_index; + int rc; =20 if (strip_devidx >=3D fc->dax_devlist->nslots) { pr_err("%s: strip_devidx %llu >=3D nslots %d\n", @@ -670,6 +758,15 @@ famfs_interleave_fileofs_to_daxofs(struct inode *inode= , struct iomap *iomap, goto err_out; } =20 + dd =3D &fc->dax_devlist->devlist[strip_devidx]; + + rc =3D famfs_dax_err(dd); + if (rc) { + /* Shut down access to this file */ + meta->error =3D true; + return rc; + } + iomap->addr =3D strip_dax_ofs + strip_offset; iomap->offset =3D file_offset; iomap->length =3D min_t(loff_t, len, chunk_remainder); @@ -767,6 +864,7 @@ famfs_fileofs_to_daxofs(struct inode *inode, struct iom= ap *iomap, if (local_offset < dax_ext_len) { loff_t ext_len_remainder =3D dax_ext_len - local_offset; struct famfs_daxdev *dd; + int rc; =20 if (daxdev_idx >=3D fc->dax_devlist->nslots) { pr_err("%s: daxdev_idx %llu >=3D nslots %d\n", @@ -777,11 +875,11 @@ famfs_fileofs_to_daxofs(struct inode *inode, struct i= omap *iomap, =20 dd =3D &fc->dax_devlist->devlist[daxdev_idx]; =20 - if (!dd->valid || dd->error) { - pr_err("%s: daxdev=3D%lld %s\n", __func__, - daxdev_idx, - dd->valid ? "error" : "invalid"); - goto err_out; + rc =3D famfs_dax_err(dd); + if (rc) { + /* Shut down access to this file */ + meta->error =3D true; + return rc; } =20 /* @@ -966,7 +1064,8 @@ famfs_file_bad(struct inode *inode) return -EIO; } if (meta->error) { - pr_debug("%s: previously detected metadata errors\n", __func__); + pr_debug("%s: previously detected metadata errors\n", + __func__); return -EIO; } if (i_size !=3D meta->file_size) { diff --git a/fs/fuse/famfs_kfmap.h b/fs/fuse/famfs_kfmap.h index e76b9057a1e0..6a6420bdff48 100644 --- a/fs/fuse/famfs_kfmap.h +++ b/fs/fuse/famfs_kfmap.h @@ -73,7 +73,8 @@ struct famfs_file_meta { struct famfs_daxdev { /* Include dev uuid? */ bool valid; - bool error; + bool error; /* Dax has reported a memory error (probably poison) */ + bool dax_err; /* fs_dax_get() failed */ dev_t devno; struct dax_device *devp; char *name; --=20 2.49.0 From nobody Sat Feb 7 07:10:19 2026 Received: from mail-oi1-f173.google.com (mail-oi1-f173.google.com [209.85.167.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A3F1C396D21 for ; Wed, 7 Jan 2026 15:34:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800079; cv=none; b=PItSl6/6o2xs+N8VI1hVQTyoP7nKCM1owERDo7Rno2LDZPVGvDcf2CR8zsLi780bVla542GJrHecsl6NHY9/DYdqODV5IYMp9pLq7gNf67/stBpfV8JNKDgpqk7RXzWa9c9xbOz0Dlgt7yIkog6V28O/PGK9JNQKVnrX+LSFr8w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800079; c=relaxed/simple; bh=zQL6OSWBO4m5+Q/hEBIHqSQry3+21Rl32A58FZEFgEw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ozirkMbOegIVMNQL6qmqbqzastf1VOEF836x7hvQkTT1SxkzMAgUib57ssmdQxC4F76BQozmethYjqwDwkS1MfnTkdW1LS7dWRuvmeqfIxoaQtqgFtbxhyXKundihdUP1HPDHTMCu5p0DiPTBsIF+rIVj6P8Kmt7PGyNIsqFSwk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=JleWKWKK; arc=none smtp.client-ip=209.85.167.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="JleWKWKK" Received: by mail-oi1-f173.google.com with SMTP id 5614622812f47-45392215f74so1095948b6e.3 for ; Wed, 07 Jan 2026 07:34:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767800076; x=1768404876; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=BadRqtdIUTjJqSpAIjL5u97iCPqnyeMr+lfViz9uPAM=; b=JleWKWKKZILs4kfdIcm67y4G/+cPmDE8Eh9sSBPGt8bosmO1BQJxYUIciTesaGkZmH k0maEuJOCN1txO0Z7yGlpuIhvRNEsqbFmd1MSV5CW8soCIdpxxaBe86ybTym+zFpyLxS vOZqtX9lPTNxRb2WPcBCecXuBpGFS4wJqaugmG2/Wh2SGPQIR8A4dTj4KdU26/LSYrEI KUpnS2wdi35TkbpjN8hSHcToVQBtSn+585t54AkDJ0l3tCYLn/ZcRjYtr7GzXrxyQlns 42Br/b1X8Q6gPMm+2MwNgSjKbgq3skPX0nPs+yNOPlfJcVFHeLi1FlBBFX/seBcp6urb elfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767800076; x=1768404876; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=BadRqtdIUTjJqSpAIjL5u97iCPqnyeMr+lfViz9uPAM=; b=ePcQcqQ5m0CQ9BQnHHWFckfuEMvFIASkieyOPDY/3A4Uq30OzP0usGnyjVPHOP5QZ7 UEsGWyj+b3Q9Fq9Kw3mDk9gjvr29zStPsrzuJPxbQi7kGdz4NSvJnOrH11qvbWeRAymf rQQPw42c5FjEmu3FVDdtXfQBn94W8xs8BYoAr8sbBRF8p4A+sAoDgqJwjFHYzt2L+1Sk QGdUhHINds03EovQrHHdYzlRP+mUQhFLPCo0kKj6TUZ8shd6nTWO/ul7svtoPhRvRCSx M8t8ZgO+KhqtJydC+W5qb5hTZwM5UyClICoA/rXfjA6TSHDIPOXIuKpUjkHbXNL84WEu Jcpg== X-Forwarded-Encrypted: i=1; AJvYcCViw74o0a7f+KAmNiR9bOW1KAQPCSCCEOIaXBm22OfDCoF058Skk+y/cAQZrYhH3bSWiwKEMr3mZCVxa1w=@vger.kernel.org X-Gm-Message-State: AOJu0YzCgsSs7iFBmBEWyedlWIowZXIutK4Nio5AUJjA7L9yGaHoJ5MO oRJQ58Up8kykfXUvmjX8LTfHW+Oty3ZcLn3a7Uo0+X9V9LI6oOzLqoL5 X-Gm-Gg: AY/fxX4Oq/Uem/kEEYMFpW3yhc84esC5W3mA0qrmBPrtfpBzsPVpEsUrhyMtvQXErfH 8CdH4XjwhP7tcavsPSFGxh1dl0d5XMY6hd9NukGs7EXrnh/jzGbfiAkIqOu3pNvf7Rg0+Nav+4+ h8MILBGiPKajB3gExNuXz+ezJrVix8IRwBXh/BPl2NuEaoCzpYLm1bzZFzYaZyIxwsQPR1vUdtH rirkQ+B8iPD7H5WTQ/E3cDtSiLPNHNK0A6VKYZQ0fjVk5hGWf29HMtPvmSFv8ENJXm4oJt/SmC8 7tOYoyW7FTbvf6lLeyXdZFji63k02zoqjwIrtt06khfYzuBBZR7ZAK1MnSNFtbTgPCzbWCannSF phTekeACpSWXqQZHg4zzsK3R59tMrunHlV15TdVYTkHGa7YrDGCTnaag+L48CHvf3xoaMHRoEm3 i5GSLe8WpOifp1v7azCDeMJh1um7LeP1hGm+NlFBRg5k2Y X-Google-Smtp-Source: AGHT+IGiBgbbbwZi9OtjHC2lwZve1mJRhadFMJn6qeQ/EHq2CoD4bMYUWvkIfhWahXqGzzsZgFZhkg== X-Received: by 2002:a05:6808:2383:b0:45a:156f:dbcd with SMTP id 5614622812f47-45a6bf2a83emr1169537b6e.62.1767800076370; Wed, 07 Jan 2026 07:34:36 -0800 (PST) Received: from localhost.localdomain ([2603:8080:1500:3d89:a917:5124:7300:7cef]) by smtp.gmail.com with ESMTPSA id 5614622812f47-45a5e2f1de5sm2398106b6e.22.2026.01.07.07.34.34 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 07 Jan 2026 07:34:36 -0800 (PST) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, John Groves Subject: [PATCH V3 19/21] famfs_fuse: Add DAX address_space_operations with noop_dirty_folio Date: Wed, 7 Jan 2026 09:33:28 -0600 Message-ID: <20260107153332.64727-20-john@groves.net> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20260107153332.64727-1-john@groves.net> References: <20260107153244.64703-1-john@groves.net> <20260107153332.64727-1-john@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: John Groves Famfs is memory-backed; there is no place to write back to, and no reason to mark pages dirty at all. Signed-off-by: John Groves --- fs/fuse/famfs.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/fs/fuse/famfs.c b/fs/fuse/famfs.c index 4eb87c5c628e..32c3d0c2ec48 100644 --- a/fs/fuse/famfs.c +++ b/fs/fuse/famfs.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include @@ -38,6 +39,15 @@ static const struct dax_holder_operations famfs_fuse_dax= _holder_ops =3D { .notify_failure =3D famfs_dax_notify_failure, }; =20 +/* + * DAX address_space_operations for famfs. + * famfs doesn't need dirty tracking - writes go directly to + * memory with no writeback required. + */ +static const struct address_space_operations famfs_dax_aops =3D { + .dirty_folio =3D noop_dirty_folio, +}; + /*************************************************************************= ****/ =20 /* @@ -657,6 +667,7 @@ famfs_file_init_dax( } i_size_write(inode, meta->file_size); inode->i_flags |=3D S_DAX; + inode->i_data.a_ops =3D &famfs_dax_aops; } unlock_out: inode_unlock(inode); --=20 2.49.0 From nobody Sat Feb 7 07:10:19 2026 Received: from mail-oi1-f193.google.com (mail-oi1-f193.google.com [209.85.167.193]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 087C9397ABA for ; Wed, 7 Jan 2026 15:34:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.193 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800083; cv=none; b=O4y83ISvL3vyeT89l52rPrAPuuKwGFm8/p+nGHNACQX6bbyTmXAc8lqIWQ4liYBCPN7BReZHHLgtD12JkiFPVNw1NY56fyZFNMS18BwSCyGDGTpqJxKP3B531cWV29r7xPzDNc76IvcwYJAz14xgUe+FHeKFLBTyd7AwW6o03/s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767800083; c=relaxed/simple; bh=+muXBvOeYTMbD8nIPUObSxhKcu4jsEupVVL6nK1rpsA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=PIcQolFl87fOsI1lT6M9hYW4aZbrzYUTe+mHbWI9ghWfdBf8gSQmu7wrzWXnlw6btdQPqMvQHPBXdec2RpUaMfnFuFiFB1VZs0x6SRwJNnxGxF12Ld234PQy4LVnAf73d1JIj3pnWeS0Z50C87SL8Qgu6qPHuB6b/XavN3jUO9w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=TZb+7stT; arc=none smtp.client-ip=209.85.167.193 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="TZb+7stT" Received: by mail-oi1-f193.google.com with SMTP id 5614622812f47-450b5338459so1353891b6e.2 for ; Wed, 07 Jan 2026 07:34:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767800080; x=1768404880; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=WDzXFW8FM/AL7CQLZ1rgjH5BjM6661IE1WdAgak/okg=; b=TZb+7stT7egIVn8Q3MNVj8NKFsrujsEIJMHdzd6hvTgqHjAf5LUoGkWcVXp0Qoq4GG GbSY1Xw+0kTxUCvB/aLPbMrp8COBJrZIOIACT1kNg8tqzbPGdouWHs3fWP3e578wM4nq 3sPwXMX9uxDIbixJmPfcEu6g+3NJ0/+/iXtDQUT9cHWUAA73m8v9QfSwegyfKPCff6Un 9bX/gX2kI41mrGklyKmpAvqp0yn4QRQRvNnBzFlMIt+SorvoUwl9nnPFWcxvmBRrujIM GaChvhPq8sDb8UZBhxQZWotgFGAZI57iNrSnOxVD3f/G8ShPLB7k3ZG1u5f1yQs4OxGL tMbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767800080; x=1768404880; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=WDzXFW8FM/AL7CQLZ1rgjH5BjM6661IE1WdAgak/okg=; b=v2V8enmwcF7cbFaQ7fC+mJuXcLMdcpZvbkj9CjGoaiDzd0u9jQ9DGLZ34tCXBf5D7w df6S1GG2Vnln+kGJM6NbgDd3QdzuWhGsqOXl7B9H2o+AojuVE03tvptQE1FIAdn0YKdy c7lL1Oel6ygAF3YVhuQx0OzahgZoOycyUVsbz5cwSXbc9xZ4ZLiZn8DHjJcTjoMzQmOz DEMfQWrLB+BYST+Sj/iOVJHlZyK5LbIV7saG9vlfMpC4C9QuxpN0a7cQ1Gn1J+TEi0hI WJqGYU+UXEYk2NuptjZJ/FySsYfbz4S92xKYZ+rNOfo0cqZSnz103lHubg7LEHenlvj2 y4fg== X-Forwarded-Encrypted: i=1; AJvYcCX23oK2LfDZj66kldAH/4dKm7v8GC9iDIEgofm8uN7t0ZimSarvFs4yO2Im2zh0/vC63NKmbgm9lp3Izzs=@vger.kernel.org X-Gm-Message-State: AOJu0Yyzx/5Lr/v8iBnePEp271MPM2KDDSuaAyfIL7rYl2a4DwpYmwLn 1uAaJSBbSzH93gK+4CZQYVmZfQ15rX68cbaNKiMWh9KbHzMORphtU44J X-Gm-Gg: AY/fxX4knFhX4cvf4njIEFX/F4K7Vw7g6BgnH2SW9yxk8+u6p5Kq5sjw/3wNiAzg3eY kD6sjdBMqBuRYcAvJvM72p6Wr/KAFcftqwhoAJGYpUK9CpT7Hf/DlQkQ4XE89afiYTjCSZ690sp TqDOV1wbTI7qws8kZbG4G8H3jpBna69W0M5hSIQITtbhCAB7gfki4wJ4CyOyzFf2VZ4DbWKIx1/ ySbpYTcEceFU5cICx+OjOq+4SwJFxXERT4y4J3Lr7RoHW9qRaTV4kP6u+Otl51OcpwYX6fpiJhG sE/SJ8VOwOGoJWRSsU2sldA4PjiXvcvA6Ra+jW1TZgRp2wrEE0cJruLgdf+BgTiIW4cp+mGzTlv DHNWYtRZ9kHL+CwkRkpR611jDS7smsj5VxQy1ZeSs78Zh2IeZqWqeroZ0wmpi+tOgOd+qNuVw11 GbI3ToBnIi8Ub8NPr2wHkUEPD0ppz1EEOJn8gy3bd1FXLT X-Google-Smtp-Source: AGHT+IH4P/04LMtWqv8sSwu50eq+akkq0AyxmvD1PnTkrLfjOQ9ju3pd7rvqUnQ/LnpHTBBsjZ04Pw== X-Received: by 2002:a05:6808:309a:b0:450:3122:f0a2 with SMTP id 5614622812f47-45a6bccfa88mr1317344b6e.11.1767800079970; Wed, 07 Jan 2026 07:34:39 -0800 (PST) Received: from localhost.localdomain ([2603:8080:1500:3d89:a917:5124:7300:7cef]) by smtp.gmail.com with ESMTPSA id 5614622812f47-45a5e2f1de5sm2398106b6e.22.2026.01.07.07.34.38 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 07 Jan 2026 07:34:39 -0800 (PST) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, John Groves Subject: [PATCH V3 20/21] famfs_fuse: Add famfs fmap metadata documentation Date: Wed, 7 Jan 2026 09:33:29 -0600 Message-ID: <20260107153332.64727-21-john@groves.net> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20260107153332.64727-1-john@groves.net> References: <20260107153244.64703-1-john@groves.net> <20260107153332.64727-1-john@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: John Groves This describes the fmap metadata - both simple and interleaved Signed-off-by: John Groves --- fs/fuse/famfs_kfmap.h | 73 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 73 insertions(+) diff --git a/fs/fuse/famfs_kfmap.h b/fs/fuse/famfs_kfmap.h index 6a6420bdff48..ac5971d4c63a 100644 --- a/fs/fuse/famfs_kfmap.h +++ b/fs/fuse/famfs_kfmap.h @@ -7,6 +7,79 @@ #ifndef FAMFS_KFMAP_H #define FAMFS_KFMAP_H =20 +/* KABI version 43 (aka v2) fmap structures + * + * The location of the memory backing for a famfs file is described by + * the response to the GET_FMAP fuse message (defined in + * include/uapi/linux/fuse.h + * + * There are currently two extent formats: Simple and Interleaved. + * + * Simple extents are just (devindex, offset, length) tuples, where devind= ex + * references a devdax device that must be retrievable via the GET_DAXDEV + * message/response. + * + * The extent list size must be >=3D file_size. + * + * Interleaved extents merit some additional explanation. Interleaved + * extents stripe data across a collection of strips. Each strip is a + * contiguous allocation from a single devdax device - and is described by + * a simple_extent structure. + * + * Interleaved_extent example: + * ie_nstrips =3D 4 + * ie_chunk_size =3D 2MiB + * ie_nbytes =3D 24MiB + * + * =E2=94=8C=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=90=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=90=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=90=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=90 + * =E2=94=82Chunk =3D 0 =E2=94=82Chunk =3D 1 =E2=94=82Chunk =3D 2 = =E2=94=82Chunk =3D 3 =E2=94=82 + * =E2=94=82Strip =3D 0 =E2=94=82Strip =3D 1 =E2=94=82Strip =3D 2 = =E2=94=82Strip =3D 3 =E2=94=82 + * =E2=94=82Stripe =3D 0 =E2=94=82Stripe =3D 0 =E2=94=82Stripe =3D 0 = =E2=94=82Stripe =3D 0 =E2=94=82 + * =E2=94=82 =E2=94=82 =E2=94=82 =E2=94= =82 =E2=94=82 + * =E2=94=94=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=98=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=98=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=98=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=98 + * =E2=94=82Chunk =3D 4 =E2=94=82Chunk =3D 5 =E2=94=82Chunk =3D 6 = =E2=94=82Chunk =3D 7 =E2=94=82 + * =E2=94=82Strip =3D 0 =E2=94=82Strip =3D 1 =E2=94=82Strip =3D 2 = =E2=94=82Strip =3D 3 =E2=94=82 + * =E2=94=82Stripe =3D 1 =E2=94=82Stripe =3D 1 =E2=94=82Stripe =3D 1 = =E2=94=82Stripe =3D 1 =E2=94=82 + * =E2=94=82 =E2=94=82 =E2=94=82 =E2=94= =82 =E2=94=82 + * =E2=94=94=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=98=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=98=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=98=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=98 + * =E2=94=82Chunk =3D 8 =E2=94=82Chunk =3D 9 =E2=94=82Chunk =3D 10 = =E2=94=82Chunk =3D 11 =E2=94=82 + * =E2=94=82Strip =3D 0 =E2=94=82Strip =3D 1 =E2=94=82Strip =3D 2 = =E2=94=82Strip =3D 3 =E2=94=82 + * =E2=94=82Stripe =3D 2 =E2=94=82Stripe =3D 2 =E2=94=82Stripe =3D 2 = =E2=94=82Stripe =3D 2 =E2=94=82 + * =E2=94=82 =E2=94=82 =E2=94=82 =E2=94= =82 =E2=94=82 + * =E2=94=94=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=98=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=98=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=98=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=98 + * + * * Data is laid out across chunks in chunk # order + * * Columns are strips + * * Strips are contiguous devdax extents, normally each coming from a + * different memory device + * * Rows are stripes + * * The number of chunks is (int)((file_size + chunk_size - 1) / chunk_si= ze) + * (and obviously the last chunk could be partial) + * * The stripe_size =3D (nstrips * chunk_size) + * * chunk_num(offset) =3D offset / chunk_size //integer division + * * strip_num(offset) =3D chunk_num(offset) % nchunks + * * stripe_num(offset) =3D offset / stripe_size //integer division + * * ...You get the idea - see the code for more details... + * + * Some concrete examples from the layout above: + * * Offset 0 in the file is offset 0 in chunk 0, which is offset 0 in + * strip 0 + * * Offset 4MiB in the file is offset 0 in chunk 2, which is offset 0 in + * strip 2 + * * Offset 15MiB in the file is offset 1MiB in chunk 7, which is offset + * 3MiB in strip 3 + * + * Notes about this metadata format: + * + * * For various reasons, chunk_size must be a multiple of the applicable + * PAGE_SIZE + * * Since chunk_size and nstrips are constant within an interleaved_exten= t, + * resolving a file offset to a strip offset within a single + * interleaved_ext is order 1. + * * If nstrips=3D=3D1, a list of interleaved_ext structures degenerates t= o a + * regular extent list (albeit with some wasted struct space). + */ + /* * The structures below are the in-memory metadata format for famfs files. * Metadata retrieved via the GET_FMAP response is converted to this format --=20 2.49.0 From nobody Sat Feb 7 07:10:19 2026 Received: from mail-dl1-f67.google.com (mail-dl1-f67.google.com [74.125.82.67]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DD222347FC4 for ; Wed, 7 Jan 2026 17:34:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.67 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767807252; cv=none; b=CvF0iOqqF81K3bdGqkY+I027ysTYsD8CbITgLMdXyJevP8v3JB8lfEBksR5wHpR1MtiVFhRBRU1r6XDRolSUq12ygGBbFdAdoFeLybByxx5dYQzUHeNI7ccMdHSbO7DBSaxonp1fongZMbH0oEjDJgxxOPFr2hD2tDkgqHCIMxY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767807252; c=relaxed/simple; bh=Iv8QEQFRzeBfeVTlG+UF1DvDWR96yJ8hOeXcNbevDAU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=HGa1snf62zKBILyoYC0RehTauZXBTl5Knt3wz1vSZ1x03fF+4uHGf3vpuBXp46F8cThYWSubmW1IJjlUMkCkidlCvpNS1zP3/YlIh0MoYfhpXsGflnVzScGaqPeDFngRaf1IY49LEAqKqwy8soYvz7Jgk0Wn8uPBoX1pXtYw13I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=JuF0lHQZ; arc=none smtp.client-ip=74.125.82.67 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="JuF0lHQZ" Received: by mail-dl1-f67.google.com with SMTP id a92af1059eb24-11f42e97340so349693c88.0 for ; Wed, 07 Jan 2026 09:34:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1767807250; x=1768412050; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=ByDlzKUbACiwWH3Dgtn+v+MBoRieZFlgupkDEPd2zG0=; b=JuF0lHQZSI5WBNI1vEdUccsB/iBeJRzxRlM4MdiwlelP9IDf2pjR82l+gRQ2QfXaNx /07XdQWHEqTAerVAT0b2F3ze5WefrUNYGnabdqE/GuxUjUhhCtRlKkdVSf3ztqiF6Y5p JoLUYgCcL4Zg3Xe+GO6MxLyoy7t0eP8vsFCXL0yvyk//SKle8oDNO0mydC3f4iBGbO7Z IQHm20qMSnwGGQu8D3dp1v+NttLoyi1lHWAoy4etQRs89xlgLY0zcyyBfl08E/mgeUjN vVpeDlraSmsARP3DHa/UCeD5lHJM9asahbhsFYLXwbaOK/hsLToCALFzLKNecg29MYio o/mA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1767807250; x=1768412050; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-gg :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ByDlzKUbACiwWH3Dgtn+v+MBoRieZFlgupkDEPd2zG0=; b=widICGCkaBfT2C1wCEJbKeB+R5q7/r4TGbVwnxUcmIABRnr3SHPPfIKTDF5Ls7jSrz aTVNMTUoAr5o9rSa3BZEpHb4Xf4b6ePpfDZCUc/eXJimJUp9orNRnOQwmfTjgDvBPwRk dcho+MUGN9k0YIjzilMRcOxtG3RTAEM1erJVl+otiWs+zCHtowlgoI80q4fPpIWrDVRH Ro3MLoOYvuuX0jwLn1RhFCO9+40az8wb9zQBRjAmwC88ga7Fz1LH47plA4JlgaMt+OZ6 vT111yxFB2czEwlML9eGICx+IdkC1RMtO0qTTBMF5BUKeu9+Tu9E+rhm/OgXc973rlkT i8eg== X-Forwarded-Encrypted: i=1; AJvYcCWM4BKAbx+OIq/ZkbWtE43ZeXqGPi+EeRJF7rUommBQgMm2kTke01q88LZ+jL2CM2ylkbWZB/klfsQ3RJc=@vger.kernel.org X-Gm-Message-State: AOJu0Yzx4H47unzNVSbszOwAC7hSY+c9glUvRB49fdLm3fl+3R7m1OfL bvNFi3GzoJ/Wt+G2BpdLaALDvO9pPH+m7dPAzBJxnvJRPD4WlzSf/n4okz430aU8 X-Gm-Gg: AY/fxX7hPKa/Is2mEOHC9I96lPyYrWMFYXMd7XgxZrUsX5wRpSwxo+PceiygLUspPq2 AQAfX5kGw6gdWwI7+kxkO7Vd9z6dK1A+lW2xGn21u33RrMc1S5rbNcUqdy/OdpLTZwfjlqanso4 OMLEjTy6qvNw8XsaIXhqEvMCN6jTn7Zt6P0LdKg4xA8WBP+9kok0mYvNC+DxGTOof25usTjs5Bl 8oBUPGey2Z4g27gh0opIqGfPpqHHQz0pAf9QpoQwf+Fp8pGat4HeosOH28iT8E1DpfFNnd1agfo oRaN0Y4SVUaOOrkSfficDsF21qQhplTXiVjiShRf2bU5iQ0x5XJaH4j1XbrexTaVrmDHp/MjdaU IVSofN8gsJ3iOUeas02Gp0N87xkvjIF2hqtvHvjgCyXXtvGkNQ0bANoW7pmpXjNi/M8Ea0HnB+X hSVKc+70LY4GcvSsTjdRE7lxhP6IIwqSexcTP9GsCA0YF+ X-Google-Smtp-Source: AGHT+IHH6c3F0YfxvIazIzqmrv5Gfwpzb8ium784lL1LqhHDGR+rc1ftda1nHLwiQjItEG5YQPZNcQ== X-Received: by 2002:a05:6808:6d91:b0:453:50af:c463 with SMTP id 5614622812f47-45a6bebb603mr920549b6e.41.1767800083021; Wed, 07 Jan 2026 07:34:43 -0800 (PST) Received: from localhost.localdomain ([2603:8080:1500:3d89:a917:5124:7300:7cef]) by smtp.gmail.com with ESMTPSA id 5614622812f47-45a5e2f1de5sm2398106b6e.22.2026.01.07.07.34.41 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 07 Jan 2026 07:34:42 -0800 (PST) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Miklos Szeredi , Dan Williams , Bernd Schubert , Alison Schofield Cc: John Groves , Jonathan Corbet , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , David Hildenbrand , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Bagas Sanjaya , Chen Linxuan , James Morse , Fuad Tabba , Sean Christopherson , Shivank Garg , Ackerley Tng , Gregory Price , Aravind Ramesh , Ajay Joshi , venkataravis@micron.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, John Groves Subject: [PATCH V3 21/21] famfs_fuse: Add documentation Date: Wed, 7 Jan 2026 09:33:30 -0600 Message-ID: <20260107153332.64727-22-john@groves.net> X-Mailer: git-send-email 2.50.1 In-Reply-To: <20260107153332.64727-1-john@groves.net> References: <20260107153244.64703-1-john@groves.net> <20260107153332.64727-1-john@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add Documentation/filesystems/famfs.rst and update MAINTAINERS Reviewed-by: Randy Dunlap Tested-by: Randy Dunlap Signed-off-by: John Groves Reviewed-by: Jonathan Cameron --- Documentation/filesystems/famfs.rst | 142 ++++++++++++++++++++++++++++ Documentation/filesystems/index.rst | 1 + MAINTAINERS | 1 + 3 files changed, 144 insertions(+) create mode 100644 Documentation/filesystems/famfs.rst diff --git a/Documentation/filesystems/famfs.rst b/Documentation/filesystem= s/famfs.rst new file mode 100644 index 000000000000..0d3c9ba9b7a8 --- /dev/null +++ b/Documentation/filesystems/famfs.rst @@ -0,0 +1,142 @@ +.. SPDX-License-Identifier: GPL-2.0 + +.. _famfs_index: + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +famfs: The fabric-attached memory file system +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +- Copyright (C) 2024-2025 Micron Technology, Inc. + +Introduction +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Compute Express Link (CXL) provides a mechanism for disaggregated or +fabric-attached memory (FAM). This creates opportunities for data sharing; +clustered apps that would otherwise have to shard or replicate data can +share one copy in disaggregated memory. + +Famfs, which is not CXL-specific in any way, provides a mechanism for +multiple hosts to concurrently access data in shared memory, by giving it +a file system interface. With famfs, any app that understands files can +access data sets in shared memory. Although famfs supports read and write, +the real point is to support mmap, which provides direct (dax) access to +the memory - either writable or read-only. + +Shared memory can pose complex coherency and synchronization issues, but +there are also simple cases. Two simple and eminently useful patterns that +occur frequently in data analytics and AI are: + +* Serial Sharing - Only one host or process at a time has access to a file +* Read-only Sharing - Multiple hosts or processes share read-only access + to a file + +The famfs fuse file system is part of the famfs framework; user space +components [1] handle metadata allocation and distribution, and provide a +low-level fuse server to expose files that map directly to [presumably +shared] memory. + +The famfs framework manages coherency of its own metadata and structures, +but does not attempt to manage coherency for applications. + +Famfs also provides data isolation between files. That is, even though +the host has access to an entire memory "device" (as a devdax device), apps +cannot write to memory for which the file is read-only, and mapping one +file provides isolation from the memory of all other files. This is pretty +basic, but some experimental shared memory usage patterns provide no such +isolation. + +Principles of Operation +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Famfs is a file system with one or more devdax devices as a first-class +backing device(s). Metadata maintenance and query operations happen +entirely in user space. + +The famfs low-level fuse server daemon provides file maps (fmaps) and +devdax device info to the fuse/famfs kernel component so that +read/write/mapping faults can be handled without up-calls for all active +files. + +The famfs user space is responsible for maintaining and distributing +consistent metadata. This is currently handled via an append-only +metadata log within the memory, but this is orthogonal to the fuse/famfs +kernel code. + +Once instantiated, "the same file" on each host points to the same shared +memory, but in-memory metadata (inodes, etc.) is ephemeral on each host +that has a famfs instance mounted. Use cases are free to allow or not +allow mutations to data on a file-by-file basis. + +When an app accesses a data object in a famfs file, there is no page cache +involvement. The CPU cache is loaded directly from the shared memory. In +some use cases, this is an enormous reduction read amplification compared +to loading an entire page into the page cache. + + +Famfs is Not a Conventional File System +--------------------------------------- + +Famfs files can be accessed by conventional means, but there are +limitations. The kernel component of fuse/famfs is not involved in the +allocation of backing memory for files at all; the famfs user space +creates files and responds as a low-level fuse server with fmaps and +devdax device info upon request. + +Famfs differs in some important ways from conventional file systems: + +* Files must be pre-allocated by the famfs framework; allocation is never + performed on (or after) write. +* Any operation that changes a file's size is considered to put the file + in an invalid state, disabling access to the data. It may be possible to + revisit this in the future. (Typically the famfs user space can restore + files to a valid state by replaying the famfs metadata log.) + +Famfs exists to apply the existing file system abstractions to shared +memory so applications and workflows can more easily adapt to an +environment with disaggregated shared memory. + +Memory Error Handling +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Possible memory errors include timeouts, poison and unexpected +reconfiguration of an underlying dax device. In all of these cases, famfs +receives a call from the devdax layer via its iomap_ops->notify_failure() +function. If any memory errors have been detected, access to the affected +daxdev is disabled to avoid further errors or corruption. + +In all known cases, famfs can be unmounted cleanly. In most cases errors +can be cleared by re-initializing the memory - at which point a new famfs +file system can be created. + +Key Requirements +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The primary requirements for famfs are: + +1. Must support a file system abstraction backed by sharable devdax memory +2. Files must efficiently handle VMA faults +3. Must support metadata distribution in a sharable way +4. Must handle clients with a stale copy of metadata + +The famfs kernel component takes care of 1-2 above by caching each file's +mapping metadata in the kernel. + +Requirements 3 and 4 are handled by the user space components, and are +largely orthogonal to the functionality of the famfs kernel module. + +Requirements 3 and 4 cannot be met by conventional fs-dax file systems +(e.g. xfs) because they use write-back metadata; it is not valid to mount +such a file system on two hosts from the same in-memory image. + + +Famfs Usage +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Famfs usage is documented at [1]. + + +References +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +- [1] Famfs user space repository and documentation + https://github.com/cxl-micron-reskit/famfs diff --git a/Documentation/filesystems/index.rst b/Documentation/filesystem= s/index.rst index f4873197587d..e6fb467c1680 100644 --- a/Documentation/filesystems/index.rst +++ b/Documentation/filesystems/index.rst @@ -89,6 +89,7 @@ Documentation for filesystem implementations. ext3 ext4/index f2fs + famfs gfs2/index hfs hfsplus diff --git a/MAINTAINERS b/MAINTAINERS index 16b0606a3b85..b74ac9395264 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -10380,6 +10380,7 @@ M: John Groves L: linux-cxl@vger.kernel.org L: linux-fsdevel@vger.kernel.org S: Supported +F: Documentation/filesystems/famfs.rst F: fs/fuse/famfs.c F: fs/fuse/famfs_kfmap.h =20 --=20 2.49.0