From nobody Mon Feb 9 23:04:21 2026 Received: from mail-ot1-f42.google.com (mail-ot1-f42.google.com [209.85.210.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2AFD31DB13A; Mon, 21 Apr 2025 01:34:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745199279; cv=none; b=ODFCs/jYNTHn5toDornAZYu96tBLquftbAawsdn6/uitWEFHxu+5LYcInOEWGCEWPMtFmpaz9RLU8nmDWEWfg6wcQZ340das6XjZYHVsZqZjiOdvrQ1t6uPQOC8U5vOEWduyTvI+meoPMt81LSUWC0PWNQ6uxwcya3Rb88PqkxU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745199279; c=relaxed/simple; bh=JJavKmU5g6tWJWTxuJpGUXY96eQIHoKD8uqAsnDeOI4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=MuxFCdX+8mLGyS4zOBLrHg8TE1qUMS2FDBx3BWUvWrjuZLTWYlgNhmLmWLHaQBGsKIKvRkekGH0qpVawqfYpBQ4/elG45+nedVPZlRBfswDB1r34PdUDW4G4V+M4Fn8OjbbhZcwfx/siAUooFVR4WL4gnSP1N7yLDOxhG3r6HqE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=L1sjEvyF; arc=none smtp.client-ip=209.85.210.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="L1sjEvyF" Received: by mail-ot1-f42.google.com with SMTP id 46e09a7af769-72c16e658f4so1984138a34.1; Sun, 20 Apr 2025 18:34:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1745199276; x=1745804076; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=YcrAxqtjszSaQiT74e0heontuc8sXS7oh97ApQYTn+o=; b=L1sjEvyF/ZoPxACrgHziDW8eLOUxFTfOREfYryQNoOQjZc4zlhQn/c9m/awm7x0HEM ipLrztYJSEunAxkgQk9AOsd7i7fn5GW0OqrDbh85s3UkpsAK3Ek5fmxx60VRqWVhgK4n BnpzDBK3rFS9rbTA5PhZACdhXEea6jrMYjEXnFCl3UPPefupJX+65Gymfwoc6fchtCPN K4E9aCtH/+on9ap0nnyhHW9i/r72lkyntO/p8+xFv11C3rvjZKDbJUmULHbooXwKC7qA IAiYWHcZgHqsLmjgCW+3NepeRHgKdPpCvmnnNKmGNbwRiUfrwpiYhaNkvZ9uuvMqBfR8 leDA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745199276; x=1745804076; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=YcrAxqtjszSaQiT74e0heontuc8sXS7oh97ApQYTn+o=; b=AKFh6KEVwSFsKWoRR1OgoN5h0nnON1DlOMsV0ylIuj/vWeg7USw/vzSIlgCRW8KXEY bo68iZT/Lk8Np8Qj33pWRM1km8XuTyP+Nl7P6eWaWL4Sl/wHgCYbtBGebVo+kmA2qhgq KxF4Dc8oyWxvIAoPKF6ZxLwmGOLCz7QZirbRcmbMzUmhmBp1Rb2NBPM5lsQMEEc0ukOS 3U4ymIlgEuMS+fRUra8AlnT6OjNUXCsIp6egUoSgXSqvTHE9NggueZJ18JHKpdcs+gIX E/4e+GGO2CMwMTl8SJ1O0fBskH1hnaqlv7QQi4Ch+KmQ1JceHo0DN8pXysCL0IOtdykT 54/A== X-Forwarded-Encrypted: i=1; AJvYcCUpIlXhQKmXa3IlurN1ge56K37Mx8IEIXRpxLvfIBPqdyaXjxXaobAKWMpmfgS2nm5ZMw7ekEJ4jmM=@vger.kernel.org, AJvYcCVnciWQKCW5hsitHmlzgUlZPyMQZvfpM3hTC3Et1MQ3ny0OXpJ7OyTgh/3yCOh6/9BC6XmNHwLjmJBdFOk6Xg==@vger.kernel.org, AJvYcCWAISmNmx+j1sb3lUK4F9uIzhTt3CqUYpcNUSOkTDGy1jle9oGYAN3jnDt2uDLo9Q5cYcc57wAMRqHIyyV+@vger.kernel.org, AJvYcCXm0FlpJH1p+Kk6cWhpTfsJeH7Ja6t0Bd5BTms+fus8UpkEIi40BWjVjC4rumin2AsL/e7nl8S8sxfD@vger.kernel.org X-Gm-Message-State: AOJu0Yx0Eyn34kgJWQ/OIufHJ+akxr1ANrqp6nf74Uu3DEbPRFe0ND+t kIB00QVAy7CPSaoPczDrnjtV8+fVZyUI2a0bOLe4rsTtJ/6s301c X-Gm-Gg: ASbGncsunO79mvBb/OvpJT8qLPSFq33YibVpuTqBYPse7YMydAxXUQT2uRCzo1f7yJx iePxR+mv2kQPV6jjeVNBlUlnoRgpLEW3fVssy6CX1kWeUGqdBlxe+tasAplFaEM5/JfydDjKG8O bd2flYOQ7avtl3jp1Hv6196w7CmbIwV23J0I6a4tNzKIlocxrVrSczetxiix7/jFm3/+HWtp/rk 4bylrpZ39PHdAG0DQy6yPgOntheIpsuK6bTCkIWQgbNg6MLPmSCrmS+8K1v2LcKcQ3wMZR2Nvl1 pCJvt1lbWnSm2I6r7BQrcuQ2314sjy8PVHZBch18cSGcTCsq1gwvzf/+oEv0F+gCz9Vv4hyaosC 8TciN X-Google-Smtp-Source: AGHT+IFWNSElaLO5LfPycQSgAB52unjbLEM8kKy6vLbqFxmUQJ1htR96UknIbJZHxCYAGWw5emcl2Q== X-Received: by 2002:a05:6830:670c:b0:72b:a6a9:8465 with SMTP id 46e09a7af769-73006333daemr6668386a34.23.1745199276180; Sun, 20 Apr 2025 18:34:36 -0700 (PDT) Received: from localhost.localdomain ([2603:8080:1500:3d89:a8f7:1b36:93ce:8dbf]) by smtp.gmail.com with ESMTPSA id 46e09a7af769-7300489cd44sm1267588a34.66.2025.04.20.18.34.34 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Sun, 20 Apr 2025 18:34:35 -0700 (PDT) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Dan Williams , Miklos Szeredi , Bernd Schubert Cc: John Groves , Jonathan Corbet , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , Christian Brauner , "Darrick J . Wong" , Luis Henriques , Randy Dunlap , Jeff Layton , Kent Overstreet , Petr Vorel , Brian Foster , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Aravind Ramesh , Ajay Joshi , John Groves Subject: [RFC PATCH 14/19] famfs_fuse: GET_DAXDEV message and daxdev_table Date: Sun, 20 Apr 2025 20:33:41 -0500 Message-Id: <20250421013346.32530-15-john@groves.net> X-Mailer: git-send-email 2.39.5 (Apple Git-154) In-Reply-To: <20250421013346.32530-1-john@groves.net> References: <20250421013346.32530-1-john@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" * The new GET_DAXDEV message/response is enabled * The command it triggered by the update_daxdev_table() call, if there are any daxdevs in the subject fmap that are not represented in the daxdev_dable yet. Signed-off-by: John Groves --- fs/fuse/famfs.c | 281 ++++++++++++++++++++++++++++++++++++-- fs/fuse/famfs_kfmap.h | 23 ++++ fs/fuse/fuse_i.h | 4 + fs/fuse/inode.c | 2 + fs/namei.c | 1 + include/uapi/linux/fuse.h | 15 ++ 6 files changed, 316 insertions(+), 10 deletions(-) diff --git a/fs/fuse/famfs.c b/fs/fuse/famfs.c index e62c047d0950..2e182cb7d7c9 100644 --- a/fs/fuse/famfs.c +++ b/fs/fuse/famfs.c @@ -20,6 +20,250 @@ #include "famfs_kfmap.h" #include "fuse_i.h" =20 +/* + * famfs_teardown() + * + * Deallocate famfs metadata for a fuse_conn + */ +void +famfs_teardown(struct fuse_conn *fc) +{ + struct famfs_dax_devlist *devlist =3D fc->dax_devlist; + int i; + + fc->dax_devlist =3D NULL; + + if (!devlist) + return; + + if (!devlist->devlist) + goto out; + + /* Close & release all the daxdevs in our table */ + for (i =3D 0; i < devlist->nslots; i++) { + if (devlist->devlist[i].valid && devlist->devlist[i].devp) + fs_put_dax(devlist->devlist[i].devp, fc); + } + kfree(devlist->devlist); + +out: + kfree(devlist); +} + +static int +famfs_verify_daxdev(const char *pathname, dev_t *devno) +{ + struct inode *inode; + struct path path; + int err; + + if (!pathname || !*pathname) + return -EINVAL; + + err =3D kern_path(pathname, LOOKUP_FOLLOW, &path); + if (err) + return err; + + inode =3D d_backing_inode(path.dentry); + if (!S_ISCHR(inode->i_mode)) { + err =3D -EINVAL; + goto out_path_put; + } + + if (!may_open_dev(&path)) { /* had to export this */ + err =3D -EACCES; + goto out_path_put; + } + + *devno =3D inode->i_rdev; + +out_path_put: + path_put(&path); + return err; +} + +/** + * famfs_fuse_get_daxdev() + * + * Send a GET_DAXDEV message to the fuse server to retrieve info on a + * dax device. + * + * @fm - fuse_mount + * @index - the index of the dax device; daxdevs are referred to by index + * in fmaps, and the server resolves the index to a particular da= xdev + * + * Returns: 0=3Dsuccess + * -errno=3Dfailure + */ +static int +famfs_fuse_get_daxdev(struct fuse_mount *fm, const u64 index) +{ + struct fuse_daxdev_out daxdev_out =3D { 0 }; + struct fuse_conn *fc =3D fm->fc; + struct famfs_daxdev *daxdev; + int err =3D 0; + + FUSE_ARGS(args); + + pr_notice("%s: index=3D%lld\n", __func__, index); + + /* Store the daxdev in our table */ + if (index >=3D fc->dax_devlist->nslots) { + pr_err("%s: index(%lld) > nslots(%d)\n", + __func__, index, fc->dax_devlist->nslots); + err =3D -EINVAL; + goto out; + } + + args.opcode =3D FUSE_GET_DAXDEV; + args.nodeid =3D index; + + args.in_numargs =3D 0; + + args.out_numargs =3D 1; + args.out_args[0].size =3D sizeof(daxdev_out); + args.out_args[0].value =3D &daxdev_out; + + /* Send GET_DAXDEV command */ + err =3D fuse_simple_request(fm, &args); + if (err) { + pr_err("%s: err=3D%d from fuse_simple_request()\n", + __func__, err); + /* Error will be that the payload is smaller than FMAP_BUFSIZE, + * which is the max we can handle. Empty payload handled below. + */ + goto out; + } + + down_write(&fc->famfs_devlist_sem); + + daxdev =3D &fc->dax_devlist->devlist[index]; + pr_debug("%s: dax_devlist %llx daxdev[%lld]=3D%llx\n", __func__, + (u64)fc->dax_devlist, index, (u64)daxdev); + + /* Abort if daxdev is now valid */ + if (daxdev->valid) { + up_write(&fc->famfs_devlist_sem); + /* We already have a valid entry at this index */ + err =3D -EALREADY; + goto out; + } + + /* This verifies that the dev is valid and can be opened and gets the dev= no */ + pr_debug("%s: famfs_verify_daxdev(%s)\n", __func__, daxdev_out.name); + err =3D famfs_verify_daxdev(daxdev_out.name, &daxdev->devno); + if (err) { + up_write(&fc->famfs_devlist_sem); + pr_err("%s: err=3D%d from famfs_verify_daxdev()\n", __func__, err); + goto out; + } + + /* This will fail if it's not a dax device */ + pr_debug("%s: dax_dev_get(%x)\n", __func__, daxdev->devno); + daxdev->devp =3D dax_dev_get(daxdev->devno); + if (!daxdev->devp) { + up_write(&fc->famfs_devlist_sem); + pr_warn("%s: device %s not found or not dax\n", + __func__, daxdev_out.name); + err =3D -ENODEV; + goto out; + } + + daxdev->name =3D kstrdup(daxdev_out.name, GFP_KERNEL); + wmb(); /* all daxdev fields must be visible before marking it valid */ + daxdev->valid =3D 1; + + up_write(&fc->famfs_devlist_sem); + + pr_debug("%s: daxdev(%lld, %s)=3D%llx opened and marked valid\n", + __func__, index, daxdev->name, (u64)daxdev); + +out: + return err; +} + +/** + * famfs_update_daxdev_table() + * + * This function is called for each new file fmap, to verify whether all + * referenced daxdevs are already known (i.e. in the table). Any daxdev + * indices that are not in the table will be retrieved via + * famfs_fuse_get_daxdev() + * @fm - fuse_mount + * @meta - famfs_file_meta, in-memory format, built from a GET_FMAP respon= se + * + * Returns: 0=3Dsuccess + * -errno=3Dfailure + */ +static int +famfs_update_daxdev_table( + struct fuse_mount *fm, + const struct famfs_file_meta *meta) +{ + struct famfs_dax_devlist *local_devlist; + struct fuse_conn *fc =3D fm->fc; + int err; + int i; + + pr_debug("%s: dev_bitmap=3D0x%llx\n", __func__, meta->dev_bitmap); + + /* First time through we will need to allocate the dax_devlist */ + if (!fc->dax_devlist) { + local_devlist =3D kcalloc(1, sizeof(*fc->dax_devlist), GFP_KERNEL); + if (!local_devlist) + return -ENOMEM; + + local_devlist->nslots =3D MAX_DAXDEVS; + pr_debug("%s: allocate dax_devlist=3D%llx\n", __func__, + (u64)local_devlist); + + local_devlist->devlist =3D kcalloc(MAX_DAXDEVS, + sizeof(struct famfs_daxdev), + GFP_KERNEL); + if (!local_devlist->devlist) { + kfree(local_devlist); + return -ENOMEM; + } + + /* We don't need the famfs_devlist_sem here because we use cmpxchg... */ + if (cmpxchg(&fc->dax_devlist, NULL, local_devlist) !=3D NULL) { + pr_debug("%s: aborting new devlist\n", __func__); + kfree(local_devlist->devlist); + kfree(local_devlist); /* another thread beat us to it */ + } else { + pr_debug("%s: published new dax_devlist %llx / %llx\n", + __func__, (u64)local_devlist, + (u64)local_devlist->devlist); + } + } + + down_read(&fc->famfs_devlist_sem); + for (i =3D 0; i < fc->dax_devlist->nslots; i++) { + if (meta->dev_bitmap & (1ULL << i)) { + /* This file meta struct references devindex i + * if devindex i isn't in the table; get it... + */ + if (!(fc->dax_devlist->devlist[i].valid)) { + up_read(&fc->famfs_devlist_sem); + + pr_notice("%s: daxdev=3D%d (%llx) invalid...getting\n", + __func__, i, + (u64)(&fc->dax_devlist->devlist[i])); + err =3D famfs_fuse_get_daxdev(fm, i); + if (err) + pr_err("%s: failed to get daxdev=3D%d\n", + __func__, i); + + down_read(&fc->famfs_devlist_sem); + } + } + } + up_read(&fc->famfs_devlist_sem); + + return 0; +} + +/*************************************************************************= **/ =20 void __famfs_meta_free(void *famfs_meta) @@ -67,12 +311,15 @@ famfs_check_ext_alignment(struct famfs_meta_simple_ext= *se) } =20 /** - * famfs_meta_alloc() - Allocate famfs file metadata + * famfs_fuse_meta_alloc() - Allocate famfs file metadata * @metap: Pointer to an mcache_map_meta pointer * @ext_count: The number of extents needed + * + * Returns: 0=3Dsuccess + * -errno=3Dfailure */ static int -famfs_meta_alloc_v3( +famfs_fuse_meta_alloc( void *fmap_buf, size_t fmap_buf_size, struct famfs_file_meta **metap) @@ -92,28 +339,25 @@ famfs_meta_alloc_v3( if (next_offset > fmap_buf_size) { pr_err("%s:%d: fmap_buf underflow offset/size %ld/%ld\n", __func__, __LINE__, next_offset, fmap_buf_size); - rc =3D -EINVAL; - goto errout; + return -EINVAL; } =20 if (fmh->nextents < 1) { pr_err("%s: nextents %d < 1\n", __func__, fmh->nextents); - rc =3D -EINVAL; - goto errout; + return -EINVAL; } =20 if (fmh->nextents > FUSE_FAMFS_MAX_EXTENTS) { pr_err("%s: nextents %d > max (%d) 1\n", __func__, fmh->nextents, FUSE_FAMFS_MAX_EXTENTS); - rc =3D -E2BIG; - goto errout; + return -E2BIG; } =20 meta =3D kzalloc(sizeof(*meta), GFP_KERNEL); if (!meta) return -ENOMEM; - meta->error =3D false; =20 + meta->error =3D false; meta->file_type =3D fmh->file_type; meta->file_size =3D fmh->file_size; meta->fm_extent_type =3D fmh->ext_type; @@ -298,6 +542,20 @@ famfs_meta_alloc_v3( return rc; } =20 +/** + * famfs_file_init_dax() + * + * Initialize famfs metadata for a file, based on the contents of the GET_= FMAP + * response + * + * @fm - fuse_mount + * @inode - the inode + * @fmap_buf - fmap response message + * @fmap_size - Size of the fmap message + * + * Returns: 0=3Dsuccess + * -errno=3Dfailure + */ int famfs_file_init_dax( struct fuse_mount *fm, @@ -316,10 +574,13 @@ famfs_file_init_dax( return -EEXIST; } =20 - rc =3D famfs_meta_alloc_v3(fmap_buf, fmap_size, &meta); + rc =3D famfs_fuse_meta_alloc(fmap_buf, fmap_size, &meta); if (rc) goto errout; =20 + /* Make sure this fmap doesn't reference any unknown daxdevs */ + famfs_update_daxdev_table(fm, meta); + /* Publish the famfs metadata on fi->famfs_meta */ inode_lock(inode); if (fi->famfs_meta) { diff --git a/fs/fuse/famfs_kfmap.h b/fs/fuse/famfs_kfmap.h index ce785d76719c..325adb8b99c5 100644 --- a/fs/fuse/famfs_kfmap.h +++ b/fs/fuse/famfs_kfmap.h @@ -60,4 +60,27 @@ struct famfs_file_meta { }; }; =20 +/* + * dax_devlist + * + * This is the in-memory daxdev metadata that is populated by + * the responses to GET_FMAP messages + */ +struct famfs_daxdev { + /* Include dev uuid? */ + bool valid; + bool error; + dev_t devno; + struct dax_device *devp; + char *name; +}; + +#define MAX_DAXDEVS 24 + +struct famfs_dax_devlist { + int nslots; + int ndevs; + struct famfs_daxdev *devlist; /* XXX: make this an xarray! */ +}; + #endif /* FAMFS_KFMAP_H */ diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index d8e0ac784224..4c4c4f0ff280 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -1561,7 +1561,11 @@ extern void fuse_sysctl_unregister(void); int famfs_file_init_dax(struct fuse_mount *fm, struct inode *inode, void *fmap_buf, size_t fmap_size); +ssize_t famfs_dax_write_iter(struct kiocb *iocb, struct iov_iter *from); +ssize_t famfs_dax_read_iter(struct kiocb *iocb, struct iov_iter *to); +int famfs_file_mmap(struct file *file, struct vm_area_struct *vma); void __famfs_meta_free(void *map); +void famfs_teardown(struct fuse_conn *fc); #endif =20 static inline struct fuse_backing *famfs_meta_set(struct fuse_inode *fi, diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index e86bf330117f..af1629b07a30 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -1051,6 +1051,8 @@ void fuse_conn_put(struct fuse_conn *fc) } if (IS_ENABLED(CONFIG_FUSE_PASSTHROUGH)) fuse_backing_files_free(fc); + if (IS_ENABLED(CONFIG_FUSE_FAMFS_DAX)) + famfs_teardown(fc); call_rcu(&fc->rcu, delayed_release); } } diff --git a/fs/namei.c b/fs/namei.c index ecb7b95c2ca3..75a1e1d46593 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -3380,6 +3380,7 @@ bool may_open_dev(const struct path *path) return !(path->mnt->mnt_flags & MNT_NODEV) && !(path->mnt->mnt_sb->s_iflags & SB_I_NODEV); } +EXPORT_SYMBOL(may_open_dev); =20 static int may_open(struct mnt_idmap *idmap, const struct path *path, int acc_mode, int flag) diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h index 0f6ff1ffb23d..982d4fc66ef8 100644 --- a/include/uapi/linux/fuse.h +++ b/include/uapi/linux/fuse.h @@ -1328,4 +1328,19 @@ struct fuse_famfs_fmap_header { uint64_t file_size; uint64_t reserved1; }; + +struct fuse_get_daxdev_in { + uint32_t daxdev_num; +}; + +#define DAXDEV_NAME_MAX 256 +struct fuse_daxdev_out { + uint16_t index; + uint16_t reserved; + uint32_t reserved2; + uint64_t reserved3; /* enough space for a uuid if we need it */ + uint64_t reserved4; + char name[DAXDEV_NAME_MAX]; +}; + #endif /* _LINUX_FUSE_H */ --=20 2.49.0