From nobody Wed Oct 8 02:04:52 2025 Received: from mail-ot1-f53.google.com (mail-ot1-f53.google.com [209.85.210.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D42832F3639; Thu, 3 Jul 2025 18:51:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.53 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751568685; cv=none; b=QU8j0P+PoqzRkQwv/P8s+dhlICtKwfHBuneJo1oJLx37HgJcyJK9GgWRK+CQOx+5X6u4L0rTLIohZlOTpscrfuSEY+hu4HHMaYv4a01XvLMDOcKrV5YsWMkRVyMECeHu5liEl8oA114GuhcE3dtgF+t7OUmryU/cHc0FVymHtU4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751568685; c=relaxed/simple; bh=3uq435KcAdEmXJ2dEvmM663qcKbyyXlb5cIDAXxXSAM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=NJgH/pgRe/XsQaqoT3/G9bxwq9XLms2zGNJwmfYdLKG/AhBhJmYi97uzrGqQMcn5IaQ3UpGhgxp2KBnlP+pVfGG3oLjlCILKyvEV9LO9kNY4kHdwSvrxqNOjb6i3EiTtMQBlB2z7Zvaied572Tt8ElUYkviRFPerPL8G5n7f+nM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=WnuCf4qm; arc=none smtp.client-ip=209.85.210.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=Groves.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="WnuCf4qm" Received: by mail-ot1-f53.google.com with SMTP id 46e09a7af769-72c172f1de1so146129a34.3; Thu, 03 Jul 2025 11:51:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1751568683; x=1752173483; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=BiBVfwSLB223dPTff0BfZzAs2J9PXPdtMbuyCEnzUFk=; b=WnuCf4qm1DOCO5+o9qsamwTczdIaBnKVQgiqWCqXySkmagYiyrtMayVo/S9PDFdB7p i7lxSirU/m+m1GuxpCY5IEJ9081t2wJu4+r3KXSCfnIa1TA6+uMP2Q+xaWqzLUtgk3cy /tyrelrl2B6TdGtznJnrAwkpTCZwxzou841DXQ2behOPcpSjha41qE91XSCFbjBsyOuf QaPmlLgQ93soMV3vEEB+6aHltO0d5QHLB3InC3SwqInsGdkQizI/yZmclwX+NcwkV2B6 6PHdr2baNpUU5xb8siyx7rQvvjCxhZQq4WB3u/JFWmE8UnTYc/WszW0S/buF0U0sgwTy KxcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1751568683; x=1752173483; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=BiBVfwSLB223dPTff0BfZzAs2J9PXPdtMbuyCEnzUFk=; b=JnTGX7boThbp98iMKZGW+YXd6O4Cx/BOnXFR9qm7xAvWUA2BZciIwwJHdYFJcMfY9W yhRGWsjGp/OpB5y8K7Jfk+gfx9JQns4hxwWDjmaleI7K+IVDYxxqNt0T7edkJIO28Gib CSLG+087qC/YYT1rQ1hMBt9zIs1EciQXexfWSHiOy5cGNB5aI6dnNdiBkW1+fTr3Z1ln 1dU6bHE/dFv0aezwfYG9OGbq/H3WwIXbazXYU1HiAxo4zVPH4cspmsW6Yn7wFoPRdYJB U8OajSWg3znfUgiRrQy5WhzxZlTAH2cjbo1/lrcB20lBXSbhnR7BAHe0sL+NafQa0leb 1dqA== X-Forwarded-Encrypted: i=1; AJvYcCULLo2MyKbnfGk7kQ9ILfXF+xZ8n4eA2R73Nrr2i1crzR2Ql+KAc1BYC1MtRVWrTWh7GwWJXTEDKIC5R3oJfQ==@vger.kernel.org, AJvYcCUvYJRAKS9+Ml94rRCAkmYlPIRuwwznFF7hcllT6QL7dRoTXtsBNAAHEiUOUYk/5i2LwXhgaKjERwXn@vger.kernel.org, AJvYcCWFBKV9I2FMoi5PUTW0xW7CJxYy1ZAkCnWr4c6yY0xo8T3/CfCMpfGlkEm0sJal6SuYHkm8jSJ/d4w=@vger.kernel.org, AJvYcCWhCbhP7VztFCEHvm4WmL7vrH3dWnskGoNYio4Y2iPMFGMNSxwQj8XrItPRdsVJzTaEZxB/NuGJUXPLFKcn@vger.kernel.org X-Gm-Message-State: AOJu0YxkdN7f3A+kheTEM12l533yuP9DRQvxGLbPtdL8+FLLB3f22v9T VftXGkC1js0up8ZgA5dr80TDRfYXlphQ0O0/5poji//iPaUi0XWl1mcF X-Gm-Gg: ASbGncsC3e04kaFOHRsmRi3DPdgKPkgDWSlGu+ziClx9lW7mCbYVBIr8trPc4O3II+X NnsWjZXk22K2ke/Sy70s2u22hAJ4edUdoV/h1l24lyX9pEWQ2wus0mL1JffBa3AU3B6O7yAQktP L5+RBoqXMiU3s99XLTMXZSboVxaIZERfLqv73vlWMrU6c3v3TTi9IamZl3hg6xvelUFB7HQSCl2 pKIOPf0BPcpUFxwEkZTUiD+j+O1gpktMebmKlxA6C0TCbhSYMIwCKKVVdR5rTQV9PV2/tFrD3+z MUmHN8gMZ7qwJ7DJ2kK06xMGiDiNP+99izu5W3L8T2oFDd0aT5nLhIdFjysq9Trbaf16kufqV1m 5Zgsgqxl5VUB6Ag== X-Google-Smtp-Source: AGHT+IGmPzpo0HD5BF6fh2iRd9KU+SpEfPKk/EoogvAH+BoC56Diym3E75i5H5FPOtaQzBZoqTn2dw== X-Received: by 2002:a05:6830:3151:b0:73a:9d3d:7bea with SMTP id 46e09a7af769-73b4cce14a7mr6284108a34.24.1751568682800; Thu, 03 Jul 2025 11:51:22 -0700 (PDT) Received: from localhost.localdomain ([2603:8080:1500:3d89:cd4:2776:8c4a:3597]) by smtp.gmail.com with ESMTPSA id 46e09a7af769-73c9f90d1ccsm68195a34.44.2025.07.03.11.51.20 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Thu, 03 Jul 2025 11:51:22 -0700 (PDT) Sender: John Groves From: John Groves X-Google-Original-From: John Groves To: John Groves , Dan Williams , Miklos Szeredi , Bernd Schubert Cc: John Groves , Jonathan Corbet , Vishal Verma , Dave Jiang , Matthew Wilcox , Jan Kara , Alexander Viro , Christian Brauner , "Darrick J . Wong" , Randy Dunlap , Jeff Layton , Kent Overstreet , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, nvdimm@lists.linux.dev, linux-cxl@vger.kernel.org, linux-fsdevel@vger.kernel.org, Amir Goldstein , Jonathan Cameron , Stefan Hajnoczi , Joanne Koong , Josef Bacik , Aravind Ramesh , Ajay Joshi , John Groves Subject: [RFC V2 13/18] famfs_fuse: Create files with famfs fmaps Date: Thu, 3 Jul 2025 13:50:27 -0500 Message-Id: <20250703185032.46568-14-john@groves.net> X-Mailer: git-send-email 2.39.5 (Apple Git-154) In-Reply-To: <20250703185032.46568-1-john@groves.net> References: <20250703185032.46568-1-john@groves.net> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" On completion of GET_FMAP message/response, setup the full famfs metadata such that it's possible to handle read/write/mmap directly to dax. Note that the devdax_iomap plumbing is not in yet... Update MAINTAINERS for the new files. Signed-off-by: John Groves --- MAINTAINERS | 9 + fs/fuse/Makefile | 2 +- fs/fuse/famfs.c | 360 ++++++++++++++++++++++++++++++++++++++ fs/fuse/famfs_kfmap.h | 63 +++++++ fs/fuse/file.c | 15 +- fs/fuse/fuse_i.h | 16 +- fs/fuse/inode.c | 2 +- include/uapi/linux/fuse.h | 56 ++++++ 8 files changed, 518 insertions(+), 5 deletions(-) create mode 100644 fs/fuse/famfs.c create mode 100644 fs/fuse/famfs_kfmap.h diff --git a/MAINTAINERS b/MAINTAINERS index c0d5232a473b..02688f27a4d0 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -8808,6 +8808,15 @@ F: Documentation/networking/failover.rst F: include/net/failover.h F: net/core/failover.c =20 +FAMFS +M: John Groves +M: John Groves +L: linux-cxl@vger.kernel.org +L: linux-fsdevel@vger.kernel.org +S: Supported +F: fs/fuse/famfs.c +F: fs/fuse/famfs_kfmap.h + FANOTIFY M: Jan Kara R: Amir Goldstein diff --git a/fs/fuse/Makefile b/fs/fuse/Makefile index 3f0f312a31c1..65a12975d734 100644 --- a/fs/fuse/Makefile +++ b/fs/fuse/Makefile @@ -16,5 +16,5 @@ fuse-$(CONFIG_FUSE_DAX) +=3D dax.o fuse-$(CONFIG_FUSE_PASSTHROUGH) +=3D passthrough.o fuse-$(CONFIG_SYSCTL) +=3D sysctl.o fuse-$(CONFIG_FUSE_IO_URING) +=3D dev_uring.o - +fuse-$(CONFIG_FUSE_FAMFS_DAX) +=3D famfs.o virtiofs-y :=3D virtio_fs.o diff --git a/fs/fuse/famfs.c b/fs/fuse/famfs.c new file mode 100644 index 000000000000..41c4d92f1451 --- /dev/null +++ b/fs/fuse/famfs.c @@ -0,0 +1,360 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * famfs - dax file system for shared fabric-attached memory + * + * Copyright 2023-2025 Micron Technology, Inc. + * + * This file system, originally based on ramfs the dax support from xfs, + * is intended to allow multiple host systems to mount a common file system + * view of dax files that map to shared memory. + */ + +#include +#include +#include +#include +#include +#include +#include + +#include "famfs_kfmap.h" +#include "fuse_i.h" + + +void +__famfs_meta_free(void *famfs_meta) +{ + struct famfs_file_meta *fmap =3D famfs_meta; + + if (!fmap) + return; + + if (fmap) { + switch (fmap->fm_extent_type) { + case SIMPLE_DAX_EXTENT: + kfree(fmap->se); + break; + case INTERLEAVED_EXTENT: + if (fmap->ie) + kfree(fmap->ie->ie_strips); + + kfree(fmap->ie); + break; + default: + pr_err("%s: invalid fmap type\n", __func__); + break; + } + } + kfree(fmap); +} + +static int +famfs_check_ext_alignment(struct famfs_meta_simple_ext *se) +{ + int errs =3D 0; + + if (se->dev_index !=3D 0) + errs++; + + /* TODO: pass in alignment so we can support the other page sizes */ + if (!IS_ALIGNED(se->ext_offset, PMD_SIZE)) + errs++; + + if (!IS_ALIGNED(se->ext_len, PMD_SIZE)) + errs++; + + return errs; +} + +/** + * famfs_fuse_meta_alloc() - Allocate famfs file metadata + * @metap: Pointer to an mcache_map_meta pointer + * @ext_count: The number of extents needed + * + * Returns: 0=3Dsuccess + * -errno=3Dfailure + */ +static int +famfs_fuse_meta_alloc( + void *fmap_buf, + size_t fmap_buf_size, + struct famfs_file_meta **metap) +{ + struct famfs_file_meta *meta =3D NULL; + struct fuse_famfs_fmap_header *fmh; + size_t extent_total =3D 0; + size_t next_offset =3D 0; + int errs =3D 0; + int i, j; + int rc; + + fmh =3D (struct fuse_famfs_fmap_header *)fmap_buf; + + /* Move past fmh in fmap_buf */ + next_offset +=3D sizeof(*fmh); + if (next_offset > fmap_buf_size) { + pr_err("%s:%d: fmap_buf underflow offset/size %ld/%ld\n", + __func__, __LINE__, next_offset, fmap_buf_size); + return -EINVAL; + } + + if (fmh->nextents < 1) { + pr_err("%s: nextents %d < 1\n", __func__, fmh->nextents); + return -EINVAL; + } + + if (fmh->nextents > FUSE_FAMFS_MAX_EXTENTS) { + pr_err("%s: nextents %d > max (%d) 1\n", + __func__, fmh->nextents, FUSE_FAMFS_MAX_EXTENTS); + return -E2BIG; + } + + meta =3D kzalloc(sizeof(*meta), GFP_KERNEL); + if (!meta) + return -ENOMEM; + + meta->error =3D false; + meta->file_type =3D fmh->file_type; + meta->file_size =3D fmh->file_size; + meta->fm_extent_type =3D fmh->ext_type; + + switch (fmh->ext_type) { + case FUSE_FAMFS_EXT_SIMPLE: { + struct fuse_famfs_simple_ext *se_in; + + se_in =3D (struct fuse_famfs_simple_ext *)(fmap_buf + next_offset); + + /* Move past simple extents */ + next_offset +=3D fmh->nextents * sizeof(*se_in); + if (next_offset > fmap_buf_size) { + pr_err("%s:%d: fmap_buf underflow offset/size %ld/%ld\n", + __func__, __LINE__, next_offset, fmap_buf_size); + rc =3D -EINVAL; + goto errout; + } + + meta->fm_nextents =3D fmh->nextents; + + meta->se =3D kcalloc(meta->fm_nextents, sizeof(*(meta->se)), + GFP_KERNEL); + if (!meta->se) { + rc =3D -ENOMEM; + goto errout; + } + + if ((meta->fm_nextents > FUSE_FAMFS_MAX_EXTENTS) || + (meta->fm_nextents < 1)) { + rc =3D -EINVAL; + goto errout; + } + + for (i =3D 0; i < fmh->nextents; i++) { + meta->se[i].dev_index =3D se_in[i].se_devindex; + meta->se[i].ext_offset =3D se_in[i].se_offset; + meta->se[i].ext_len =3D se_in[i].se_len; + + /* Record bitmap of referenced daxdev indices */ + meta->dev_bitmap |=3D (1 << meta->se[i].dev_index); + + errs +=3D famfs_check_ext_alignment(&meta->se[i]); + + extent_total +=3D meta->se[i].ext_len; + } + break; + } + + case FUSE_FAMFS_EXT_INTERLEAVE: { + s64 size_remainder =3D meta->file_size; + struct fuse_famfs_iext *ie_in; + int niext =3D fmh->nextents; + + meta->fm_niext =3D niext; + + /* Allocate interleaved extent */ + meta->ie =3D kcalloc(niext, sizeof(*(meta->ie)), GFP_KERNEL); + if (!meta->ie) { + rc =3D -ENOMEM; + goto errout; + } + + /* + * Each interleaved extent has a simple extent list of strips. + * Outer loop is over separate interleaved extents + */ + for (i =3D 0; i < niext; i++) { + u64 nstrips; + struct fuse_famfs_simple_ext *sie_in; + + /* ie_in =3D one interleaved extent in fmap_buf */ + ie_in =3D (struct fuse_famfs_iext *) + (fmap_buf + next_offset); + + /* Move past one interleaved extent header in fmap_buf */ + next_offset +=3D sizeof(*ie_in); + if (next_offset > fmap_buf_size) { + pr_err("%s:%d: fmap_buf underflow offset/size %ld/%ld\n", + __func__, __LINE__, next_offset, + fmap_buf_size); + rc =3D -EINVAL; + goto errout; + } + + nstrips =3D ie_in->ie_nstrips; + meta->ie[i].fie_chunk_size =3D ie_in->ie_chunk_size; + meta->ie[i].fie_nstrips =3D ie_in->ie_nstrips; + meta->ie[i].fie_nbytes =3D ie_in->ie_nbytes; + + if (!meta->ie[i].fie_nbytes) { + pr_err("%s: zero-length interleave!\n", + __func__); + rc =3D -EINVAL; + goto errout; + } + + /* sie_in =3D the strip extents in fmap_buf */ + sie_in =3D (struct fuse_famfs_simple_ext *) + (fmap_buf + next_offset); + + /* Move past strip extents in fmap_buf */ + next_offset +=3D nstrips * sizeof(*sie_in); + if (next_offset > fmap_buf_size) { + pr_err("%s:%d: fmap_buf underflow offset/size %ld/%ld\n", + __func__, __LINE__, next_offset, + fmap_buf_size); + rc =3D -EINVAL; + goto errout; + } + + if ((nstrips > FUSE_FAMFS_MAX_STRIPS) || (nstrips < 1)) { + pr_err("%s: invalid nstrips=3D%lld (max=3D%d)\n", + __func__, nstrips, + FUSE_FAMFS_MAX_STRIPS); + errs++; + } + + /* Allocate strip extent array */ + meta->ie[i].ie_strips =3D kcalloc(ie_in->ie_nstrips, + sizeof(meta->ie[i].ie_strips[0]), + GFP_KERNEL); + if (!meta->ie[i].ie_strips) { + rc =3D -ENOMEM; + goto errout; + } + + /* Inner loop is over strips */ + for (j =3D 0; j < nstrips; j++) { + struct famfs_meta_simple_ext *strips_out; + u64 devindex =3D sie_in[j].se_devindex; + u64 offset =3D sie_in[j].se_offset; + u64 len =3D sie_in[j].se_len; + + strips_out =3D meta->ie[i].ie_strips; + strips_out[j].dev_index =3D devindex; + strips_out[j].ext_offset =3D offset; + strips_out[j].ext_len =3D len; + + /* Record bitmap of referenced daxdev indices */ + meta->dev_bitmap |=3D (1 << devindex); + + extent_total +=3D len; + errs +=3D famfs_check_ext_alignment(&strips_out[j]); + size_remainder -=3D len; + } + } + + if (size_remainder > 0) { + /* Sum of interleaved extent sizes is less than file size! */ + pr_err("%s: size_remainder %lld (0x%llx)\n", + __func__, size_remainder, size_remainder); + rc =3D -EINVAL; + goto errout; + } + break; + } + + default: + pr_err("%s: invalid ext_type %d\n", __func__, fmh->ext_type); + rc =3D -EINVAL; + goto errout; + } + + if (errs > 0) { + pr_err("%s: %d alignment errors found\n", __func__, errs); + rc =3D -EINVAL; + goto errout; + } + + /* More sanity checks */ + if (extent_total < meta->file_size) { + pr_err("%s: file size %ld larger than map size %ld\n", + __func__, meta->file_size, extent_total); + rc =3D -EINVAL; + goto errout; + } + + *metap =3D meta; + + return 0; +errout: + __famfs_meta_free(meta); + return rc; +} + +/** + * famfs_file_init_dax() - init famfs dax file metadata + * + * @fm: fuse_mount + * @inode: the inode + * @fmap_buf: fmap response message + * @fmap_size: Size of the fmap message + * + * Initialize famfs metadata for a file, based on the contents of the GET_= FMAP + * response + * + * Return: 0=3Dsuccess + * -errno=3Dfailure + */ +int +famfs_file_init_dax( + struct fuse_mount *fm, + struct inode *inode, + void *fmap_buf, + size_t fmap_size) +{ + struct fuse_inode *fi =3D get_fuse_inode(inode); + struct famfs_file_meta *meta =3D NULL; + int rc; + + if (fi->famfs_meta) { + pr_notice("%s: i_no=3D%ld fmap_size=3D%ld ALREADY INITIALIZED\n", + __func__, + inode->i_ino, fmap_size); + return -EEXIST; + } + + rc =3D famfs_fuse_meta_alloc(fmap_buf, fmap_size, &meta); + if (rc) + goto errout; + + /* Publish the famfs metadata on fi->famfs_meta */ + inode_lock(inode); + if (fi->famfs_meta) { + rc =3D -EEXIST; /* file already has famfs metadata */ + } else { + if (famfs_meta_set(fi, meta) !=3D NULL) { + pr_err("%s: file already had metadata\n", __func__); + rc =3D -EALREADY; + goto errout; + } + i_size_write(inode, meta->file_size); + inode->i_flags |=3D S_DAX; + } + inode_unlock(inode); + + errout: + if (rc) + __famfs_meta_free(meta); + + return rc; +} + diff --git a/fs/fuse/famfs_kfmap.h b/fs/fuse/famfs_kfmap.h new file mode 100644 index 000000000000..ce785d76719c --- /dev/null +++ b/fs/fuse/famfs_kfmap.h @@ -0,0 +1,63 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * famfs - dax file system for shared fabric-attached memory + * + * Copyright 2023-2025 Micron Technology, Inc. + */ +#ifndef FAMFS_KFMAP_H +#define FAMFS_KFMAP_H + +/* + * These structures are the in-memory metadata format for famfs files. Met= adata + * retrieved via the GET_FMAP response is converted to this format for use= in + * resolving file mapping faults. + */ + +enum famfs_file_type { + FAMFS_REG, + FAMFS_SUPERBLOCK, + FAMFS_LOG, +}; + +/* We anticipate the possiblity of supporting additional types of extents = */ +enum famfs_extent_type { + SIMPLE_DAX_EXTENT, + INTERLEAVED_EXTENT, + INVALID_EXTENT_TYPE, +}; + +struct famfs_meta_simple_ext { + u64 dev_index; + u64 ext_offset; + u64 ext_len; +}; + +struct famfs_meta_interleaved_ext { + u64 fie_nstrips; + u64 fie_chunk_size; + u64 fie_nbytes; + struct famfs_meta_simple_ext *ie_strips; +}; + +/* + * Each famfs dax file has this hanging from its fuse_inode->famfs_meta + */ +struct famfs_file_meta { + bool error; + enum famfs_file_type file_type; + size_t file_size; + enum famfs_extent_type fm_extent_type; + u64 dev_bitmap; /* bitmap of referenced daxdevs by index */ + union { /* This will make code a bit more readable */ + struct { + size_t fm_nextents; + struct famfs_meta_simple_ext *se; + }; + struct { + size_t fm_niext; + struct famfs_meta_interleaved_ext *ie; + }; + }; +}; + +#endif /* FAMFS_KFMAP_H */ diff --git a/fs/fuse/file.c b/fs/fuse/file.c index 8616fb0a6d61..5d205eadb48f 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -237,6 +237,7 @@ static void fuse_truncate_update_attr(struct inode *ino= de, struct file *file) static int fuse_get_fmap(struct fuse_mount *fm, struct inode *inode, u64 nodeid) { + struct fuse_inode *fi =3D get_fuse_inode(inode); struct fuse_get_fmap_in inarg =3D { 0 }; size_t fmap_bufsize =3D FMAP_BUFSIZE; ssize_t fmap_size; @@ -246,6 +247,10 @@ fuse_get_fmap(struct fuse_mount *fm, struct inode *ino= de, u64 nodeid) =20 FUSE_ARGS(args); =20 + /* Don't retrieve if we already have the famfs metadata */ + if (fi->famfs_meta) + return 0; + fmap_buf =3D kcalloc(1, FMAP_BUFSIZE, GFP_KERNEL); if (!fmap_buf) return -EIO; @@ -285,6 +290,13 @@ fuse_get_fmap(struct fuse_mount *fm, struct inode *ino= de, u64 nodeid) */ fmap_bufsize =3D *((uint32_t *)fmap_buf); =20 + if (fmap_bufsize < fmap_msg_min_size() + || fmap_bufsize > FAMFS_FMAP_MAX) { + pr_err("%s: fmap_size=3D%ld out of range\n", + __func__, fmap_bufsize); + return -EIO; + } + --retries; kfree(fmap_buf); fmap_buf =3D kcalloc(1, fmap_bufsize, GFP_KERNEL); @@ -294,7 +306,8 @@ fuse_get_fmap(struct fuse_mount *fm, struct inode *inod= e, u64 nodeid) goto retry_once; } =20 - /* Will call famfs_file_init_dax() when that gets added */ + /* Convert fmap into in-memory format and hang from inode */ + famfs_file_init_dax(fm, inode, fmap_buf, fmap_size); =20 kfree(fmap_buf); return 0; diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index e01d6e5c6e93..fb6095655403 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -1560,11 +1560,18 @@ extern void fuse_sysctl_unregister(void); #endif /* CONFIG_SYSCTL */ =20 /* famfs.c */ +#if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX) +int famfs_file_init_dax(struct fuse_mount *fm, + struct inode *inode, void *fmap_buf, + size_t fmap_size); +void __famfs_meta_free(void *map); +#endif + static inline struct fuse_backing *famfs_meta_set(struct fuse_inode *fi, void *meta) { #if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX) - return xchg(&fi->famfs_meta, meta); + return cmpxchg(&fi->famfs_meta, NULL, meta); #else return NULL; #endif @@ -1572,7 +1579,12 @@ static inline struct fuse_backing *famfs_meta_set(st= ruct fuse_inode *fi, =20 static inline void famfs_meta_free(struct fuse_inode *fi) { - /* Stub wil be connected in a subsequent commit */ +#if IS_ENABLED(CONFIG_FUSE_FAMFS_DAX) + if (fi->famfs_meta !=3D NULL) { + __famfs_meta_free(fi->famfs_meta); + famfs_meta_set(fi, NULL); + } +#endif } =20 static inline int fuse_file_famfs(struct fuse_inode *fi) diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index b071d16f7d04..1682755abf30 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -118,7 +118,7 @@ static struct inode *fuse_alloc_inode(struct super_bloc= k *sb) fuse_inode_backing_set(fi, NULL); =20 if (IS_ENABLED(CONFIG_FUSE_FAMFS_DAX)) - famfs_meta_set(fi, NULL); + fi->famfs_meta =3D NULL; /* XXX new inodes currently not zeroed; why not= ? */ =20 return &fi->inode; =20 diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h index dff5aa62543e..ecaaa62910f0 100644 --- a/include/uapi/linux/fuse.h +++ b/include/uapi/linux/fuse.h @@ -231,6 +231,13 @@ * - enum fuse_uring_cmd * 7.43 * - Add FUSE_DAX_FMAP capability - ability to handle in-kernel fsdax m= aps + * - Add the following structures for the GET_FMAP message reply compon= ents: + * - struct fuse_famfs_simple_ext + * - struct fuse_famfs_iext + * - struct fuse_famfs_fmap_header + * - Add the following enumerated types + * - enum fuse_famfs_file_type + * - enum famfs_ext_type */ =20 #ifndef _LINUX_FUSE_H @@ -1300,6 +1307,55 @@ struct fuse_uring_cmd_req { =20 /* Famfs fmap message components */ =20 +#define FAMFS_FMAP_VERSION 1 + #define FAMFS_FMAP_MAX 32768 /* Largest supported fmap message */ +#define FUSE_FAMFS_MAX_EXTENTS 32 +#define FUSE_FAMFS_MAX_STRIPS 32 + +enum fuse_famfs_file_type { + FUSE_FAMFS_FILE_REG, + FUSE_FAMFS_FILE_SUPERBLOCK, + FUSE_FAMFS_FILE_LOG, +}; + +enum famfs_ext_type { + FUSE_FAMFS_EXT_SIMPLE =3D 0, + FUSE_FAMFS_EXT_INTERLEAVE =3D 1, +}; + +struct fuse_famfs_simple_ext { + uint32_t se_devindex; + uint32_t reserved; + uint64_t se_offset; + uint64_t se_len; +}; + +struct fuse_famfs_iext { /* Interleaved extent */ + uint32_t ie_nstrips; + uint32_t ie_chunk_size; + uint64_t ie_nbytes; /* Total bytes for this interleaved_ext; + * sum of strips may be more + */ + uint64_t reserved; +}; + +struct fuse_famfs_fmap_header { + uint8_t file_type; /* enum famfs_file_type */ + uint8_t reserved; + uint16_t fmap_version; + uint32_t ext_type; /* enum famfs_log_ext_type */ + uint32_t nextents; + uint32_t reserved0; + uint64_t file_size; + uint64_t reserved1; +}; + +static inline int32_t fmap_msg_min_size(void) +{ + /* Smallest fmap message is a header plus one simple extent */ + return (sizeof(struct fuse_famfs_fmap_header) + + sizeof(struct fuse_famfs_simple_ext)); +} =20 #endif /* _LINUX_FUSE_H */ --=20 2.49.0