From nobody Tue Jun 16 19:24:14 2026 Received: from mail-wm1-f49.google.com (mail-wm1-f49.google.com [209.85.128.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0F4D52DCC01 for ; Wed, 29 Apr 2026 15:26:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777476387; cv=none; b=TM9+xklp5dgUtUJjimQhSXvHvPyzVfPbZxB+AnPY1gXEuP70SzLAVlgDJ+luxnM40diYpw+EVfo49dx3I9o06BLajL2Ak+pB5vtcbAGzP17TjypBFGfsyz9ibDdAL6m60w3MI6J69zVLPPBvaoqKVByahi38ApqoIwfRwiWjAbQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777476387; c=relaxed/simple; bh=dwXXUn2/AXYocQMBrxdXlK4QY//00n+FpUG8ksJYzZ4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=F/CgsZBXttdgqBRuFIGRvhXxvTZX+KSgGE6jDmJ9DIbBMpQIf2ZcB6jO37DS1dau7sfSnrXlmTy6GFR9ncoNvdiNukhRd3VuZ+NqLDvKqSP9xTzcppZg82xBfL6W7VsTxWsLV79COmZXvA3zNpuTREBcirQ+Wvec5KB7ceZ5OE4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Q//38A29; arc=none smtp.client-ip=209.85.128.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Q//38A29" Received: by mail-wm1-f49.google.com with SMTP id 5b1f17b1804b1-488af96f6b2so167925905e9.0 for ; Wed, 29 Apr 2026 08:26:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1777476383; x=1778081183; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Bz5U140iikgjy9qkhQv9ANNAasoYi2fY/CedhAQlGZc=; b=Q//38A29BIyAwvgVfxuarwdkkCIkYkoEgnlCZ1ZWTVC9bBGBlbwjWMLO+PCyW3pZb3 /pU/BZg1FEtUrImeH+eXGiGHqna2ZvIWgk+L6yR86y2B0VdaVlYkeMunR7khoFxE4S8J 0FNUd76MGFcBRZ5OxoP4ajvEzZZMXCSTkGLykLqyP0Ab+gFIXaGbWxh2fVbPdhx2Z9ZJ KEn1sba+9jRPPUOBVsvW0+1xOMEdWejyHJKwKElkTfsdDWcEYtTzEgQw3k+6VtHwcQUk tNSQwq6YhbSfLlBF0fbYnD0bGUK6JS4s8nem1hUMV56iM3cdlycEqDt4GJcUEtO5p6XO 1Mlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777476383; x=1778081183; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=Bz5U140iikgjy9qkhQv9ANNAasoYi2fY/CedhAQlGZc=; b=mdQyuRmqS1t/hRbijwXMY51aLIhg2Btv3uRFjxk5RcM2JWEZuESXWqDJ0HIrDRgTCO TIHaXahyUXAmd2VUgqntgv1MaLPherBNSKXyLgRpriWqYuWCMNUk/iTgFvzoEP/BfaUE QG2dVCFvzeMrg9PNBQ1fa5+RTsbsiOmBLXhyvxkUkIiRTUBYWu2twayd4/L0a5B48gRb 6jszJlkS+Xjyq5e5+Ip+RaEsAPvit1yRPImxQVqDAwMyVp4nrNMixgt/zuSBVl6YTnOn qSVjwwoqQVQOHZd9uLtsfY2ESpnfCER0lX/URuW6S9p0G1Hb1sFirIWmvwybttlWnR3i 9bgQ== X-Forwarded-Encrypted: i=1; AFNElJ/hvMsx7MjGpqMztLoNwJy/2BnMNrTC0rd1aT2W0Fh8nHstO/m2Jyyw0sIXF4gq4bfCV9kcSTMFE7FXYtI=@vger.kernel.org X-Gm-Message-State: AOJu0YzLkGIkiTSN7YDRNFTphYia+PhNylK3vnO0QyR4myJXaMCY1+oF /TdNDOPkXN1cd1kb5Uf9XtqA8i4vcPZ9H65u2tueOPz0dZBAZ59x8NCZ X-Gm-Gg: AeBDievUXM23GKb0ntYGnwKv+fHHfFrLsj4LvkB3B1h3ufrXqtVB3Z1n0gnTZJKOgb3 zVPiaS/jejd5LMenKBJn/Wt54mS6bFGuxf9IbjICc7+LHX0YeuyoVEQl9NawKNghSTFy7HpfZDO 85Vp7RZJ7vtb+56dmhWmeGjZqSAPWZUeIKa6r0gXX3knM9NSBf8P0wNSd+RwQVwh0HQmVbCSzzg 6rEmGjIK51ptDqSawK/0iMjRBxrX2sjF8viOS+9AbL5hPLhjjFJYCNR521ztwjGWUbKKJZg0EE7 XcoQMallHVV121d3gJDdBYgOemdpwceVmHFOzQTOOGWSBOihTNTBOW3rbAN+ZKvb9qqj8pGNLQW z0UPsh6/87FJ6+FPNShR0T8XQ4G+9g7goV9lF+YcJlRs3iPKBs9gUrKhTI8EBektP3/EVJhRtDu P2sgiaEaOk0BvqBCUjDzoz72IN6bhDiPo+Er9ywRV2uGjlPRzFj38o4yBBtMdpg7QX36Ss1DRBk i5aXh/1ZwMiVkftF1ydUgljnmy3O1MhwsMr0Oj0uwey X-Received: by 2002:a05:600c:3b13:b0:48a:52ee:5776 with SMTP id 5b1f17b1804b1-48a77ae049fmr129049305e9.11.1777476382406; Wed, 29 Apr 2026 08:26:22 -0700 (PDT) Received: from 127.0.0.1localhost ([82.132.184.31]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-447b76e5c22sm6382951f8f.28.2026.04.29.08.26.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Apr 2026 08:26:21 -0700 (PDT) From: Pavel Begunkov To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , Alexander Viro , Christian Brauner , Andrew Morton , Sumit Semwal , =?UTF-8?q?Christian=20K=C3=B6nig?= , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, io-uring@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org Cc: asml.silence@gmail.com, Nitesh Shetty , Kanchan Joshi , Anuj Gupta , Tushar Gohad , William Power , Phil Cayton , Jason Gunthorpe Subject: [PATCH v3 01/10] file: add callback for creating long-term dmabuf maps Date: Wed, 29 Apr 2026 16:25:47 +0100 Message-ID: X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce a new file callback that allows creating long-term dma mapping. All necessary information together with a dmabuf will be passed in the second argument of type struct io_dmabuf_token, which will be defined in following patches. Signed-off-by: Pavel Begunkov --- include/linux/fs.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/include/linux/fs.h b/include/linux/fs.h index b5b01bb22d12..c5558aab4628 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1920,6 +1920,7 @@ struct dir_context { =20 struct io_uring_cmd; struct offset_ctx; +struct io_dmabuf_token; =20 typedef unsigned int __bitwise fop_flags_t; =20 @@ -1967,6 +1968,7 @@ struct file_operations { int (*uring_cmd_iopoll)(struct io_uring_cmd *, struct io_comp_batch *, unsigned int poll_flags); int (*mmap_prepare)(struct vm_area_desc *); + int (*create_dmabuf_token)(struct file *, struct io_dmabuf_token *); } __randomize_layout; =20 /* Supports async buffered reads */ --=20 2.53.0 From nobody Tue Jun 16 19:24:14 2026 Received: from mail-wr1-f42.google.com (mail-wr1-f42.google.com [209.85.221.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5D77E3815EC for ; Wed, 29 Apr 2026 15:26:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777476393; cv=none; b=LIASHMspWeQyQXhk+JlsqoomjWdWlTsU1jW2kjKuusEPiHQEFxjOwiMnf4vAJiW4jZjiuGXIwWbBUIw+6nQQre9rYgYebvtFeushpzT7DBGIVFB1Vw45A91vJus6ZWO/VhI7ev+pEtZujdj//cKqtLtiATJMuQZ6Bk+QyI5ZQjU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777476393; c=relaxed/simple; bh=CNs/Yxbm5MZEeW58x4HT0ka3sUBh+wiGFZ1q5uz6Eyg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Qc3uzrxYARhe//ERT0tRa3xHsUEWgcphEIgUvadOlAD1aDWzVebh71YVu/l4nWsPSDu3XyEb34JslwoTG2EZW8cWOzBM79XIXlJG6Am9IRIU+COZ4QhwJZvTB3fPFnedZMJUi7Tjz/YKWshNDlqo72eCoxoXHeCG4Ob31qBfEwc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=RAXK9GyR; arc=none smtp.client-ip=209.85.221.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="RAXK9GyR" Received: by mail-wr1-f42.google.com with SMTP id ffacd0b85a97d-43fde5b81a1so9453083f8f.0 for ; Wed, 29 Apr 2026 08:26:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1777476389; x=1778081189; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=wYUNmBP6U3FeRokeO7r7d75tv9vLUxF9835eHvzSe1g=; b=RAXK9GyRu97bc5WOLSH+4gh1auoZTmpa1HqX78LaANkYEjuGMxYtdmGzrscRlggjBu FI3MsWnBF3K0c0J/xBIYa+adAOw5BsKftyc1EPzIn8JGIRtL7jqGBntr6V5PzM7K30/D ivK00yx75TjAMKw60V5vtbYCC/qAlhBTy4Q4xIqoeQx5uozrlsf0y1angULIyPkwyaeG yAxohAfltZlkqdgnlN0d0y14nTPr5JAZqtpAddVndfInu54FrdPpttKn6nwzBZX4XZV0 Be3oCXsLQR05Bnj4njUgZL2x+U6oIgTBvRou2ue8Q47n0uxfFeQ5R+drvSVYb84N0Dgl NEMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777476389; x=1778081189; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=wYUNmBP6U3FeRokeO7r7d75tv9vLUxF9835eHvzSe1g=; b=RSfT62GfaCMmVGHv/ziujStnywHnSH6x+i1xyrLwRt3jn7nSQ0hgqkEwK9azB0hf0n wXx8Vh2q1qMSKtNzWskRtmJ5prn1VRTJcEDS/pVuE78Seh7aQBLIpYdu3j6U4CmDGsvx JF7mAsBxHR+ne2Cc82H4GCMNdH0mEvwbb11GixkoYVdUUDoPmlvAJVYsX8skggTvTYSf 8eqUWcUpg3FLJKEpMgK/ZabzQMdr0wH83SiK+j5Q+eYxCogqgCFYS/aTnpBK4pLYj4hM 0w8gOhKhbilF3lMSUQ/6kpfSfmoIiCPxf7kZVC07fB7DzRSTqdzHDDGOZuJ6/uKt049r lAng== X-Forwarded-Encrypted: i=1; AFNElJ9DolkRbkT0mAjHLuw9QRtFIFlpyl6DnbbpDXgFNLtrJ94h/4i5yehQR2KBweFTl3oD9NverRlPeaUGWIQ=@vger.kernel.org X-Gm-Message-State: AOJu0YxJ8jom9PEMzFDk+9UkHVoiImAa6p8IDQCSGjwEMi2PJfsqyLCk vC3FZXrO0j4vPmyX790zlDuwTaRwprfHKUvfVc3LmUifjSQUxTas0hfr X-Gm-Gg: AeBDieul4NJaaignqq8CyptH4g5uxTpeGhk1V8yXgizFT1i8YiIGzZor5uq4wzomqlb 7bJay7QAFVgwp2ZALeLMjUnnCnWusiwlWdPQPlSo3asl0AWDMmRPbAqBpWqrmf8br/gRHB2/lIm PtCCrCE41F8wYgmAzG1etV8quDflfaKS95c5RfZYarhXx3Rg8obZiLs+K5SA4CWtJQnkm2YUFpa cTBuXi0NPL1ocw26m73t9FTeypk3uxsq4wXNSKo4AN0swKtBUNbcu3bXRn8AQ5wJ1llXgXvhLUX d5LpTYFiYeEeLBmqKzc1fiYr8thmc6rUYeLSDj+31bxQSsVJqs3B92cQag3+dKXMW9TnNtBdTGU 2dluRDu9QKr3dZey3zb2iY95yrsOLT5aY2KqzQOaQ7Vfz3+Af8C6WOkqbkmv5SUJa0tIBP7yzIe Og/Ax7Jqt88rdaUdiHDpHslBmbEC4guK8XE23YHxoyLUfqcuygWqpFsSc/zfff5ZtYmRfoMWEvj MlZPuy0NmZyGO9tQcc/PaCjfFdMic50XXfMm1VPx9ti2AVKNNSGEek= X-Received: by 2002:a05:6000:26cb:b0:43f:de5a:eb63 with SMTP id ffacd0b85a97d-4478ea89a58mr8097281f8f.11.1777476388358; Wed, 29 Apr 2026 08:26:28 -0700 (PDT) Received: from 127.0.0.1localhost ([82.132.184.31]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-447b76e5c22sm6382951f8f.28.2026.04.29.08.26.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Apr 2026 08:26:27 -0700 (PDT) From: Pavel Begunkov To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , Alexander Viro , Christian Brauner , Andrew Morton , Sumit Semwal , =?UTF-8?q?Christian=20K=C3=B6nig?= , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, io-uring@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org Cc: asml.silence@gmail.com, Nitesh Shetty , Kanchan Joshi , Anuj Gupta , Tushar Gohad , William Power , Phil Cayton , Jason Gunthorpe Subject: [PATCH v3 02/10] iov_iter: add iterator type for dmabuf maps Date: Wed, 29 Apr 2026 16:25:48 +0100 Message-ID: <20a233d2f35274817aa643cc0fe113707eb47e72.1777475843.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce a new iterator type for dmabuf maps. The map in an opaque object with internals and format specific to the subsystem / driver, and only it can use that subsystem / driver for issuing IO. The task of the middle layers is to pass the map / iterator further down, maybe doing basic splitting and length checking. The iterator can only be used by operations of the file the associated map was created for. Suggested-by: Keith Busch Signed-off-by: Pavel Begunkov Reviewed-by: Christoph Hellwig --- include/linux/uio.h | 11 +++++++++++ lib/iov_iter.c | 29 +++++++++++++++++++++++------ 2 files changed, 34 insertions(+), 6 deletions(-) diff --git a/include/linux/uio.h b/include/linux/uio.h index a9bc5b3067e3..75051aed70de 100644 --- a/include/linux/uio.h +++ b/include/linux/uio.h @@ -12,6 +12,7 @@ =20 struct page; struct folio_queue; +struct io_dmabuf_map; =20 typedef unsigned int __bitwise iov_iter_extraction_t; =20 @@ -29,6 +30,7 @@ enum iter_type { ITER_FOLIOQ, ITER_XARRAY, ITER_DISCARD, + ITER_DMABUF_MAP, }; =20 #define ITER_SOURCE 1 // =3D=3D WRITE @@ -71,6 +73,7 @@ struct iov_iter { const struct folio_queue *folioq; struct xarray *xarray; void __user *ubuf; + struct io_dmabuf_map *dmabuf_map; }; size_t count; }; @@ -155,6 +158,11 @@ static inline bool iov_iter_is_xarray(const struct iov= _iter *i) return iov_iter_type(i) =3D=3D ITER_XARRAY; } =20 +static inline bool iov_iter_is_dmabuf_map(const struct iov_iter *i) +{ + return iov_iter_type(i) =3D=3D ITER_DMABUF_MAP; +} + static inline unsigned char iov_iter_rw(const struct iov_iter *i) { return i->data_source ? WRITE : READ; @@ -300,6 +308,9 @@ void iov_iter_folio_queue(struct iov_iter *i, unsigned = int direction, unsigned int first_slot, unsigned int offset, size_t count); void iov_iter_xarray(struct iov_iter *i, unsigned int direction, struct xa= rray *xarray, loff_t start, size_t count); +void iov_iter_dmabuf_map(struct iov_iter *i, unsigned int direction, + struct io_dmabuf_map *map, + loff_t off, size_t count); ssize_t iov_iter_get_pages2(struct iov_iter *i, struct page **pages, size_t maxsize, unsigned maxpages, size_t *start); ssize_t iov_iter_get_pages_alloc2(struct iov_iter *i, struct page ***pages, diff --git a/lib/iov_iter.c b/lib/iov_iter.c index 243662af1af7..e2253684b991 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -575,7 +575,8 @@ void iov_iter_advance(struct iov_iter *i, size_t size) { if (unlikely(i->count < size)) size =3D i->count; - if (likely(iter_is_ubuf(i)) || unlikely(iov_iter_is_xarray(i))) { + if (likely(iter_is_ubuf(i)) || unlikely(iov_iter_is_xarray(i)) || + unlikely(iov_iter_is_dmabuf_map(i))) { i->iov_offset +=3D size; i->count -=3D size; } else if (likely(iter_is_iovec(i) || iov_iter_is_kvec(i))) { @@ -631,7 +632,8 @@ void iov_iter_revert(struct iov_iter *i, size_t unroll) return; } unroll -=3D i->iov_offset; - if (iov_iter_is_xarray(i) || iter_is_ubuf(i)) { + if (iov_iter_is_xarray(i) || iter_is_ubuf(i) || + iov_iter_is_dmabuf_map(i)) { BUG(); /* We should never go beyond the start of the specified * range since we might then be straying into pages that * aren't pinned. @@ -775,6 +777,20 @@ void iov_iter_xarray(struct iov_iter *i, unsigned int = direction, } EXPORT_SYMBOL(iov_iter_xarray); =20 +void iov_iter_dmabuf_map(struct iov_iter *i, unsigned int direction, + struct io_dmabuf_map *map, + loff_t off, size_t count) +{ + WARN_ON(direction & ~(READ | WRITE)); + *i =3D (struct iov_iter){ + .iter_type =3D ITER_DMABUF_MAP, + .data_source =3D direction, + .dmabuf_map =3D map, + .count =3D count, + .iov_offset =3D off, + }; +} + /** * iov_iter_discard - Initialise an I/O iterator that discards data * @i: The iterator to initialise. @@ -841,7 +857,7 @@ static unsigned long iov_iter_alignment_bvec(const stru= ct iov_iter *i) =20 unsigned long iov_iter_alignment(const struct iov_iter *i) { - if (likely(iter_is_ubuf(i))) { + if (likely(iter_is_ubuf(i)) || iov_iter_is_dmabuf_map(i)) { size_t size =3D i->count; if (size) return ((unsigned long)i->ubuf + i->iov_offset) | size; @@ -872,7 +888,7 @@ unsigned long iov_iter_gap_alignment(const struct iov_i= ter *i) size_t size =3D i->count; unsigned k; =20 - if (iter_is_ubuf(i)) + if (iter_is_ubuf(i) || iov_iter_is_dmabuf_map(i)) return 0; =20 if (WARN_ON(!iter_is_iovec(i))) @@ -1469,11 +1485,12 @@ EXPORT_SYMBOL_GPL(import_ubuf); void iov_iter_restore(struct iov_iter *i, struct iov_iter_state *state) { if (WARN_ON_ONCE(!iov_iter_is_bvec(i) && !iter_is_iovec(i) && - !iter_is_ubuf(i)) && !iov_iter_is_kvec(i)) + !iter_is_ubuf(i) && !iov_iter_is_kvec(i) && + !iov_iter_is_dmabuf_map(i))) return; i->iov_offset =3D state->iov_offset; i->count =3D state->count; - if (iter_is_ubuf(i)) + if (iter_is_ubuf(i) || iov_iter_is_dmabuf_map(i)) return; /* * For the *vec iters, nr_segs + iov is constant - if we increment --=20 2.53.0 From nobody Tue Jun 16 19:24:14 2026 Received: from mail-wr1-f41.google.com (mail-wr1-f41.google.com [209.85.221.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7F22537F015 for ; Wed, 29 Apr 2026 15:26:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777476397; cv=none; b=GPODMpZX+9B/2+ed/q8cZ4BSUNZhrVlXremYRH9Gf3Ycs3GNrn10p32vImFO4WFpU9nsTtw0YTu/idfd1TgWHxJQPtL/7s/m539SBE/pzjHjGjIQF9g3Pj9Ozw0hP3XUXbPmE0ysW1dadWpa207ZogRgy/EDivt1+WQOl4gT/ZE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777476397; c=relaxed/simple; bh=Rg4ViyjWjJ0uH1MionkRUKz+M0sp6VZkQ9UyDJex3kw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=H1UTD4EEzAcuiLovgAHC0hznDrjytAxItgFlctpmzrMATOOEDY65PtBgRl30cv6nFBkTBlO6a5kUoISU3o3eGFbB3fyufWHxjG1ixIY0DQWs/Lsq0vz+lRfxC2hlk4WQdINxG28BZQ2fDk2JgDIDzxI715Jn5IhoImpSqAkNhzY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ZxbT4MqG; arc=none smtp.client-ip=209.85.221.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ZxbT4MqG" Received: by mail-wr1-f41.google.com with SMTP id ffacd0b85a97d-446fea16729so1612769f8f.3 for ; Wed, 29 Apr 2026 08:26:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1777476393; x=1778081193; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=nOTrV0WGracSIFm0V666o5CRu83OUP8NlZEfiWwVSJE=; b=ZxbT4MqGYb41heJHzUNkeNrQqqbwivoko4YMr5UyQTxSL8WB1OyORAio38DxntrS6u auZrBgzdXlS/Zd+QO9j19xJeT986+p4M04JYGsA9Cy1qY7tyv6ZXSsqeu3lOu5KcvgSf bwH040Jp458XcQ78nGYH+2kcXG0HqNQhcSaQEopmU66xpyEEMarI7UgAdVTwheElzo5Z 2HR0Fn7PnbcP6s0xfsjdbr5X0o3yQ4BN/63ISgNQnvQCaJQJJrImNwWpwl1p/TSCFX2Z otPWWi2MPrnCArGZBi2AILWhP9XjXJtLKVR4fjzxCf2rtAdcaSa+MtvsIi8IVnFvQkLk 9NQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777476393; x=1778081193; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=nOTrV0WGracSIFm0V666o5CRu83OUP8NlZEfiWwVSJE=; b=EWY4uWnYCqBZvPTcju31gVD07gMfGJVZsDYU5Bdk0Rp8/DBvouSgNvIQjwpY+lXPwg c5HFKs8iBqXYqfVaeHJZNOQzfXOnGKiwhN4DXEsQTzSDEoSErHuYaGsw28q7Sd+Jd3zk 7TQmqTkbXhVfR/guCVb6LhvMjyslyxBjjqqWaO733uFyGz5jxhuy+zag0rJzXGp7h1hC lnqr6+qNLtP4b+iwHDLt7P79mco9C8CdvCDaYywxBK0FvOGE3MdiPyalsr7arweeNZ6H 1CvIsKIcJLIttJbbWdXCSw0mOTgEE1+hIQxwGvnvk4zgFEuYTyLfgTrqd7osm9lOIe+1 QicA== X-Forwarded-Encrypted: i=1; AFNElJ9Q1DWDDkZ8Ezps4xwUkJwDy0htHq5nf5zyPnRbi3U5OfgJ4pW8lvA4C0kgphrmJo+8UsatKOoG/3w0gOY=@vger.kernel.org X-Gm-Message-State: AOJu0Yxcg7Z28pK/aJjs0d21joDkkqHItRTe7M8RGW9GLeBLSxhus5MI gP/umdDyWaI471zAIvESUtBTc39uxFI2i/rQxVsI2p2bzfFSHtVuTe6e X-Gm-Gg: AeBDieucoiv84fP9cLa4PDRXQ06D/y1aU5C1reJRLYil2Ha9usbT8sdkKEC9/0pdywY poLDwUOc8LXxbVL0zeDN6nlSDOiNohL5nheaAwOTT7SwJVg1qvtDLpMxtqFyqtcO8ZahJXsIsar fEtFlsrHesytauzwS9vnZ4eHuLpI6J8Nua9FQml2YZifT2yKum36RxjYL9LxGSLfbqm5j9zVltq tjYuKz5GTnZWi338Nzl50bpZw84MabOL49Boh4xJsY2WRa3eWtTrrYk3UndKPJWOAxEFQdkCWg2 MdFwRHLwzBQbgXNCtrWYDRAA8o07tdsq69uD8wqyvqXzvwWUu7qpYAibrQbpbxfdvbBEQPaiC4g CLjaj+DcuIP4QTKBidhboyGwrpHpeDrasAYIBAaIixLC92X1g00TnjLps/iWWQjdmFt+N0Y83lG Qh6BD8lf2BQmUvZBIcljNB1E+imznMWTUVnxjSphar50qQP+SuUdNdxlmqUek33bGj/NOAxpI1m +E1phetaN9tvU88qhC13CHiF8rbpgXQfLyLjuNNA0bo X-Received: by 2002:a05:6000:250f:b0:43f:e41d:85f2 with SMTP id ffacd0b85a97d-4464839c9bcmr15101656f8f.2.1777476392719; Wed, 29 Apr 2026 08:26:32 -0700 (PDT) Received: from 127.0.0.1localhost ([82.132.184.31]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-447b76e5c22sm6382951f8f.28.2026.04.29.08.26.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Apr 2026 08:26:32 -0700 (PDT) From: Pavel Begunkov To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , Alexander Viro , Christian Brauner , Andrew Morton , Sumit Semwal , =?UTF-8?q?Christian=20K=C3=B6nig?= , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, io-uring@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org Cc: asml.silence@gmail.com, Nitesh Shetty , Kanchan Joshi , Anuj Gupta , Tushar Gohad , William Power , Phil Cayton , Jason Gunthorpe Subject: [PATCH v3 03/10] block: move bvec init into __bio_clone Date: Wed, 29 Apr 2026 16:25:49 +0100 Message-ID: <43a91f54d61d3329316e40c69ace781b4d35fe0b.1777475843.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" To quote Cristoph: "Historically __bio_clone itself does not clone the payload, just the bio. But we got rid of the callers that want to clone a bio but not the payload long time ago". So let's move ->bi_io_vec assignment into __bio_clone(), so we have a single point where it's set. Suggested-by: Christoph Hellwig Signed-off-by: Pavel Begunkov --- block/bio.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/block/bio.c b/block/bio.c index 4d46af0cd256..0734b50d4992 100644 --- a/block/bio.c +++ b/block/bio.c @@ -851,6 +851,7 @@ static int __bio_clone(struct bio *bio, struct bio *bio= _src, gfp_t gfp) bio->bi_write_hint =3D bio_src->bi_write_hint; bio->bi_write_stream =3D bio_src->bi_write_stream; bio->bi_iter =3D bio_src->bi_iter; + bio->bi_io_vec =3D bio_src->bi_io_vec; =20 if (bio->bi_bdev) { if (bio->bi_bdev =3D=3D bio_src->bi_bdev && @@ -893,8 +894,6 @@ struct bio *bio_alloc_clone(struct block_device *bdev, = struct bio *bio_src, bio_put(bio); return NULL; } - bio->bi_io_vec =3D bio_src->bi_io_vec; - return bio; } EXPORT_SYMBOL(bio_alloc_clone); @@ -914,7 +913,7 @@ int bio_init_clone(struct block_device *bdev, struct bi= o *bio, { int ret; =20 - bio_init(bio, bdev, bio_src->bi_io_vec, 0, bio_src->bi_opf); + bio_init(bio, bdev, NULL, 0, bio_src->bi_opf); ret =3D __bio_clone(bio, bio_src, gfp); if (ret) bio_uninit(bio); --=20 2.53.0 From nobody Tue Jun 16 19:24:14 2026 Received: from mail-wr1-f54.google.com (mail-wr1-f54.google.com [209.85.221.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 41DF8341AB8 for ; Wed, 29 Apr 2026 15:26:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777476407; cv=none; b=gwijkKZLo1QcGmOjPDxI0cr7aYYYJRR4ua/S/TDIJl7c7wCBXGRtKKuON5kkWAvyb98W5lMSenz7TpksSlN4SgqYjReG4r3gMH+YAu/FDPCw5/oxJ5Bc92ko0a6dY/o6b+mnFrRRev2yD2Omv8oVQ9CNzz/88eO3I4ax4lIQmr0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777476407; c=relaxed/simple; bh=wPna77yg1sN41atnG+TaUrGkyr7sGycheND1pjkgNrA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=OyBBWJW0qSqhG8yW1vOG7Z/fP98EdoK/4PN0Vdk3Rmd6yyZaazi/18frLEcBm9QrLE1cpBn0Bi3tYBqTHpJA7+2v9bQm1NYvY2F/ymGV/wyGyArUD1ePuM7pr6uCVaDF8zSFVJqrBMk/EjHO4JxItMoI7fq8dJLYPw0V7tNmBKU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=TY6XL8+S; arc=none smtp.client-ip=209.85.221.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="TY6XL8+S" Received: by mail-wr1-f54.google.com with SMTP id ffacd0b85a97d-43cf8d550bdso10720173f8f.0 for ; Wed, 29 Apr 2026 08:26:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1777476402; x=1778081202; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=swjDqvfsydtMCDgC1VH/dViD3xm2NdukCe3jlmQ9xuA=; b=TY6XL8+SopWnzYIn0SbdzNv5g86pO9n5+v3Cvu0qi5bOnGzBX7BxvB1PnP1RLoCH3P F8Ae4llxZwbHeUX/A6xvs0pPRBr8geA5jkkMMZ+u2oFGlYCIa/B5U/m8mkgiuDqZa2GT MRr3T7Tp38ZmmFAfACqOYpqY3vnQh1uFzHALigHNaRPEmChrY5tNXL0kMPWoWkaXtSB4 NGWN1G68pJTZ0anmuqdmk83WmK3e1OzREoLUjPSQAotRh87+OX1uG1MakOwNJMZqHnb6 98o+lgiWw98QvScr7M7vN4rUfbIwJHsQlwBCFT8SiKiNqiw0a/sxTBAEUAV4a7iwyCwP tqbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777476402; x=1778081202; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=swjDqvfsydtMCDgC1VH/dViD3xm2NdukCe3jlmQ9xuA=; b=QQzA284mTHrJf54fqxxR1hvSBZiu+86xx4Y07o8+UQZz75qoqODRfVZNOfLqctpl93 HN8+cCwW1oouLpihm+BSqb6hbiTUm1bLPEAcNQKV+3QF6lZS3zTLoFz21A/Xw1GrDSka pZTYrPUYuHhT4U+D1IyZfF0utlYpQ5GD5//6E2gm0uLGuf4todnjueLan+9oj3wKbQjq VtpNi/kxme1O0AZmZ5ruXGHUTZXc8Mgpa0XWi99t5KgXcAkUsQAJ1LQMQFGiwLJRr/vC FsH3cy05fx95jCl6ieCJ/uB9Y6/0fnnbbtZyQlQoPe6Q+etiUgXahHlATEW9T3nCI636 4lJA== X-Forwarded-Encrypted: i=1; AFNElJ9cFYvibVYXFazICCG0qZWc3OTggGIzJtc9CedAlTDalLdVUZHYBilrSEb+uB3GaD42uCJw6K2B1fGB8is=@vger.kernel.org X-Gm-Message-State: AOJu0Yx7cSet8VT8KhOwjdSNnT99DzaZRFOzueKJWyuXW6XiFYLS4VM5 Jy+oGKhQ9kO3BSwWrUSRzVq4xhbYNRF0VXe3N4UWRlm50GeKXjgh2tkK X-Gm-Gg: AeBDievP1gvcghD+6wn6L8+x8rgMJun3J+lRnxt6TVAl/Teh4y/BYGBjuBd+n9/FEfG Uct7PdhR4Hcz92DHs+jBqwDl2cW7PPgYiTIxKuWVefWHhfOetROR9DvjBk43+2pgp+YCETYI+QD rirLTpg5iT3eUzj8HX1SHJWOXl3FWu1zf2M9X5SCyi9fGJUR1NIQSVqnIHD+lh6DkybaVEA3bec Bp5cI7mWI45NoOz/qWGCwSllTLRwGuVeSLChQ7zTa/7tVYajD3FRtlGiwbYuaeiwwlb849leSjD lqogaoT0dW+ygmhQY6xP1FBpegTYrBb2LyTH3oPTY2PwAeuzPaJsgXJg3y1455xVOSAl79V1Lv+ jwI8/sdwqx0B2PhUyN8hlaLU93A2WwuFhplcOG/OZoAyuJL0oxFUeWR33JNYj7Pqz0tWRogORSV Tbo1gR4yAkUvokOPwx6BaTy/a2DwhrdPU/KbEGOg2naCyQHpMxcWBxsRFrd/P52CwzO8Hqp18XG FMjt51vsTEFOUW26o/osRRIAW2xu+txzXHEA79Nis7j X-Received: by 2002:a05:6000:1888:b0:43f:debd:feb1 with SMTP id ffacd0b85a97d-44649ba18b5mr14136832f8f.39.1777476397928; Wed, 29 Apr 2026 08:26:37 -0700 (PDT) Received: from 127.0.0.1localhost ([82.132.184.31]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-447b76e5c22sm6382951f8f.28.2026.04.29.08.26.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Apr 2026 08:26:37 -0700 (PDT) From: Pavel Begunkov To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , Alexander Viro , Christian Brauner , Andrew Morton , Sumit Semwal , =?UTF-8?q?Christian=20K=C3=B6nig?= , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, io-uring@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org Cc: asml.silence@gmail.com, Nitesh Shetty , Kanchan Joshi , Anuj Gupta , Tushar Gohad , William Power , Phil Cayton , Jason Gunthorpe Subject: [PATCH v3 04/10] block: introduce dma map backed bio type Date: Wed, 29 Apr 2026 16:25:50 +0100 Message-ID: <646ecd6fde8d9e146cb051efb514deb27ce3883e.1777475843.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Premapped buffers don't require a generic bio_vec since these have already been dma mapped. Repurpose the bi_io_vec space to strore dmabuf maps as they are mutually exclusive. Suggested-by: Keith Busch Signed-off-by: Pavel Begunkov --- block/bio.c | 25 ++++++++++++++++++++++++- block/blk-merge.c | 14 ++++++++++++++ block/blk.h | 3 ++- block/fops.c | 2 ++ include/linux/bio.h | 19 ++++++++++++++++--- include/linux/blk_types.h | 8 +++++++- 6 files changed, 65 insertions(+), 6 deletions(-) diff --git a/block/bio.c b/block/bio.c index 0734b50d4992..bdc91777c288 100644 --- a/block/bio.c +++ b/block/bio.c @@ -851,7 +851,13 @@ static int __bio_clone(struct bio *bio, struct bio *bi= o_src, gfp_t gfp) bio->bi_write_hint =3D bio_src->bi_write_hint; bio->bi_write_stream =3D bio_src->bi_write_stream; bio->bi_iter =3D bio_src->bi_iter; - bio->bi_io_vec =3D bio_src->bi_io_vec; + + if (!bio_flagged(bio_src, BIO_DMABUF_MAP)) { + bio->bi_io_vec =3D bio_src->bi_io_vec; + } else { + bio->dmabuf_map =3D bio_src->dmabuf_map; + bio_set_flag(bio, BIO_DMABUF_MAP); + } =20 if (bio->bi_bdev) { if (bio->bi_bdev =3D=3D bio_src->bi_bdev && @@ -1183,6 +1189,18 @@ void bio_iov_bvec_set(struct bio *bio, const struct = iov_iter *iter) bio_set_flag(bio, BIO_CLONED); } =20 +void bio_dmabuf_map_set(struct bio *bio, struct iov_iter *iter) +{ + WARN_ON_ONCE(bio->bi_max_vecs); + + bio->dmabuf_map =3D iter->dmabuf_map; + bio->bi_vcnt =3D 0; + bio->bi_iter.bi_bvec_done =3D iter->iov_offset; + bio->bi_iter.bi_size =3D iov_iter_count(iter); + bio->bi_opf |=3D REQ_NOMERGE; + bio_set_flag(bio, BIO_DMABUF_MAP); +} + /* * Aligns the bio size to the len_align_mask, releasing excessive bio vecs= that * __bio_iov_iter_get_pages may have inserted, and reverts the trimmed len= gth @@ -1252,6 +1270,11 @@ int bio_iov_iter_get_pages(struct bio *bio, struct i= ov_iter *iter, iov_iter_advance(iter, bio->bi_iter.bi_size); return 0; } + if (iov_iter_is_dmabuf_map(iter)) { + bio_dmabuf_map_set(bio, iter); + iov_iter_advance(iter, bio->bi_iter.bi_size); + return 0; + } =20 if (iov_iter_extract_will_pin(iter)) bio_set_flag(bio, BIO_PAGE_PINNED); diff --git a/block/blk-merge.c b/block/blk-merge.c index fcf09325b22e..fc2c0c428001 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -348,6 +348,19 @@ int bio_split_io_at(struct bio *bio, const struct queu= e_limits *lim, len_align_mask |=3D (bc->bc_key->crypto_cfg.data_unit_size - 1); } =20 + if (bio_flagged(bio, BIO_DMABUF_MAP)) { + nsegs =3D 1; + + if ((bio->bi_iter.bi_bvec_done & lim->dma_alignment) || + (bio->bi_iter.bi_size & len_align_mask)) + return -EINVAL; + if (bio->bi_iter.bi_size > max_bytes) { + bytes =3D max_bytes; + goto split; + } + goto out; + } + bio_for_each_bvec(bv, bio, iter) { if (bv.bv_offset & start_align_mask || bv.bv_len & len_align_mask) @@ -378,6 +391,7 @@ int bio_split_io_at(struct bio *bio, const struct queue= _limits *lim, bvprvp =3D &bvprv; } =20 +out: *segs =3D nsegs; bio->bi_bvec_gap_bit =3D ffs(gaps); return 0; diff --git a/block/blk.h b/block/blk.h index b998a7761faf..b4b09abebce8 100644 --- a/block/blk.h +++ b/block/blk.h @@ -424,7 +424,8 @@ static inline struct bio *__bio_split_to_limits(struct = bio *bio, switch (bio_op(bio)) { case REQ_OP_READ: case REQ_OP_WRITE: - if (bio_may_need_split(bio, lim)) + if (bio_may_need_split(bio, lim) || + bio_flagged(bio, BIO_DMABUF_MAP)) return bio_split_rw(bio, lim, nr_segs); *nr_segs =3D 1; return bio; diff --git a/block/fops.c b/block/fops.c index bb6642b45937..713a3ba3f457 100644 --- a/block/fops.c +++ b/block/fops.c @@ -349,6 +349,8 @@ static ssize_t __blkdev_direct_IO_async(struct kiocb *i= ocb, * bio_iov_iter_get_pages() and set the bvec directly. */ bio_iov_bvec_set(bio, iter); + } else if (iov_iter_is_dmabuf_map(iter)) { + bio_dmabuf_map_set(bio, iter); } else { ret =3D blkdev_iov_iter_get_pages(bio, iter, bdev); if (unlikely(ret)) diff --git a/include/linux/bio.h b/include/linux/bio.h index 97d747320b35..0c43fa6b0900 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -108,16 +108,26 @@ static inline bool bio_next_segment(const struct bio = *bio, #define bio_for_each_segment_all(bvl, bio, iter) \ for (bvl =3D bvec_init_iter_all(&iter); bio_next_segment((bio), &iter); ) =20 +static inline void bio_advance_iter_dmabuf_map(struct bvec_iter *iter, + unsigned int bytes) +{ + iter->bi_bvec_done +=3D bytes; + iter->bi_size -=3D bytes; +} + static inline void bio_advance_iter(const struct bio *bio, struct bvec_iter *iter, unsigned int bytes) { iter->bi_sector +=3D bytes >> 9; =20 - if (bio_no_advance_iter(bio)) + if (bio_no_advance_iter(bio)) { iter->bi_size -=3D bytes; - else + } else if (bio_flagged(bio, BIO_DMABUF_MAP)) { + bio_advance_iter_dmabuf_map(iter, bytes); + } else { bvec_iter_advance(bio->bi_io_vec, iter, bytes); /* TODO: It is reasonable to complete bio with error here. */ + } } =20 /* @bytes should be less or equal to bvec[i->bi_idx].bv_len */ @@ -129,6 +139,8 @@ static inline void bio_advance_iter_single(const struct= bio *bio, =20 if (bio_no_advance_iter(bio)) iter->bi_size -=3D bytes; + else if (bio_flagged(bio, BIO_DMABUF_MAP)) + bio_advance_iter_dmabuf_map(iter, bytes); else bvec_iter_advance_single(bio->bi_io_vec, iter, bytes); } @@ -391,7 +403,7 @@ static inline void bio_wouldblock_error(struct bio *bio) */ static inline int bio_iov_vecs_to_alloc(struct iov_iter *iter, int max_seg= s) { - if (iov_iter_is_bvec(iter)) + if (iov_iter_is_bvec(iter) || iov_iter_is_dmabuf_map(iter)) return 0; return iov_iter_npages(iter, max_segs); } @@ -471,6 +483,7 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_= iter *iter, unsigned len_align_mask); =20 void bio_iov_bvec_set(struct bio *bio, const struct iov_iter *iter); +void bio_dmabuf_map_set(struct bio *bio, struct iov_iter *iter); void __bio_release_pages(struct bio *bio, bool mark_dirty); extern void bio_set_pages_dirty(struct bio *bio); extern void bio_check_pages_dirty(struct bio *bio); diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 8808ee76e73c..d5ad085b701d 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -233,7 +233,12 @@ struct bio { atomic_t __bi_remaining; =20 /* The actual vec list, preserved by bio_reset() */ - struct bio_vec *bi_io_vec; + union { + struct bio_vec *bi_io_vec; + /* Driver specific dma map, present only with BIO_DMABUF_MAP */ + struct io_dmabuf_map *dmabuf_map; + }; + struct bvec_iter bi_iter; =20 union { @@ -322,6 +327,7 @@ enum { BIO_REMAPPED, BIO_ZONE_WRITE_PLUGGING, /* bio handled through zone write plugging */ BIO_EMULATES_ZONE_APPEND, /* bio emulates a zone append operation */ + BIO_DMABUF_MAP, /* Using premmaped dma buffers */ BIO_FLAG_LAST }; =20 --=20 2.53.0 From nobody Tue Jun 16 19:24:14 2026 Received: from mail-wr1-f51.google.com (mail-wr1-f51.google.com [209.85.221.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9AE8D37CD25 for ; Wed, 29 Apr 2026 15:26:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777476408; cv=none; b=EDkd6iZAM06Vs4d4K+J9TyTw+kAZxr+fADtzTGZ+BsVICdRpeK/eLYGXncxw033udTOHlZPWOAM60/dlY+9mZCuAPKj4IIBBeQ+ACtzKFJRZ9Saa9SNDF4z0Ft3Cym5vDf/G9+o/hLL3qzE+/6fU29haSViKCQs1x7b4OwujGRs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777476408; c=relaxed/simple; bh=oJzJ3baWzxt8SL6mu8Gavx3suCHTBf+39zFidf5RUps=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ivGZ2knDm3IRRAcJQutSzWAwEIcR7Mrs1E/y95mWQy71ITGSq3xZGEaw+axvNh3f5HmJo9Vb1Tk4b4FtWTRdTYv+R/TFwiM2bgBRVwss+nNjKET+8nmKtdwRFFHG/rIQrjZBKhHx3mcxM9sS5z3PP8ehylBLcEF53/acOsKy32A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=CLVtxWJW; arc=none smtp.client-ip=209.85.221.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="CLVtxWJW" Received: by mail-wr1-f51.google.com with SMTP id ffacd0b85a97d-43d6fbd0954so9596298f8f.1 for ; Wed, 29 Apr 2026 08:26:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1777476404; x=1778081204; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=GR5ndKNNRfx4IwKN9v/ALZWj8JN4jvbZ2IeaIPIqzHs=; b=CLVtxWJWfeNWeCrzX9F9I0ZmyXB0gr603IqxKmnWTFVghLEhS7tnDd1f+lOQxwXf81 MRSgmnfdj/uV2WEnhotAe8tOPhYvREqgsFGQz3aRHcAtcMSE74Van6p77iVoJK04ovyg Z7FtaDnhPivvUiqRzi0smFLQRt7kv0DtaqlrP0vRg0t8iYGQLHy5cSuTc4oGrDt4tiHx 0KQADqrNORUTaCq5p+P9Fd4kMmu4OUrawXSe5ndeiubJYFDb0vBBOZ+7rpCqNoHJFPyD X3+lWxlrHwS2Vl7/eX+TtF/OIZVN9MdItFB/S1zbXCHdjHWpgi9+Rod5goHoDA0wbAD0 lkxg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777476404; x=1778081204; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=GR5ndKNNRfx4IwKN9v/ALZWj8JN4jvbZ2IeaIPIqzHs=; b=byp5weuR4fwNsAQSNqqY674TsgSzDNX+W7pIF/bzg9k+ltVLcSl/AL/YWZ0BXqppA0 60eDQDdxzMX/T35Z4qTAA6FggOmQLNS+d7OYFmCQmNK6xiAj5zL4OxVnMQeDyZBSf6gP GcEEYuO0atuS4QkDzDyl4EXGnXbQLESSMjiGztrNDkQOgLIbXklra9wPkxvg9CAKmq6+ HIUNr3EBoKziUUJi2m2Hs13kUosW1vgRPNwea40w8MLW4MUjiM3JVN4+h2ArZVH4db3p wcKwQGWXMvwws5yoJZoVCzGRKrWGftZmjPq5qRauv3+fmCzKzcQfQ9o64aoo8h95M6ju Q2Pw== X-Forwarded-Encrypted: i=1; AFNElJ/G3VcUk2V1JRC51EiHEPYFrIN023P3eVeYP+xzTc3KLvjeCcac5AwtZ5s9unck+SZhDZ0lyOydpK8LbqU=@vger.kernel.org X-Gm-Message-State: AOJu0YwNfRhFk+2CVSZL3woWUwzJuBlzwPAYbijdpMPhitz2cHGy7QY6 LVduhC24kotxMl2T3HHbvL3JN0zK7Y9HA+Js5vj+hppQxROTO6NAtfAM X-Gm-Gg: AeBDieuXUgm3WQbkIop0mbeyMwHrlN2ARO+bV9DNIjT3DXLyGB1mSLQvj4tJYX3yAXr lG1Ebwpazcu467vVHpvOZtoMg12hRxzsmbCsRfgpknYN+Fb83invwQfsInwDvquMUGVgGXdp31Z Wur/uKJ4mNTHEr9zhlLlgJELRLGOsMvFos8ApRu9hrwP0YQN/w9u3x+BdgLKaUfXZjRFr9spL2O 7IBuTE3GriO1Tr18e6Ud9acq3CMWLVFOhdD9AOOeXuC4I+UTprdJrM3iQSLVxdUUUwd7SHxuquN ENTNL7z1+4alH60ytJorWQgI0Vh8vK1pmZhAQxQkcV3DqStsL/+/Z9/AnoDBY7a2qXTtV47ROP4 J09HNbUJ4x3RH3kHhHc3khvrnk5k1rMPFPs4LAvoOLbvKAGg/ty8Pk+5rGDJpVvG0mCPbT4Ag8V cU6qmIe/kDbRDc0EvWrI0oF2FLRE/zcI9+Gm57QTpUzW5uRW3/oxejG9E9w9O05K5UJLwC7Yvmb Hpf6EbJPXgS+cY0WoN4Pmh2WIsxU9mQLfxNzAkRjUdG X-Received: by 2002:a05:6000:2681:b0:441:1ca1:6404 with SMTP id ffacd0b85a97d-4478ee6236amr7736191f8f.18.1777476403705; Wed, 29 Apr 2026 08:26:43 -0700 (PDT) Received: from 127.0.0.1localhost ([82.132.184.31]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-447b76e5c22sm6382951f8f.28.2026.04.29.08.26.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Apr 2026 08:26:43 -0700 (PDT) From: Pavel Begunkov To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , Alexander Viro , Christian Brauner , Andrew Morton , Sumit Semwal , =?UTF-8?q?Christian=20K=C3=B6nig?= , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, io-uring@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org Cc: asml.silence@gmail.com, Nitesh Shetty , Kanchan Joshi , Anuj Gupta , Tushar Gohad , William Power , Phil Cayton , Jason Gunthorpe Subject: [PATCH v3 05/10] lib: add dmabuf token infrastructure Date: Wed, 29 Apr 2026 16:25:51 +0100 Message-ID: X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" There are two main objects. struct io_dmabuf_token and struct io_dmabuf_map. The token is used during initial registration and serves as an interface between the upper layer user like io_uring and to the importer subsystem / driver. io_dmabuf_map represens the actual dma map established for the target device[s] with dma_buf_map_attachment() and stored in a device specific format. The separation into two different objects exists to support map invalidation (see dma_buf_invalidate_mappings()). A token can create multiple maps during its lifetime, but there can only be one (active) map attached to it. It's aslo possible to not have an active map. Invalidation drops the active map if present, and the next map will only be attempted to be created once there is a new request that wants to use the token. The primary task of the io_dmabuf_map object is to count all requests currently using it, which is done with percpu refcounts. When a map is invalidated, we remove it from the token, so there can be no new requests, then it adds a fence to the dmabuf reservation object. Once all the requests complete, we signal the fence and unmap it. [un]mapping and any work with dma addresses is delegated to the importer driver via an ops table stored in the token, see struct io_dmabuf_token_dev_ops. That's required because the generic layer doesn't have knowledge about the device it's going to be use with, and there will be more complex use cases with multiple devices. Signed-off-by: Pavel Begunkov --- include/linux/io_dmabuf_token.h | 92 +++++++++++ lib/Kconfig | 4 + lib/Makefile | 2 + lib/io_dmabuf_token.c | 272 ++++++++++++++++++++++++++++++++ 4 files changed, 370 insertions(+) create mode 100644 include/linux/io_dmabuf_token.h create mode 100644 lib/io_dmabuf_token.c diff --git a/include/linux/io_dmabuf_token.h b/include/linux/io_dmabuf_toke= n.h new file mode 100644 index 000000000000..b94bda684812 --- /dev/null +++ b/include/linux/io_dmabuf_token.h @@ -0,0 +1,92 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_DMA_TOKEN_H +#define _LINUX_DMA_TOKEN_H + +#include + +struct io_dmabuf_fence; +struct io_dmabuf_token; +struct io_dmabuf_map; + +struct io_dmabuf_token_dev_ops { + /* + * Create a new map for the given token. It should be initialised + * with io_dmabuf_init_map(). The callback is executed with the + * reservation lock held. + */ + struct io_dmabuf_map *(*map)(struct io_dmabuf_token *); + + /* + * Clean up device specific parts of the map. The callback is + * executed with the reservation lock held. + */ + void (*unmap)(struct io_dmabuf_token *, struct io_dmabuf_map *); + + /* + * The user tries to destroy the token. Release all device specific + * parts of the token. + */ + void (*release)(struct io_dmabuf_token *); +}; + +struct io_dmabuf_map { + /* + * Counts attached requests and other users. Device specific unmapping + * is deferred until all refs are dropped. + */ + struct percpu_ref refs; + + struct work_struct release_work; + struct io_dmabuf_fence *fence; + struct io_dmabuf_token *token; +}; + +struct io_dmabuf_token { + struct io_dmabuf_map __rcu *map; + struct dma_buf *dmabuf; + enum dma_data_direction dir; + + atomic_t fence_seq; + u64 fence_ctx; + struct work_struct release_work; + refcount_t refs; + + void *dev_priv; + const struct io_dmabuf_token_dev_ops *dev_ops; +}; + +int io_dmabuf_token_create(struct file *file, + struct io_dmabuf_token *token, + struct dma_buf *dmabuf, + enum dma_data_direction dir); +void io_dmabuf_token_release(struct io_dmabuf_token *token); + +struct io_dmabuf_map *io_dmabuf_create_map(struct io_dmabuf_token *token); + +static inline struct io_dmabuf_map *io_dmabuf_get_map(struct io_dmabuf_tok= en *token) +{ + struct io_dmabuf_map *map; + + guard(rcu)(); + + map =3D rcu_dereference(token->map); + if (unlikely(!map || !percpu_ref_tryget_live_rcu(&map->refs))) + return NULL; + + return map; +} + +static inline void io_dmabuf_map_drop(struct io_dmabuf_map *map) +{ + percpu_ref_put(&map->refs); +} + +/* + * Device API + */ + +void io_dmabuf_token_invalidate_mappings(struct io_dmabuf_token *token); +int io_dmabuf_init_map(struct io_dmabuf_token *token, struct io_dmabuf_map= *map); + + +#endif diff --git a/lib/Kconfig b/lib/Kconfig index 0f2fb9610647..853f10bf8e1a 100644 --- a/lib/Kconfig +++ b/lib/Kconfig @@ -636,3 +636,7 @@ config UNION_FIND =20 config MIN_HEAP bool + +config DMABUF_TOKEN + def_bool y + depends on DMA_SHARED_BUFFER diff --git a/lib/Makefile b/lib/Makefile index ea660cca04f4..4a42cfcaa80c 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -246,6 +246,8 @@ obj-$(CONFIG_IRQ_POLL) +=3D irq_poll.o =20 obj-$(CONFIG_POLYNOMIAL) +=3D polynomial.o =20 +obj-$(CONFIG_DMABUF_TOKEN) +=3D io_dmabuf_token.o + # stackdepot.c should not be instrumented or call instrumented functions. # Prevent the compiler from calling builtins like memcmp() or bcmp() from = this # file. diff --git a/lib/io_dmabuf_token.c b/lib/io_dmabuf_token.c new file mode 100644 index 000000000000..808b5ad33dbc --- /dev/null +++ b/lib/io_dmabuf_token.c @@ -0,0 +1,272 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Common infrastructure for supporing dma-buf in the I/O path. + * + * Copyright (C) 2026 Pavel Begunkov + */ +#include +#include + +struct io_dmabuf_fence { + struct dma_fence base; + spinlock_t lock; +}; + +static const char *io_dmabuf_fence_drv_name(struct dma_fence *fence) +{ + /* default fence release kfree's the base pointer */ + BUILD_BUG_ON(offsetof(struct io_dmabuf_fence, base)); + + return "DMABUF token"; +} + +static const char *io_dmabuf_fence_timeline_name(struct dma_fence *fence) +{ + return "DMABUF token"; +} + +const struct dma_fence_ops io_dmabuf_fence_ops =3D { + .get_driver_name =3D io_dmabuf_fence_drv_name, + .get_timeline_name =3D io_dmabuf_fence_timeline_name, +}; + +static void io_dmabuf_token_destroy_work(struct work_struct *work) +{ + struct io_dmabuf_token *token =3D container_of(work, struct io_dmabuf_tok= en, + release_work); + + if (WARN_ON_ONCE(refcount_read(&token->refs))) + return; + + token->dev_ops->release(token); + dma_buf_put(token->dmabuf); + kfree(token); +} + +static void io_dmabuf_map_release_work(struct work_struct *work) +{ + struct io_dmabuf_map *map =3D container_of(work, struct io_dmabuf_map, + release_work); + struct io_dmabuf_fence *fence =3D map->fence; + struct io_dmabuf_token *token =3D map->token; + struct dma_buf *dmabuf =3D token->dmabuf; + + /* the release path must wait for fences */ + if (WARN_ON_ONCE(refcount_read(&token->refs) =3D=3D 0)) + return; + + /* Prevent from destoying the token while unmapping */ + refcount_inc(&token->refs); + + /* + * There are no more requests using the map, we can signal the fence. + * It should be done before taking the resv lock as someone could be + * waiting for the fence while holding the lock. + */ + dma_fence_signal(&fence->base); + + dma_resv_lock(dmabuf->resv, NULL); + token->dev_ops->unmap(token, map); + dma_resv_unlock(dmabuf->resv); + + dma_fence_put(&fence->base); + percpu_ref_exit(&map->refs); + kfree(map); + + if (refcount_dec_and_test(&token->refs)) { + /* + * Destruction needs to wait for I/O and dma fences. Defer it to + * simplify locking. + */ + INIT_WORK(&token->release_work, io_dmabuf_token_destroy_work); + queue_work(system_wq, &token->release_work); + } +} + +static void io_dmabuf_map_refs_release(struct percpu_ref *ref) +{ + struct io_dmabuf_map *map =3D container_of(ref, struct io_dmabuf_map, ref= s); + + /* might sleep, use a worker */ + INIT_WORK(&map->release_work, io_dmabuf_map_release_work); + queue_work(system_wq, &map->release_work); +} + +int io_dmabuf_init_map(struct io_dmabuf_token *token, struct io_dmabuf_map= *map) +{ + struct io_dmabuf_fence *fence =3D NULL; + int ret; + + fence =3D kzalloc(sizeof(*fence), GFP_KERNEL); + if (!fence) + return -ENOMEM; + + ret =3D percpu_ref_init(&map->refs, io_dmabuf_map_refs_release, 0, GFP_KE= RNEL); + if (ret) { + kfree(fence); + return ret; + } + + spin_lock_init(&fence->lock); + dma_fence_init(&fence->base, &io_dmabuf_fence_ops, &fence->lock, + token->fence_ctx, atomic_inc_return(&token->fence_seq)); + map->fence =3D fence; + map->token =3D token; + return 0; +} +EXPORT_SYMBOL_NS_GPL(io_dmabuf_init_map, "DMA_BUF"); + +struct io_dmabuf_map *io_dmabuf_create_map(struct io_dmabuf_token *token) +{ + struct dma_buf *dmabuf =3D token->dmabuf; + struct io_dmabuf_map *map; + long ret; + +retry: + /* + * ->dmabuf_map() will be calling dma_buf_map_attachment(), for which + * we'll need to wait for fences. Do a bit nicer and try to wait + * without the resv lock first. + */ + ret =3D dma_resv_wait_timeout(dmabuf->resv, DMA_RESV_USAGE_KERNEL, + true, MAX_SCHEDULE_TIMEOUT); + if (!ret) + ret =3D -EAGAIN; + if (ret < 0) + return ERR_PTR(ret); + + dma_resv_lock(dmabuf->resv, NULL); + map =3D io_dmabuf_get_map(token); + if (map) { + ret =3D 0; + goto out; + } + + if (dma_resv_wait_timeout(dmabuf->resv, DMA_RESV_USAGE_KERNEL, + true, 0) < 0) { + dma_resv_unlock(dmabuf->resv); + goto retry; + } + + map =3D token->dev_ops->map(token); + if (IS_ERR(map)) { + ret =3D PTR_ERR(map); + goto out; + } + + percpu_ref_get(&map->refs); + rcu_assign_pointer(token->map, map); +out: + dma_resv_unlock(dmabuf->resv); + if (ret < 0) + return ERR_PTR(ret); + return map; +} + +static void io_dmabuf_drop_map(struct io_dmabuf_token *token) +{ + struct dma_buf *dmabuf =3D token->dmabuf; + struct io_dmabuf_map *map; + int ret; + + dma_resv_assert_held(dmabuf->resv); + + map =3D rcu_dereference_protected(token->map, + dma_resv_held(dmabuf->resv)); + if (!map) + return; + rcu_assign_pointer(token->map, NULL); + + ret =3D dma_resv_reserve_fences(dmabuf->resv, 1); + if (WARN_ON_ONCE(ret)) { + struct dma_fence *fence =3D &map->fence->base; + + dma_fence_get(fence); + percpu_ref_kill(&map->refs); + dma_fence_wait(fence, false); + dma_fence_put(fence); + return; + } + + dma_resv_add_fence(dmabuf->resv, &map->fence->base, + DMA_RESV_USAGE_KERNEL); + /* + * Delay destruction until all inflight requests using the map are + * gone. It'll also signal the fence then. + */ + percpu_ref_kill(&map->refs); +} + +void io_dmabuf_token_invalidate_mappings(struct io_dmabuf_token *token) +{ + io_dmabuf_drop_map(token); +} +EXPORT_SYMBOL_NS_GPL(io_dmabuf_token_invalidate_mappings, "DMA_BUF"); + +static void io_dmabuf_token_release_work(struct work_struct *work) +{ + struct io_dmabuf_token *token =3D container_of(work, struct io_dmabuf_tok= en, + release_work); + struct dma_buf *dmabuf =3D token->dmabuf; + long ret; + + dma_resv_lock(dmabuf->resv, NULL); + /* Remove the last map, there should be no new ones going forward. */ + io_dmabuf_drop_map(token); + dma_resv_unlock(dmabuf->resv); + + /* Wait until all maps are destroyed. */ + ret =3D dma_resv_wait_timeout(dmabuf->resv, DMA_RESV_USAGE_KERNEL, + false, MAX_SCHEDULE_TIMEOUT); + + if (WARN_ON_ONCE(ret <=3D 0)) + return; + if (WARN_ON_ONCE(rcu_dereference_protected(token->map, true))) + return; + + if (refcount_dec_and_test(&token->refs)) + io_dmabuf_token_destroy_work(&token->release_work); +} + +void io_dmabuf_token_release(struct io_dmabuf_token *token) +{ + /* + * Destruction needs to wait for I/O and dma fences. Defer it to + * simplify locking. + */ + INIT_WORK(&token->release_work, io_dmabuf_token_release_work); + queue_work(system_wq, &token->release_work); +} + +int io_dmabuf_token_create(struct file *file, + struct io_dmabuf_token *token, + struct dma_buf *dmabuf, + enum dma_data_direction dir) +{ + int ret; + + if (!file->f_op->create_dmabuf_token) + return -EOPNOTSUPP; + + memset(token, 0, sizeof(*token)); + token->fence_ctx =3D dma_fence_context_alloc(1); + token->dir =3D dir; + token->dmabuf =3D dmabuf; + refcount_set(&token->refs, 1); + get_dma_buf(dmabuf); + + ret =3D file->f_op->create_dmabuf_token(file, token); + if (ret) { + memset(token, 0, sizeof(*token)); + dma_buf_put(dmabuf); + return ret; + } + + if (WARN_ON_ONCE(!token->dev_ops || + !token->dev_ops->map || + !token->dev_ops->unmap || + !token->dev_ops->release)) + return -EINVAL; + + return ret; +} --=20 2.53.0 From nobody Tue Jun 16 19:24:14 2026 Received: from mail-wr1-f49.google.com (mail-wr1-f49.google.com [209.85.221.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9DE803815E3 for ; Wed, 29 Apr 2026 15:26:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777476416; cv=none; b=kmaCqW8zE19EbHlggPINCTGti8yTuNIzmr4gPO0F3Esc0WR+D6tyHCkqWTFmFn2O/xZpF4L5wAbmGrEzXuy2oPhBTSHk+nID2bxmJQ5Uqz/r+SVXnTtFj6MK45OI7mFiDYR/BR1weBgwxVOOngIB8GMmdJYeq+71tSVAxvXfEPY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777476416; c=relaxed/simple; bh=DL12ZWqeJwgmWM4073U0nNO7Rud8B90RO8c36PxK6lA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=J4QpblFNddbcusYIMOvdiQAR6+Ma+ZSLRQJ0HDQwH5lRtPcSTGUVtChnqaSnY6DpfMNrK33RZhuy2DNqi0sg40al6HlfRIYWR+Pe3jDfW0rxtjzErocCb49frtihkTq8kECNsDnCvLz5AgJsJxf2KDk7SFVEtgJGbTL5aCeC0eo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=EevxMA7L; arc=none smtp.client-ip=209.85.221.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="EevxMA7L" Received: by mail-wr1-f49.google.com with SMTP id ffacd0b85a97d-43d6fbd0954so9596367f8f.1 for ; Wed, 29 Apr 2026 08:26:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1777476409; x=1778081209; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Pdajlfqb0Mkqw7FfTkWcBSlkNtUxKMEiU4sOjeajLqA=; b=EevxMA7LPtKpXQoHdd237w6Vc8pJbzdxm6cjW6PNSoDJlLosPzx0qIFsRDgPYNbdG7 mexMEQ4B7y3zWxfct269nOC9bu4er8Y6GvfmyrrJRneIVajWkRzJhh1B0Y8Rak9y6l8U JBBZfSuG7lH+3EHCHD3CbnDANBUEW90SUEboSu1+/E8aF9lElvHqUuTUTUWwSCv0jTek eB4W4U2CzXvdYQJ0yK9hi/Fd/NnwU9NR9zcOpQpDQbD7Ksp9i/dB9dfdeVbtLEAivF1p rPNP2fBYkQWVm+FyjgRqQaI8WlJvCVhPpVCW/tZ0pIGGuC7zB+WQvEfLOHlCqhvGkBt3 yoHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777476409; x=1778081209; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=Pdajlfqb0Mkqw7FfTkWcBSlkNtUxKMEiU4sOjeajLqA=; b=Z/F1gEDvpZBv82ZX04T+Vr/fz0UizgnXYT/8SJHBUZ6cuhvP7lustXl+wb1rpXkN4R T4xOQA0gO96IY4GhXZlC2JgX1cBfGMLWICVyY5ww0/l28PiFSLklbUJyO7hayhgj5Xua k5mTJNKgLP2ObIgVG8jp44Z/B8X2vyZvlK1qkZqwT8JAMlIf/E5qZXO4cCMD8DqlDsZd xw2z0WsxOKBbPkjChLF4enUhsqXB7z2oE/UpWzk21drXNRIyQpWoRiLEK6om1D5Wq+zg QBvAosNGCeNCPdh9ZTv8MQBiLkfDv6CLaXYO27h64qrYjrprQTZz/xj2eKzIxbxCYcEq KjuA== X-Forwarded-Encrypted: i=1; AFNElJ+HS3keLXsZ/hcqXhZjYnCn0S1Tn8wQe3gyYH1YKJYggO1D2BQSRw4xnCewGMqmTKnFKQT1bGxxpLF8MaA=@vger.kernel.org X-Gm-Message-State: AOJu0YxzRoXmiRCsNNFkuDCz0ZRwbTqcm3c9CE+KDHZYtrPQjxfid5Vz eJL3+qInaZsVsYxQSEw0aEbrQmMQ/paQyadWiEOpWjbBR8BkLH5DVfMQ X-Gm-Gg: AeBDietT09W7R/XTyYJVLgFmyWStA58BjpVD3kRXITdr9Yfsrh9HnHHon57ypIf262y PIlVpr9JXC2n5Sr4CbaInWb6doY1G04OKQ0qrVkO13zNQaoN9adhOUQ1mArJCVcG2mwi1M0lnFa AXUavd0xQmWjA0XhSUCnwHXQ2DjK6nIZ1EYhOBcUBu5YhoKEPDkVHPv0YGeZxd7B09Q2jKPEkRq 9CH3gVdlDO1yd/nvSINMGqieDLgHXgIOso/NwQvp1q3uqi8xn1h1AHZVjMgxEjt9HL1G4jroaZ2 vKmS/VQ3SIoiOp+Ba/VdlEIhAU0bEldl09/DDImwCDktf0Yo4Phj/ZED8hOPgmYvd1uvZNWZZR8 f9XgsxY/2b4mT+toQvaiEooj1/p2W0cc6vLo9B6J+PbWBRsqUW/RVxAO8ygEhteak2rPCVHDLUT cDkDRN3qbN1/oFnnDciF0kYyw/RaPyegljcax7lYlwkEfYQP6S6YCyZs/RwitumhyA8wOGc7BJK p+69QdyxFvbZheTReahNMo1RGBMIIlsDPxULrrmhKCR X-Received: by 2002:a05:6000:2002:b0:43f:e9ee:5610 with SMTP id ffacd0b85a97d-44790d12be0mr7198547f8f.43.1777476408973; Wed, 29 Apr 2026 08:26:48 -0700 (PDT) Received: from 127.0.0.1localhost ([82.132.184.31]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-447b76e5c22sm6382951f8f.28.2026.04.29.08.26.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Apr 2026 08:26:48 -0700 (PDT) From: Pavel Begunkov To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , Alexander Viro , Christian Brauner , Andrew Morton , Sumit Semwal , =?UTF-8?q?Christian=20K=C3=B6nig?= , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, io-uring@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org Cc: asml.silence@gmail.com, Nitesh Shetty , Kanchan Joshi , Anuj Gupta , Tushar Gohad , William Power , Phil Cayton , Jason Gunthorpe Subject: [PATCH v3 06/10] block: forward create_dmabuf_token to drivers Date: Wed, 29 Apr 2026 16:25:52 +0100 Message-ID: <559756c5e22dcfa183080a979de039910d1b896d.1777475843.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a trivial implementation of the create_dmabuf_token call for block devices that forwards the call to a new blk-mq callback if it's available. Signed-off-by: Pavel Begunkov --- block/fops.c | 14 ++++++++++++++ include/linux/blk-mq.h | 9 +++++++++ 2 files changed, 23 insertions(+) diff --git a/block/fops.c b/block/fops.c index 713a3ba3f457..3d8a48a7d645 100644 --- a/block/fops.c +++ b/block/fops.c @@ -951,6 +951,19 @@ static int blkdev_mmap_prepare(struct vm_area_desc *de= sc) return generic_file_mmap_prepare(desc); } =20 +static int blkdev_create_dmabuf_token(struct file *file, + struct io_dmabuf_token *token) +{ + struct request_queue *q =3D bdev_get_queue(file_bdev(file)); + + if (!(file->f_flags & O_DIRECT)) + return -EINVAL; + if (!q->mq_ops || !q->mq_ops->create_dmabuf_token) + return -EINVAL; + + return q->mq_ops->create_dmabuf_token(q, token); +} + const struct file_operations def_blk_fops =3D { .open =3D blkdev_open, .release =3D blkdev_release, @@ -969,6 +982,7 @@ const struct file_operations def_blk_fops =3D { .fallocate =3D blkdev_fallocate, .uring_cmd =3D blkdev_uring_cmd, .fop_flags =3D FOP_BUFFER_RASYNC, + .create_dmabuf_token =3D blkdev_create_dmabuf_token, }; =20 static __init int blkdev_init(void) diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 18a2388ba581..ee31fb3ada10 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -15,6 +15,8 @@ struct blk_mq_tags; struct blk_flush_queue; struct io_comp_batch; =20 +struct io_dmabuf_token; + #define BLKDEV_MIN_RQ 4 #define BLKDEV_DEFAULT_RQ 128 =20 @@ -684,6 +686,13 @@ struct blk_mq_ops { */ void (*show_rq)(struct seq_file *m, struct request *rq); #endif + + /** + * @create_dma_token: Create a dma token, which will be using to map + * a dmabuf for IO requests. + */ + int (*create_dmabuf_token)(struct request_queue *, + struct io_dmabuf_token *token); }; =20 /* Keep hctx_flag_name[] in sync with the definitions below */ --=20 2.53.0 From nobody Tue Jun 16 19:24:14 2026 Received: from mail-wr1-f51.google.com (mail-wr1-f51.google.com [209.85.221.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 114533815F8 for ; Wed, 29 Apr 2026 15:26:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777476419; cv=none; b=gw3jy0iJm2D6mCpHZT2pdvssMj2ih19YNPiyLHQ1muYgqlhZv+dGLLmNUvKf6RAnFGBeBTEgBgd5oXN+0Y4Uge9qmktatl0LxpPdbpgiGpGSKlaHpo0CewtoKyMkFx0JG2iCGjk806MSM6fkSY4IV4ZpUTJpam0lT2ZIXInOBT4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777476419; c=relaxed/simple; bh=Wh6RxPyI66yH5ETIHW2PWMsa8ELteQ8VlbfRUpAojEw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pEu5fwKyovxADyCvgBRwVc9JLn4NqmpkqDHYPhNbOEAfRjiTGsimwjaHuqv2gmrzVtGCALFlbQD4TRD6mszPiDFc8lxAGgU26iFAoeYLGsn5Yiorp7INNSBkeitbi1/9GU4d91z01HALMcuwade+NB8uAWMfdj3Qis6U+BeCTgk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=k0wwavOr; arc=none smtp.client-ip=209.85.221.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="k0wwavOr" Received: by mail-wr1-f51.google.com with SMTP id ffacd0b85a97d-43d70b3e159so6820139f8f.0 for ; Wed, 29 Apr 2026 08:26:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1777476415; x=1778081215; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=5tNLlPR5H1MSAY4d3fHMpPecWT4SmT3leRzkE1JmSrg=; b=k0wwavOrhL2+fKFQ6mIy4Q3Gyn/rsRnsOeI2ie/YQMPwx9X7Ymh6lhDPq9Rn5WzIKC 0yDfOrUZSb64tta39Sw0I4AH118wpEuIM9bt/5i0QrXpcQeQb2TWjetmp5rSsTdb6vL+ nf5weWE/ZW4HvkzRFaO7oc52iA3iI8CG3ZURJiIEpRP0fKvZQONkrv+72B7vfeo8s/j+ iDQE469DaOKL2qkYwXBbtXTr4Qmg7mvROQKJEHjteR+I3dO3Lwzp4QDb2i/gKpQfKeEd fwHBzdi4NygkgsJGhazvKYe7+nRG8nVgNp99I0TTthpP39SJkfiXWxTEDmRz5qiRlS1M DrkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777476415; x=1778081215; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=5tNLlPR5H1MSAY4d3fHMpPecWT4SmT3leRzkE1JmSrg=; b=koZH5qkk8qvu7BpBW+YejFE3C1RUyNzTnY2LT2sSRlC5EJAF3v6rGMz7w4dFciXajq +4WAqE+Uzf5cJeIKRkSGZUlzcrkwgYVUSLEovHxWehfQkOTDqJH/Qd04LNzJvygvu4Xo Y5r/f0R7+suKT1lb2D28qA471XPO0A9d2NyGEVM25nzvsZHtvsRl6JmXdyfzIw5Poe+F zky5wd9FrLbc/FPtJe6ibVSwRhUFxQIM8pgSaJ8K2gUiLr1IV2Stsjx2XqNGONuD06T0 /ihPR5dLA+sAax+P1Oqsab5jOOihiOukPezSv/hHXWsNFbwRkF4eI4ByQIpNFZw2hg7G ygbA== X-Forwarded-Encrypted: i=1; AFNElJ/J4RhHgq2OnqMzwDCsw+ZGpyBOuRWPGZdIp0JOLsPW4lfl4SdyL1tTQkcX/juW6bvf40KWbchEoDACXgo=@vger.kernel.org X-Gm-Message-State: AOJu0Yz0ME3eXc4ZCDXNamVXFmaXACEGTCObKoNJuyi1MPrAckVCKkOP Ogm5CM2vdsKaldkIlogiMrZBXFCcNYgtUVFJ7oPsiBUG053UZ/NuZ0KS X-Gm-Gg: AeBDievHAoQGVoHuQVmwIY9IZhpVMvwjMfppT5eYyaSOQ4lNmKmOhOzNHLJ6bFYFGq0 X4fYcA/U0a0/9FiaduxMF4dJ9lg6pMNv5YzWDACVGurFRGNx6dKbKoWhN1kggEn6XofFbNSH57F DLPmmUdrzEN8f34Tf9kTz9hS1uT6mL2lfQ5PRpnYKkZn7Gc6y6xrYRHxxums8+hWoB+potSO4Ae rwvja2pzkrP5nbdRkOeIBDn+q2OMQ3rsXIrtPwatcyNP0fkkshjpiccn+envretqcNV9uUQ9nHx bSEuBdKvYYLo+tiyzJugHWfNGI0cib6atlTkGYcdn79rDU16pQsgreIAxU1a8yAo9Yz4om+gmNZ P/7kWIU/Ilt3BAMVFTwVFKHicoxqEH17zkjvs6lKWJqAEVR3YMLqGuBZiO7qNHnFHjCX9l6V9XH oXCclXTpF5H/jrpFiTWlD953FZoTmIDT5Gc+K8zNub9580nCS1KMN4O+xVuEPQKLPI/ekcFxQGF ozlnYn5irrjbJxBhqqZPLKuKIVoobs7HWXMlrw10KnF X-Received: by 2002:a05:6000:601:b0:43f:e22d:9a73 with SMTP id ffacd0b85a97d-44647808b94mr14223288f8f.2.1777476415006; Wed, 29 Apr 2026 08:26:55 -0700 (PDT) Received: from 127.0.0.1localhost ([82.132.184.31]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-447b76e5c22sm6382951f8f.28.2026.04.29.08.26.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Apr 2026 08:26:54 -0700 (PDT) From: Pavel Begunkov To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , Alexander Viro , Christian Brauner , Andrew Morton , Sumit Semwal , =?UTF-8?q?Christian=20K=C3=B6nig?= , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, io-uring@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org Cc: asml.silence@gmail.com, Nitesh Shetty , Kanchan Joshi , Anuj Gupta , Tushar Gohad , William Power , Phil Cayton , Jason Gunthorpe Subject: [PATCH v3 07/10] nvme-pci: implement dma_token backed requests Date: Wed, 29 Apr 2026 16:25:53 +0100 Message-ID: <5cecb1157ab784f9f303a91449fdf11b03aa6002.1777475843.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Enable BIO_DMABUF_MAP backed requests. It creates a prp list for the dmabuf when it's mapped, which is then used to initialise requests. Suggested-by: Keith Busch Signed-off-by: Pavel Begunkov --- drivers/nvme/host/pci.c | 282 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 282 insertions(+) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index db5fc9bf6627..d2629853a972 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -27,6 +27,8 @@ #include #include #include +#include +#include =20 #include "trace.h" #include "nvme.h" @@ -393,6 +395,17 @@ struct nvme_queue { struct completion delete_done; }; =20 +struct nvme_dmabuf_token { + struct dma_buf_attachment *attach; +}; + +struct nvme_dmabuf_map { + struct io_dmabuf_map base; + dma_addr_t *dma_list; + struct sg_table *sgt; + unsigned nr_entries; +}; + /* bits for iod->flags */ enum nvme_iod_flags { /* this command has been aborted by the timeout handler */ @@ -854,6 +867,134 @@ static void nvme_free_descriptors(struct request *req) } } =20 +static void nvme_dmabuf_map_sync(struct nvme_dev *nvme_dev, struct request= *req, + bool for_cpu) +{ + int length =3D blk_rq_payload_bytes(req); + struct device *dev =3D nvme_dev->dev; + enum dma_data_direction dma_dir; + struct bio *bio =3D req->bio; + struct nvme_dmabuf_map *map; + dma_addr_t *dma_list; + int offset, map_idx; + + dma_dir =3D rq_data_dir(req) =3D=3D READ ? DMA_FROM_DEVICE : DMA_TO_DEVIC= E; + map =3D container_of(bio->dmabuf_map, struct nvme_dmabuf_map, base); + dma_list =3D map->dma_list; + + offset =3D bio->bi_iter.bi_bvec_done; + map_idx =3D offset / NVME_CTRL_PAGE_SIZE; + length +=3D offset & (NVME_CTRL_PAGE_SIZE - 1); + + while (length > 0) { + u64 dma_addr =3D dma_list[map_idx++]; + + if (for_cpu) + __dma_sync_single_for_cpu(dev, dma_addr, + NVME_CTRL_PAGE_SIZE, dma_dir); + else + __dma_sync_single_for_device(dev, dma_addr, + NVME_CTRL_PAGE_SIZE, + dma_dir); + length -=3D NVME_CTRL_PAGE_SIZE; + } +} + +static void nvme_rq_clean_dmabuf_map(struct nvme_dev *dev, + struct request *req) +{ + struct nvme_iod *iod =3D blk_mq_rq_to_pdu(req); + + nvme_dmabuf_map_sync(dev, req, true); + + if (!(iod->flags & IOD_SINGLE_SEGMENT)) + nvme_free_descriptors(req); +} + +static blk_status_t nvme_rq_setup_dmabuf_map(struct request *req, + struct nvme_queue *nvmeq) +{ + struct nvme_iod *iod =3D blk_mq_rq_to_pdu(req); + int length =3D blk_rq_payload_bytes(req); + u64 dma_addr, prp1_dma, prp2_dma; + struct bio *bio =3D req->bio; + struct nvme_dmabuf_map *map; + dma_addr_t *dma_list; + dma_addr_t prp_dma; + __le64 *prp_list; + int i, map_idx; + int offset; + + nvme_dmabuf_map_sync(nvmeq->dev, req, false); + + map =3D container_of(bio->dmabuf_map, struct nvme_dmabuf_map, base); + dma_list =3D map->dma_list; + + offset =3D bio->bi_iter.bi_bvec_done; + map_idx =3D offset / NVME_CTRL_PAGE_SIZE; + offset &=3D (NVME_CTRL_PAGE_SIZE - 1); + prp1_dma =3D dma_list[map_idx++] + offset; + + length -=3D (NVME_CTRL_PAGE_SIZE - offset); + if (length <=3D 0) { + prp2_dma =3D 0; + goto done; + } + + if (length <=3D NVME_CTRL_PAGE_SIZE) { + prp2_dma =3D dma_list[map_idx]; + goto done; + } + + if (DIV_ROUND_UP(length, NVME_CTRL_PAGE_SIZE) <=3D + NVME_SMALL_POOL_SIZE / sizeof(__le64)) + iod->flags |=3D IOD_SMALL_DESCRIPTOR; + + prp_list =3D dma_pool_alloc(nvme_dma_pool(nvmeq, iod), GFP_ATOMIC, + &prp_dma); + if (!prp_list) + return BLK_STS_RESOURCE; + + iod->descriptors[iod->nr_descriptors++] =3D prp_list; + prp2_dma =3D prp_dma; + i =3D 0; + for (;;) { + if (i =3D=3D NVME_CTRL_PAGE_SIZE >> 3) { + __le64 *old_prp_list =3D prp_list; + + prp_list =3D dma_pool_alloc(nvmeq->descriptor_pools.large, + GFP_ATOMIC, &prp_dma); + if (!prp_list) + goto free_prps; + iod->descriptors[iod->nr_descriptors++] =3D prp_list; + prp_list[0] =3D old_prp_list[i - 1]; + old_prp_list[i - 1] =3D cpu_to_le64(prp_dma); + i =3D 1; + } + + dma_addr =3D dma_list[map_idx++]; + prp_list[i++] =3D cpu_to_le64(dma_addr); + + length -=3D NVME_CTRL_PAGE_SIZE; + if (length <=3D 0) + break; + } +done: + iod->cmd.common.dptr.prp1 =3D cpu_to_le64(prp1_dma); + iod->cmd.common.dptr.prp2 =3D cpu_to_le64(prp2_dma); + return BLK_STS_OK; +free_prps: + nvme_free_descriptors(req); + return BLK_STS_RESOURCE; +} + +static inline bool nvme_rq_is_dmabuf_attached(struct request *req) +{ + if (!IS_ENABLED(CONFIG_DMABUF_TOKEN)) + return false; + return req->bio && bio_flagged(req->bio, BIO_DMABUF_MAP); +} + static void nvme_free_prps(struct request *req, unsigned int attrs) { struct nvme_iod *iod =3D blk_mq_rq_to_pdu(req); @@ -932,6 +1073,11 @@ static void nvme_unmap_data(struct request *req) struct device *dma_dev =3D nvmeq->dev->dev; unsigned int attrs =3D 0; =20 + if (nvme_rq_is_dmabuf_attached(req)) { + nvme_rq_clean_dmabuf_map(nvmeq->dev, req); + return; + } + if (iod->flags & IOD_SINGLE_SEGMENT) { static_assert(offsetof(union nvme_data_ptr, prp1) =3D=3D offsetof(union nvme_data_ptr, sgl.addr)); @@ -1222,6 +1368,9 @@ static blk_status_t nvme_map_data(struct request *req) struct blk_dma_iter iter; blk_status_t ret; =20 + if (nvme_rq_is_dmabuf_attached(req)) + return nvme_rq_setup_dmabuf_map(req, nvmeq); + /* * Try to skip the DMA iterator for single segment requests, as that * significantly improves performances for small I/O sizes. @@ -2238,6 +2387,134 @@ static int nvme_create_queue(struct nvme_queue *nvm= eq, int qid, bool polled) return result; } =20 +#ifdef CONFIG_DMABUF_TOKEN +static void nvme_dmabuf_invalidate_mappings(struct dma_buf_attachment *att= ach) +{ + struct io_dmabuf_token *token =3D attach->importer_priv; + + io_dmabuf_token_invalidate_mappings(token); +} + +const struct dma_buf_attach_ops nvme_dmabuf_importer_ops =3D { + .invalidate_mappings =3D nvme_dmabuf_invalidate_mappings, + .allow_peer2peer =3D true, +}; + +static struct io_dmabuf_map *nvme_dmabuf_token_map(struct io_dmabuf_token = *token) +{ + struct nvme_dmabuf_token *data =3D token->dev_priv; + struct dma_buf_attachment *attach =3D data->attach; + dma_addr_t *dma_list =3D NULL; + unsigned long tmp, i =3D 0; + struct nvme_dmabuf_map *map; + struct scatterlist *sg; + struct sg_table *sgt; + unsigned nr_entries; + int ret; + + dma_resv_assert_held(token->dmabuf->resv); + + map =3D kmalloc(sizeof(*map), GFP_KERNEL); + if (!map) + return ERR_PTR(-ENOMEM); + + nr_entries =3D token->dmabuf->size / NVME_CTRL_PAGE_SIZE; + dma_list =3D kmalloc_array(nr_entries, sizeof(dma_list[0]), GFP_KERNEL); + if (!dma_list) { + ret =3D -ENOMEM; + goto err; + } + + sgt =3D dma_buf_map_attachment(attach, token->dir); + if (IS_ERR(sgt)) { + ret =3D PTR_ERR(sgt); + sgt =3D NULL; + goto err; + } + + for_each_sgtable_dma_sg(sgt, sg, tmp) { + dma_addr_t dma_addr =3D sg_dma_address(sg); + unsigned long sg_len =3D sg_dma_len(sg); + + if (sg_len % NVME_CTRL_PAGE_SIZE) { + ret =3D -EINVAL; + goto err; + } + + while (sg_len) { + dma_list[i++] =3D dma_addr; + dma_addr +=3D NVME_CTRL_PAGE_SIZE; + sg_len -=3D NVME_CTRL_PAGE_SIZE; + } + } + + ret =3D io_dmabuf_init_map(token, &map->base); + if (ret) + goto err; + map->nr_entries =3D nr_entries; + map->dma_list =3D dma_list; + map->sgt =3D sgt; + return &map->base; +err: + if (sgt) + dma_buf_unmap_attachment(attach, sgt, token->dir); + kfree(map); + kfree(dma_list); + return ERR_PTR(ret); +} + +static void nvme_dmabuf_token_unmap(struct io_dmabuf_token *token, + struct io_dmabuf_map *map_base) +{ + struct nvme_dmabuf_token *data =3D token->dev_priv; + struct nvme_dmabuf_map *map =3D container_of(map_base, + struct nvme_dmabuf_map, base); + + dma_resv_assert_held(token->dmabuf->resv); + + dma_buf_unmap_attachment(data->attach, map->sgt, token->dir); + kfree(map->dma_list); +} + +static void nvme_dmabuf_token_release(struct io_dmabuf_token *token) +{ + struct nvme_dmabuf_token *data =3D token->dev_priv; + + dma_buf_detach(token->dmabuf, data->attach); + kfree(data); +} + +const struct io_dmabuf_token_dev_ops nvme_dma_token_ops =3D { + .map =3D nvme_dmabuf_token_map, + .unmap =3D nvme_dmabuf_token_unmap, + .release =3D nvme_dmabuf_token_release, +}; + +static int nvme_create_dmabuf_token(struct request_queue *q, + struct io_dmabuf_token *token) +{ + struct nvme_dmabuf_token *data; + struct dma_buf_attachment *attach; + struct nvme_ns *ns =3D q->queuedata; + struct nvme_dev *dev =3D to_nvme_dev(ns->ctrl); + struct dma_buf *dmabuf =3D token->dmabuf; + + data =3D kzalloc(sizeof(data), GFP_KERNEL); + if (!data) + return -ENOMEM; + + token->dev_priv =3D data; + token->dev_ops =3D &nvme_dma_token_ops; + + attach =3D dma_buf_dynamic_attach(dmabuf, dev->dev, + &nvme_dmabuf_importer_ops, token); + if (IS_ERR(attach)) + return PTR_ERR(attach); + data->attach =3D attach; + return 0; +} +#endif + static const struct blk_mq_ops nvme_mq_admin_ops =3D { .queue_rq =3D nvme_queue_rq, .complete =3D nvme_pci_complete_rq, @@ -2256,6 +2533,10 @@ static const struct blk_mq_ops nvme_mq_ops =3D { .map_queues =3D nvme_pci_map_queues, .timeout =3D nvme_timeout, .poll =3D nvme_poll, + +#ifdef CONFIG_DMABUF_TOKEN + .create_dmabuf_token =3D nvme_create_dmabuf_token, +#endif }; =20 static void nvme_dev_remove_admin(struct nvme_dev *dev) @@ -4289,5 +4570,6 @@ MODULE_AUTHOR("Matthew Wilcox = "); MODULE_LICENSE("GPL"); MODULE_VERSION("1.0"); MODULE_DESCRIPTION("NVMe host PCIe transport driver"); +MODULE_IMPORT_NS("DMA_BUF"); module_init(nvme_init); module_exit(nvme_exit); --=20 2.53.0 From nobody Tue Jun 16 19:24:14 2026 Received: from mail-wr1-f49.google.com (mail-wr1-f49.google.com [209.85.221.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 540D93822BD for ; Wed, 29 Apr 2026 15:27:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777476423; cv=none; b=fz8D1YVXHhJOeltRDK2v/LUW8lbyvtYGLprNWtvoE9ItcrKp+1zI4tgr+jR/Yy0eVyVKRfNup1UtPO6MZg3Kf2b62gzTLkbspJYPYfDpxV//PSK1OlLIHVZBgRSHMuz9bDgWg6rEFEFTZa+uUKdm3gV+hNIsWewi1VgjzDEmWBo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777476423; c=relaxed/simple; bh=8Q724PKAeWMa6roOSMmfJBPXkxkRSKnlWqU8HQUxKDI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=WdPAfzoZkgfmrkywcQ8KT2jiMGbSTqvJcAyBd+WBVKbXGnspPW4GXxSiyqGE1fYDmzQA9nqzAQH9OoBgY6dexApFLo+TRtfN8lA/1lUUnYEsqiCbYEwGh7JZ/2PbW3HRS3fn1soCs9bYjEPObGfnVIbfF7WTfaVy/VtYfFMkEs0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=itq6qZ7P; arc=none smtp.client-ip=209.85.221.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="itq6qZ7P" Received: by mail-wr1-f49.google.com with SMTP id ffacd0b85a97d-43d77f60944so9087130f8f.3 for ; Wed, 29 Apr 2026 08:27:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1777476420; x=1778081220; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Okho95RNKRmNJtlBZrtRfvb3jbBLTZKlsmbr0Q4pBPI=; b=itq6qZ7PTBJI54cVGL7/SgM5oWV1agvjlAAOYuZW4kRdal5vy2XeXvfkrJCaMztAQp ngePVXhwmcThB6gUEOgDruHlCvQsy15ncOeLhubY/U7i1aWdUk/YrEekg31w/lAJaBbe Uf3mjWpRVSn+a/bFdvdIL7x60AjxfTWYjvMUhsl4RpUUXsRMADB91a4ivaSVHAxR3ztv /ftx/Z4W7eTDhwSqOYwKWJ6iqn+hx7M2LrFxwH7i4PpeynJmBRuB6m61vgdJI2iJMbnG tL9HtOOsVgKQ2r3gjPXx7HyyEHrCs5vBSGF+KEsTXOIxTV82BU87wO3L5A/yXzZnb6CA dJoQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777476420; x=1778081220; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=Okho95RNKRmNJtlBZrtRfvb3jbBLTZKlsmbr0Q4pBPI=; b=APpTlsvIlx+O1NjbwyGl3xPCuIWBDk/76t2Xby/AmOmKkxm/G0aIQ/w1mXnfScf6eh 4RwDwIkawSNrOC2e4akkMxsbGlpcIVgoq5LYmsbXS6p9P+ABQmSEJhRiAY0Wl1uASWzb qBoHTjjtycu/vu9ZeGCL++1M5F/oBOjL5YrS8etFhiCVWGLZCDWrUUaaB4CPG2+59hDw ilu6y8ZyL4y0mytacRWoeVUIHOS9FveomfOw23F4kkhlJ/fWdxEtZnm/AKNlCZE3RFza PHdTAhSRSvW2+/uMQcProRgGZVHZ+2wX/MdiXjnuGmHx4RC5Bot317tu9uG00SMKoT8Y OJKg== X-Forwarded-Encrypted: i=1; AFNElJ9Xcf4fAZrweMuTAH4T9v1F2F4emc2xs/RBZ8Ke69r3czugIgLoUNdHq4bG7uS1fVEwLrzbrlCfvs54mYA=@vger.kernel.org X-Gm-Message-State: AOJu0YxsHyJ/WznypjyDeWFlTbA259jtA/YFFgY6s8xmKQ/nnfPHyLwA JAktJK+qI0kgq07Ke88shoxhDdj6uzGoUljX0zD0XxGh4zbAO7vKws82 X-Gm-Gg: AeBDietvIS+iZoCVh0Y6yofn/nOasot4ZeZ6odezqeQexBPM/FlVYOI4prMDNxGx/+1 aaHOyg1Zh0K4xtGRO3BpAerMqHQ0l/ZeayCdV3LavsdKaLTC4PsRrUsT8EgehB4CP+IyPxjig+U qkoHvzoPXtoWLcgPunl3LuSs8RZdOLEKPS7ndaWYej2dcr4sbwx3+lPq6DaVRfeBwNZQJA6XjqK vcgYOB3Jznn13h9Eo/pCXOPAtNNnPOShgvELKsy/UM6KLL3TtzByCcSrFZ/L62a28+lKQDkF+Eq tSsJpkrQK/mGRcTwIlKxhsRTFpDLF1k9jX7X6DwIBuAE8/lpN2lvrLp0TWLuSYGoyQPGDR6Yh7Q AL3VvzvgCtmm1xKHtGzpqUunJHKWstLEPrHbkqrfqsJ/hbXE/hn7Mzc37sGoid0HqyFr94+GiVf Th33VntLhp4SXa9WYInQncepN7R1nZJp61Mn70p8hy8Ia3i7wbf2ZXSnCmpO+Py14j3IsKxq8FB 2lU2d0dj/XoZHfliIRaoAkXrz/QNhfnMeYOWZap1Ag3 X-Received: by 2002:a05:6000:1a89:b0:43d:7d6f:f529 with SMTP id ffacd0b85a97d-44790a325e5mr7826316f8f.31.1777476419464; Wed, 29 Apr 2026 08:26:59 -0700 (PDT) Received: from 127.0.0.1localhost ([82.132.184.31]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-447b76e5c22sm6382951f8f.28.2026.04.29.08.26.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Apr 2026 08:26:58 -0700 (PDT) From: Pavel Begunkov To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , Alexander Viro , Christian Brauner , Andrew Morton , Sumit Semwal , =?UTF-8?q?Christian=20K=C3=B6nig?= , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, io-uring@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org Cc: asml.silence@gmail.com, Nitesh Shetty , Kanchan Joshi , Anuj Gupta , Tushar Gohad , William Power , Phil Cayton , Jason Gunthorpe Subject: [PATCH v3 08/10] io_uring/rsrc: introduce buf registration structure Date: Wed, 29 Apr 2026 16:25:54 +0100 Message-ID: <881422d8d613a8370ed98b158d2b57b46bb37230.1777475843.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In preparation to following changes, instead of passing an iovec for buffer registration introduce a new structure. It'll be moved to uapi later, but for now it's initialised early from a user provided iovec. Signed-off-by: Pavel Begunkov --- io_uring/rsrc.c | 50 +++++++++++++++++++++++++++++++++---------------- 1 file changed, 34 insertions(+), 16 deletions(-) diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c index c4a7a77d1ee9..ba00238941ed 100644 --- a/io_uring/rsrc.c +++ b/io_uring/rsrc.c @@ -27,8 +27,14 @@ struct io_rsrc_update { u32 offset; }; =20 +struct io_uring_regbuf_desc { + __u64 uaddr; + __u64 size; +}; + static struct io_rsrc_node *io_sqe_buffer_register(struct io_ring_ctx *ctx, - struct iovec *iov, struct page **last_hpage); + struct io_uring_regbuf_desc *desc, + struct page **last_hpage); =20 /* only define max */ #define IORING_MAX_FIXED_FILES (1U << 20) @@ -36,6 +42,15 @@ static struct io_rsrc_node *io_sqe_buffer_register(struc= t io_ring_ctx *ctx, =20 #define IO_CACHED_BVECS_SEGS 32 =20 +static void io_iov_to_regbuf_desc(const struct iovec *iov, + struct io_uring_regbuf_desc *desc) +{ + *desc =3D (struct io_uring_regbuf_desc) { + .uaddr =3D (u64)iov->iov_base, + .size =3D iov->iov_len, + }; +} + int __io_account_mem(struct user_struct *user, unsigned long nr_pages) { unsigned long page_limit, cur_pages, new_pages; @@ -291,6 +306,7 @@ static int __io_sqe_buffers_update(struct io_ring_ctx *= ctx, return -EINVAL; =20 for (done =3D 0; done < nr_args; done++) { + struct io_uring_regbuf_desc desc; struct io_rsrc_node *node; u64 tag =3D 0; =20 @@ -304,7 +320,9 @@ static int __io_sqe_buffers_update(struct io_ring_ctx *= ctx, err =3D -EFAULT; break; } - node =3D io_sqe_buffer_register(ctx, iov, &last_hpage); + + io_iov_to_regbuf_desc(iov, &desc); + node =3D io_sqe_buffer_register(ctx, &desc, &last_hpage); if (IS_ERR(node)) { err =3D PTR_ERR(node); break; @@ -760,27 +778,27 @@ bool io_check_coalesce_buffer(struct page **page_arra= y, int nr_pages, } =20 static struct io_rsrc_node *io_sqe_buffer_register(struct io_ring_ctx *ctx, - struct iovec *iov, - struct page **last_hpage) + struct io_uring_regbuf_desc *desc, + struct page **last_hpage) { + unsigned long uaddr =3D (unsigned long)desc->uaddr; + size_t size =3D desc->size; struct io_mapped_ubuf *imu =3D NULL; struct page **pages =3D NULL; struct io_rsrc_node *node; unsigned long off; - size_t size; int ret, nr_pages, i; struct io_imu_folio_data data; bool coalesced =3D false; =20 - if (!iov->iov_base) { - if (iov->iov_len) + if (!uaddr) { + if (size) return ERR_PTR(-EFAULT); /* remove the buffer without installing a new one */ return NULL; } =20 - ret =3D io_validate_user_buf_range((unsigned long)iov->iov_base, - iov->iov_len); + ret =3D io_validate_user_buf_range(uaddr, size); if (ret) return ERR_PTR(ret); =20 @@ -789,8 +807,7 @@ static struct io_rsrc_node *io_sqe_buffer_register(stru= ct io_ring_ctx *ctx, return ERR_PTR(-ENOMEM); =20 ret =3D -ENOMEM; - pages =3D io_pin_pages((unsigned long) iov->iov_base, iov->iov_len, - &nr_pages); + pages =3D io_pin_pages(uaddr, size, &nr_pages); if (IS_ERR(pages)) { ret =3D PTR_ERR(pages); pages =3D NULL; @@ -812,10 +829,9 @@ static struct io_rsrc_node *io_sqe_buffer_register(str= uct io_ring_ctx *ctx, if (ret) goto done; =20 - size =3D iov->iov_len; /* store original address for later verification */ - imu->ubuf =3D (unsigned long) iov->iov_base; - imu->len =3D iov->iov_len; + imu->ubuf =3D uaddr; + imu->len =3D size; imu->folio_shift =3D PAGE_SHIFT; imu->release =3D io_release_ubuf; imu->priv =3D imu; @@ -825,7 +841,7 @@ static struct io_rsrc_node *io_sqe_buffer_register(stru= ct io_ring_ctx *ctx, imu->folio_shift =3D data.folio_shift; refcount_set(&imu->refs, 1); =20 - off =3D (unsigned long)iov->iov_base & ~PAGE_MASK; + off =3D uaddr & ~PAGE_MASK; if (coalesced) off +=3D data.first_folio_page_idx << PAGE_SHIFT; =20 @@ -878,6 +894,7 @@ int io_sqe_buffers_register(struct io_ring_ctx *ctx, vo= id __user *arg, memset(iov, 0, sizeof(*iov)); =20 for (i =3D 0; i < nr_args; i++) { + struct io_uring_regbuf_desc desc; struct io_rsrc_node *node; u64 tag =3D 0; =20 @@ -901,7 +918,8 @@ int io_sqe_buffers_register(struct io_ring_ctx *ctx, vo= id __user *arg, } } =20 - node =3D io_sqe_buffer_register(ctx, iov, &last_hpage); + io_iov_to_regbuf_desc(iov, &desc); + node =3D io_sqe_buffer_register(ctx, &desc, &last_hpage); if (IS_ERR(node)) { ret =3D PTR_ERR(node); break; --=20 2.53.0 From nobody Tue Jun 16 19:24:14 2026 Received: from mail-wr1-f48.google.com (mail-wr1-f48.google.com [209.85.221.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0C3E3389116 for ; Wed, 29 Apr 2026 15:27:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777476428; cv=none; b=tntVlCd/oqasUqCIDYFkTIHhvTKwIN0iv/iO9nDTqp4OY0p/oCiwilF6N8NIDzBt9exKS4xWx09qu7ce2Z+uwTTf4xpUumh5s6lUpLtlGb5UvzQ9HvjVI9dw3IMo0yGvlGfh/Jcw+vym/qHjqD7Fe4BK2/y8hHzWUHwQWWhi5Bo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777476428; c=relaxed/simple; bh=MzCLUVNjom1kdvfOy4mbWiP95oxUQojGE7qkZH3udWI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=p6gz7Y/pOB21ITsZfCMlM3klRiseQ7I5Cx1IZJleXDie7dD5dscLg35bDOHrv6zCDRF3G9r0dPMLXIhfv2ZwHNcAvDyrmJ8yPFm/qVPqEVA5ciMHSehK5Lf71A0uYhV8e1LzLy4lNCW09y6O0wZnanpN7qoykbaGGvaDtE0ZuhI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=l51hFs9t; arc=none smtp.client-ip=209.85.221.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="l51hFs9t" Received: by mail-wr1-f48.google.com with SMTP id ffacd0b85a97d-43cfd1f9fd1so7925325f8f.3 for ; Wed, 29 Apr 2026 08:27:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1777476423; x=1778081223; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=lUujl6+MDkMLMC27AkxpxJZzY2/aRxcdD8WLm5JxhAU=; b=l51hFs9tUU7CUqP7PIYM0H8C1Vzu/tTPsqh2+6zJvV7kKmiLIeNk32HRXfiVdgrAD2 PQrXGBO2CzV+JX/oBVCCUMlnDRFvx4VxZ6K2HR0zCDrpdmu3MnHlvtByYFfyu2CRAqc1 r7UkVrykwjJEot5zsWsHlX26bPgPRNscaOgWhT8dTFpoKSqLYJGN5rJ6+6H+memeMMB5 ibIWsm8c94FZn8/ivMg0lZpbq2rpuwfnkvWz0OEkXjg8HMoDekq1ej0etNGvbp7z4S4V IqkFI+h5lJg2Xh0SSyfIjaJMo1iDLLIp7qSMkfedJXVvhgcQ54JpzwXZkJB1VB4+ejO3 MdnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777476423; x=1778081223; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=lUujl6+MDkMLMC27AkxpxJZzY2/aRxcdD8WLm5JxhAU=; b=mXeVoNRxxm6yboNBLXoSPoX7nhqYtDh7PMbB9vqUyiEtepei3V59ORP+O5Ejkpnlg/ XQr7EySclCUL/+6MQY/eI2C5pKkxO+k81r+ovbw3Ent+xi7sL1PvwMQQhth9oqYd5ioo cHIeRlcw0NCQXpfQtknhupxg+Be5ZXOMNZqKtaxSRTyl/aT7LLeVTg+DSEoxvsc/wrle GaeNyfaIQVlhK22LcoDx98aPKOp/0JrezpxMXTWAAZLLN9HmXI9mHCCY2obQADlaAXaS hCrXZdzqycOdYA3l1yuSr2wClPzx7yMGeYoEsP+wiThNZCj4YxNXOq7UKGVh2Q1AdmpY lruQ== X-Forwarded-Encrypted: i=1; AFNElJ9bez1mS6vZ2ion+OptmPMBbwhLCUXMI2H/P8DrYZDQ+NJVEGy4H38Hjz7WZyfO0WfF7yrYHHJBacfmRmk=@vger.kernel.org X-Gm-Message-State: AOJu0Yy+c9UHkeErNULJ9z9FxqFI1br0kB5RVlEomknKLNgf9wtd8GOK 3uLwJP5kNgiVjq4snXjs3FrOofzieTYqrz4dhjSkyRquZLWrsmxJ5j15 X-Gm-Gg: AeBDievpd0GGbcdUmpCDwihaLeo99bOp9Bz+BWPqKTyG73bcmqnjsArp1zaM+vYInPW 3YFK6J7MdzB3gZk11NzcdgN6gEP5L7Fz6xNqS4w/e6zrvOxke2Pg1jBPE3Q6USFAyKFUlbz44IE oZ28Bv8or59PWmK8BmVMeMEUDtlT83SA3Uap1So+qB6yuGF/46hzsnu0Vp8vjNDRRK726dVuDKD 7zG/mF+rfV8mXGeA8TtDa+F094oYgfJI1r80fCToXn/x+AmcI4HzaZuJwlZaPfM1/ZM8PyRLvzA 8VR0irb0Wub6W5olgpvgfhPIyOdog158n8tJNMiQB5d1CizHQ5BhgVDJtkNBL0tCYbRku+xAcbN mP8GeIh0tqGCc1vgJfbyvZaQL+2eyHbS/iYKtnIpMMvIG7vpvSHq482H8YXrLM4tFQpaeNFiLLu 23zeDvnnBBh9/rkNtpu0yMQOkdDn9S5wlilbJD4YX5MhzN8kIgi8+al+eZekzcPxenW36ewuyn5 Es6vS3wHx/U4ytmXvT+ekS01wKur9MKmfaPw/NPMuZS X-Received: by 2002:a05:6000:2903:b0:43d:4b00:9ee7 with SMTP id ffacd0b85a97d-4464b1b8722mr14754743f8f.33.1777476423217; Wed, 29 Apr 2026 08:27:03 -0700 (PDT) Received: from 127.0.0.1localhost ([82.132.184.31]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-447b76e5c22sm6382951f8f.28.2026.04.29.08.26.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Apr 2026 08:27:02 -0700 (PDT) From: Pavel Begunkov To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , Alexander Viro , Christian Brauner , Andrew Morton , Sumit Semwal , =?UTF-8?q?Christian=20K=C3=B6nig?= , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, io-uring@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org Cc: asml.silence@gmail.com, Nitesh Shetty , Kanchan Joshi , Anuj Gupta , Tushar Gohad , William Power , Phil Cayton , Jason Gunthorpe Subject: [PATCH v3 09/10] io_uring/rsrc: extend buffer update Date: Wed, 29 Apr 2026 16:25:55 +0100 Message-ID: X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" We need to pass more information to buffer registration than we can fit into a single struct iovec. This patch allows users to optionally pass struct io_uring_regbuf_desc. Apart from having more space for future use cases, it also introduces registration types. Currently, the type can be either of IO_REGBUF_TYPE_UADDR, which mirrors the iovec path, or IO_REGBUF_TYPE_EMPTY for leaving a buffer table slot empty. The next patch introduces a dmabuf backed type, and can be useful for other extensions like splicing a list of user addresses (i.e. iovec[]), interoperability with zcrx, kernel allocated memory like was brough up by Cristoph. Note, the type only represents a registration option, which is distinct from how io_uring internally stores it. The flags field is not used yet but always useful to have, e.g. we can encode read-only / write-only restrictions using it. Signed-off-by: Pavel Begunkov --- include/uapi/linux/io_uring.h | 27 +++++++++++++- io_uring/rsrc.c | 69 ++++++++++++++++++++++------------- 2 files changed, 69 insertions(+), 27 deletions(-) diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 17ac1b785440..05c3fd078767 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -790,13 +790,38 @@ struct io_uring_rsrc_update { =20 struct io_uring_rsrc_update2 { __u32 offset; - __u32 resv; + __u32 flags; __aligned_u64 data; __aligned_u64 tags; __u32 nr; __u32 resv2; }; =20 +/* struct io_uring_rsrc_update2::flags */ +enum io_uring_rsrc_reg_flags { + /* + * Use the extended descriptor format for buffer updates, + * see struct io_uring_regbuf_desc + */ + IORING_RSRC_UPDATE_EXTENDED =3D 1U << 1, +}; + +/* Buffer registration type, passed in struct io_uring_regbuf_desc::type */ +enum io_uring_regbuf_type { + IO_REGBUF_TYPE_EMPTY, + IO_REGBUF_TYPE_UADDR, + + __IO_REGBUF_TYPE_MAX, +}; + +struct io_uring_regbuf_desc { + __u32 type; /* enum io_uring_regbuf_type */ + __u32 flags; + __u64 size; + __u64 uaddr; + __u64 __resv[7]; +}; + /* Skip updating fd indexes set to this value in the fd table */ #define IORING_REGISTER_FILES_SKIP (-2) =20 diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c index ba00238941ed..f8696b01cb54 100644 --- a/io_uring/rsrc.c +++ b/io_uring/rsrc.c @@ -27,11 +27,6 @@ struct io_rsrc_update { u32 offset; }; =20 -struct io_uring_regbuf_desc { - __u64 uaddr; - __u64 size; -}; - static struct io_rsrc_node *io_sqe_buffer_register(struct io_ring_ctx *ctx, struct io_uring_regbuf_desc *desc, struct page **last_hpage); @@ -46,9 +41,12 @@ static void io_iov_to_regbuf_desc(const struct iovec *io= v, struct io_uring_regbuf_desc *desc) { *desc =3D (struct io_uring_regbuf_desc) { + .type =3D IO_REGBUF_TYPE_UADDR, .uaddr =3D (u64)iov->iov_base, .size =3D iov->iov_len, }; + if (!desc->uaddr) + desc->type =3D IO_REGBUF_TYPE_EMPTY; } =20 int __io_account_mem(struct user_struct *user, unsigned long nr_pages) @@ -236,6 +234,8 @@ static int __io_sqe_files_update(struct io_ring_ctx *ct= x, return -ENXIO; if (up->offset + nr_args > ctx->file_table.data.nr) return -EINVAL; + if (up->flags) + return -EINVAL; =20 for (done =3D 0; done < nr_args; done++) { u64 tag =3D 0; @@ -292,10 +292,9 @@ static int __io_sqe_buffers_update(struct io_ring_ctx = *ctx, struct io_uring_rsrc_update2 *up, unsigned int nr_args) { + bool extended =3D up->flags & IORING_RSRC_UPDATE_EXTENDED; u64 __user *tags =3D u64_to_user_ptr(up->tags); - struct iovec fast_iov, *iov; struct page *last_hpage =3D NULL; - struct iovec __user *uvec; u64 user_data =3D up->data; __u32 done; int i, err; @@ -304,29 +303,49 @@ static int __io_sqe_buffers_update(struct io_ring_ctx= *ctx, return -ENXIO; if (up->offset + nr_args > ctx->buf_table.nr) return -EINVAL; + if (up->flags & ~IORING_RSRC_UPDATE_EXTENDED) + return -EINVAL; =20 for (done =3D 0; done < nr_args; done++) { struct io_uring_regbuf_desc desc; struct io_rsrc_node *node; u64 tag =3D 0; =20 - uvec =3D u64_to_user_ptr(user_data); - iov =3D iovec_from_user(uvec, 1, 1, &fast_iov, io_is_compat(ctx)); - if (IS_ERR(iov)) { - err =3D PTR_ERR(iov); - break; - } if (tags && copy_from_user(&tag, &tags[done], sizeof(tag))) { err =3D -EFAULT; break; } =20 - io_iov_to_regbuf_desc(iov, &desc); + if (extended) { + if (copy_from_user(&desc, u64_to_user_ptr(user_data), + sizeof(desc))) { + err =3D -EFAULT; + break; + } + user_data +=3D sizeof(desc); + } else { + struct iovec __user *uvec =3D u64_to_user_ptr(user_data); + struct iovec fast_iov, *iov; + + if (io_is_compat(ctx)) + user_data +=3D sizeof(struct compat_iovec); + else + user_data +=3D sizeof(struct iovec); + + iov =3D iovec_from_user(uvec, 1, 1, &fast_iov, io_is_compat(ctx)); + if (IS_ERR(iov)) { + err =3D PTR_ERR(iov); + break; + } + io_iov_to_regbuf_desc(iov, &desc); + } + node =3D io_sqe_buffer_register(ctx, &desc, &last_hpage); if (IS_ERR(node)) { err =3D PTR_ERR(node); break; } + if (tag) { if (!node) { err =3D -EINVAL; @@ -337,10 +356,6 @@ static int __io_sqe_buffers_update(struct io_ring_ctx = *ctx, i =3D array_index_nospec(up->offset + done, ctx->buf_table.nr); io_reset_rsrc_node(ctx, &ctx->buf_table, i); ctx->buf_table.nodes[i] =3D node; - if (io_is_compat(ctx)) - user_data +=3D sizeof(struct compat_iovec); - else - user_data +=3D sizeof(struct iovec); } return done ? done : err; } @@ -375,7 +390,7 @@ int io_register_files_update(struct io_ring_ctx *ctx, v= oid __user *arg, memset(&up, 0, sizeof(up)); if (copy_from_user(&up, arg, sizeof(struct io_uring_rsrc_update))) return -EFAULT; - if (up.resv || up.resv2) + if (up.resv2) return -EINVAL; return __io_register_rsrc_update(ctx, IORING_RSRC_FILE, &up, nr_args); } @@ -389,7 +404,7 @@ int io_register_rsrc_update(struct io_ring_ctx *ctx, vo= id __user *arg, return -EINVAL; if (copy_from_user(&up, arg, sizeof(up))) return -EFAULT; - if (!up.nr || up.resv || up.resv2) + if (!up.nr || up.resv2) return -EINVAL; return __io_register_rsrc_update(ctx, type, &up, up.nr); } @@ -489,12 +504,9 @@ int io_files_update(struct io_kiocb *req, unsigned int= issue_flags) struct io_uring_rsrc_update2 up2; int ret; =20 + memset(&up2, 0, sizeof(up2)); up2.offset =3D up->offset; up2.data =3D up->arg; - up2.nr =3D 0; - up2.tags =3D 0; - up2.resv =3D 0; - up2.resv2 =3D 0; =20 if (up->offset =3D=3D IORING_FILE_INDEX_ALLOC) { ret =3D io_files_update_with_index_alloc(req, issue_flags); @@ -791,8 +803,13 @@ static struct io_rsrc_node *io_sqe_buffer_register(str= uct io_ring_ctx *ctx, struct io_imu_folio_data data; bool coalesced =3D false; =20 - if (!uaddr) { - if (size) + if (desc->type >=3D __IO_REGBUF_TYPE_MAX) + return ERR_PTR(-EINVAL); + if (!mem_is_zero(&desc->__resv, sizeof(desc->__resv))) + return ERR_PTR(-EINVAL); + + if (desc->type =3D=3D IO_REGBUF_TYPE_EMPTY) { + if (uaddr || size) return ERR_PTR(-EFAULT); /* remove the buffer without installing a new one */ return NULL; --=20 2.53.0 From nobody Tue Jun 16 19:24:14 2026 Received: from mail-wr1-f44.google.com (mail-wr1-f44.google.com [209.85.221.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6E34939BFF4 for ; Wed, 29 Apr 2026 15:27:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777476432; cv=none; b=A1JvYGEPo7jNud2A9ao+NuS3vLDl6lG9SpMRcJ69ZoMG4KjiA0sLuCX8iby11hIdvf4t7RXptf/r+zAx9umj+gPd4n9FsBXt1wX7FYiMVkkqUWn4MSsp80vrcsGaQUv49IJz857C1CHkM9yaS5ybC/xqgrkh/ue14tLHkJddjPQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777476432; c=relaxed/simple; bh=zMYEBj1l+Q3AjQaXQiQWvzV0eGjk8vXy2arrpbw7GaY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=FZydhBhJXevZuhRbnjPWsMivPiJ5Bo97L6D8hvfD6UEASSXcXoUuHI610YFIFc/gnMIUCGeu6H7T0AtVq1mkEgLY1F7cY749BRPp5KeJuL/c614UpTKvteyzbGMjvQ0+c1tfmqRzIt77exvFOrvah1WslEwN8FU7MmaiwMK4PuQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Nzabc46q; arc=none smtp.client-ip=209.85.221.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Nzabc46q" Received: by mail-wr1-f44.google.com with SMTP id ffacd0b85a97d-43cfbd17589so9917108f8f.0 for ; Wed, 29 Apr 2026 08:27:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1777476428; x=1778081228; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=kBiinN/v9OplpViyP3kAACFhmG1eMp4XWQQRH+zMTH4=; b=Nzabc46qUmDEFLPq7hkiTm3kQb24Hmt3v/h/DssUqIl/zJyavbQzSwU/8DqnwG/Cd6 V0TKaB31eSUgAtVVwsVo91TfoU9FVXgtyqQNhFrokjkkD3iO4AkyeTdq5yJvgWh8Lpd2 a87CC1kIsPpE08KRqPrJml0zPdQ+v1Q7bDyWyf4Fkmj6P7Hjj6RNSbFy30hgmWmoLACc xOYFO9cw20zzpyOPRxYgqXAlyICRwriTjyWFN+xyLIYXraDtqq6YH/qlNNJjSHy6WH38 n0ZBorn6dpfmlVI3QUDIjvGp2c5z3XPj+EjUvl4rf2Z9xey5N2m8EuLpGv3VoRI8QoPf VR4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1777476428; x=1778081228; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=kBiinN/v9OplpViyP3kAACFhmG1eMp4XWQQRH+zMTH4=; b=qGZx3lIyVZbxZ+AiXe5eK3lc0dlqhw01AeHVIhZHeqMvMCEaMf32iT1gOCCop4GlVP j3LUWRQ/iqODDQvglPlzieuDgTuaryScKH4eknow2xqIFHOlK9RVlwe9QYH+DSjyQ0lD 8ZLpGWxYmAj0/gzFVBAXoDyE3ME5iRSRc1ePCk8XWIvC+zsWbMYqvvxx3yLtuDAgToEU cOZar0u6FydlAYTcAXYTl/AHhukfZ/bm1p76RHN19lbd8BD+8IRtyCxU3/qxt2W6KgUW jnOCGnEtHVYvZOcKUD+6B/GqhR4IMsOVOBtWJC72N1BFt6Gtd2SeWOdEiWNpIrBZU/x1 lLZw== X-Forwarded-Encrypted: i=1; AFNElJ8sCXSxSljPIwDNQMsi9QhAEiiyc/2UT+UBb7Wh40mGjwsiIhQYpaWGbFKRHByWAnhWXsMUcdm2rTRQzfs=@vger.kernel.org X-Gm-Message-State: AOJu0YwMyxHPmIcvK1A6hl+IRP2ybKbX18j7NkICR0W0e0GF4aiNSDN+ NXHKg3IK4JUsIEcoe+tR+d5Ar9/pYQH8tkRMtLlqcfxArXRrk5ExF7HP X-Gm-Gg: AeBDiesNMXruNeSbmj91nxPx5yeeL9bCUWFCtmNeZxjqWCZ7+SjSKtUoyaVNiP+4q/Z PJaK2nLa1unVFphT6OOLegZkeg0AgzJ7PkGB4daGU0iHVgrZTxe7pI8kqMMR1RaIXuCkNwY+BlL N1hPBwKWI/s+ueW3U8fl/Dh05iHvA2b7MABecBGTM7oD5XwbHHsL0iCWz3UkYAYDlQz4rvzonjn KKUZNYWdEajmvFKG/cFpeacWBSAhrFJ4PvEkdQcvakXhsLfdtU7AOvvsk1jibZReQSv1kgB8/cV rVAxmReIlRyIhDbPWKEvI8gCDuZ9PmlfkrnQ2KcJpv+PfReDGDnJIBvTZ8G6Au+ooQFinAqjkfa BiAVOTEMQdxqWeifSiTW1eLhjbb0oXlRcS7fPPKbH5pc+NX+OFhaPXyahH7O7ex1kdnH+6NhKQL RXqmZ0wk4t7U8JOVZGhRA2KXw2GqzRwbf5PtNmLLMq2mTQ0A3Cz9t45kY9+RBhOLynLx2h63lxP wgt/ydSyC9UstQTJofm6lV3TxOkE2VsalFpw8Jd9ZoP X-Received: by 2002:a05:6000:2f85:b0:43b:8f38:3b88 with SMTP id ffacd0b85a97d-446494ea255mr14984759f8f.25.1777476427532; Wed, 29 Apr 2026 08:27:07 -0700 (PDT) Received: from 127.0.0.1localhost ([82.132.184.31]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-447b76e5c22sm6382951f8f.28.2026.04.29.08.27.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Apr 2026 08:27:06 -0700 (PDT) From: Pavel Begunkov To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , Alexander Viro , Christian Brauner , Andrew Morton , Sumit Semwal , =?UTF-8?q?Christian=20K=C3=B6nig?= , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, io-uring@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org Cc: asml.silence@gmail.com, Nitesh Shetty , Kanchan Joshi , Anuj Gupta , Tushar Gohad , William Power , Phil Cayton , Jason Gunthorpe Subject: [PATCH v3 10/10] io_uring/rsrc: add dmabuf backed registered buffers Date: Wed, 29 Apr 2026 16:25:56 +0100 Message-ID: <0040156480814237fc099878756fa0fb079e14d2.1777475843.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Implement dmabuf backed registered buffers. To register them, the user should specify IO_REGBUF_TYPE_DMABUF for the regitration and pass the desired dmabuf fd and a file for which it should be registered. From there, it can be used with io_uring read/write requests IORING_OP_{READ,WRITE}_FIXED) as normal. The requests should be issued against the file specified during registration, and otherwise they'll be failed. The user should also be prepared to handle spurious -EAGAIN by reissuing the request. Internally, dmabuf registered buffers is an optin feature for io_uring request opcodes and they should pass a special flag on import to use it. Suggested-by: David Wei Suggested-by: Vishal Verma Suggested-by: Tushar Gohad Signed-off-by: Pavel Begunkov --- include/linux/io_uring_types.h | 5 + include/uapi/linux/io_uring.h | 6 +- io_uring/io_uring.c | 3 +- io_uring/rsrc.c | 163 +++++++++++++++++++++++++++++++-- io_uring/rsrc.h | 30 +++++- io_uring/rw.c | 4 +- 6 files changed, 200 insertions(+), 11 deletions(-) diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 7aee83e5ea0e..f9a33099421a 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -10,6 +10,7 @@ =20 struct iou_loop_params; struct io_uring_bpf_ops; +struct io_dmabuf_map; =20 enum { /* @@ -567,6 +568,7 @@ enum { REQ_F_IMPORT_BUFFER_BIT, REQ_F_SQE_COPIED_BIT, REQ_F_IOPOLL_BIT, + REQ_F_DROP_DMABUF_BIT, =20 /* not a real bit, just to check we're not overflowing the space */ __REQ_F_LAST_BIT, @@ -662,6 +664,8 @@ enum { REQ_F_SQE_COPIED =3D IO_REQ_FLAG(REQ_F_SQE_COPIED_BIT), /* request must be iopolled to completion (set in ->issue()) */ REQ_F_IOPOLL =3D IO_REQ_FLAG(REQ_F_IOPOLL_BIT), + /* there is a dma map attached to request that needs to be dropped */ + REQ_F_DROP_DMABUF =3D IO_REQ_FLAG(REQ_F_DROP_DMABUF_BIT), }; =20 struct io_tw_req { @@ -786,6 +790,7 @@ struct io_kiocb { /* custom credentials, valid IFF REQ_F_CREDS is set */ const struct cred *creds; struct io_wq_work work; + struct io_dmabuf_map *dmabuf_map; =20 struct io_big_cqe { u64 extra1; diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 05c3fd078767..3cd6ce28f9f5 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -810,6 +810,7 @@ enum io_uring_rsrc_reg_flags { enum io_uring_regbuf_type { IO_REGBUF_TYPE_EMPTY, IO_REGBUF_TYPE_UADDR, + IO_REGBUF_TYPE_DMABUF, =20 __IO_REGBUF_TYPE_MAX, }; @@ -819,7 +820,10 @@ struct io_uring_regbuf_desc { __u32 flags; __u64 size; __u64 uaddr; - __u64 __resv[7]; + + __s32 dmabuf_fd; + __s32 target_fd; + __u64 __resv[6]; }; =20 /* Skip updating fd indexes set to this value in the fd table */ diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 6068448a5aaa..e8a8eef45c3f 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -108,7 +108,7 @@ =20 #define IO_REQ_CLEAN_SLOW_FLAGS (REQ_F_REFCOUNT | IO_REQ_LINK_FLAGS | \ REQ_F_REISSUE | REQ_F_POLLED | \ - IO_REQ_CLEAN_FLAGS) + IO_REQ_CLEAN_FLAGS | REQ_F_DROP_DMABUF) =20 #define IO_TCTX_REFS_CACHE_NR (1U << 10) =20 @@ -1115,6 +1115,7 @@ static void io_free_batch_list(struct io_ring_ctx *ct= x, io_queue_next(req); if (unlikely(req->flags & IO_REQ_CLEAN_FLAGS)) io_clean_op(req); + io_req_drop_dmabuf(req); } io_put_file(req); io_req_put_rsrc_nodes(req); diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c index f8696b01cb54..bb61de308543 100644 --- a/io_uring/rsrc.c +++ b/io_uring/rsrc.c @@ -10,6 +10,7 @@ #include #include #include +#include =20 #include =20 @@ -789,6 +790,93 @@ bool io_check_coalesce_buffer(struct page **page_array= , int nr_pages, return true; } =20 +struct io_regbuf_dma { + struct io_dmabuf_token token; + struct file *target_file; +}; + +static void io_release_reg_dmabuf(void *priv) +{ + struct io_regbuf_dma *db =3D priv; + + fput(db->target_file); + io_dmabuf_token_release(&db->token); +} + +static struct io_rsrc_node *io_register_dmabuf(struct io_ring_ctx *ctx, + struct io_uring_regbuf_desc *desc) +{ + struct io_rsrc_node *node =3D NULL; + struct io_mapped_ubuf *imu =3D NULL; + struct io_regbuf_dma *regbuf =3D NULL; + struct file *target_file =3D NULL; + struct dma_buf *dmabuf =3D NULL; + int ret; + + if (!IS_ENABLED(CONFIG_DMABUF_TOKEN)) + return ERR_PTR(-EOPNOTSUPP); + if (desc->uaddr || desc->size) + return ERR_PTR(-EINVAL); + + ret =3D -ENOMEM; + node =3D io_rsrc_node_alloc(ctx, IORING_RSRC_BUFFER); + if (!node) + return ERR_PTR(-ENOMEM); + imu =3D io_alloc_imu(ctx, 0); + if (!imu) + goto err; + regbuf =3D kzalloc(sizeof(*regbuf), GFP_KERNEL); + if (!regbuf) + goto err; + + ret =3D -EBADF; + target_file =3D fget(desc->target_fd); + if (!target_file) + goto err; + + dmabuf =3D dma_buf_get(desc->dmabuf_fd); + if (IS_ERR(dmabuf)) { + ret =3D PTR_ERR(dmabuf); + dmabuf =3D NULL; + goto err; + } + if (dmabuf->size > SZ_1G) { + ret =3D -EINVAL; + goto err; + } + + ret =3D io_dmabuf_token_create(target_file, ®buf->token, dmabuf, + DMA_BIDIRECTIONAL); + if (ret) + goto err; + + regbuf->target_file =3D target_file; + imu->nr_bvecs =3D 1; + imu->ubuf =3D 0; + imu->len =3D dmabuf->size; + imu->folio_shift =3D 0; + imu->release =3D io_release_reg_dmabuf; + imu->priv =3D regbuf; + imu->flags =3D IO_REGBUF_F_DMABUF; + imu->dir =3D IO_BUF_DEST | IO_BUF_SOURCE; + refcount_set(&imu->refs, 1); + node->buf =3D imu; + dma_buf_put(dmabuf); + return node; +err: + kfree(regbuf); + if (imu) + io_free_imu(ctx, imu); + if (node) + io_cache_free(&ctx->node_cache, node); + if (target_file) + fput(target_file); + if (dmabuf) + dma_buf_put(dmabuf); + return ERR_PTR(ret); +} + + static struct io_rsrc_node *io_sqe_buffer_register(struct io_ring_ctx *ctx, struct io_uring_regbuf_desc *desc, struct page **last_hpage) @@ -808,6 +896,12 @@ static struct io_rsrc_node *io_sqe_buffer_register(str= uct io_ring_ctx *ctx, if (!mem_is_zero(&desc->__resv, sizeof(desc->__resv))) return ERR_PTR(-EINVAL); =20 + if (desc->type =3D=3D IO_REGBUF_TYPE_DMABUF) + return io_register_dmabuf(ctx, desc); + + if (desc->dmabuf_fd || desc->target_fd) + return ERR_PTR(-EINVAL); + if (desc->type =3D=3D IO_REGBUF_TYPE_EMPTY) { if (uaddr || size) return ERR_PTR(-EFAULT); @@ -1134,9 +1228,57 @@ static int io_import_kbuf(int ddir, struct iov_iter = *iter, return 0; } =20 -static int io_import_fixed(int ddir, struct iov_iter *iter, +void io_drop_dmabuf_node(struct io_kiocb *req) +{ + struct io_mapped_ubuf *imu; + + if (!IS_ENABLED(CONFIG_DMABUF_TOKEN)) + return; + if (WARN_ON_ONCE(req->buf_node->type !=3D IORING_RSRC_BUFFER)) + return; + imu =3D req->buf_node->buf; + if (WARN_ON_ONCE(!(imu->flags & IO_REGBUF_F_DMABUF))) + return; + io_dmabuf_map_drop(req->dmabuf_map); +} + +static int io_import_dmabuf(struct io_kiocb *req, + int ddir, struct iov_iter *iter, struct io_mapped_ubuf *imu, - u64 buf_addr, size_t len) + size_t len, size_t offset, + unsigned issue_flags) +{ + struct io_regbuf_dma *db =3D imu->priv; + struct io_dmabuf_map *map; + + if (!IS_ENABLED(CONFIG_DMABUF_TOKEN)) + return -EOPNOTSUPP; + if (!len) + return -EFAULT; + if (req->file !=3D db->target_file) + return -EBADF; + + map =3D io_dmabuf_get_map(&db->token); + if (unlikely(!map)) { + if (!(issue_flags & IO_URING_F_UNLOCKED)) + return -EAGAIN; + map =3D io_dmabuf_create_map(&db->token); + if (IS_ERR(map)) + return PTR_ERR(map); + } + + req->dmabuf_map =3D map; + req->flags |=3D REQ_F_DROP_DMABUF; + iov_iter_dmabuf_map(iter, ddir, map, offset, len); + return 0; +} + +static int io_import_fixed(struct io_kiocb *req, + int ddir, struct iov_iter *iter, + struct io_mapped_ubuf *imu, + u64 buf_addr, size_t len, + unsigned issue_flags, + unsigned import_flags) { const struct bio_vec *bvec; size_t folio_mask; @@ -1156,6 +1298,12 @@ static int io_import_fixed(int ddir, struct iov_iter= *iter, =20 offset =3D buf_addr - imu->ubuf; =20 + if (imu->flags & IO_REGBUF_F_DMABUF) { + if (!(import_flags & IO_REGBUF_IMPORT_ALLOW_DMABUF)) + return -EFAULT; + return io_import_dmabuf(req, ddir, iter, imu, len, offset, + issue_flags); + } if (imu->flags & IO_REGBUF_F_KBUF) return io_import_kbuf(ddir, iter, imu, len, offset); =20 @@ -1209,16 +1357,17 @@ inline struct io_rsrc_node *io_find_buf_node(struct= io_kiocb *req, return NULL; } =20 -int io_import_reg_buf(struct io_kiocb *req, struct iov_iter *iter, +int __io_import_reg_buf(struct io_kiocb *req, struct iov_iter *iter, u64 buf_addr, size_t len, int ddir, - unsigned issue_flags) + unsigned issue_flags, unsigned import_flags) { struct io_rsrc_node *node; =20 node =3D io_find_buf_node(req, issue_flags); if (!node) return -EFAULT; - return io_import_fixed(ddir, iter, node->buf, buf_addr, len); + return io_import_fixed(req, ddir, iter, node->buf, buf_addr, len, + issue_flags, import_flags); } =20 /* Lock two rings at once. The rings must be different! */ @@ -1577,7 +1726,9 @@ int io_import_reg_vec(int ddir, struct iov_iter *iter, iovec_off =3D vec->nr - nr_iovs; iov =3D vec->iovec + iovec_off; =20 - if (imu->flags & IO_REGBUF_F_KBUF) { + if (imu->flags & IO_REGBUF_F_DMABUF) { + return -EOPNOTSUPP; + } else if (imu->flags & IO_REGBUF_F_KBUF) { int ret =3D io_kern_bvec_size(iov, nr_iovs, imu, &nr_segs); =20 if (unlikely(ret)) diff --git a/io_uring/rsrc.h b/io_uring/rsrc.h index 8d48195faf9d..005a273ba107 100644 --- a/io_uring/rsrc.h +++ b/io_uring/rsrc.h @@ -25,6 +25,11 @@ struct io_rsrc_node { =20 enum { IO_REGBUF_F_KBUF =3D 1, + IO_REGBUF_F_DMABUF =3D 2, +}; + +enum { + IO_REGBUF_IMPORT_ALLOW_DMABUF =3D 1, }; =20 struct io_mapped_ubuf { @@ -60,9 +65,19 @@ int io_rsrc_data_alloc(struct io_rsrc_data *data, unsign= ed nr); =20 struct io_rsrc_node *io_find_buf_node(struct io_kiocb *req, unsigned issue_flags); +int __io_import_reg_buf(struct io_kiocb *req, struct iov_iter *iter, + u64 buf_addr, size_t len, int ddir, + unsigned issue_flags, unsigned import_flags); + +static inline int io_import_reg_buf(struct io_kiocb *req, struct iov_iter *iter, u64 buf_addr, size_t len, int ddir, - unsigned issue_flags); + unsigned issue_flags) +{ + return __io_import_reg_buf(req, iter, buf_addr, len, ddir, + issue_flags, 0); +} + int io_import_reg_vec(int ddir, struct iov_iter *iter, struct io_kiocb *req, struct iou_vec *vec, unsigned nr_iovs, unsigned issue_flags); @@ -147,4 +162,17 @@ static inline void io_alloc_cache_vec_kasan(struct iou= _vec *iv) io_vec_free(iv); } =20 +void io_drop_dmabuf_node(struct io_kiocb *req); + +static inline void io_req_drop_dmabuf(struct io_kiocb *req) +{ + if (!IS_ENABLED(CONFIG_DMABUF_TOKEN)) + return; + if (!(req->flags & REQ_F_DROP_DMABUF)) + return; + if (WARN_ON_ONCE(!(req->flags & REQ_F_BUF_NODE))) + return; + io_drop_dmabuf_node(req); +} + #endif diff --git a/io_uring/rw.c b/io_uring/rw.c index 20654deff84d..d50da5fa8bb9 100644 --- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -380,8 +380,8 @@ static int io_init_rw_fixed(struct io_kiocb *req, unsig= ned int issue_flags, if (io->bytes_done) return 0; =20 - ret =3D io_import_reg_buf(req, &io->iter, rw->addr, rw->len, ddir, - issue_flags); + ret =3D __io_import_reg_buf(req, &io->iter, rw->addr, rw->len, ddir, + issue_flags, IO_REGBUF_IMPORT_ALLOW_DMABUF); iov_iter_save_state(&io->iter, &io->iter_state); return ret; } --=20 2.53.0