From nobody Tue Dec 2 01:22:05 2025 Received: from mail-wm1-f51.google.com (mail-wm1-f51.google.com [209.85.128.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2C87B279DAD for ; Sun, 23 Nov 2025 22:51:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763938305; cv=none; b=OjV6ri4ntnr6oTDxtry6WO8a/2RYZsCMH9J72NRU2cnZI6cD4lA4O/IIP7pOzBb1nTAvJfpUwr/hjN5n+IJMnXQohDWRp9JxvP9oIWqaJSalAysuc4vi9Z1iCLnmISx6Qx1eqhkXT/rMtFUCKZI7R+sYKO7rjpELxKr/tjxc4Rg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763938305; c=relaxed/simple; bh=TaA27OxA1hBFyRB5sQEZB6XkCgdFpDueoKtHtp+vFGM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=GjBZDDw24ZL5cXvQlrGzEdZfA+bVSFZkJstQyg/lPX2fQkBZlEdI752gwa6KqEBwBVeGcZwGGJmeiDUHQmwufYHqT44QB8+y9DyNn+5oE7VOyxQ3VoBzdN+rTh6kbqPKae51aw8UaC6QAKewUya2KmeU5pki/E8n9Z2FZickLQY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ftiNcTx1; arc=none smtp.client-ip=209.85.128.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ftiNcTx1" Received: by mail-wm1-f51.google.com with SMTP id 5b1f17b1804b1-4779d47be12so29010575e9.2 for ; Sun, 23 Nov 2025 14:51:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1763938301; x=1764543101; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=PR8DhyhbmfM3XR45v/DX3VtwAUQ7pza2V3aAfSC4t8o=; b=ftiNcTx13YdP4WuIFjBYEQY9qh3Zw5EizPLOwUI0Oo29LmHJH65J9m5pF9tilsgK4R Rahz3xxfxLMMQwK14JXW532fLPTYPEwkK3ER1iCLC6qfx/26ECwO0Rv2iU494tZ7Nj+n kNNDxSdI9r705Qgcb7Msyeau1ATIy0ChwWxeIzgVQRW8cVAikB/9IKpzTRZmBmDB6vW6 fkxIN2vlVHGp7KWT88WoGGAU7COpoAKtaSOaDVh5PRcaok/AAmVbDQqger9TwVxKz6/X Xn6IB6EtTFeqJLZ6w5szBwq5njjyzULzOsqBp2lw4WThiqTSW1xjDSUFxEnTbVJOKv58 oC2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763938301; x=1764543101; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=PR8DhyhbmfM3XR45v/DX3VtwAUQ7pza2V3aAfSC4t8o=; b=r7ELr6LOkkYlL86e9S8hD8ZHA3/tGLsMTyvIMVavJQPOD2Rq6NEsKVKUxlboFkWocP 2IzPgUoDzWwJuIn0fCnpq3c3ECug9YPB3uF6+hdINQ3GO6aPMpOd/yP2Hrx9DepbKaN3 J8avsFrOF2OjYa4erTkR7iJpMTsNkjEOY2DP7rElM8FCrJE/89qqE2nyXTE8MsPReM1s +984zdDOIB99GWtPCzlG/m2w4j3aXUq8lrXWq7csZ9GspLY6EXhRz0/N8TekSvma6l1j NUzfGgxIZI3R14clLeMSF7IJXdKymi4utzqIRp+GF/nXDvR4lIpglmwjTwsGS+x+IhSc h47A== X-Forwarded-Encrypted: i=1; AJvYcCV0nH2CPbYQBeqVe2EWKqv3t/FuLIYXKhp1hO0JJFZJ3+o93o1gnsdeQSTggnGHuMHyU7p+9lQuA2S83Uk=@vger.kernel.org X-Gm-Message-State: AOJu0YzoY6yD+0eAL7pnamWyfy6j7p0+IAVMm8Wf1U2Sajhq9yVfBQVL y9lTdyk6y4vsBusVUGF6mhC9wEL0xERHjwD9/rn5sQ+LntCljTc41mD4 X-Gm-Gg: ASbGncvczJnTP4FDW6QD9tkq8OrTnZ1s2uIAIjaJ+XzezmTqfJkcB1idWrIhzfVufsB qA7c/L/kxebXT5W3SfvgqcPwFInYAM+TnpXxbS1KcWN4RuD2kzATOtEP9Kz2/Gya3ZTwuj3ULtV FR49AIpVnnUW3G4NvH6iw34ktQPdtIDduskvgnYya/A4y11+39MgbX6VyZA9w3nLRPIiExnWw9L plJiMPqYuFqbymWeGacMXu2eDGk7M1IqB6tCn3/5p5cg8daPO3qaGeSYN9HkMx7+aSPuwNy7w/8 bK3quub2DapgAilCLHq6rYus0shAa0+6rnT8/YmKNt0hSN//QTmJ9AfPlge5KOa/Q9BauNRKvZ7 aMOI1eVGvFiDPFJAE8ptLKnrQaW73GAWdhg7e1wGad+DCa9MGLwq4SHZ/MwF9kWBX8/L5IzMuiA xkvZGZAaHH+G4sUQ== X-Google-Smtp-Source: AGHT+IF2fkOx1psmIiI8t6n1UVP6WxxL86e2rgF4bc5zvWBkXEsJkKhYvtgfzcrreOJ89mrZR3pH7Q== X-Received: by 2002:a05:6000:40da:b0:42b:47da:c316 with SMTP id ffacd0b85a97d-42cc1cc30c2mr10114899f8f.26.1763938301503; Sun, 23 Nov 2025 14:51:41 -0800 (PST) Received: from 127.mynet ([2a01:4b00:bd21:4f00:7cc6:d3ca:494:116c]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-42cb7fb9190sm24849064f8f.33.2025.11.23.14.51.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 23 Nov 2025 14:51:40 -0800 (PST) From: Pavel Begunkov To: linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: Vishal Verma , tushar.gohad@intel.com, Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , Alexander Viro , Christian Brauner , Andrew Morton , Sumit Semwal , =?UTF-8?q?Christian=20K=C3=B6nig?= , Pavel Begunkov , linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org Subject: [RFC v2 01/11] file: add callback for pre-mapping dmabuf Date: Sun, 23 Nov 2025 22:51:21 +0000 Message-ID: <74d689540fa200fe37f1a930165357a92fe9e68c.1763725387.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a file callback that maps a dmabuf for the given file and returns an opaque token of type struct dma_token representing the mapping. The implementation details are hidden from the caller, and the implementors are normally expected to extend the structure. The callback callers will be able to pass the token with an IO request, which implemented in following patches as a new iterator type. The user should release the token once it's not needed by calling the provided release callback via appropriate helpers. Signed-off-by: Pavel Begunkov --- include/linux/dma_token.h | 35 +++++++++++++++++++++++++++++++++++ include/linux/fs.h | 4 ++++ 2 files changed, 39 insertions(+) create mode 100644 include/linux/dma_token.h diff --git a/include/linux/dma_token.h b/include/linux/dma_token.h new file mode 100644 index 000000000000..9194b34282c2 --- /dev/null +++ b/include/linux/dma_token.h @@ -0,0 +1,35 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_DMA_TOKEN_H +#define _LINUX_DMA_TOKEN_H + +#include + +struct dma_token_params { + struct dma_buf *dmabuf; + enum dma_data_direction dir; +}; + +struct dma_token { + void (*release)(struct dma_token *); +}; + +static inline void dma_token_release(struct dma_token *token) +{ + token->release(token); +} + +static inline struct dma_token * +dma_token_create(struct file *file, struct dma_token_params *params) +{ + struct dma_token *res; + + if (!file->f_op->dma_map) + return ERR_PTR(-EOPNOTSUPP); + res =3D file->f_op->dma_map(file, params); + + WARN_ON_ONCE(!IS_ERR(res) && !res->release); + + return res; +} + +#endif diff --git a/include/linux/fs.h b/include/linux/fs.h index c895146c1444..0ce9a53fabec 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2262,6 +2262,8 @@ struct dir_context { struct iov_iter; struct io_uring_cmd; struct offset_ctx; +struct dma_token; +struct dma_token_params; =20 typedef unsigned int __bitwise fop_flags_t; =20 @@ -2309,6 +2311,8 @@ struct file_operations { int (*uring_cmd_iopoll)(struct io_uring_cmd *, struct io_comp_batch *, unsigned int poll_flags); int (*mmap_prepare)(struct vm_area_desc *); + struct dma_token *(*dma_map)(struct file *, + struct dma_token_params *); } __randomize_layout; =20 /* Supports async buffered reads */ --=20 2.52.0 From nobody Tue Dec 2 01:22:05 2025 Received: from mail-wr1-f50.google.com (mail-wr1-f50.google.com [209.85.221.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EED7727FD40 for ; Sun, 23 Nov 2025 22:51:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763938308; cv=none; b=uZKX2W6OvVNBAhp/++rGIeuG1h0NtVXP5EqkLIgDl1otoZK7ihZAGVGSnYiyu+o7bDI12nR7EjEIgrElShm56u1psSgdL/t4V1GsZ56F7XfFVBxZAISZq9QWY5uih17D1oc7k3zsFcuUbFzu1IVZM2P50btKZ4J4H9YVxVfDblI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763938308; c=relaxed/simple; bh=biIYoLFZIunfY+RjkNduyhE0he3gpQJOgrFilhAOH4g=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=F7J5a98babpUUsrVE2wGaM0NFRxC/bUJHG8G5E2Ry/tnusNTQL+bPMGZqPSSCOARz4rI1kanfO2R+Shq27iguL61Yqx0aRtjEVSF9+mHtG9qAlxrHBPaK2Mj/5jd9RnE+G7pU12Ipsoy8336BRZoDX90ElOq0ajbI1ZoNCh13m0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=WpmGMvm3; arc=none smtp.client-ip=209.85.221.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="WpmGMvm3" Received: by mail-wr1-f50.google.com with SMTP id ffacd0b85a97d-42b3d7c1321so2269757f8f.3 for ; Sun, 23 Nov 2025 14:51:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1763938304; x=1764543104; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Eb2Dznv7cN1HWvl69eUL3qKJ9duXd8EJVDjyGYBP64I=; b=WpmGMvm3Tt0Y2XyUxUpwW2d5teUAZCtS8tXckR5X/RDdFiEfJCS4kG6UaJnYUThtom hVeXn8TIJ9meTxceBvBKwkHciPinGwCqfv3wjWwIf+m88DPx3FN4hpvukifhHXo98Zgf Z5WVvQboSs4e2gH8x979VuM5bzCFt6GmbE24Kvbr4tFLnXRGk/0iExyengLb2OvbRPVc TDwO+nPjaWdkwTGWQsEJozk5nUZ1UUcVRUS05QTkW3lCmRFGs3QFAmSHXsGZmUKgcjnC LGqCFndJS+bJKiRwz7ujKWxSlhm7A/S03RXh/ctR3jiEVpK1SkHNGZ36PnMNEchv4zDe naPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763938304; x=1764543104; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=Eb2Dznv7cN1HWvl69eUL3qKJ9duXd8EJVDjyGYBP64I=; b=Wx/P9EBbs7NGK2r9QIWk2nYCQIuRwW425+Emuo3kaxHMZX/jMLCiwNrQgxZU9ybOl8 HyVJ0P3MiAy4li13K/TDaXa+n9krzvj5AoikIGFe+S4ggwERnyDOnkPi9UEUZwN3aR0A DOhBYyHHNOqAxHAVTsVn4KrYwps2ZcV6jyy4wbzouMmGtOwjCXYgztpDYmMYwQpZxdtr 0lk1fl9MsxCgVcG2v+bx0qLxT8QyBOKbz1p4XR5iBmSJd5wRlmr37E9XWnM7onKQS8ZQ 29KafmGhd4KHKiSHNmECrKj/6Pf3fRsU3xyEFIs7oCu6zLyPCbqHeQBOMENRZTEnkjQ3 JFwQ== X-Forwarded-Encrypted: i=1; AJvYcCWORM69EY/yf1MeEHCs1HWR3fdwgEm87pwKhIzP9tYU0dHVEO6Ce8+URFT/CmCOzNjJMrq+ypDKdsNRmIk=@vger.kernel.org X-Gm-Message-State: AOJu0YxwsIXGf7PpPS+qO26UFu7FDhOazWv+ilWEVnCXKQmNN5JrVr68 BQVQnwuQVBagh0tExdFrNu/3h4dkStTED6EIhzf79JCmL8UMPjgdU6hK X-Gm-Gg: ASbGncvyrCZK5QgMHtdJYFPuxQhrNNzoKJacm6eXtRAEi/C1Ete9wDplzjs9+jNJGxR MrazITcRQYFiuw2q9fzS+lG26XQozedb1JbSY6vWykwklSIPZ76B4/sbFn8yN6gyXmyREBZTSAo ZIIhrf9n8utt4jGelx5hRlTSjdbsjUzSO8Xp2yGJs/zAzhH5D3d/AQASuEqPxh8eNOsuBLdiZke KyDNVLIU6vFzyYulzIFpTn59G/pXD4fJNDL/Dl2FKal0Xz/XiKWC7Mgak2x5s+DQGzkTaAM+R3l JJqR9RYI7bXzveyAY63voY5rboI8dQJbGMkrVD2WsT146vl0z1F6KfR7SGz3iyjlkM+FeVLXlyR bB9fvYAMPtkSbQ6qkJs6RgZdTWoOHCjaPlrNBgxDRd9HoDcGe9VXXh3AWDwCuZAShwsKoc98uP4 teNggUttMoPaz0PQ== X-Google-Smtp-Source: AGHT+IED8z7T9BZIQ3+G6TZGcukyRm9aG9PTE7rjzPMs+94mQ+eXcoj3xDrYuPfTG51yWGc4ynh5VQ== X-Received: by 2002:a05:6000:601:b0:42b:5592:ebd1 with SMTP id ffacd0b85a97d-42cc19f0b39mr11000772f8f.0.1763938304146; Sun, 23 Nov 2025 14:51:44 -0800 (PST) Received: from 127.mynet ([2a01:4b00:bd21:4f00:7cc6:d3ca:494:116c]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-42cb7fb9190sm24849064f8f.33.2025.11.23.14.51.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 23 Nov 2025 14:51:43 -0800 (PST) From: Pavel Begunkov To: linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: Vishal Verma , tushar.gohad@intel.com, Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , Alexander Viro , Christian Brauner , Andrew Morton , Sumit Semwal , =?UTF-8?q?Christian=20K=C3=B6nig?= , Pavel Begunkov , linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org Subject: [RFC v2 02/11] iov_iter: introduce iter type for pre-registered dma Date: Sun, 23 Nov 2025 22:51:22 +0000 Message-ID: X-Mailer: git-send-email 2.52.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce a new iterator type backed by a pre mapped dmabuf represented by struct dma_token. The token is specific to the file for which it was created, and the user must avoid the token and the iterator to any other file. This limitation will be softened in the future. Suggested-by: Keith Busch Signed-off-by: Pavel Begunkov --- include/linux/uio.h | 10 ++++++++++ lib/iov_iter.c | 30 ++++++++++++++++++++++++------ 2 files changed, 34 insertions(+), 6 deletions(-) diff --git a/include/linux/uio.h b/include/linux/uio.h index 5b127043a151..1b22594ca35b 100644 --- a/include/linux/uio.h +++ b/include/linux/uio.h @@ -29,6 +29,7 @@ enum iter_type { ITER_FOLIOQ, ITER_XARRAY, ITER_DISCARD, + ITER_DMA_TOKEN, }; =20 #define ITER_SOURCE 1 // =3D=3D WRITE @@ -71,6 +72,7 @@ struct iov_iter { const struct folio_queue *folioq; struct xarray *xarray; void __user *ubuf; + struct dma_token *dma_token; }; size_t count; }; @@ -155,6 +157,11 @@ static inline bool iov_iter_is_xarray(const struct iov= _iter *i) return iov_iter_type(i) =3D=3D ITER_XARRAY; } =20 +static inline bool iov_iter_is_dma_token(const struct iov_iter *i) +{ + return iov_iter_type(i) =3D=3D ITER_DMA_TOKEN; +} + static inline unsigned char iov_iter_rw(const struct iov_iter *i) { return i->data_source ? WRITE : READ; @@ -300,6 +307,9 @@ void iov_iter_folio_queue(struct iov_iter *i, unsigned = int direction, unsigned int first_slot, unsigned int offset, size_t count); void iov_iter_xarray(struct iov_iter *i, unsigned int direction, struct xa= rray *xarray, loff_t start, size_t count); +void iov_iter_dma_token(struct iov_iter *i, unsigned int direction, + struct dma_token *token, + loff_t off, size_t count); ssize_t iov_iter_get_pages2(struct iov_iter *i, struct page **pages, size_t maxsize, unsigned maxpages, size_t *start); ssize_t iov_iter_get_pages_alloc2(struct iov_iter *i, struct page ***pages, diff --git a/lib/iov_iter.c b/lib/iov_iter.c index 2fe66a6b8789..26fa8f8f13c0 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -563,7 +563,8 @@ void iov_iter_advance(struct iov_iter *i, size_t size) { if (unlikely(i->count < size)) size =3D i->count; - if (likely(iter_is_ubuf(i)) || unlikely(iov_iter_is_xarray(i))) { + if (likely(iter_is_ubuf(i)) || unlikely(iov_iter_is_xarray(i)) || + unlikely(iov_iter_is_dma_token(i))) { i->iov_offset +=3D size; i->count -=3D size; } else if (likely(iter_is_iovec(i) || iov_iter_is_kvec(i))) { @@ -619,7 +620,8 @@ void iov_iter_revert(struct iov_iter *i, size_t unroll) return; } unroll -=3D i->iov_offset; - if (iov_iter_is_xarray(i) || iter_is_ubuf(i)) { + if (iov_iter_is_xarray(i) || iter_is_ubuf(i) || + iov_iter_is_dma_token(i)) { BUG(); /* We should never go beyond the start of the specified * range since we might then be straying into pages that * aren't pinned. @@ -763,6 +765,21 @@ void iov_iter_xarray(struct iov_iter *i, unsigned int = direction, } EXPORT_SYMBOL(iov_iter_xarray); =20 +void iov_iter_dma_token(struct iov_iter *i, unsigned int direction, + struct dma_token *token, + loff_t off, size_t count) +{ + WARN_ON(direction & ~(READ | WRITE)); + *i =3D (struct iov_iter){ + .iter_type =3D ITER_DMA_TOKEN, + .data_source =3D direction, + .dma_token =3D token, + .iov_offset =3D 0, + .count =3D count, + .iov_offset =3D off, + }; +} + /** * iov_iter_discard - Initialise an I/O iterator that discards data * @i: The iterator to initialise. @@ -829,7 +846,7 @@ static unsigned long iov_iter_alignment_bvec(const stru= ct iov_iter *i) =20 unsigned long iov_iter_alignment(const struct iov_iter *i) { - if (likely(iter_is_ubuf(i))) { + if (likely(iter_is_ubuf(i)) || iov_iter_is_dma_token(i)) { size_t size =3D i->count; if (size) return ((unsigned long)i->ubuf + i->iov_offset) | size; @@ -860,7 +877,7 @@ unsigned long iov_iter_gap_alignment(const struct iov_i= ter *i) size_t size =3D i->count; unsigned k; =20 - if (iter_is_ubuf(i)) + if (iter_is_ubuf(i) || iov_iter_is_dma_token(i)) return 0; =20 if (WARN_ON(!iter_is_iovec(i))) @@ -1457,11 +1474,12 @@ EXPORT_SYMBOL_GPL(import_ubuf); void iov_iter_restore(struct iov_iter *i, struct iov_iter_state *state) { if (WARN_ON_ONCE(!iov_iter_is_bvec(i) && !iter_is_iovec(i) && - !iter_is_ubuf(i)) && !iov_iter_is_kvec(i)) + !iter_is_ubuf(i) && !iov_iter_is_kvec(i) && + !iov_iter_is_dma_token(i))) return; i->iov_offset =3D state->iov_offset; i->count =3D state->count; - if (iter_is_ubuf(i)) + if (iter_is_ubuf(i) || iov_iter_is_dma_token(i)) return; /* * For the *vec iters, nr_segs + iov is constant - if we increment --=20 2.52.0 From nobody Tue Dec 2 01:22:05 2025 Received: from mail-wr1-f54.google.com (mail-wr1-f54.google.com [209.85.221.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5F67A27FD72 for ; Sun, 23 Nov 2025 22:51:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763938309; cv=none; b=HRi2Mc6wJe2N+fMKwjWhPZsswTfRA5Kayf25hluOIWlj6ryPWkFTr2pwt+QE12IXH5tPi+Yz/mXX3jL2AY7lGFqznWzGFiuNf1xOOp4ix7UjgI3S8cvCpw8Z6fJPi00IDMT1yw98qz7wc/OfUiKZKhRPar3JeMXunPDffZMejYw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763938309; c=relaxed/simple; bh=2QAhGEFKNmLFH3rV8w4M1xXti7VQ8PITqhdQLrQ5sfg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=TrOs1m5yTAUvjtWynYbTxpcycXj7u/MoOoHY5uTBycxMbE8kpnH+qVdXxG9C0Yk3fJjIFGHqoEI2URvNDdVVI52XLlUVfxswDq7VECsvlwCId6uwEo67a0rkWQjYqiL2hqjrri68n2n3OrFNHHbtSBTzKvUswIMygBVSlQAMEGM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ExMtn78q; arc=none smtp.client-ip=209.85.221.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ExMtn78q" Received: by mail-wr1-f54.google.com with SMTP id ffacd0b85a97d-42b566859ecso3321330f8f.2 for ; Sun, 23 Nov 2025 14:51:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1763938306; x=1764543106; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=g6fykkTkHS7qppZkkxaQlxQiAfIqxjHL5zv+seY74S0=; b=ExMtn78q7qR6pIUKU/7VGEUjfVV6bk4JomIs0V7n3OZ75PBi4JcasAliBmf1wTvoyx b92OsjRkMwvIJuHVXeRbVuryPNFAwkzKf4EGn7EO8szNtfK1k5krVOdZmS8No32ZESGv ImPPNTwX7bmkJ1IhqUNvIU76EZji1bIfOQtq1ljfyidK+oNXUJ7XWwa6MNxvVFiP8HHs B48v9HxDqE527c99oqmHROtvLt21uqMQkqSXO9ArUQjRm7NuOAtbNBiuF1oPNo1YAURf n7K4uaIjeMzrer6TJnVDiPQbQLbmf5Oe9zgZDmAiVIE3QFQMktiqlGywBqIateN1ldED 3Ohw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763938306; x=1764543106; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=g6fykkTkHS7qppZkkxaQlxQiAfIqxjHL5zv+seY74S0=; b=l36hzdf4W1x4KH+fdoysBP7ke8rXPu/e68OQJypWTv0zUgZ7HYnW+UPU95vC2Uqxue CfZdKpE7uavggrxhoxHnpwv4bCzJg+vHQZwVEk6tIx0NU9LEa+FX2OKfA1e7z0IpJVFW hfwnqZLcadLGFNPIsa191+srYR+4wTu9rAUj9vBmx1BeSctqyFhSCwgDFEPvbnT1U2vW Voa+p/dgVBZ/oQnWZbOpW6u0JTkWYbaW9XI0sFxRVEj7Hc86NA89xCsm+CQrKRw7DBFd vAiQV3uCIKzlOUBdtCLCakaajXY5JLou6Dkt7unK/YcY5nt+XJdVoMYyZQZ47Ayw3TGV ak7w== X-Forwarded-Encrypted: i=1; AJvYcCWnp+0hKuWgBFgGz+Cr2/CtoUaYcucQfqCkUGftaqdWpo1wJMsdRiQnaVt+Wjme24rM5ZP/S7sh4d3iSrQ=@vger.kernel.org X-Gm-Message-State: AOJu0YyGwoe9jqi8iaHc10Xm6tNY6DpD4uzOBPN1YP2a1wapsz+Zk3+B wsbZk02h3L4iyl9hOukRmAscxZEyXNt4uVW0Uh/Z/XY8ZHhyiaJgvQfo X-Gm-Gg: ASbGncv2hPbCH11JqVbOFIM57Eq96iS94S5CL3xop/OgAl/UxOAp9uwE0n33oW4YVoz 8IOpDr/fOqIvC8sI6ZiTFa1cxFzddkeG3XcB//8EF0zCRiiUEryu6ugi3494Ud8exfUgFtIomcK Ao0fTeM4drCHLG7ZR10OrtjN2HXLpNI2lRlO4VgDzCtwrDOoPmD09OPrmspkEOQ6yB3T6bx5W5l O6+vyzKNIQLS3d4bBPy4F91BIuuKC/JnYio7RK8E276oTVNQvRJptksmfufoAN5gOz/Na7sfGgq iuOwk1vvioAHMZdl7ACoTEjQB72Zr3MNHfadGoLehPgfE8QMFXcV+11x+i3RLLFIJ/eaXap5C9Q Rqk7Y7Y9grDhxJ+EUnRcp3wBVaVFDpJJpWDtznIevEycQKZ6jR2IngcHtnqv3jn/Ey8O2q/qRpD hmDblxZ5cgt62skg== X-Google-Smtp-Source: AGHT+IG6zETWdv1uNEXL1H7qlBK8rzCntcx0G5cI2YCnUx99f9VimwWWKVBIKAVqjQPLH9huSPPJwA== X-Received: by 2002:a05:6000:1448:b0:429:d3e9:65b with SMTP id ffacd0b85a97d-42cc1d23c3emr9659226f8f.59.1763938305625; Sun, 23 Nov 2025 14:51:45 -0800 (PST) Received: from 127.mynet ([2a01:4b00:bd21:4f00:7cc6:d3ca:494:116c]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-42cb7fb9190sm24849064f8f.33.2025.11.23.14.51.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 23 Nov 2025 14:51:44 -0800 (PST) From: Pavel Begunkov To: linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: Vishal Verma , tushar.gohad@intel.com, Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , Alexander Viro , Christian Brauner , Andrew Morton , Sumit Semwal , =?UTF-8?q?Christian=20K=C3=B6nig?= , Pavel Begunkov , linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org Subject: [RFC v2 03/11] block: move around bio flagging helpers Date: Sun, 23 Nov 2025 22:51:23 +0000 Message-ID: <6cb3193d3249ab5ca54e8aecbfc24086db09b753.1763725387.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" We'll need bio_flagged() earlier in bio.h in the next patch, move it together with all related helpers, and mark the bio_flagged()'s bio argument as const. Signed-off-by: Pavel Begunkov --- include/linux/bio.h | 30 +++++++++++++++--------------- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/include/linux/bio.h b/include/linux/bio.h index ad2d57908c1c..c75a9b3672aa 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -46,6 +46,21 @@ static inline unsigned int bio_max_segs(unsigned int nr_= segs) #define bio_data_dir(bio) \ (op_is_write(bio_op(bio)) ? WRITE : READ) =20 +static inline bool bio_flagged(const struct bio *bio, unsigned int bit) +{ + return bio->bi_flags & (1U << bit); +} + +static inline void bio_set_flag(struct bio *bio, unsigned int bit) +{ + bio->bi_flags |=3D (1U << bit); +} + +static inline void bio_clear_flag(struct bio *bio, unsigned int bit) +{ + bio->bi_flags &=3D ~(1U << bit); +} + /* * Check whether this bio carries any data or not. A NULL bio is allowed. */ @@ -225,21 +240,6 @@ static inline void bio_cnt_set(struct bio *bio, unsign= ed int count) atomic_set(&bio->__bi_cnt, count); } =20 -static inline bool bio_flagged(struct bio *bio, unsigned int bit) -{ - return bio->bi_flags & (1U << bit); -} - -static inline void bio_set_flag(struct bio *bio, unsigned int bit) -{ - bio->bi_flags |=3D (1U << bit); -} - -static inline void bio_clear_flag(struct bio *bio, unsigned int bit) -{ - bio->bi_flags &=3D ~(1U << bit); -} - static inline struct bio_vec *bio_first_bvec_all(struct bio *bio) { WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED)); --=20 2.52.0 From nobody Tue Dec 2 01:22:05 2025 Received: from mail-wm1-f49.google.com (mail-wm1-f49.google.com [209.85.128.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4B6152853F7 for ; Sun, 23 Nov 2025 22:51:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763938312; cv=none; b=mG7MyOI/waZVrF6+IB/nN0JIgQ0Z+mvCo1FvSwEQ+d/ABW4qrEnF6pyIiXVt+Lurauf1sZwJBcExghsX8sA/pFmhHSTmrt1+DkznAstXovftg8I+SmTR3W7pXAg8FGh5OxCfgX3RZf+yRc84SVc+aTaCp2RVzg8klFmXhch8oP4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763938312; c=relaxed/simple; bh=XGx1f8qezWKlKgKLLZ301GCSMIBZ1JsSW3EFwWISAzM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pcEmqKWYT9Mvei0JDerudVFzP9PMLCEY+//FaQLV31nMOtyfmyBqtyqs/pNFqFQs9wzrOK+eebbQ+k2hq6FpmWd2OJp3EWF1xcFR4d79H/Yh7j2ACj3DpBTsGPUfcYVSUuqfOVDYfQ6WzKpLSKsCq9oJ2ORfZP9oyFY2JGvAduA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Usl72E/4; arc=none smtp.client-ip=209.85.128.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Usl72E/4" Received: by mail-wm1-f49.google.com with SMTP id 5b1f17b1804b1-47118259fd8so31532475e9.3 for ; Sun, 23 Nov 2025 14:51:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1763938309; x=1764543109; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=zxuKr9/wD0hv1o+9Q16IeSgRkNQrWSlYyRysfxG3AMc=; b=Usl72E/4lDVylaOAdrgBenfNmYwPULhoCn7bcUbX9lMyLAwFpLkhpQgSbSxfxpz0Fw U7Zx3kltDeQeXDAJ6unFxL+Aivuml7u+pH+YdsCp2YTEGuVWvtNC1xcCd7uMu3Q/Zi1H So+w5R3mk2TxH/02U99gTzJNVSGSWSO0Mil8Z3yTwlYMMX7WRi1uK8f5XqOsNiqkT51C e625YD7BMenXbT1xQlh1jNZ+HEx5hCbLGVP0bL2vVoqbYMLH7zX/HaZb8J8gDdVHI6ik M7pyw1B520SlVXnVjbRibWutqDawrzaDB1cUS2V6YdZJ6TaUrsqPTLb/UdqGpThc+NeJ b6gQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763938309; x=1764543109; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=zxuKr9/wD0hv1o+9Q16IeSgRkNQrWSlYyRysfxG3AMc=; b=uw9M1pW+KMOyzeMlXaFkwbeCgXjxF/q7jSEd9fagr+nutlm+vo017ayjySvUxlqmo3 dmeG3kt90nZM4ZebP2mnRZuF2K6v/qM/XESh0HRm/N2CNmREsbwHRYFAjH1yr7ooSx28 mkE8Y/qmQ3Omn1ZDJTDTSlWeweoClzBwTqlVgc/5beBPmQ1eEhpjrRfeW2Rs708f1lQX ot8jouol4EAo7XHGj4pnCLvBFZ8asosUf1hRzFdSd/vpWqS/E+nDxgeagMrqnlbjhZIm 8fEMz4I9OIds2UrOYMVd+RP83nCxzjse73X3v9A2JY+3FQnGWztdD6yzAdJkZ4ciGBGe 1e3Q== X-Forwarded-Encrypted: i=1; AJvYcCV7rUuyJ8KK+HXy7HVIeBbZMvLr7iiP34A1z70/qrP5qEpOejr6Yaxo0GGUg6sz1EiWQBa9FKU75zXlO1A=@vger.kernel.org X-Gm-Message-State: AOJu0YxSSoCyy8M4q0wDUho5sR14Y2DwyZfNPSEH6alN2CUoZBAYID6K etUIlPrxZ1reF9aL2Cz7w3QcuBSB9MblYFUOS79qfnU6xzX/Lasw87rW X-Gm-Gg: ASbGncsXV/+02E5TSnEAcV7sXn8Agumj++E2pHpgkhOY71TNkUsVSDrC4IKKixnRVVN reFDhoidM7vm5cp+/smtZlgti+Kjjp63Aeg3BFPkTInnT4RWeLIOIVzuN2DmcR/oSHkfHIun3Kn JwE1qKNEm6KpFCb01hJwh5RxsZoJ2f2MUeJYvntk4J8/4MdyLkNv5JiQ/AtCoGL0fIk+7D5XFx7 JCcRGzStmVjZk/zpoDk+3lfaCBxUOvSfSLscnVJIo7mLtYhSQ9RF55bLj7n7qgN9ZJNXrPqAyaN 2qPEgvS0l5RljM95c/ah9bqD/rEo8ZGVVs/2mz4gJOhx0Ue6eFgi0KlEg3YS4SvSKOb1dhB0uyR MPWnl+wfWFbXGb3KuS9cCNwBC4AkRbPSVszjDFrpEjk4+ZfZG8gvgnjlJ4l1viXhBtt9+l9vRSp uKH63/QclbCFIa2i2IXUccnL11 X-Google-Smtp-Source: AGHT+IF9Tqh/ORJZEo3R5Xg7ZJWy7sWzF8MjqLc9q+tvrWwYorpVbwyCo218uq/Fj/Pad8uOgoxn1g== X-Received: by 2002:a05:600c:840f:b0:477:7479:f081 with SMTP id 5b1f17b1804b1-477c0181443mr112383465e9.12.1763938308511; Sun, 23 Nov 2025 14:51:48 -0800 (PST) Received: from 127.mynet ([2a01:4b00:bd21:4f00:7cc6:d3ca:494:116c]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-42cb7fb9190sm24849064f8f.33.2025.11.23.14.51.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 23 Nov 2025 14:51:47 -0800 (PST) From: Pavel Begunkov To: linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: Vishal Verma , tushar.gohad@intel.com, Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , Alexander Viro , Christian Brauner , Andrew Morton , Sumit Semwal , =?UTF-8?q?Christian=20K=C3=B6nig?= , Pavel Begunkov , linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org Subject: [RFC v2 04/11] block: introduce dma token backed bio type Date: Sun, 23 Nov 2025 22:51:24 +0000 Message-ID: <12530de6d1907afb44be3e76e7668b935f1fd441.1763725387.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Premapped buffers don't require a generic bio_vec since these have already been dma mapped. Repurpose the bi_io_vec space for the dma token as they are mutually exclusive, and provide setup to support dma tokens. In order to use this, a driver must implement the dma_map blk-mq op, in which case it must be aware that any given bio may be using a dma_tag instead of a bio_vec. Suggested-by: Keith Busch Signed-off-by: Pavel Begunkov --- block/bio.c | 21 +++++++++++++++++++++ block/blk-merge.c | 23 +++++++++++++++++++++++ block/blk.h | 3 ++- block/fops.c | 2 ++ include/linux/bio.h | 19 ++++++++++++++++--- include/linux/blk_types.h | 8 +++++++- 6 files changed, 71 insertions(+), 5 deletions(-) diff --git a/block/bio.c b/block/bio.c index 7b13bdf72de0..8793f1ee559d 100644 --- a/block/bio.c +++ b/block/bio.c @@ -843,6 +843,11 @@ static int __bio_clone(struct bio *bio, struct bio *bi= o_src, gfp_t gfp) bio_clone_blkg_association(bio, bio_src); } =20 + if (bio_flagged(bio_src, BIO_DMA_TOKEN)) { + bio->dma_token =3D bio_src->dma_token; + bio_set_flag(bio, BIO_DMA_TOKEN); + } + if (bio_crypt_clone(bio, bio_src, gfp) < 0) return -ENOMEM; if (bio_integrity(bio_src) && @@ -1167,6 +1172,18 @@ void bio_iov_bvec_set(struct bio *bio, const struct = iov_iter *iter) bio_set_flag(bio, BIO_CLONED); } =20 +void bio_iov_dma_token_set(struct bio *bio, struct iov_iter *iter) +{ + WARN_ON_ONCE(bio->bi_max_vecs); + + bio->dma_token =3D iter->dma_token; + bio->bi_vcnt =3D 0; + bio->bi_iter.bi_bvec_done =3D iter->iov_offset; + bio->bi_iter.bi_size =3D iov_iter_count(iter); + bio->bi_opf |=3D REQ_NOMERGE; + bio_set_flag(bio, BIO_DMA_TOKEN); +} + static unsigned int get_contig_folio_len(unsigned int *num_pages, struct page **pages, unsigned int i, struct folio *folio, size_t left, @@ -1349,6 +1366,10 @@ int bio_iov_iter_get_pages(struct bio *bio, struct i= ov_iter *iter, bio_iov_bvec_set(bio, iter); iov_iter_advance(iter, bio->bi_iter.bi_size); return 0; + } else if (iov_iter_is_dma_token(iter)) { + bio_iov_dma_token_set(bio, iter); + iov_iter_advance(iter, bio->bi_iter.bi_size); + return 0; } =20 if (iov_iter_extract_will_pin(iter)) diff --git a/block/blk-merge.c b/block/blk-merge.c index d3115d7469df..c02a5f9c99e6 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -328,6 +328,29 @@ int bio_split_io_at(struct bio *bio, const struct queu= e_limits *lim, unsigned nsegs =3D 0, bytes =3D 0, gaps =3D 0; struct bvec_iter iter; =20 + if (bio_flagged(bio, BIO_DMA_TOKEN)) { + int offset =3D offset_in_page(bio->bi_iter.bi_bvec_done); + + nsegs =3D ALIGN(bio->bi_iter.bi_size + offset, PAGE_SIZE); + nsegs >>=3D PAGE_SHIFT; + + if (offset & lim->dma_alignment || bytes & len_align_mask) + return -EINVAL; + + if (bio->bi_iter.bi_size > max_bytes) { + bytes =3D max_bytes; + nsegs =3D (bytes + offset) >> PAGE_SHIFT; + goto split; + } else if (nsegs > lim->max_segments) { + nsegs =3D lim->max_segments; + bytes =3D PAGE_SIZE * nsegs - offset; + goto split; + } + + *segs =3D nsegs; + return 0; + } + bio_for_each_bvec(bv, bio, iter) { if (bv.bv_offset & lim->dma_alignment || bv.bv_len & len_align_mask) diff --git a/block/blk.h b/block/blk.h index e4c433f62dfc..2c72f2630faf 100644 --- a/block/blk.h +++ b/block/blk.h @@ -398,7 +398,8 @@ static inline struct bio *__bio_split_to_limits(struct = bio *bio, switch (bio_op(bio)) { case REQ_OP_READ: case REQ_OP_WRITE: - if (bio_may_need_split(bio, lim)) + if (bio_may_need_split(bio, lim) || + bio_flagged(bio, BIO_DMA_TOKEN)) return bio_split_rw(bio, lim, nr_segs); *nr_segs =3D 1; return bio; diff --git a/block/fops.c b/block/fops.c index 5e3db9fead77..41f8795874a9 100644 --- a/block/fops.c +++ b/block/fops.c @@ -354,6 +354,8 @@ static ssize_t __blkdev_direct_IO_async(struct kiocb *i= ocb, * bio_iov_iter_get_pages() and set the bvec directly. */ bio_iov_bvec_set(bio, iter); + } else if (iov_iter_is_dma_token(iter)) { + bio_iov_dma_token_set(bio, iter); } else { ret =3D blkdev_iov_iter_get_pages(bio, iter, bdev); if (unlikely(ret)) diff --git a/include/linux/bio.h b/include/linux/bio.h index c75a9b3672aa..f83342640e71 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -108,16 +108,26 @@ static inline bool bio_next_segment(const struct bio = *bio, #define bio_for_each_segment_all(bvl, bio, iter) \ for (bvl =3D bvec_init_iter_all(&iter); bio_next_segment((bio), &iter); ) =20 +static inline void bio_advance_iter_dma_token(struct bvec_iter *iter, + unsigned int bytes) +{ + iter->bi_bvec_done +=3D bytes; + iter->bi_size -=3D bytes; +} + static inline void bio_advance_iter(const struct bio *bio, struct bvec_iter *iter, unsigned int bytes) { iter->bi_sector +=3D bytes >> 9; =20 - if (bio_no_advance_iter(bio)) + if (bio_no_advance_iter(bio)) { iter->bi_size -=3D bytes; - else + } else if (bio_flagged(bio, BIO_DMA_TOKEN)) { + bio_advance_iter_dma_token(iter, bytes); + } else { bvec_iter_advance(bio->bi_io_vec, iter, bytes); /* TODO: It is reasonable to complete bio with error here. */ + } } =20 /* @bytes should be less or equal to bvec[i->bi_idx].bv_len */ @@ -129,6 +139,8 @@ static inline void bio_advance_iter_single(const struct= bio *bio, =20 if (bio_no_advance_iter(bio)) iter->bi_size -=3D bytes; + else if (bio_flagged(bio, BIO_DMA_TOKEN)) + bio_advance_iter_dma_token(iter, bytes); else bvec_iter_advance_single(bio->bi_io_vec, iter, bytes); } @@ -398,7 +410,7 @@ static inline void bio_wouldblock_error(struct bio *bio) */ static inline int bio_iov_vecs_to_alloc(struct iov_iter *iter, int max_seg= s) { - if (iov_iter_is_bvec(iter)) + if (iov_iter_is_bvec(iter) || iov_iter_is_dma_token(iter)) return 0; return iov_iter_npages(iter, max_segs); } @@ -452,6 +464,7 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_= iter *iter, unsigned len_align_mask); =20 void bio_iov_bvec_set(struct bio *bio, const struct iov_iter *iter); +void bio_iov_dma_token_set(struct bio *bio, struct iov_iter *iter); void __bio_release_pages(struct bio *bio, bool mark_dirty); extern void bio_set_pages_dirty(struct bio *bio); extern void bio_check_pages_dirty(struct bio *bio); diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index cbbcb9051ec3..3bc7f89d4e66 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -275,7 +275,12 @@ struct bio { =20 atomic_t __bi_cnt; /* pin count */ =20 - struct bio_vec *bi_io_vec; /* the actual vec list */ + union { + struct bio_vec *bi_io_vec; /* the actual vec list */ + /* Driver specific dma map, present only with BIO_DMA_TOKEN */ + struct dma_token *dma_token; + }; + =20 struct bio_set *bi_pool; }; @@ -315,6 +320,7 @@ enum { BIO_REMAPPED, BIO_ZONE_WRITE_PLUGGING, /* bio handled through zone write plugging */ BIO_EMULATES_ZONE_APPEND, /* bio emulates a zone append operation */ + BIO_DMA_TOKEN, /* Using premmaped dma buffers */ BIO_FLAG_LAST }; =20 --=20 2.52.0 From nobody Tue Dec 2 01:22:05 2025 Received: from mail-wr1-f46.google.com (mail-wr1-f46.google.com [209.85.221.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A883228688C for ; Sun, 23 Nov 2025 22:51:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.46 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763938315; cv=none; b=iEhXOyybnHurA7rU5jJFW+dzISpdybBzLu28mwHWrz2jJApmbUeKINhbxpD2aAXORVXDMc+KS+V6NQAS74EdURU5ZIJivr4tiSDX51Rv1vMFX7LggkyrOwjHgWtDrLjkuAXH2epc5r11DqCfCz0YJfo1DQJcqG/juuuy1acmfiA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763938315; c=relaxed/simple; bh=069Y8B4LQyZ9577N4Tpt4w72wqG515Cr2xlW9Y7a6Cg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=KTL3WhVQDAHnjQ21TFN8cQX9re4AcdljxKha1cTdw1J7KzA4TO20g5w8NkXETHHPjJwA7wh2tHCQFhCafbu9PE9bmvoAQgwbqYDikOFK134JB8Q54yIvfe3mh+5e2v3cqz+wnHnJbkI8Vd1xqnx5I12MnleBRUwbtIGqz3NZ72c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=WSVzQSuZ; arc=none smtp.client-ip=209.85.221.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="WSVzQSuZ" Received: by mail-wr1-f46.google.com with SMTP id ffacd0b85a97d-429c8632fcbso2222067f8f.1 for ; Sun, 23 Nov 2025 14:51:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1763938311; x=1764543111; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=vtyRM3SiK93qfGEm4pAX9GDP/9MNGPDztBsVCrDNvOw=; b=WSVzQSuZFASv9x+Ew3i3SIjYdBEW2G1mYB2kSAVZh0ZeZdgeTgK2oWLXdH0EaRSdBy U1s5++Nt0Ly00qErV8mdmfFI7UlvgHaEZt9yfxrFZwzzkQrlBMzgpkPhMEZ9rUEnJ6P/ OYw70cxs9KcAusMiZ/MZESMKr/itGclAM6mc6PLUEyo8dVxB3HRMiLxUdHz5zssFcfk9 ik26ZNn/tOoetCq0oRIdS0yE2k8vHJKbtHKbeyzuDDKchgh3EIpLgwTCX2a7pE+MELfv 6vkfp9MlmkAmSmutLvMaamjpNdxuliQB2iY1yXubcDbU/lHrMs3/3DPq1B0fWR0OmnQ5 WTDw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763938311; x=1764543111; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=vtyRM3SiK93qfGEm4pAX9GDP/9MNGPDztBsVCrDNvOw=; b=hUsCkSr4jaFKKrEefwyyKdXx1GkuLsY8e+8U4rEdnnCIiRR185DVPjBd5/E6BEEYka gL/v4b8FSPjeO9aqvcHIkmNfPDpUg5cBmr4E5G6O3wTaVuyf1qN68SGdQd8g2TrZPeZX g5TbR90XJHm2Syj0CpuYI5PB9JL3p+qWsz0Q6hr/ZtNh+FcRQhpb59FkkoYh+KcVVK7M 30B5EV3i8UR262j1XKkCmIM/EotFRtXZYHLbFDDU9Jd5DiLsXj96XKmJXXD2qKytLQZY g/v/l5FS2/Z3JAMRrQi92qTs5Wm5lfJZ4kWItWIvWihJ70iXHtEzGV7eGOv+4uezWn6l idTA== X-Forwarded-Encrypted: i=1; AJvYcCXx1KjXaxt7hMYOvT+c5tVpbrWzM+QmZky1JZbIa7mmcXe5YcEdRsdd+XNett7KSZEWHsMOn+zCcu8edMg=@vger.kernel.org X-Gm-Message-State: AOJu0YyVmyTvt1FRPXHjDnz7OoauVpzLjXqIK6TMuJB5N9pJhgXYpwq/ 8rOoaxZyYc9uzSCnEc8h4zAEIY3MpaRiH0E8WXlv0gI++7z8V6tqAi/l X-Gm-Gg: ASbGnctlSdE02tD5XwVkzbGXuBJ54vMRi6X2nHW4lFS1GmqLV4gMhudH+e+52vMKK5W 8MLA3vbKxTPUyij3bat5SBuh31kqNNqFnQMc858SDZdpEWWhCFxiLN74fnS8TR+eRedK0yug1wF NHQsgoe6NkubJB1vHtiMU9gbZ25mNF8QyfYD5CBQKJYFgyKLehdueEWthQI4P8KrDnhNLdV93vv HdLn/IWSU0e4ryGssPhwnTPwAsWs0q1eP/gDZ/Xxdjq5Tcjal+8EoZyBXSrHRRkxLwhgTcdz00j ns0t4fYE7cNJQ7WjFIHdqjWYfa5GZhvuzHU9NCAuAlfS8ZkIhQs818pIBoan7d17ePl7FTLVbbc wN2l4XuFgjFk99c8eVg11/9/hLC1pswM+NIrjmKyqOLQBnzN6p7b8btWdoPTAmFyqDitCNJW8qH OFdhrV9KdMUTWZyw== X-Google-Smtp-Source: AGHT+IGfOL+4KpbQvV0qNRfo8OovcJ37mYapE/bWdNNKsAObOGKg7LEW9kMhWjZqfvqDPLpynO+F1g== X-Received: by 2002:a05:6000:3102:b0:429:d40e:fa40 with SMTP id ffacd0b85a97d-42cc1d0cab6mr8756848f8f.45.1763938310890; Sun, 23 Nov 2025 14:51:50 -0800 (PST) Received: from 127.mynet ([2a01:4b00:bd21:4f00:7cc6:d3ca:494:116c]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-42cb7fb9190sm24849064f8f.33.2025.11.23.14.51.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 23 Nov 2025 14:51:50 -0800 (PST) From: Pavel Begunkov To: linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: Vishal Verma , tushar.gohad@intel.com, Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , Alexander Viro , Christian Brauner , Andrew Morton , Sumit Semwal , =?UTF-8?q?Christian=20K=C3=B6nig?= , Pavel Begunkov , linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org Subject: [RFC v2 05/11] block: add infra to handle dmabuf tokens Date: Sun, 23 Nov 2025 22:51:25 +0000 Message-ID: <51cddd97b31d80ec8842a88b9f3c9881419e8a7b.1763725387.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add blk-mq infrastructure to handle dmabuf tokens. There are two main objects. The first is struct blk_mq_dma_token, which is an extension of struct dma_token and passed in an iterator. The second is struct blk_mq_dma_map, which keeps the actual mapping and unlike the token, can be ejected (e.g. by move_notify) and recreated. The token keeps an rcu protected pointer to the mapping, so when it resolves a token into a mapping to pass it to a request, it'll do an rcu protected lookup and get a percpu reference to the mapping. If there is no current mapping attached to a token, it'll need to be created by calling the driver (e.g. nvme) via a new callback. It requires waiting, thefore can't be done for nowait requests and couldn't happen deeper in the stack, e.g. during nvme request submission. The structure split is needed because move_notify can request to invalidate the dma mapping at any moment, and we need a way to concurrently remove it and wait for the inflight requests using the previous mapping to complete. Signed-off-by: Pavel Begunkov --- block/Makefile | 1 + block/bdev.c | 14 ++ block/blk-mq-dma-token.c | 236 +++++++++++++++++++++++++++++++ block/blk-mq.c | 20 +++ block/fops.c | 1 + include/linux/blk-mq-dma-token.h | 60 ++++++++ include/linux/blk-mq.h | 21 +++ include/linux/blkdev.h | 3 + 8 files changed, 356 insertions(+) create mode 100644 block/blk-mq-dma-token.c create mode 100644 include/linux/blk-mq-dma-token.h diff --git a/block/Makefile b/block/Makefile index c65f4da93702..0190e5aa9f00 100644 --- a/block/Makefile +++ b/block/Makefile @@ -36,3 +36,4 @@ obj-$(CONFIG_BLK_INLINE_ENCRYPTION) +=3D blk-crypto.o blk= -crypto-profile.o \ blk-crypto-sysfs.o obj-$(CONFIG_BLK_INLINE_ENCRYPTION_FALLBACK) +=3D blk-crypto-fallback.o obj-$(CONFIG_BLOCK_HOLDER_DEPRECATED) +=3D holder.o +obj-$(CONFIG_DMA_SHARED_BUFFER) +=3D blk-mq-dma-token.o diff --git a/block/bdev.c b/block/bdev.c index 810707cca970..da89d20f33f3 100644 --- a/block/bdev.c +++ b/block/bdev.c @@ -28,6 +28,7 @@ #include #include #include +#include #include "../fs/internal.h" #include "blk.h" =20 @@ -61,6 +62,19 @@ struct block_device *file_bdev(struct file *bdev_file) } EXPORT_SYMBOL(file_bdev); =20 +struct dma_token *blkdev_dma_map(struct file *file, + struct dma_token_params *params) +{ + struct request_queue *q =3D bdev_get_queue(file_bdev(file)); + + if (!(file->f_flags & O_DIRECT)) + return ERR_PTR(-EINVAL); + if (!q->mq_ops) + return ERR_PTR(-EINVAL); + + return blk_mq_dma_map(q, params); +} + static void bdev_write_inode(struct block_device *bdev) { struct inode *inode =3D BD_INODE(bdev); diff --git a/block/blk-mq-dma-token.c b/block/blk-mq-dma-token.c new file mode 100644 index 000000000000..cd62c4d09422 --- /dev/null +++ b/block/blk-mq-dma-token.c @@ -0,0 +1,236 @@ +#include +#include + +struct blk_mq_dma_fence { + struct dma_fence base; + spinlock_t lock; +}; + +static const char *blk_mq_fence_drv_name(struct dma_fence *fence) +{ + return "blk-mq"; +} + +const struct dma_fence_ops blk_mq_dma_fence_ops =3D { + .get_driver_name =3D blk_mq_fence_drv_name, + .get_timeline_name =3D blk_mq_fence_drv_name, +}; + +static void blk_mq_dma_token_free(struct blk_mq_dma_token *token) +{ + token->q->mq_ops->clean_dma_token(token->q, token); + dma_buf_put(token->dmabuf); + kfree(token); +} + +static inline void blk_mq_dma_token_put(struct blk_mq_dma_token *token) +{ + if (refcount_dec_and_test(&token->refs)) + blk_mq_dma_token_free(token); +} + +static void blk_mq_dma_mapping_free(struct blk_mq_dma_map *map) +{ + struct blk_mq_dma_token *token =3D map->token; + + if (map->sgt) + token->q->mq_ops->dma_unmap(token->q, map); + + dma_fence_put(&map->fence->base); + percpu_ref_exit(&map->refs); + kfree(map); + blk_mq_dma_token_put(token); +} + +static void blk_mq_dma_map_work_free(struct work_struct *work) +{ + struct blk_mq_dma_map *map =3D container_of(work, struct blk_mq_dma_map, + free_work); + + dma_fence_signal(&map->fence->base); + blk_mq_dma_mapping_free(map); +} + +static void blk_mq_dma_map_refs_free(struct percpu_ref *ref) +{ + struct blk_mq_dma_map *map =3D container_of(ref, struct blk_mq_dma_map, r= efs); + + INIT_WORK(&map->free_work, blk_mq_dma_map_work_free); + queue_work(system_wq, &map->free_work); +} + +static struct blk_mq_dma_map *blk_mq_alloc_dma_mapping(struct blk_mq_dma_t= oken *token) +{ + struct blk_mq_dma_fence *fence =3D NULL; + struct blk_mq_dma_map *map; + int ret =3D -ENOMEM; + + map =3D kzalloc(sizeof(*map), GFP_KERNEL); + if (!map) + return ERR_PTR(-ENOMEM); + + fence =3D kzalloc(sizeof(*fence), GFP_KERNEL); + if (!fence) + goto err; + + ret =3D percpu_ref_init(&map->refs, blk_mq_dma_map_refs_free, 0, + GFP_KERNEL); + if (ret) + goto err; + + dma_fence_init(&fence->base, &blk_mq_dma_fence_ops, &fence->lock, + token->fence_ctx, atomic_inc_return(&token->fence_seq)); + spin_lock_init(&fence->lock); + map->fence =3D fence; + map->token =3D token; + refcount_inc(&token->refs); + return map; +err: + kfree(map); + kfree(fence); + return ERR_PTR(ret); +} + +static inline +struct blk_mq_dma_map *blk_mq_get_token_map(struct blk_mq_dma_token *token) +{ + struct blk_mq_dma_map *map; + + guard(rcu)(); + + map =3D rcu_dereference(token->map); + if (unlikely(!map || !percpu_ref_tryget_live_rcu(&map->refs))) + return NULL; + return map; +} + +static struct blk_mq_dma_map * +blk_mq_create_dma_map(struct blk_mq_dma_token *token) +{ + struct dma_buf *dmabuf =3D token->dmabuf; + struct blk_mq_dma_map *map; + long ret; + + guard(mutex)(&token->mapping_lock); + + map =3D blk_mq_get_token_map(token); + if (map) + return map; + + map =3D blk_mq_alloc_dma_mapping(token); + if (IS_ERR(map)) + return NULL; + + dma_resv_lock(dmabuf->resv, NULL); + ret =3D dma_resv_wait_timeout(dmabuf->resv, DMA_RESV_USAGE_BOOKKEEP, + true, MAX_SCHEDULE_TIMEOUT); + ret =3D ret ? ret : -ETIME; + if (ret > 0) + ret =3D token->q->mq_ops->dma_map(token->q, map); + dma_resv_unlock(dmabuf->resv); + + if (ret) + return ERR_PTR(ret); + + percpu_ref_get(&map->refs); + rcu_assign_pointer(token->map, map); + return map; +} + +static void blk_mq_dma_map_remove(struct blk_mq_dma_token *token) +{ + struct dma_buf *dmabuf =3D token->dmabuf; + struct blk_mq_dma_map *map; + int ret; + + dma_resv_assert_held(dmabuf->resv); + + ret =3D dma_resv_reserve_fences(dmabuf->resv, 1); + if (WARN_ON_ONCE(ret)) + return; + + map =3D rcu_dereference_protected(token->map, + dma_resv_held(dmabuf->resv)); + if (!map) + return; + rcu_assign_pointer(token->map, NULL); + + dma_resv_add_fence(dmabuf->resv, &map->fence->base, + DMA_RESV_USAGE_KERNEL); + percpu_ref_kill(&map->refs); +} + +blk_status_t blk_rq_assign_dma_map(struct request *rq, + struct blk_mq_dma_token *token) +{ + struct blk_mq_dma_map *map; + + map =3D blk_mq_get_token_map(token); + if (map) + goto complete; + + if (rq->cmd_flags & REQ_NOWAIT) + return BLK_STS_AGAIN; + + map =3D blk_mq_create_dma_map(token); + if (IS_ERR(map)) + return BLK_STS_RESOURCE; +complete: + rq->dma_map =3D map; + return BLK_STS_OK; +} + +void blk_mq_dma_map_move_notify(struct blk_mq_dma_token *token) +{ + blk_mq_dma_map_remove(token); +} + +static void blk_mq_release_dma_mapping(struct dma_token *base_token) +{ + struct blk_mq_dma_token *token =3D dma_token_to_blk_mq(base_token); + struct dma_buf *dmabuf =3D token->dmabuf; + + dma_resv_lock(dmabuf->resv, NULL); + blk_mq_dma_map_remove(token); + dma_resv_unlock(dmabuf->resv); + + blk_mq_dma_token_put(token); +} + +struct dma_token *blk_mq_dma_map(struct request_queue *q, + struct dma_token_params *params) +{ + struct dma_buf *dmabuf =3D params->dmabuf; + struct blk_mq_dma_token *token; + int ret; + + if (!q->mq_ops->dma_map || !q->mq_ops->dma_unmap || + !q->mq_ops->init_dma_token || !q->mq_ops->clean_dma_token) + return ERR_PTR(-EINVAL); + + token =3D kzalloc(sizeof(*token), GFP_KERNEL); + if (!token) + return ERR_PTR(-ENOMEM); + + get_dma_buf(dmabuf); + token->fence_ctx =3D dma_fence_context_alloc(1); + token->dmabuf =3D dmabuf; + token->dir =3D params->dir; + token->base.release =3D blk_mq_release_dma_mapping; + token->q =3D q; + refcount_set(&token->refs, 1); + mutex_init(&token->mapping_lock); + + if (!blk_get_queue(q)) { + kfree(token); + return ERR_PTR(-EFAULT); + } + + ret =3D token->q->mq_ops->init_dma_token(token->q, token); + if (ret) { + kfree(token); + blk_put_queue(q); + return ERR_PTR(ret); + } + return &token->base; +} diff --git a/block/blk-mq.c b/block/blk-mq.c index f2650c97a75e..1ff3a7e3191b 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -29,6 +29,7 @@ #include #include #include +#include =20 #include =20 @@ -439,6 +440,7 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq= _alloc_data *data, rq->nr_integrity_segments =3D 0; rq->end_io =3D NULL; rq->end_io_data =3D NULL; + rq->dma_map =3D NULL; =20 blk_crypto_rq_set_defaults(rq); INIT_LIST_HEAD(&rq->queuelist); @@ -794,6 +796,7 @@ static void __blk_mq_free_request(struct request *rq) blk_pm_mark_last_busy(rq); rq->mq_hctx =3D NULL; =20 + blk_rq_drop_dma_map(rq); if (rq->tag !=3D BLK_MQ_NO_TAG) { blk_mq_dec_active_requests(hctx); blk_mq_put_tag(hctx->tags, ctx, rq->tag); @@ -3214,6 +3217,23 @@ void blk_mq_submit_bio(struct bio *bio) =20 blk_mq_bio_to_request(rq, bio, nr_segs); =20 + if (bio_flagged(bio, BIO_DMA_TOKEN)) { + struct blk_mq_dma_token *token; + blk_status_t ret; + + token =3D dma_token_to_blk_mq(bio->dma_token); + ret =3D blk_rq_assign_dma_map(rq, token); + if (ret) { + if (ret =3D=3D BLK_STS_AGAIN) { + bio_wouldblock_error(bio); + } else { + bio->bi_status =3D BLK_STS_RESOURCE; + bio_endio(bio); + } + goto queue_exit; + } + } + ret =3D blk_crypto_rq_get_keyslot(rq); if (ret !=3D BLK_STS_OK) { bio->bi_status =3D ret; diff --git a/block/fops.c b/block/fops.c index 41f8795874a9..ac52fe1a4b8d 100644 --- a/block/fops.c +++ b/block/fops.c @@ -973,6 +973,7 @@ const struct file_operations def_blk_fops =3D { .fallocate =3D blkdev_fallocate, .uring_cmd =3D blkdev_uring_cmd, .fop_flags =3D FOP_BUFFER_RASYNC, + .dma_map =3D blkdev_dma_map, }; =20 static __init int blkdev_init(void) diff --git a/include/linux/blk-mq-dma-token.h b/include/linux/blk-mq-dma-to= ken.h new file mode 100644 index 000000000000..4a8d84addc06 --- /dev/null +++ b/include/linux/blk-mq-dma-token.h @@ -0,0 +1,60 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef BLK_MQ_DMA_TOKEN_H +#define BLK_MQ_DMA_TOKEN_H + +#include +#include +#include + +struct blk_mq_dma_token; +struct blk_mq_dma_fence; + +struct blk_mq_dma_map { + void *private; + + struct percpu_ref refs; + struct sg_table *sgt; + struct blk_mq_dma_token *token; + struct blk_mq_dma_fence *fence; + struct work_struct free_work; +}; + +struct blk_mq_dma_token { + struct dma_token base; + enum dma_data_direction dir; + + void *private; + + struct dma_buf *dmabuf; + struct blk_mq_dma_map __rcu *map; + struct request_queue *q; + + struct mutex mapping_lock; + refcount_t refs; + + atomic_t fence_seq; + u64 fence_ctx; +}; + +static inline +struct blk_mq_dma_token *dma_token_to_blk_mq(struct dma_token *token) +{ + return container_of(token, struct blk_mq_dma_token, base); +} + +blk_status_t blk_rq_assign_dma_map(struct request *req, + struct blk_mq_dma_token *token); + +static inline void blk_rq_drop_dma_map(struct request *rq) +{ + if (rq->dma_map) { + percpu_ref_put(&rq->dma_map->refs); + rq->dma_map =3D NULL; + } +} + +void blk_mq_dma_map_move_notify(struct blk_mq_dma_token *token); +struct dma_token *blk_mq_dma_map(struct request_queue *q, + struct dma_token_params *params); + +#endif /* BLK_MQ_DMA_TOKEN_H */ diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index b54506b3b76d..4745d1e183f2 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -94,6 +94,9 @@ enum mq_rq_state { MQ_RQ_COMPLETE =3D 2, }; =20 +struct blk_mq_dma_map; +struct blk_mq_dma_token; + /* * Try to put the fields that are referenced together in the same cachelin= e. * @@ -170,6 +173,8 @@ struct request { =20 unsigned long deadline; =20 + struct blk_mq_dma_map *dma_map; + /* * The hash is used inside the scheduler, and killed once the * request reaches the dispatch list. The ipi_list is only used @@ -675,6 +680,21 @@ struct blk_mq_ops { */ void (*map_queues)(struct blk_mq_tag_set *set); =20 + /** + * @map_dmabuf: Allows drivers to pre-map a dmabuf. The resulting driver + * specific mapping will be wrapped into dma_token and passed to the + * read / write path in an iterator. + */ + int (*dma_map)(struct request_queue *q, struct blk_mq_dma_map *); + void (*dma_unmap)(struct request_queue *q, struct blk_mq_dma_map *); + int (*init_dma_token)(struct request_queue *q, + struct blk_mq_dma_token *token); + void (*clean_dma_token)(struct request_queue *q, + struct blk_mq_dma_token *token); + + struct dma_buf_attachment *(*dma_attach)(struct request_queue *q, + struct dma_token_params *params); + #ifdef CONFIG_BLK_DEBUG_FS /** * @show_rq: Used by the debugfs implementation to show driver-specific @@ -946,6 +966,7 @@ void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tag= set, void blk_mq_tagset_wait_completed_request(struct blk_mq_tag_set *tagset); void blk_mq_freeze_queue_nomemsave(struct request_queue *q); void blk_mq_unfreeze_queue_nomemrestore(struct request_queue *q); + static inline unsigned int __must_check blk_mq_freeze_queue(struct request_queue *q) { diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index cb4ba09959ee..dec75348f8dc 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1777,6 +1777,9 @@ struct block_device *file_bdev(struct file *bdev_file= ); bool disk_live(struct gendisk *disk); unsigned int block_size(struct block_device *bdev); =20 +struct dma_token *blkdev_dma_map(struct file *file, + struct dma_token_params *params); + #ifdef CONFIG_BLOCK void invalidate_bdev(struct block_device *bdev); int sync_blockdev(struct block_device *bdev); --=20 2.52.0 From nobody Tue Dec 2 01:22:05 2025 Received: from mail-wr1-f45.google.com (mail-wr1-f45.google.com [209.85.221.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7258928850B for ; Sun, 23 Nov 2025 22:51:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.45 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763938318; cv=none; b=smV7Ix+OxKghMoZfLPv08Sk472gjf3XTRXD4t1u4glC+sHwYDk0FAMRVAcXMYZKqnDj3//apndMezVQiJEPvvQA2ySv2nFvraZnQS21rxdi6lGSg4zcP1ln0PdUcz2Zw/IRfaJvYHS6vlBlsmkpcx8bBig/U4ybC05Pl54bZXBg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763938318; c=relaxed/simple; bh=yxSa78i6WGSB+rLN2wnYPcV3pbfkXhbApdZHHrVnd/4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=bXCbZrLOxJtzsuaplYgwBcEiJhedrxyLWiFZ1TXD5gu+qQLM45qhTIc4PcQHkHl7MDwz8VYod52wS8/avEoi0gpgbNlATW8GusCsO4kcYNI2pKq5jIw6LliMj0TKrEeD3ZMTXZMW8uY7ovxMBAHOHO6wyUYcsYAsvnmG/rqTQEw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=SWC3WtzW; arc=none smtp.client-ip=209.85.221.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="SWC3WtzW" Received: by mail-wr1-f45.google.com with SMTP id ffacd0b85a97d-42b2e9ac45aso2137531f8f.0 for ; Sun, 23 Nov 2025 14:51:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1763938314; x=1764543114; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=o6XYP6M4yTt1qmMNVTJGvjFxmCyRCFcZp21DmV1BJvo=; b=SWC3WtzWunmIoxQcpPcBFLPUYEJWym14GO7iRr1/3Ad36ZEWp8V202BrOiW3ZOAtA5 KF/ACbm/UiESvf5tIAV7eC+onWTD648+asz9J3Rkemi5EhydytFOTuKqqrfvQ83ejP5U Btu4HwACa5/mz0Bue4XOYVliW7GSFFcJh4oYeP+FNHk3rNqnrIk6gRNf7w/sgWtTk2E3 I2c18V+67kavlYo67p746Vre5fBSG5MgrMElJ44JJfzsV1/MDhsb60CZwIrd0ksvFdVL 2JUPt/jnoGcO9oi84wVxtL1RAgxVlVOK4WosWmrIh/zJE6wkdIgzhom/JhIed3BpmqAV +Thw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763938314; x=1764543114; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=o6XYP6M4yTt1qmMNVTJGvjFxmCyRCFcZp21DmV1BJvo=; b=T434ark0PmT7hj8YIrGbtZQFun5Wu7Jn1+UXjcref3QXVZiSvpbf9YpQthSd1UBu2Z OhuAsY4hrmed2IavSHgLo+yKRJ/xT0N846VIj5tAzUxugfV5AUEXnL9roSAMp4YFtSuH Y0OwLU3+qGasMsV918OD/opsZMtKe7z5rB8W10o37qFBF5GU7kt7q7k6aKvtoZ0Tjdu0 XlGIp1ZH76N7TaPc+PbVa50XvPEwH1/is505egA9V9UghBMOJX7/JwEHsrlEsCafZYdg 24LJKJl0xKwkQNrF9H+k/ZU3bEa1zEwK64Ag/3f8kV/tlCyhq3f5zs9AmQhIza9k8NQr LIqg== X-Forwarded-Encrypted: i=1; AJvYcCX38Zrw3CjV2nIzuLIQ6DJ0hh6qh4UGY3iM2u9YLmzz/41+yv5qYa8bejMP05IKGpXtcAVqFG/Jk1Omq/U=@vger.kernel.org X-Gm-Message-State: AOJu0Yx7Sz3KKkiPLjmOsWeEeoJFJxpZ/NWpSGeYiI7w3cyhP2iwFK4p TM8GgXiqMWJ39qK/oRs8zy8z+CYc4eHbhhAQAha6xWInioFQ1PLdMjH9 X-Gm-Gg: ASbGnct/WiRZpVOwRw3ZSYR0QUYS/it4XS/b2hj4LEBfq5lAHLOSNCe7Wyeyseh39EY YOU4P4sEYArNZXMISOZPuiQNzlpMJ6WViEG0orsEEJaM2dBKtJlibIrRQul7bC33JvByRFftKlh KsWq48W6VgFs+lqvcX3q++DxkF1FYG7xm8poIAP7skI4AUgu1OnJOVIT+ZJmhJgHK8gJhXRWktO JF+x1rwidJGptQKYZiuiLP/pgfa0qUlLYChPHHoViF2IsLd9+TuNtQKC42gGB6eh1O0pZGuU0ap qrWK5G7igzj0KhfxWspK7O8Bng5eH0kjha9fFX22IUIZGlOGYBgYowGnTEvOVFpofEhXFRluf6d rct/9e6VRs88A75W/Twn0HNqm4QU2EVgKWqjHIZy4g1XYNoTw0R3bd7yHyINYmGc31BMm0lDDin xpwAH4IfcI9R3CP/lCJnTPx4q3 X-Google-Smtp-Source: AGHT+IGE0ZjscP742EdyKh9uzu23uLZ3l6gi0agALTuRuYrf+rNDfvZsYNVo2p4Uu5MAT2Id6+WoIQ== X-Received: by 2002:a05:6000:1447:b0:3ec:dd12:54d3 with SMTP id ffacd0b85a97d-42cc1d0c37dmr9157966f8f.35.1763938313681; Sun, 23 Nov 2025 14:51:53 -0800 (PST) Received: from 127.mynet ([2a01:4b00:bd21:4f00:7cc6:d3ca:494:116c]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-42cb7fb9190sm24849064f8f.33.2025.11.23.14.51.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 23 Nov 2025 14:51:52 -0800 (PST) From: Pavel Begunkov To: linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: Vishal Verma , tushar.gohad@intel.com, Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , Alexander Viro , Christian Brauner , Andrew Morton , Sumit Semwal , =?UTF-8?q?Christian=20K=C3=B6nig?= , Pavel Begunkov , linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org Subject: [RFC v2 06/11] nvme-pci: add support for dmabuf reggistration Date: Sun, 23 Nov 2025 22:51:26 +0000 Message-ID: <9bc25f46d2116436d73140cd8e8554576de2caca.1763725388.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Implement dma-token related callbacks for nvme block devices. Signed-off-by: Pavel Begunkov --- drivers/nvme/host/pci.c | 95 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 95 insertions(+) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index e5ca8301bb8b..63e03c3dc044 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -27,6 +27,7 @@ #include #include #include +#include =20 #include "trace.h" #include "nvme.h" @@ -482,6 +483,92 @@ static void nvme_release_descriptor_pools(struct nvme_= dev *dev) } } =20 +static void nvme_dmabuf_move_notify(struct dma_buf_attachment *attach) +{ + blk_mq_dma_map_move_notify(attach->importer_priv); +} + +const struct dma_buf_attach_ops nvme_dmabuf_importer_ops =3D { + .move_notify =3D nvme_dmabuf_move_notify, + .allow_peer2peer =3D true, +}; + +static int nvme_init_dma_token(struct request_queue *q, + struct blk_mq_dma_token *token) +{ + struct dma_buf_attachment *attach; + struct nvme_ns *ns =3D q->queuedata; + struct nvme_dev *dev =3D to_nvme_dev(ns->ctrl); + struct dma_buf *dmabuf =3D token->dmabuf; + + if (dmabuf->size % NVME_CTRL_PAGE_SIZE) + return -EINVAL; + + attach =3D dma_buf_dynamic_attach(dmabuf, dev->dev, + &nvme_dmabuf_importer_ops, token); + if (IS_ERR(attach)) + return PTR_ERR(attach); + + token->private =3D attach; + return 0; +} + +static void nvme_clean_dma_token(struct request_queue *q, + struct blk_mq_dma_token *token) +{ + struct dma_buf_attachment *attach =3D token->private; + + dma_buf_detach(token->dmabuf, attach); +} + +static int nvme_dma_map(struct request_queue *q, struct blk_mq_dma_map *ma= p) +{ + struct blk_mq_dma_token *token =3D map->token; + struct dma_buf_attachment *attach =3D token->private; + unsigned nr_entries; + unsigned long tmp, i =3D 0; + struct scatterlist *sg; + struct sg_table *sgt; + dma_addr_t *dma_list; + + nr_entries =3D token->dmabuf->size / NVME_CTRL_PAGE_SIZE; + dma_list =3D kmalloc_array(nr_entries, sizeof(dma_list[0]), GFP_KERNEL); + if (!dma_list) + return -ENOMEM; + + sgt =3D dma_buf_map_attachment(attach, token->dir); + if (IS_ERR(sgt)) { + kfree(dma_list); + return PTR_ERR(sgt); + } + map->sgt =3D sgt; + + for_each_sgtable_dma_sg(sgt, sg, tmp) { + dma_addr_t dma =3D sg_dma_address(sg); + unsigned long sg_len =3D sg_dma_len(sg); + + while (sg_len) { + dma_list[i++] =3D dma; + dma +=3D NVME_CTRL_PAGE_SIZE; + sg_len -=3D NVME_CTRL_PAGE_SIZE; + } + } + + map->private =3D dma_list; + return 0; +} + +static void nvme_dma_unmap(struct request_queue *q, struct blk_mq_dma_map = *map) +{ + struct blk_mq_dma_token *token =3D map->token; + struct dma_buf_attachment *attach =3D token->private; + dma_addr_t *dma_list =3D map->private; + + dma_buf_unmap_attachment_unlocked(attach, map->sgt, token->dir); + map->sgt =3D NULL; + kfree(dma_list); +} + static int nvme_init_hctx_common(struct blk_mq_hw_ctx *hctx, void *data, unsigned qid) { @@ -1067,6 +1154,9 @@ static blk_status_t nvme_map_data(struct request *req) struct blk_dma_iter iter; blk_status_t ret; =20 + if (req->bio && bio_flagged(req->bio, BIO_DMA_TOKEN)) + return BLK_STS_RESOURCE; + /* * Try to skip the DMA iterator for single segment requests, as that * significantly improves performances for small I/O sizes. @@ -2093,6 +2183,11 @@ static const struct blk_mq_ops nvme_mq_ops =3D { .map_queues =3D nvme_pci_map_queues, .timeout =3D nvme_timeout, .poll =3D nvme_poll, + + .dma_map =3D nvme_dma_map, + .dma_unmap =3D nvme_dma_unmap, + .init_dma_token =3D nvme_init_dma_token, + .clean_dma_token =3D nvme_clean_dma_token, }; =20 static void nvme_dev_remove_admin(struct nvme_dev *dev) --=20 2.52.0 From nobody Tue Dec 2 01:22:05 2025 Received: from mail-wr1-f52.google.com (mail-wr1-f52.google.com [209.85.221.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EAA8128DF07 for ; Sun, 23 Nov 2025 22:51:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763938319; cv=none; b=Vbz5h3mGjZqIeeZ3GoJbGcgmZLdZMIY9Avcb3cK/U4dI1K49KFulTM0pxvTLUTUwbfUtXeYYGGsMbD8gcY4sp8/WoEAilahqaenpxLjmbwQJZAlvTdRR/tXrVYNzz8ywM40EkmtCLZ1tANitfGaA344/PfdsMruTpVDmv/adF0E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763938319; c=relaxed/simple; bh=UXMDuvciehfdM/JpQpcfknw3TaZdhJajBPgetayGTPI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=s9Kb2ieIg0OCkWqFFV2DOqN1bYoPFUPxxkuJvDUMPrd7GAMXH9k9YOBUo5HptdB9W7OphKz0AZ5+wvdB5p0GK4Y32x+4ZLcOWKZLORtL+udsOt/Ngi591y7aufc9c4+XTlqMCk/2wVjroWn/zyikH7GtOSlYKqfhcbeqdASRNJ8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=mSdY/1KA; arc=none smtp.client-ip=209.85.221.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="mSdY/1KA" Received: by mail-wr1-f52.google.com with SMTP id ffacd0b85a97d-429c8632fcbso2222120f8f.1 for ; Sun, 23 Nov 2025 14:51:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1763938316; x=1764543116; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=aczXTymWXQPIGDaXnf/nBr9xYQNzFFU48NeAsCHNOPA=; b=mSdY/1KA049Dl+KwP552oSPQcLPG8jtb1WMtsdNe3I0iIAxyVIukExEx1SRjVM5Ier tRSAL8ACfC7NdWhEI7NrX6jJbBxKyNwSJfElCoUTdQ9n8bo1SRQcznuakYQ4oLioPzix hTzAgWjv9vd0knKSstQ2hUFCkz0ocA2SDp32HjBS07ZBP9XbElMN0+3/PEEIm/GP6OCp Azcye8vZnEgeF7XBgJQHariw/VvI90NBbIja/657GD7/clhJE2t5KIzbfVtFbEplbBb7 nndgyeDZ4n67flVCYVvn6WC/n9J3lwvQFUwJkSZZaxpozVFeUnfNPAWEOvfJM6tx0W7p C7/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763938316; x=1764543116; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=aczXTymWXQPIGDaXnf/nBr9xYQNzFFU48NeAsCHNOPA=; b=jo8NuAtbmQGgXVVjrxpuxb8mbwo4yFpP9ciLIZNdPSHFQ4z3YIJ+Vs7C++48Wsc0Qb YAfo3YLWc6gTuSA8HG1LXSGYUqpVwe6nf0Tmd87A7GQRWPJqoR2gc7kpndhzXz7cJHTC Y9d/amghIa0+4SADlV75TQtOGfOjQduyfuJRU6GzefiUToSr/il6+jBNSP6RlpCQSYhY nMtvwCQbWmMGRQTE1p3a01BUQY7Ida99RiJy6d7UZP58ClxfnRRlz6Pxjq8xsaAmgtM5 Hp0xW9hDwmC58RJS+ghOB58MmhBsJ5gxDL4IxcOYjmFYXx9/CljKztwrwQf4NGHB44dI eQLw== X-Forwarded-Encrypted: i=1; AJvYcCXttLEC4vUKLKjx73IZ+57W7yn3mU3yCUkHilU9pXNdPFztr0FpC66Lq9B8wOlhe5yg72n/SsA8U3gtIpI=@vger.kernel.org X-Gm-Message-State: AOJu0YxOkJw1f6iCqCHXkUPALpinWfz3PyO0jsnMTPdGranzb0+IHROZ X8XpOZV6+DDEhD1+Oj0unnYOkrDcKgS7CS9VIMdk2iCsjl2MXRy1awYc X-Gm-Gg: ASbGnct6Gb+LwqOU+QA4if6bUoJdrS0YcZOQbpn93VVjHkU3lfyvKxNqYPsf30wtzSq HBv91hOUJmuexMeTduEYm9vth1mVXFXBbXfFBNWwEPSbgz35y9fJy+Zhp/r0ebl/6R+mQlRIuKS Fb9Zlon+zb90y9cTKGnxt7pkOq7Yhha+byKi2BaNMHZYH1GPjW0rEWIErnqGAC9nGFSMta/EccM Zco0dKSMv1Tp5E6gX7e62PP4CXnO1tzjOvoDsJyc6PAoHeF6XhUJhDivy0pjgLHiKjIsHCsoVs5 +6vanS0JB3wYq/DFMkQHTSCnRSUgsCz38Q0kxgeIdrrB4q77JwkmSjWi6hJfCqJLz1V+yb+7xYd UyPZVD09UKGG3Onk7FL+Lgcr5e2CFjUMUOxT/R1vcFtFycR9pYr+idLvD/lYVpgB99x2OMxDi+6 ndO78z0APuflCZww== X-Google-Smtp-Source: AGHT+IEhsmDeavDDsXXTdPeMYswvjoeFYMEcGIVgrZUZIxbXidFIxZRmwhVPCPHioGiavOVUtI6phA== X-Received: by 2002:a05:6000:2893:b0:42b:55f3:6196 with SMTP id ffacd0b85a97d-42cc1ab89b3mr10647235f8f.4.1763938316187; Sun, 23 Nov 2025 14:51:56 -0800 (PST) Received: from 127.mynet ([2a01:4b00:bd21:4f00:7cc6:d3ca:494:116c]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-42cb7fb9190sm24849064f8f.33.2025.11.23.14.51.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 23 Nov 2025 14:51:54 -0800 (PST) From: Pavel Begunkov To: linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: Vishal Verma , tushar.gohad@intel.com, Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , Alexander Viro , Christian Brauner , Andrew Morton , Sumit Semwal , =?UTF-8?q?Christian=20K=C3=B6nig?= , Pavel Begunkov , linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org Subject: [RFC v2 07/11] nvme-pci: implement dma_token backed requests Date: Sun, 23 Nov 2025 22:51:27 +0000 Message-ID: X-Mailer: git-send-email 2.52.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Enable BIO_DMA_TOKEN backed requests. It requires special handling to set up the nvme request from the prepared in advance mapping, tear it down and sync the buffers. Suggested-by: Keith Busch Signed-off-by: Pavel Begunkov --- drivers/nvme/host/pci.c | 126 +++++++++++++++++++++++++++++++++++++++- 1 file changed, 124 insertions(+), 2 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 63e03c3dc044..ac377416b088 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -797,6 +797,123 @@ static void nvme_free_descriptors(struct request *req) } } =20 +static void nvme_sync_dma(struct nvme_dev *nvme_dev, struct request *req, + enum dma_data_direction dir) +{ + struct blk_mq_dma_map *map =3D req->dma_map; + int length =3D blk_rq_payload_bytes(req); + bool for_cpu =3D dir =3D=3D DMA_FROM_DEVICE; + struct device *dev =3D nvme_dev->dev; + dma_addr_t *dma_list =3D map->private; + struct bio *bio =3D req->bio; + int offset, map_idx; + + offset =3D bio->bi_iter.bi_bvec_done; + map_idx =3D offset / NVME_CTRL_PAGE_SIZE; + length +=3D offset & (NVME_CTRL_PAGE_SIZE - 1); + + while (length > 0) { + u64 dma_addr =3D dma_list[map_idx++]; + + if (for_cpu) + __dma_sync_single_for_cpu(dev, dma_addr, + NVME_CTRL_PAGE_SIZE, dir); + else + __dma_sync_single_for_device(dev, dma_addr, + NVME_CTRL_PAGE_SIZE, dir); + length -=3D NVME_CTRL_PAGE_SIZE; + } +} + +static void nvme_unmap_premapped_data(struct nvme_dev *dev, + struct request *req) +{ + struct nvme_iod *iod =3D blk_mq_rq_to_pdu(req); + + if (rq_data_dir(req) =3D=3D READ) + nvme_sync_dma(dev, req, DMA_FROM_DEVICE); + if (!(iod->flags & IOD_SINGLE_SEGMENT)) + nvme_free_descriptors(req); +} + +static blk_status_t nvme_dma_premapped(struct request *req, + struct nvme_queue *nvmeq) +{ + struct nvme_iod *iod =3D blk_mq_rq_to_pdu(req); + int length =3D blk_rq_payload_bytes(req); + struct blk_mq_dma_map *map =3D req->dma_map; + u64 dma_addr, prp1_dma, prp2_dma; + struct bio *bio =3D req->bio; + dma_addr_t *dma_list; + dma_addr_t prp_dma; + __le64 *prp_list; + int i, map_idx; + int offset; + + dma_list =3D map->private; + + if (rq_data_dir(req) =3D=3D WRITE) + nvme_sync_dma(nvmeq->dev, req, DMA_TO_DEVICE); + + offset =3D bio->bi_iter.bi_bvec_done; + map_idx =3D offset / NVME_CTRL_PAGE_SIZE; + offset &=3D (NVME_CTRL_PAGE_SIZE - 1); + + prp1_dma =3D dma_list[map_idx++] + offset; + + length -=3D (NVME_CTRL_PAGE_SIZE - offset); + if (length <=3D 0) { + prp2_dma =3D 0; + goto done; + } + + if (length <=3D NVME_CTRL_PAGE_SIZE) { + prp2_dma =3D dma_list[map_idx]; + goto done; + } + + if (DIV_ROUND_UP(length, NVME_CTRL_PAGE_SIZE) <=3D + NVME_SMALL_POOL_SIZE / sizeof(__le64)) + iod->flags |=3D IOD_SMALL_DESCRIPTOR; + + prp_list =3D dma_pool_alloc(nvme_dma_pool(nvmeq, iod), GFP_ATOMIC, + &prp_dma); + if (!prp_list) + return BLK_STS_RESOURCE; + + iod->descriptors[iod->nr_descriptors++] =3D prp_list; + prp2_dma =3D prp_dma; + i =3D 0; + for (;;) { + if (i =3D=3D NVME_CTRL_PAGE_SIZE >> 3) { + __le64 *old_prp_list =3D prp_list; + + prp_list =3D dma_pool_alloc(nvmeq->descriptor_pools.large, + GFP_ATOMIC, &prp_dma); + if (!prp_list) + goto free_prps; + iod->descriptors[iod->nr_descriptors++] =3D prp_list; + prp_list[0] =3D old_prp_list[i - 1]; + old_prp_list[i - 1] =3D cpu_to_le64(prp_dma); + i =3D 1; + } + + dma_addr =3D dma_list[map_idx++]; + prp_list[i++] =3D cpu_to_le64(dma_addr); + + length -=3D NVME_CTRL_PAGE_SIZE; + if (length <=3D 0) + break; + } +done: + iod->cmd.common.dptr.prp1 =3D cpu_to_le64(prp1_dma); + iod->cmd.common.dptr.prp2 =3D cpu_to_le64(prp2_dma); + return BLK_STS_OK; +free_prps: + nvme_free_descriptors(req); + return BLK_STS_RESOURCE; +} + static void nvme_free_prps(struct request *req, unsigned int attrs) { struct nvme_iod *iod =3D blk_mq_rq_to_pdu(req); @@ -875,6 +992,11 @@ static void nvme_unmap_data(struct request *req) struct device *dma_dev =3D nvmeq->dev->dev; unsigned int attrs =3D 0; =20 + if (req->bio && bio_flagged(req->bio, BIO_DMA_TOKEN)) { + nvme_unmap_premapped_data(nvmeq->dev, req); + return; + } + if (iod->flags & IOD_SINGLE_SEGMENT) { static_assert(offsetof(union nvme_data_ptr, prp1) =3D=3D offsetof(union nvme_data_ptr, sgl.addr)); @@ -1154,8 +1276,8 @@ static blk_status_t nvme_map_data(struct request *req) struct blk_dma_iter iter; blk_status_t ret; =20 - if (req->bio && bio_flagged(req->bio, BIO_DMA_TOKEN)) - return BLK_STS_RESOURCE; + if (req->dma_map) + return nvme_dma_premapped(req, nvmeq); =20 /* * Try to skip the DMA iterator for single segment requests, as that --=20 2.52.0 From nobody Tue Dec 2 01:22:05 2025 Received: from mail-wr1-f52.google.com (mail-wr1-f52.google.com [209.85.221.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7960C299AAC for ; Sun, 23 Nov 2025 22:51:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763938323; cv=none; b=T8tAqJGdRkUxbtuMc6ALP9ftD/SwmpgNYz4543Zvja3OoUAJUhrEyPNN+B+a/UbXgAU8BK1UBoAvudijxRMH1ePy64c0qfRwIjvlREXCNkWLBFbDGOh7V7XpH67WkRgX2LjdN0S7efr8gFZqqDrmVxF4RebP8hzJJ2XtbrK4HaU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763938323; c=relaxed/simple; bh=mm+6YwU1gOqG0s75Ml2JJlbE3SsFVnXZaGl2Nl7+2l0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=DOok8cw/fvq8+BbnBgMS0hnzamOWMh2fA0PNPG77zoybv972bUThSsC8gmgP90mIbfFEQtuMBWCtKeZfsAmrHPr/Qj+E+wLOddd3gGZ3GMe5cW7yQ43E+3oqwIkuKaLClsaJd/4OWI1aF3FsyQErYGTuvRRvhMpp5KiFfnxygf0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=IzQjW9LC; arc=none smtp.client-ip=209.85.221.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="IzQjW9LC" Received: by mail-wr1-f52.google.com with SMTP id ffacd0b85a97d-42b2dc17965so3606999f8f.3 for ; Sun, 23 Nov 2025 14:51:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1763938318; x=1764543118; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Cr1T3mHouFLI2INVVeu1NU8Ohycgb7ruAKm6sUveqNE=; b=IzQjW9LCSDhXQJlfox4jXG+GkY63YhHN88jNMB+XzG+H1qhx1pSDgvIoTqUNrGk6QM 8nRVgqSC9uSjklve7fATB/SRCTvXE0VD4XnfHEXLjzKh9H0ZVDv3xZ+Fa6F5sIo3P7Sz SdXUEmnHXi/QPiFgx0GBEvP2+wr6f8feICOmVCu91WYN/Q5XCXY8zU7BoOjQpyw+bx75 21wjbMcEO90dLjAqAsOVW/N0G25gmPPFzZ+e6xSa/Jf+zhKETP2AQsE2s0+nlGIWCA3n YSlOk8K8Y8ga28F/HnQExNZgqD+B9JI09Yg1bSNfcjaW6wk77bxzjUrsAhSEp1/DcYT+ COug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763938318; x=1764543118; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=Cr1T3mHouFLI2INVVeu1NU8Ohycgb7ruAKm6sUveqNE=; b=abE9PebTHCrU38RF4Ud8RyD7DdDpKzOZqs814Pv5vKClOH97SHDG6xfz3J8JLaDRAy /2hkWb4utmUGPrx/U7OkV8x4wDaIwXZ9Vh6yxzczobqGWlvKp9tzqWB/4XV7jr+uzMQw yz0IEjI1eJ7y9VR7ZxClUeBMKQgt7y0HM9e3sZlpQznXH9/b0hTbeZbaZVn/TuZ1NrmV RaoRp6xo62OAUTnz0y/04fGgoeRcmVx/qsFm4yNxUV1Hr6jHPzwsdoPECrMdhoD1j9po oiihIXlRuHWhdgmFSjQu5nLdwxf8HuP0R7qb8bhF1Oh/eI2zsBZ36Lm0zmzJL+ORSPNR 5MeA== X-Forwarded-Encrypted: i=1; AJvYcCXtLpCPs0z6lUnG6M3MQxfL2Wh9xFn99Qx9fO17sFA7BdtWHyCzZwgwg5+tq2ViD2JLaNlCaIXpt2Nrv6w=@vger.kernel.org X-Gm-Message-State: AOJu0YwD//xmQnxV0lwLQAOJ3FqEfHBVx/Nj84M/Tl+oFeQFdaFacbCt ixmC/6eC2wY9PGS+qe40D+3hN5k5RTlQeIRWNsZW9Czu4UuFx28BvH8P X-Gm-Gg: ASbGnct57mTbIHAzmg98KBI74con6QH2EdMxjM1RGS/antObqQxSdHtR0U6ZAMdEuJW jWm/eTf8JlG2L1e377dXRRm8/aZmUMBNGUNOMHbVYC8pPv82XC3SAphyg+kZhMeSHE/Jxk006is IQkNj/OFBql+e8fXTcemRq6APvlVRA7w7d/40CHf78RhuLKWxpsy9a1uCAwpsuT3dAcUKVu9zFR Y8ClkpLyI5GHU0GxG6EMP7YzFUmT7+/RwwQyIHUYqUUPj2EwSamOpfLNqWpdNsfrDvpQHMcSGuK IElWvCHzvhuqVmnGWCP0ZO1MxiGABxqvj+0AxjC7LU8S5hD8jq7My7tsjGLhMLCX2OTbnmFAzwJ yObKoHtrlCAqt/rfo3yJSBABGroJLrZpJiPnXfmpTp6I5RySDGxxHPrZaBG5OK8oY8ELabAcGv8 sK3dfyRDoagf2ong== X-Google-Smtp-Source: AGHT+IFWQMi7JjnPU4iNqrFrw/2a7As2OTjK90JEPF3SEC3P6oOl1QIY7pfuaXnoytjEpQNeSrnvkg== X-Received: by 2002:a05:6000:200c:b0:42b:4139:579e with SMTP id ffacd0b85a97d-42cc1d1983fmr10129976f8f.43.1763938317609; Sun, 23 Nov 2025 14:51:57 -0800 (PST) Received: from 127.mynet ([2a01:4b00:bd21:4f00:7cc6:d3ca:494:116c]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-42cb7fb9190sm24849064f8f.33.2025.11.23.14.51.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 23 Nov 2025 14:51:56 -0800 (PST) From: Pavel Begunkov To: linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: Vishal Verma , tushar.gohad@intel.com, Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , Alexander Viro , Christian Brauner , Andrew Morton , Sumit Semwal , =?UTF-8?q?Christian=20K=C3=B6nig?= , Pavel Begunkov , linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org Subject: [RFC v2 08/11] io_uring/rsrc: add imu flags Date: Sun, 23 Nov 2025 22:51:28 +0000 Message-ID: <25a416c7f2673d39ae31bfe8bddcfc7eef710e71.1763725388.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Replace is_kbuf with a flags field in io_mapped_ubuf. There will be new flags shortly, and bit fields are often not as convenient to work with. Signed-off-by: Pavel Begunkov --- io_uring/rsrc.c | 12 ++++++------ io_uring/rsrc.h | 6 +++++- io_uring/rw.c | 3 ++- 3 files changed, 13 insertions(+), 8 deletions(-) diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c index 3765a50329a8..21548942e80d 100644 --- a/io_uring/rsrc.c +++ b/io_uring/rsrc.c @@ -828,7 +828,7 @@ static struct io_rsrc_node *io_sqe_buffer_register(stru= ct io_ring_ctx *ctx, imu->folio_shift =3D PAGE_SHIFT; imu->release =3D io_release_ubuf; imu->priv =3D imu; - imu->is_kbuf =3D false; + imu->flags =3D 0; imu->dir =3D IO_IMU_DEST | IO_IMU_SOURCE; if (coalesced) imu->folio_shift =3D data.folio_shift; @@ -985,7 +985,7 @@ int io_buffer_register_bvec(struct io_uring_cmd *cmd, s= truct request *rq, refcount_set(&imu->refs, 1); imu->release =3D release; imu->priv =3D rq; - imu->is_kbuf =3D true; + imu->flags =3D IO_IMU_F_KBUF; imu->dir =3D 1 << rq_data_dir(rq); =20 rq_for_each_bvec(bv, rq, rq_iter) @@ -1020,7 +1020,7 @@ int io_buffer_unregister_bvec(struct io_uring_cmd *cm= d, unsigned int index, ret =3D -EINVAL; goto unlock; } - if (!node->buf->is_kbuf) { + if (!(node->buf->flags & IO_IMU_F_KBUF)) { ret =3D -EBUSY; goto unlock; } @@ -1086,7 +1086,7 @@ static int io_import_fixed(int ddir, struct iov_iter = *iter, =20 offset =3D buf_addr - imu->ubuf; =20 - if (imu->is_kbuf) + if (imu->flags & IO_IMU_F_KBUF) return io_import_kbuf(ddir, iter, imu, len, offset); =20 /* @@ -1511,7 +1511,7 @@ int io_import_reg_vec(int ddir, struct iov_iter *iter, iovec_off =3D vec->nr - nr_iovs; iov =3D vec->iovec + iovec_off; =20 - if (imu->is_kbuf) { + if (imu->flags & IO_IMU_F_KBUF) { int ret =3D io_kern_bvec_size(iov, nr_iovs, imu, &nr_segs); =20 if (unlikely(ret)) @@ -1549,7 +1549,7 @@ int io_import_reg_vec(int ddir, struct iov_iter *iter, req->flags |=3D REQ_F_NEED_CLEANUP; } =20 - if (imu->is_kbuf) + if (imu->flags & IO_IMU_F_KBUF) return io_vec_fill_kern_bvec(ddir, iter, imu, iov, nr_iovs, vec); =20 return io_vec_fill_bvec(ddir, iter, imu, iov, nr_iovs, vec); diff --git a/io_uring/rsrc.h b/io_uring/rsrc.h index d603f6a47f5e..7c1128a856ec 100644 --- a/io_uring/rsrc.h +++ b/io_uring/rsrc.h @@ -28,6 +28,10 @@ enum { IO_IMU_SOURCE =3D 1 << ITER_SOURCE, }; =20 +enum { + IO_IMU_F_KBUF =3D 1, +}; + struct io_mapped_ubuf { u64 ubuf; unsigned int len; @@ -37,7 +41,7 @@ struct io_mapped_ubuf { unsigned long acct_pages; void (*release)(void *); void *priv; - bool is_kbuf; + u8 flags; u8 dir; struct bio_vec bvec[] __counted_by(nr_bvecs); }; diff --git a/io_uring/rw.c b/io_uring/rw.c index a7b568c3dfe8..a3eb4e7bf992 100644 --- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -706,7 +706,8 @@ static ssize_t loop_rw_iter(int ddir, struct io_rw *rw,= struct iov_iter *iter) if ((kiocb->ki_flags & IOCB_NOWAIT) && !(kiocb->ki_filp->f_flags & O_NONBLOCK)) return -EAGAIN; - if ((req->flags & REQ_F_BUF_NODE) && req->buf_node->buf->is_kbuf) + if ((req->flags & REQ_F_BUF_NODE) && + (req->buf_node->buf->flags & IO_IMU_F_KBUF)) return -EFAULT; =20 ppos =3D io_kiocb_ppos(kiocb); --=20 2.52.0 From nobody Tue Dec 2 01:22:05 2025 Received: from mail-wr1-f41.google.com (mail-wr1-f41.google.com [209.85.221.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F385429B766 for ; Sun, 23 Nov 2025 22:52:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763938323; cv=none; b=CmMFG7NsoctpNMUIgb2e2YGP7d0iPRm/60izsVXUn3kl1YgcnzPenIrzD9X8FrqjnE7fz12SEEGPX1CfFrzz1ng2jcDJHvzwCYunqfCnV07Zw6NhIERceOnvvpRouFuttPREqpicsjLLRAitTvJJl2T5XBlU4tdm/JKeVjZKVNE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763938323; c=relaxed/simple; bh=8BZ+n8Zf5MA70rF0FsKoS6ud0wPoj8V1K06OCtwyg2w=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=BJFMmaQ44uZXgZDOt4b3mgA13QMSytGuJWEuCzQZxb4yIyIvcmbJLSgLzfUlJTU4Em6gJUQxVxPFo1eNi4z/+PE/8hdEoRRqzUWkmQs052EtXVsJmHx5Q6MuuwazS0YdfqEEItwfklUIzr1r7mBZGU4Xwq5UsypnL4noT0eKgdM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ZoisclpE; arc=none smtp.client-ip=209.85.221.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ZoisclpE" Received: by mail-wr1-f41.google.com with SMTP id ffacd0b85a97d-42b3720e58eso2850888f8f.3 for ; Sun, 23 Nov 2025 14:52:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1763938319; x=1764543119; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=iDOUZTVJgMkXOYK0w7myV3C8dq9FH3xgeb/1ORThyV0=; b=ZoisclpEPqMwbvYeJR7vAl5JgYNZBpN/+Fb8ZFq19Co7tvf3uA+FqEVf0llvimM4k9 ouWkXf7Y/kwgelFIa3W3JwwVDXl0vUceXrij5CgdGaantsvOe0UTBrgjEOFipWPTyKy6 IwX9r7ChX+Hdl3U8HYzO7RVqvLqeI9JSkBoKf6PoV/Jhy+VS2aFvX2vYDgcS+rD3Za8T z8c+xX+PWRKxQ3vmOBnKqWzwSQIxK6tmsSghO2MAdFztHuiP9c6uAE/Ku3lzXx2CIbgH AaJ7csUPK6RGrJDQjPnLKGJKJm5i3GPaIw0BrUC96Cebs3yFC/1njAYOL/XMiXF6CnRx e+vg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763938319; x=1764543119; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=iDOUZTVJgMkXOYK0w7myV3C8dq9FH3xgeb/1ORThyV0=; b=k9LSE9XNa7MFBFz/Z46JxNnWuHsMOVKiqEhcI6KXYKYyO4KUWyY+/stzkYVUZUUQ7d 0gPJmtbVlqALSGF3rieCaJ8IhmzFW3yVtDo+bIuBxRFbWDgdsiCGlejCCrQCpGjAqqKl gKpSatCc2e2MYi5I4h8A2uF5+OX4kaHs+WJBihSkibEb6Lx81heyM81UAOzYGF+tWfwz szIQAmIgip/d5Em6dE/guP2Qss066bdqErXcX8wuJvvVdFtP2+iEeE8NReYvMyo7+UMF HLnGt3l5c0ayQMe0l2VQtOK3Cf2LizSkhGvuAM530GoyAFQlYL3Z+UZMWJZAoEbfwXra CK3Q== X-Forwarded-Encrypted: i=1; AJvYcCWothYaxKn2Nv3J6hU9va0h5AtChyVkxywVzbtxFSFSKdNWLUIq0h6rYlElI+bTMhOUlLHJF632tADkUC8=@vger.kernel.org X-Gm-Message-State: AOJu0Yxfn3G1ig8eYzKSgm4DeoXThbGMHZJJQmewnGieUxE0mgMOBwx+ Gfx0LY/U5yOiB9m722HKKLLLvzs+cYFBEmxTA3bYPwdMgLzRszORx0rh X-Gm-Gg: ASbGncsz2b434CMSxvtrCBr8vgGb3FoU7gS3J0S8tgL3QwXvKqaQ7UBeEjwOaTnJBXp Bjdz30ENbQHCS+8dBwTqANXB/Oyut02Cznnbzq6n1P40N9vQepS1LhTvfOe4zqd1yscG9vrt+ZE NLRUqGqFfg1hV2LK5hNJ0zbEczdNKuwFKjtoettppkfCb3qYKstfy0VlhjPmntsiqib2pyazpiS ZxSwFf1xtx2ChbnRjbzJeDJ2cXcUoEj6JpAHwPUZRplZpE1GPH0OCxPQmt1IaGzene/NPw7Od56 S4e1ZIa3S7e0Hrl11ulRzAR5fhWpBi1oEM1zVs3KAIyvgLLzoNHbpKGjNaj9rQ4CWzKHpUPzebC eJmLDa9ZwLjQ8Nqh5/ZmNAvMVjnpPont+JlwpZEN0TH+yWjuxws4Eyxuhu0jyCWttW6onVlhnv9 gsPISqkulH/LVfxQ== X-Google-Smtp-Source: AGHT+IEVGZmbP6f6xRHNG0YOhugv/nk9aBAFhUarY2FYYdxDanQ12tfVqCx9OnxuzVxk1JyVJaIagQ== X-Received: by 2002:a5d:588c:0:b0:42b:4069:428a with SMTP id ffacd0b85a97d-42cc1cd5d0bmr10157686f8f.12.1763938319012; Sun, 23 Nov 2025 14:51:59 -0800 (PST) Received: from 127.mynet ([2a01:4b00:bd21:4f00:7cc6:d3ca:494:116c]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-42cb7fb9190sm24849064f8f.33.2025.11.23.14.51.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 23 Nov 2025 14:51:58 -0800 (PST) From: Pavel Begunkov To: linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: Vishal Verma , tushar.gohad@intel.com, Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , Alexander Viro , Christian Brauner , Andrew Morton , Sumit Semwal , =?UTF-8?q?Christian=20K=C3=B6nig?= , Pavel Begunkov , linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org Subject: [RFC v2 09/11] io_uring/rsrc: extended reg buffer registration Date: Sun, 23 Nov 2025 22:51:29 +0000 Message-ID: X-Mailer: git-send-email 2.52.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" We'll need to pass extra information for buffer registration apart from iovec, add a flag to struct io_uring_rsrc_update2 that tells that its data fields points to an extended registration structure, i.e. struct io_uring_reg_buffer. To do normal registration the user has to set target_fd and dmabuf_fd fields to -1, and any other combination is currently rejected. Signed-off-by: Pavel Begunkov --- include/uapi/linux/io_uring.h | 13 ++++++++- io_uring/rsrc.c | 53 +++++++++++++++++++++++++++-------- 2 files changed, 54 insertions(+), 12 deletions(-) diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index deb772222b6d..f64d1f246b93 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -765,15 +765,26 @@ struct io_uring_rsrc_update { __aligned_u64 data; }; =20 +/* struct io_uring_rsrc_update2::flags */ +enum io_uring_rsrc_reg_flags { + IORING_RSRC_F_EXTENDED_UPDATE =3D 1, +}; + struct io_uring_rsrc_update2 { __u32 offset; - __u32 resv; + __u32 flags; __aligned_u64 data; __aligned_u64 tags; __u32 nr; __u32 resv2; }; =20 +struct io_uring_reg_buffer { + __aligned_u64 iov_uaddr; + __s32 target_fd; + __s32 dmabuf_fd; +}; + /* Skip updating fd indexes set to this value in the fd table */ #define IORING_REGISTER_FILES_SKIP (-2) =20 diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c index 21548942e80d..691f9645d04c 100644 --- a/io_uring/rsrc.c +++ b/io_uring/rsrc.c @@ -27,7 +27,8 @@ struct io_rsrc_update { u32 offset; }; =20 -static struct io_rsrc_node *io_sqe_buffer_register(struct io_ring_ctx *ctx, +static struct io_rsrc_node * +io_sqe_buffer_register(struct io_ring_ctx *ctx, struct io_uring_reg_buffer= *rb, struct iovec *iov, struct page **last_hpage); =20 /* only define max */ @@ -234,6 +235,8 @@ static int __io_sqe_files_update(struct io_ring_ctx *ct= x, =20 if (!ctx->file_table.data.nr) return -ENXIO; + if (up->flags) + return -EINVAL; if (up->offset + nr_args > ctx->file_table.data.nr) return -EINVAL; =20 @@ -288,10 +291,18 @@ static int __io_sqe_files_update(struct io_ring_ctx *= ctx, return done ? done : err; } =20 +static inline void io_default_reg_buf(struct io_uring_reg_buffer *rb) +{ + memset(rb, 0, sizeof(*rb)); + rb->target_fd =3D -1; + rb->dmabuf_fd =3D -1; +} + static int __io_sqe_buffers_update(struct io_ring_ctx *ctx, struct io_uring_rsrc_update2 *up, unsigned int nr_args) { + bool extended_entry =3D up->flags & IORING_RSRC_F_EXTENDED_UPDATE; u64 __user *tags =3D u64_to_user_ptr(up->tags); struct iovec fast_iov, *iov; struct page *last_hpage =3D NULL; @@ -302,14 +313,32 @@ static int __io_sqe_buffers_update(struct io_ring_ctx= *ctx, =20 if (!ctx->buf_table.nr) return -ENXIO; + if (up->flags & ~IORING_RSRC_F_EXTENDED_UPDATE) + return -EINVAL; if (up->offset + nr_args > ctx->buf_table.nr) return -EINVAL; =20 for (done =3D 0; done < nr_args; done++) { + struct io_uring_reg_buffer rb; struct io_rsrc_node *node; u64 tag =3D 0; =20 - uvec =3D u64_to_user_ptr(user_data); + if (extended_entry) { + if (copy_from_user(&rb, u64_to_user_ptr(user_data), + sizeof(rb))) + return -EFAULT; + user_data +=3D sizeof(rb); + } else { + io_default_reg_buf(&rb); + rb.iov_uaddr =3D user_data; + + if (ctx->compat) + user_data +=3D sizeof(struct compat_iovec); + else + user_data +=3D sizeof(struct iovec); + } + + uvec =3D u64_to_user_ptr(rb.iov_uaddr); iov =3D iovec_from_user(uvec, 1, 1, &fast_iov, ctx->compat); if (IS_ERR(iov)) { err =3D PTR_ERR(iov); @@ -322,7 +351,7 @@ static int __io_sqe_buffers_update(struct io_ring_ctx *= ctx, err =3D io_buffer_validate(iov); if (err) break; - node =3D io_sqe_buffer_register(ctx, iov, &last_hpage); + node =3D io_sqe_buffer_register(ctx, &rb, iov, &last_hpage); if (IS_ERR(node)) { err =3D PTR_ERR(node); break; @@ -337,10 +366,6 @@ static int __io_sqe_buffers_update(struct io_ring_ctx = *ctx, i =3D array_index_nospec(up->offset + done, ctx->buf_table.nr); io_reset_rsrc_node(ctx, &ctx->buf_table, i); ctx->buf_table.nodes[i] =3D node; - if (ctx->compat) - user_data +=3D sizeof(struct compat_iovec); - else - user_data +=3D sizeof(struct iovec); } return done ? done : err; } @@ -375,7 +400,7 @@ int io_register_files_update(struct io_ring_ctx *ctx, v= oid __user *arg, memset(&up, 0, sizeof(up)); if (copy_from_user(&up, arg, sizeof(struct io_uring_rsrc_update))) return -EFAULT; - if (up.resv || up.resv2) + if (up.resv2) return -EINVAL; return __io_register_rsrc_update(ctx, IORING_RSRC_FILE, &up, nr_args); } @@ -389,7 +414,7 @@ int io_register_rsrc_update(struct io_ring_ctx *ctx, vo= id __user *arg, return -EINVAL; if (copy_from_user(&up, arg, sizeof(up))) return -EFAULT; - if (!up.nr || up.resv || up.resv2) + if (!up.nr || up.resv2) return -EINVAL; return __io_register_rsrc_update(ctx, type, &up, up.nr); } @@ -493,7 +518,7 @@ int io_files_update(struct io_kiocb *req, unsigned int = issue_flags) up2.data =3D up->arg; up2.nr =3D 0; up2.tags =3D 0; - up2.resv =3D 0; + up2.flags =3D 0; up2.resv2 =3D 0; =20 if (up->offset =3D=3D IORING_FILE_INDEX_ALLOC) { @@ -778,6 +803,7 @@ bool io_check_coalesce_buffer(struct page **page_array,= int nr_pages, } =20 static struct io_rsrc_node *io_sqe_buffer_register(struct io_ring_ctx *ctx, + struct io_uring_reg_buffer *rb, struct iovec *iov, struct page **last_hpage) { @@ -790,6 +816,9 @@ static struct io_rsrc_node *io_sqe_buffer_register(stru= ct io_ring_ctx *ctx, struct io_imu_folio_data data; bool coalesced =3D false; =20 + if (rb->dmabuf_fd !=3D -1 || rb->target_fd !=3D -1) + return NULL; + if (!iov->iov_base) return NULL; =20 @@ -887,6 +916,7 @@ int io_sqe_buffers_register(struct io_ring_ctx *ctx, vo= id __user *arg, memset(iov, 0, sizeof(*iov)); =20 for (i =3D 0; i < nr_args; i++) { + struct io_uring_reg_buffer rb; struct io_rsrc_node *node; u64 tag =3D 0; =20 @@ -913,7 +943,8 @@ int io_sqe_buffers_register(struct io_ring_ctx *ctx, vo= id __user *arg, } } =20 - node =3D io_sqe_buffer_register(ctx, iov, &last_hpage); + io_default_reg_buf(&rb); + node =3D io_sqe_buffer_register(ctx, &rb, iov, &last_hpage); if (IS_ERR(node)) { ret =3D PTR_ERR(node); break; --=20 2.52.0 From nobody Tue Dec 2 01:22:05 2025 Received: from mail-wr1-f47.google.com (mail-wr1-f47.google.com [209.85.221.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D565E27BF93 for ; Sun, 23 Nov 2025 22:52:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763938326; cv=none; b=nDhGGjLmY/XFg3RsLreDb/DNU24YywVI2N6gV1GSsfS1ZO4h/x3g1RV5ajlMgJ1VaLcdhe+PpjRRIUwhvRRC3TXcymjWrAwEJU+ZqUq81MbQz7VW5pHAJM4rU+90MJr04PF8zSVmIwei61MaHkWkdx7EFGtqeMVOvbIwLJMyMKA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763938326; c=relaxed/simple; bh=3Yw5gnORHPUkwdVq0k9YaTZDeI1SsW0ivfxZGTZP3Cc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=cJ2lChIAQqeFXD3sVA9qdMlMUKhaZPR30/uUv56aeJpm//S9KDq0V1W2qLTLvqNRJtieRKT4J/JcX7KmRnv7x0X7/uVl64i2PtuoUuYcn8yg8HSoyqwyZeOuo8l7O0zC1TOm/G9rnBNBOY4GA6vR4uCjgohk11eARY8D3yOEVyg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=g7doDkH8; arc=none smtp.client-ip=209.85.221.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="g7doDkH8" Received: by mail-wr1-f47.google.com with SMTP id ffacd0b85a97d-42b39d51dcfso2146411f8f.2 for ; Sun, 23 Nov 2025 14:52:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1763938323; x=1764543123; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=I7U8WjZVUJLujU3iF7ZJF1k1wei+IEPh96Vfasq7vRA=; b=g7doDkH8grU2riP18oRUs8e7YFeVqVJgtvrGYpXTDCrFt7l5LfFbbFQV7UU8gCdyGp MR6NYn3HA3mxPoK15qLVKYWUrVqa2sfDDOUxXfAGmwHbFMC18bSE8YvzGj5tDWtryPks 8UjKawvGHRUpM86YqWkTWDXnmohwApBm1LnKVh3jC1QLu2SO9zJQF3JDzKeLlqccltMn yxQ8bnOZ3aomzXJKG1yB8F/V0ziKDadgldKdoc8BkUl0YoTBEp6N1E1oUQC4rlfmCYD2 uEJYUbT5DBWpVja9JCdof68MIXMc9ZEXqxDvtxe2FrU50U4bMf6S5XVfC7o8b74QdMq6 LbeA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763938323; x=1764543123; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=I7U8WjZVUJLujU3iF7ZJF1k1wei+IEPh96Vfasq7vRA=; b=foNeiH/+N4KyLZDadHAbv/hA/UWdGQ7qElNkHlXQ3gLVO560jz93kjE/lclfMTNHqP hpDk8RJEh+xl0r3me6rkUcHJ+UKdcZ9qWuRhpFjCPnnZJ1zZHD1RR/PrR1aYqofjiMMw qxvWL6JXgE0z+U3I7bkwZpX5dLLNEDVJLH5Bnxr97obiRbQ7yW63asXtdgPmeRMm/PYg viiatvC+XdzIat5Cr7X1lO+Oq/rp2z/rXuRXU4G2N8iUG5Oo9mXfBom3OKUjwepssGP4 SaMIs8HidkevyhMCkJND8iPRYDkXNHlzgxy6Hjnccm5H5e0Spkc21r2AhDwEQUzgwEgi wqeg== X-Forwarded-Encrypted: i=1; AJvYcCWOCrvJV5Dn65ARuycUP8lOmrWgrPz9gczaGjVADXc5zaeFmjMB1NlBCSjDm+9b3RigWda7kdigWP63H6E=@vger.kernel.org X-Gm-Message-State: AOJu0Yx5stgqVTKGYOxPPxj3HzpINytn7p7nfGURXsxLGlYofWVijz0V iDOWYFgLznqe9/yMVY5qQkXaSQd5R8lv25Ov8P3x2WMqQKtoW2H+zTr+ X-Gm-Gg: ASbGncugE5mTdBFid27gS2uibkVylZvzpB2GvFrLodUCn2VG9sr1hyBSdjatNbufipt w8GtTlL5WDR3DkpqTTe4Z8KmZgyumUXQbw0i1Pa+qykVhNfUm0BSPI8r1CPtPU8PBL2cAD4Nn36 U6yAXzR1Ib59n5dSbaLH9PpIy24qv5cHdUASu61X9nmjkpaNuSGleVat/BUqUjl9YKruBmrYvK4 QDaiOJJBGCIK6UdpLwbzHdetvd34tLiNgJG4SnXhKR5OfWncpTWMiXZ40DSppZZGlnU7YR8HOF9 EUxWJQY0ekPL8l+oMQz/4vea0oYVtFlyagrpYVkbAyfT4ZGk2M4pEmf/biucH9SrUt/n/+xxkcM Mxcsq+ZJl0byBXts2XFuXuYZFeiTtWU7WeeoGq1IR161YrY0hGjy5+QufjKJkSg9vg7NeWPm9Rh MKJJsM421v/Httug== X-Google-Smtp-Source: AGHT+IHP6prALX51tQ6Iw6dY96wePGFkgrZhEc+sNQdx04+GPwBJksTSC7+04SaYBCnud+M1CKHhfg== X-Received: by 2002:a05:6000:430e:b0:42b:2e94:5a94 with SMTP id ffacd0b85a97d-42cc1cf4540mr9370759f8f.29.1763938323162; Sun, 23 Nov 2025 14:52:03 -0800 (PST) Received: from 127.mynet ([2a01:4b00:bd21:4f00:7cc6:d3ca:494:116c]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-42cb7fb9190sm24849064f8f.33.2025.11.23.14.51.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 23 Nov 2025 14:52:01 -0800 (PST) From: Pavel Begunkov To: linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: Vishal Verma , tushar.gohad@intel.com, Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , Alexander Viro , Christian Brauner , Andrew Morton , Sumit Semwal , =?UTF-8?q?Christian=20K=C3=B6nig?= , Pavel Begunkov , linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org, David Wei Subject: [RFC v2 10/11] io_uring/rsrc: add dmabuf-backed buffer registeration Date: Sun, 23 Nov 2025 22:51:30 +0000 Message-ID: X-Mailer: git-send-email 2.52.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add an ability to register a dmabuf backed io_uring buffer. It also needs know which device to use for attachment, for that it takes target_fd and extracts the device through the new file op. Unlike normal buffers, it also retains the target file so that any imports from ineligible requests can be rejected in next patches. Suggested-by: Vishal Verma Suggested-by: David Wei Signed-off-by: Pavel Begunkov --- io_uring/rsrc.c | 106 +++++++++++++++++++++++++++++++++++++++++++++++- io_uring/rsrc.h | 1 + 2 files changed, 106 insertions(+), 1 deletion(-) diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c index 691f9645d04c..7dfebf459dd0 100644 --- a/io_uring/rsrc.c +++ b/io_uring/rsrc.c @@ -10,6 +10,8 @@ #include #include #include +#include +#include =20 #include =20 @@ -802,6 +804,106 @@ bool io_check_coalesce_buffer(struct page **page_arra= y, int nr_pages, return true; } =20 +struct io_regbuf_dma { + struct dma_token *token; + struct file *target_file; + struct dma_buf *dmabuf; +}; + +static void io_release_reg_dmabuf(void *priv) +{ + struct io_regbuf_dma *db =3D priv; + + dma_token_release(db->token); + dma_buf_put(db->dmabuf); + fput(db->target_file); + kfree(db); +} + +static struct io_rsrc_node *io_register_dmabuf(struct io_ring_ctx *ctx, + struct io_uring_reg_buffer *rb, + struct iovec *iov) +{ + struct dma_token_params params =3D {}; + struct io_rsrc_node *node =3D NULL; + struct io_mapped_ubuf *imu =3D NULL; + struct io_regbuf_dma *regbuf =3D NULL; + struct file *target_file =3D NULL; + struct dma_buf *dmabuf =3D NULL; + struct dma_token *token; + int ret; + + if (iov->iov_base || iov->iov_len) + return ERR_PTR(-EFAULT); + + node =3D io_rsrc_node_alloc(ctx, IORING_RSRC_BUFFER); + if (!node) { + ret =3D -ENOMEM; + goto err; + } + + imu =3D io_alloc_imu(ctx, 0); + if (!imu) { + ret =3D -ENOMEM; + goto err; + } + + regbuf =3D kzalloc(sizeof(*regbuf), GFP_KERNEL); + if (!regbuf) { + ret =3D -ENOMEM; + goto err; + } + + target_file =3D fget(rb->target_fd); + if (!target_file) { + ret =3D -EBADF; + goto err; + } + + dmabuf =3D dma_buf_get(rb->dmabuf_fd); + if (IS_ERR(dmabuf)) { + ret =3D PTR_ERR(dmabuf); + dmabuf =3D NULL; + goto err; + } + + params.dmabuf =3D dmabuf; + params.dir =3D DMA_BIDIRECTIONAL; + token =3D dma_token_create(target_file, ¶ms); + if (IS_ERR(token)) { + ret =3D PTR_ERR(token); + goto err; + } + + regbuf->target_file =3D target_file; + regbuf->token =3D token; + regbuf->dmabuf =3D dmabuf; + + imu->nr_bvecs =3D 1; + imu->ubuf =3D 0; + imu->len =3D dmabuf->size; + imu->folio_shift =3D 0; + imu->release =3D io_release_reg_dmabuf; + imu->priv =3D regbuf; + imu->flags =3D IO_IMU_F_DMA; + imu->dir =3D IO_IMU_DEST | IO_IMU_SOURCE; + refcount_set(&imu->refs, 1); + node->buf =3D imu; + return node; +err: + if (regbuf) + kfree(regbuf); + if (imu) + io_free_imu(ctx, imu); + if (node) + io_cache_free(&ctx->node_cache, node); + if (target_file) + fput(target_file); + if (dmabuf) + dma_buf_put(dmabuf); + return ERR_PTR(ret); +} + static struct io_rsrc_node *io_sqe_buffer_register(struct io_ring_ctx *ctx, struct io_uring_reg_buffer *rb, struct iovec *iov, @@ -817,7 +919,7 @@ static struct io_rsrc_node *io_sqe_buffer_register(stru= ct io_ring_ctx *ctx, bool coalesced =3D false; =20 if (rb->dmabuf_fd !=3D -1 || rb->target_fd !=3D -1) - return NULL; + return io_register_dmabuf(ctx, rb, iov); =20 if (!iov->iov_base) return NULL; @@ -1117,6 +1219,8 @@ static int io_import_fixed(int ddir, struct iov_iter = *iter, =20 offset =3D buf_addr - imu->ubuf; =20 + if (imu->flags & IO_IMU_F_DMA) + return -EOPNOTSUPP; if (imu->flags & IO_IMU_F_KBUF) return io_import_kbuf(ddir, iter, imu, len, offset); =20 diff --git a/io_uring/rsrc.h b/io_uring/rsrc.h index 7c1128a856ec..280d3988abf3 100644 --- a/io_uring/rsrc.h +++ b/io_uring/rsrc.h @@ -30,6 +30,7 @@ enum { =20 enum { IO_IMU_F_KBUF =3D 1, + IO_IMU_F_DMA =3D 2, }; =20 struct io_mapped_ubuf { --=20 2.52.0 From nobody Tue Dec 2 01:22:05 2025 Received: from mail-wm1-f54.google.com (mail-wm1-f54.google.com [209.85.128.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5FF6E2BE7B6 for ; Sun, 23 Nov 2025 22:52:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763938328; cv=none; b=YuwoKctRasfVyX3dYN/dpJg1IFG8v+Sh43JH1+V1vZTQhfY947aRR49pxfpGZUiTVuXPug79hI+p6q/CVsJ2+J+H4+5hCsDTpIMMRYsNA9KAzAIAvgjuWSAVVQvS6w8M+6K9aa+ymsNOEULEFwdAl6ZDbcdGGymB4eP7YEQRE08= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763938328; c=relaxed/simple; bh=EXYOheJb4cgW5yVlz0jhAvYk6zmI9HBqPvYTaeMxyhc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=WutuEjW/jx34kvs2MAxCtLFCLwr4X8kGurkPZmUgryAu7dyO+dw9wcQSNZYMB73E0y+Td6G41K+xzgckkecvIlANsziMwow2f8y3tQwLbF9c3kxZDZfQ14P4DeiyNY/4lNwwc9a/MxtSUQjuDq/ZYFPfxu6gxISpQyIFKy6YHOo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=P4NZQWfe; arc=none smtp.client-ip=209.85.128.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="P4NZQWfe" Received: by mail-wm1-f54.google.com with SMTP id 5b1f17b1804b1-47778b23f64so19064375e9.0 for ; Sun, 23 Nov 2025 14:52:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1763938325; x=1764543125; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=IjKpkoDBMFXqwI53pRDJnxFi7xwhurS0a3ahU0JbCgI=; b=P4NZQWfeyO45qY/ul+V8nOpW0TduYosMmKvc9jP8E//7n1Kuz9XEHBR4MP35H2QEL5 sQRRFXi6orkEra+c606COFw3gtGmyj+6yzk5qdWcWgLrcu7x0zgXa+NzdmCe1X5WBxd5 2TQg4+n65KIoUGwQsJv1jNvtjWulO90MfvFO+JXnPVSvGvILHUjAwEhKzhHOkaVdbdeS e04WvpRe3TwvvPvNrTEMEhZNSNtaZR2OkwX3c5bhUE0WbdPku1toldN1q+KJgEb9N77S PJlAGJjHMosHvFnc+GocB72N9HRjxpKmslvSu2l9TXToP1nIWZQ2KQCLN8jmO+0NsQx4 7bOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763938325; x=1764543125; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=IjKpkoDBMFXqwI53pRDJnxFi7xwhurS0a3ahU0JbCgI=; b=dtdxpH8/Cf+3OTwWUrg6x5pzG+8it3DiQKyzb+2tNc9I73HoCeT7yKj8VcIZd5YGyi LzNGX0dz1w8IzfU0gZDE12Fc0xRbHOa1NwRA1TvsymZPFbbpgCZigL/f5nDxBq0CLydB GV9JMv8Cp/jWc8eJWxGy0PysEMgPolu8efrgNjXzq63P/DHH0QV34Eh7GjD+1l4xgm0v 0U3/rmukRycVqt2bo1BEbspaHs54GsDUyVDtp1Nm6I0hQbut/SsXK67n31CRr7eUazhf 1lrHcOOlwxjWGith1NxIixelksAIRTamwHXv9Hw6qzdhQoRjPfCHJofDnzfH1LLFQh2v GDWw== X-Forwarded-Encrypted: i=1; AJvYcCWGixTNZXbEDY+R+yr0HwqZWIWJTFO6/HiWScBobzY+NJ66xMi58YwrDhXVqg1D9WeoDEip/UVkBjudSmQ=@vger.kernel.org X-Gm-Message-State: AOJu0YxpR/9sbIjkaKUQppVjBHkwUjU2odNiq36SnGNoY43rf6m/2xgg SGBMM2QEWfmrk6DezWfkJwcw6AEvIDRo8jy00+plwfO6G/JzGetWVyfb X-Gm-Gg: ASbGnct8wrnODwFrFDbC4G6vMpkCwAH6iAzoZT33O7KfZW/cuUR3CYD1BUYKxJnVFCg Iwi/sxPMrOcFFwm9pPg+fOLhir8Va8+8kXaJpyacB2x8KshmidLGqjR7XuasB2U58xYzUu0Apgx 5gT6gtL+ZkhGoBKdXSXyXucOSDHt2VFGirXa58VnNdu9bsotOMPtUtHLFHakIxjOyvN6Y0g/0qv G8NcGM5PezA44NUVR7d76RL8bijRGHIhHdsQS5bOlxDDO02zDepSb86741cXaoHgheFNgbEj5ti WwqO3EZb6DMFezHiNycDVaL+ilsQDzfdmqS6aeVF1pHHBm+JOEqokL3+lDmaOVw20ol94YgyQ6t qNMun9q606rYFnnQb+82DHTPSD7kw4xpgxucwJ7RkjAY4wfym9a9Viz3O3l8Tc4sbqKMwK+MFD8 vHRuxmKbZxj96ZPw== X-Google-Smtp-Source: AGHT+IG0SGMWMdVe66Rksvn6v5FBe6opsohs/EM0p8y743GggDbgal5u09Oo75v6mEVGyS+Z85JlHQ== X-Received: by 2002:a05:600c:8b35:b0:477:832c:86ae with SMTP id 5b1f17b1804b1-477c111b94fmr113406075e9.12.1763938324588; Sun, 23 Nov 2025 14:52:04 -0800 (PST) Received: from 127.mynet ([2a01:4b00:bd21:4f00:7cc6:d3ca:494:116c]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-42cb7fb9190sm24849064f8f.33.2025.11.23.14.52.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 23 Nov 2025 14:52:03 -0800 (PST) From: Pavel Begunkov To: linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: Vishal Verma , tushar.gohad@intel.com, Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , Alexander Viro , Christian Brauner , Andrew Morton , Sumit Semwal , =?UTF-8?q?Christian=20K=C3=B6nig?= , Pavel Begunkov , linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org, David Wei Subject: [RFC v2 11/11] io_uring/rsrc: implement dmabuf regbuf import Date: Sun, 23 Nov 2025 22:51:31 +0000 Message-ID: <44e4ad8c4bd72856379c368e4303090c44c9e98e.1763725388.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.52.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Allow importing dmabuf backed registered buffers. It's an opt-in feature for requests and they need to pass a flag allowing it. Furthermore, the import will fail if the request's file doesn't match the file for which the buffer for registered. This way, it's also limited to files that support the feature by implementing the corresponding file op. Enable it for read/write requests. Suggested-by: David Wei Suggested-by: Vishal Verma Signed-off-by: Pavel Begunkov --- io_uring/rsrc.c | 36 +++++++++++++++++++++++++++++------- io_uring/rsrc.h | 16 +++++++++++++++- io_uring/rw.c | 4 ++-- 3 files changed, 46 insertions(+), 10 deletions(-) diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c index 7dfebf459dd0..a5d88dae536e 100644 --- a/io_uring/rsrc.c +++ b/io_uring/rsrc.c @@ -1201,9 +1201,27 @@ static int io_import_kbuf(int ddir, struct iov_iter = *iter, return 0; } =20 -static int io_import_fixed(int ddir, struct iov_iter *iter, +static int io_import_dmabuf(struct io_kiocb *req, + int ddir, struct iov_iter *iter, struct io_mapped_ubuf *imu, - u64 buf_addr, size_t len) + size_t len, size_t offset) +{ + struct io_regbuf_dma *db =3D imu->priv; + + if (!len) + return -EFAULT; + if (req->file !=3D db->target_file) + return -EBADF; + + iov_iter_dma_token(iter, ddir, db->token, offset, len); + return 0; +} + +static int io_import_fixed(struct io_kiocb *req, + int ddir, struct iov_iter *iter, + struct io_mapped_ubuf *imu, + u64 buf_addr, size_t len, + unsigned import_flags) { const struct bio_vec *bvec; size_t folio_mask; @@ -1219,8 +1237,11 @@ static int io_import_fixed(int ddir, struct iov_iter= *iter, =20 offset =3D buf_addr - imu->ubuf; =20 - if (imu->flags & IO_IMU_F_DMA) - return -EOPNOTSUPP; + if (imu->flags & IO_IMU_F_DMA) { + if (!(import_flags & IO_REGBUF_IMPORT_ALLOW_DMA)) + return -EFAULT; + return io_import_dmabuf(req, ddir, iter, imu, len, offset); + } if (imu->flags & IO_IMU_F_KBUF) return io_import_kbuf(ddir, iter, imu, len, offset); =20 @@ -1274,16 +1295,17 @@ inline struct io_rsrc_node *io_find_buf_node(struct= io_kiocb *req, return NULL; } =20 -int io_import_reg_buf(struct io_kiocb *req, struct iov_iter *iter, +int __io_import_reg_buf(struct io_kiocb *req, struct iov_iter *iter, u64 buf_addr, size_t len, int ddir, - unsigned issue_flags) + unsigned issue_flags, unsigned import_flags) { struct io_rsrc_node *node; =20 node =3D io_find_buf_node(req, issue_flags); if (!node) return -EFAULT; - return io_import_fixed(ddir, iter, node->buf, buf_addr, len); + return io_import_fixed(req, ddir, iter, node->buf, buf_addr, len, + import_flags); } =20 /* Lock two rings at once. The rings must be different! */ diff --git a/io_uring/rsrc.h b/io_uring/rsrc.h index 280d3988abf3..e0eafce976f3 100644 --- a/io_uring/rsrc.h +++ b/io_uring/rsrc.h @@ -33,6 +33,10 @@ enum { IO_IMU_F_DMA =3D 2, }; =20 +enum { + IO_REGBUF_IMPORT_ALLOW_DMA =3D 1, +}; + struct io_mapped_ubuf { u64 ubuf; unsigned int len; @@ -66,9 +70,19 @@ int io_rsrc_data_alloc(struct io_rsrc_data *data, unsign= ed nr); =20 struct io_rsrc_node *io_find_buf_node(struct io_kiocb *req, unsigned issue_flags); +int __io_import_reg_buf(struct io_kiocb *req, struct iov_iter *iter, + u64 buf_addr, size_t len, int ddir, + unsigned issue_flags, unsigned import_flags); + +static inline int io_import_reg_buf(struct io_kiocb *req, struct iov_iter *iter, u64 buf_addr, size_t len, int ddir, - unsigned issue_flags); + unsigned issue_flags) +{ + return __io_import_reg_buf(req, iter, buf_addr, len, ddir, + issue_flags, 0); +} + int io_import_reg_vec(int ddir, struct iov_iter *iter, struct io_kiocb *req, struct iou_vec *vec, unsigned nr_iovs, unsigned issue_flags); diff --git a/io_uring/rw.c b/io_uring/rw.c index a3eb4e7bf992..0d9d99695801 100644 --- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -374,8 +374,8 @@ static int io_init_rw_fixed(struct io_kiocb *req, unsig= ned int issue_flags, if (io->bytes_done) return 0; =20 - ret =3D io_import_reg_buf(req, &io->iter, rw->addr, rw->len, ddir, - issue_flags); + ret =3D __io_import_reg_buf(req, &io->iter, rw->addr, rw->len, ddir, + issue_flags, IO_REGBUF_IMPORT_ALLOW_DMA); iov_iter_save_state(&io->iter, &io->iter_state); return ret; } --=20 2.52.0