From nobody Wed Dec 3 03:12:02 2025 Received: from mail-wr1-f52.google.com (mail-wr1-f52.google.com [209.85.221.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EAA8128DF07 for ; Sun, 23 Nov 2025 22:51:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763938319; cv=none; b=Vbz5h3mGjZqIeeZ3GoJbGcgmZLdZMIY9Avcb3cK/U4dI1K49KFulTM0pxvTLUTUwbfUtXeYYGGsMbD8gcY4sp8/WoEAilahqaenpxLjmbwQJZAlvTdRR/tXrVYNzz8ywM40EkmtCLZ1tANitfGaA344/PfdsMruTpVDmv/adF0E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763938319; c=relaxed/simple; bh=UXMDuvciehfdM/JpQpcfknw3TaZdhJajBPgetayGTPI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=s9Kb2ieIg0OCkWqFFV2DOqN1bYoPFUPxxkuJvDUMPrd7GAMXH9k9YOBUo5HptdB9W7OphKz0AZ5+wvdB5p0GK4Y32x+4ZLcOWKZLORtL+udsOt/Ngi591y7aufc9c4+XTlqMCk/2wVjroWn/zyikH7GtOSlYKqfhcbeqdASRNJ8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=mSdY/1KA; arc=none smtp.client-ip=209.85.221.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="mSdY/1KA" Received: by mail-wr1-f52.google.com with SMTP id ffacd0b85a97d-429c8632fcbso2222120f8f.1 for ; Sun, 23 Nov 2025 14:51:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1763938316; x=1764543116; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=aczXTymWXQPIGDaXnf/nBr9xYQNzFFU48NeAsCHNOPA=; b=mSdY/1KA049Dl+KwP552oSPQcLPG8jtb1WMtsdNe3I0iIAxyVIukExEx1SRjVM5Ier tRSAL8ACfC7NdWhEI7NrX6jJbBxKyNwSJfElCoUTdQ9n8bo1SRQcznuakYQ4oLioPzix hTzAgWjv9vd0knKSstQ2hUFCkz0ocA2SDp32HjBS07ZBP9XbElMN0+3/PEEIm/GP6OCp Azcye8vZnEgeF7XBgJQHariw/VvI90NBbIja/657GD7/clhJE2t5KIzbfVtFbEplbBb7 nndgyeDZ4n67flVCYVvn6WC/n9J3lwvQFUwJkSZZaxpozVFeUnfNPAWEOvfJM6tx0W7p C7/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763938316; x=1764543116; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=aczXTymWXQPIGDaXnf/nBr9xYQNzFFU48NeAsCHNOPA=; b=jo8NuAtbmQGgXVVjrxpuxb8mbwo4yFpP9ciLIZNdPSHFQ4z3YIJ+Vs7C++48Wsc0Qb YAfo3YLWc6gTuSA8HG1LXSGYUqpVwe6nf0Tmd87A7GQRWPJqoR2gc7kpndhzXz7cJHTC Y9d/amghIa0+4SADlV75TQtOGfOjQduyfuJRU6GzefiUToSr/il6+jBNSP6RlpCQSYhY nMtvwCQbWmMGRQTE1p3a01BUQY7Ida99RiJy6d7UZP58ClxfnRRlz6Pxjq8xsaAmgtM5 Hp0xW9hDwmC58RJS+ghOB58MmhBsJ5gxDL4IxcOYjmFYXx9/CljKztwrwQf4NGHB44dI eQLw== X-Forwarded-Encrypted: i=1; AJvYcCXttLEC4vUKLKjx73IZ+57W7yn3mU3yCUkHilU9pXNdPFztr0FpC66Lq9B8wOlhe5yg72n/SsA8U3gtIpI=@vger.kernel.org X-Gm-Message-State: AOJu0YxOkJw1f6iCqCHXkUPALpinWfz3PyO0jsnMTPdGranzb0+IHROZ X8XpOZV6+DDEhD1+Oj0unnYOkrDcKgS7CS9VIMdk2iCsjl2MXRy1awYc X-Gm-Gg: ASbGnct6Gb+LwqOU+QA4if6bUoJdrS0YcZOQbpn93VVjHkU3lfyvKxNqYPsf30wtzSq HBv91hOUJmuexMeTduEYm9vth1mVXFXBbXfFBNWwEPSbgz35y9fJy+Zhp/r0ebl/6R+mQlRIuKS Fb9Zlon+zb90y9cTKGnxt7pkOq7Yhha+byKi2BaNMHZYH1GPjW0rEWIErnqGAC9nGFSMta/EccM Zco0dKSMv1Tp5E6gX7e62PP4CXnO1tzjOvoDsJyc6PAoHeF6XhUJhDivy0pjgLHiKjIsHCsoVs5 +6vanS0JB3wYq/DFMkQHTSCnRSUgsCz38Q0kxgeIdrrB4q77JwkmSjWi6hJfCqJLz1V+yb+7xYd UyPZVD09UKGG3Onk7FL+Lgcr5e2CFjUMUOxT/R1vcFtFycR9pYr+idLvD/lYVpgB99x2OMxDi+6 ndO78z0APuflCZww== X-Google-Smtp-Source: AGHT+IEhsmDeavDDsXXTdPeMYswvjoeFYMEcGIVgrZUZIxbXidFIxZRmwhVPCPHioGiavOVUtI6phA== X-Received: by 2002:a05:6000:2893:b0:42b:55f3:6196 with SMTP id ffacd0b85a97d-42cc1ab89b3mr10647235f8f.4.1763938316187; Sun, 23 Nov 2025 14:51:56 -0800 (PST) Received: from 127.mynet ([2a01:4b00:bd21:4f00:7cc6:d3ca:494:116c]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-42cb7fb9190sm24849064f8f.33.2025.11.23.14.51.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 23 Nov 2025 14:51:54 -0800 (PST) From: Pavel Begunkov To: linux-block@vger.kernel.org, io-uring@vger.kernel.org Cc: Vishal Verma , tushar.gohad@intel.com, Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , Alexander Viro , Christian Brauner , Andrew Morton , Sumit Semwal , =?UTF-8?q?Christian=20K=C3=B6nig?= , Pavel Begunkov , linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org Subject: [RFC v2 07/11] nvme-pci: implement dma_token backed requests Date: Sun, 23 Nov 2025 22:51:27 +0000 Message-ID: X-Mailer: git-send-email 2.52.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Enable BIO_DMA_TOKEN backed requests. It requires special handling to set up the nvme request from the prepared in advance mapping, tear it down and sync the buffers. Suggested-by: Keith Busch Signed-off-by: Pavel Begunkov --- drivers/nvme/host/pci.c | 126 +++++++++++++++++++++++++++++++++++++++- 1 file changed, 124 insertions(+), 2 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 63e03c3dc044..ac377416b088 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -797,6 +797,123 @@ static void nvme_free_descriptors(struct request *req) } } =20 +static void nvme_sync_dma(struct nvme_dev *nvme_dev, struct request *req, + enum dma_data_direction dir) +{ + struct blk_mq_dma_map *map =3D req->dma_map; + int length =3D blk_rq_payload_bytes(req); + bool for_cpu =3D dir =3D=3D DMA_FROM_DEVICE; + struct device *dev =3D nvme_dev->dev; + dma_addr_t *dma_list =3D map->private; + struct bio *bio =3D req->bio; + int offset, map_idx; + + offset =3D bio->bi_iter.bi_bvec_done; + map_idx =3D offset / NVME_CTRL_PAGE_SIZE; + length +=3D offset & (NVME_CTRL_PAGE_SIZE - 1); + + while (length > 0) { + u64 dma_addr =3D dma_list[map_idx++]; + + if (for_cpu) + __dma_sync_single_for_cpu(dev, dma_addr, + NVME_CTRL_PAGE_SIZE, dir); + else + __dma_sync_single_for_device(dev, dma_addr, + NVME_CTRL_PAGE_SIZE, dir); + length -=3D NVME_CTRL_PAGE_SIZE; + } +} + +static void nvme_unmap_premapped_data(struct nvme_dev *dev, + struct request *req) +{ + struct nvme_iod *iod =3D blk_mq_rq_to_pdu(req); + + if (rq_data_dir(req) =3D=3D READ) + nvme_sync_dma(dev, req, DMA_FROM_DEVICE); + if (!(iod->flags & IOD_SINGLE_SEGMENT)) + nvme_free_descriptors(req); +} + +static blk_status_t nvme_dma_premapped(struct request *req, + struct nvme_queue *nvmeq) +{ + struct nvme_iod *iod =3D blk_mq_rq_to_pdu(req); + int length =3D blk_rq_payload_bytes(req); + struct blk_mq_dma_map *map =3D req->dma_map; + u64 dma_addr, prp1_dma, prp2_dma; + struct bio *bio =3D req->bio; + dma_addr_t *dma_list; + dma_addr_t prp_dma; + __le64 *prp_list; + int i, map_idx; + int offset; + + dma_list =3D map->private; + + if (rq_data_dir(req) =3D=3D WRITE) + nvme_sync_dma(nvmeq->dev, req, DMA_TO_DEVICE); + + offset =3D bio->bi_iter.bi_bvec_done; + map_idx =3D offset / NVME_CTRL_PAGE_SIZE; + offset &=3D (NVME_CTRL_PAGE_SIZE - 1); + + prp1_dma =3D dma_list[map_idx++] + offset; + + length -=3D (NVME_CTRL_PAGE_SIZE - offset); + if (length <=3D 0) { + prp2_dma =3D 0; + goto done; + } + + if (length <=3D NVME_CTRL_PAGE_SIZE) { + prp2_dma =3D dma_list[map_idx]; + goto done; + } + + if (DIV_ROUND_UP(length, NVME_CTRL_PAGE_SIZE) <=3D + NVME_SMALL_POOL_SIZE / sizeof(__le64)) + iod->flags |=3D IOD_SMALL_DESCRIPTOR; + + prp_list =3D dma_pool_alloc(nvme_dma_pool(nvmeq, iod), GFP_ATOMIC, + &prp_dma); + if (!prp_list) + return BLK_STS_RESOURCE; + + iod->descriptors[iod->nr_descriptors++] =3D prp_list; + prp2_dma =3D prp_dma; + i =3D 0; + for (;;) { + if (i =3D=3D NVME_CTRL_PAGE_SIZE >> 3) { + __le64 *old_prp_list =3D prp_list; + + prp_list =3D dma_pool_alloc(nvmeq->descriptor_pools.large, + GFP_ATOMIC, &prp_dma); + if (!prp_list) + goto free_prps; + iod->descriptors[iod->nr_descriptors++] =3D prp_list; + prp_list[0] =3D old_prp_list[i - 1]; + old_prp_list[i - 1] =3D cpu_to_le64(prp_dma); + i =3D 1; + } + + dma_addr =3D dma_list[map_idx++]; + prp_list[i++] =3D cpu_to_le64(dma_addr); + + length -=3D NVME_CTRL_PAGE_SIZE; + if (length <=3D 0) + break; + } +done: + iod->cmd.common.dptr.prp1 =3D cpu_to_le64(prp1_dma); + iod->cmd.common.dptr.prp2 =3D cpu_to_le64(prp2_dma); + return BLK_STS_OK; +free_prps: + nvme_free_descriptors(req); + return BLK_STS_RESOURCE; +} + static void nvme_free_prps(struct request *req, unsigned int attrs) { struct nvme_iod *iod =3D blk_mq_rq_to_pdu(req); @@ -875,6 +992,11 @@ static void nvme_unmap_data(struct request *req) struct device *dma_dev =3D nvmeq->dev->dev; unsigned int attrs =3D 0; =20 + if (req->bio && bio_flagged(req->bio, BIO_DMA_TOKEN)) { + nvme_unmap_premapped_data(nvmeq->dev, req); + return; + } + if (iod->flags & IOD_SINGLE_SEGMENT) { static_assert(offsetof(union nvme_data_ptr, prp1) =3D=3D offsetof(union nvme_data_ptr, sgl.addr)); @@ -1154,8 +1276,8 @@ static blk_status_t nvme_map_data(struct request *req) struct blk_dma_iter iter; blk_status_t ret; =20 - if (req->bio && bio_flagged(req->bio, BIO_DMA_TOKEN)) - return BLK_STS_RESOURCE; + if (req->dma_map) + return nvme_dma_premapped(req, nvmeq); =20 /* * Try to skip the DMA iterator for single segment requests, as that --=20 2.52.0