From nobody Tue Dec 2 02:04:07 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2FEEA2DCC1B for ; Fri, 21 Nov 2025 02:00:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763690452; cv=none; b=Uuw/baTjzc51h6IwBoksuB9/lRm96QDRUttKzGorP0CLBWW1oksyYsfJ8qMAYlEdLmPalu7FPSEXQWkziLeoLDv6cBOd+0yKyafVsL56KnWQUraZZmhM0dgdfw+ktqMXAq4MOjBoI+zg+ybYB6BJITzEQEINdXKC5tIHjg62fGQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763690452; c=relaxed/simple; bh=0YiZ2Y84rU5lfo5m+i04VaZBD5DjMht7Ah3qPqgITq0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=EjGW7CdeWvSJh2iVDlESxh2ohsCpVJKLfR/P1CHrb0iS2gw82m6u+7/Z5LdpeDEqZLMnZJf8lU5RT4HRa96IacomNdLNLymqLMPyHtTlXa0pvsobhvDXQof9qzIlCZQcp0dkesfLrU7LODVqxzy+eMpg2UM2PA9vO1IFGslDDB4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=TVxNc6L7; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="TVxNc6L7" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1763690449; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rJ9+bjNk9l3ZWyw+GB/6UrQf0WcvmWDX5OqUGGkYIGU=; b=TVxNc6L7gmMnuRRmpfVkiMDGQcwy33/pQmqkxvdAw/ji7aKCjeLENzGPZQcsIk6JRvs0hD IX14aEuh3gl0Eyk95v70Sus7SeVxbS2/6ZDOVw7CJN08PSY2mG9SEVyJiRSD7F7W3NKMFt GHB5eRBKx8RDvjBdMciaaYgppyij4IU= Received: from mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-328-bAKQdnctPPe5SGkU09aQ_A-1; Thu, 20 Nov 2025 21:00:45 -0500 X-MC-Unique: bAKQdnctPPe5SGkU09aQ_A-1 X-Mimecast-MFC-AGG-ID: bAKQdnctPPe5SGkU09aQ_A_1763690444 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 7B10E18002C8; Fri, 21 Nov 2025 02:00:44 +0000 (UTC) Received: from localhost (unknown [10.72.116.211]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 3C76E1956045; Fri, 21 Nov 2025 02:00:42 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org Cc: Caleb Sander Mateos , Uday Shankar , Stefani Seibold , Andrew Morton , linux-kernel@vger.kernel.org, Ming Lei Subject: [PATCH V4 25/27] selftests: ublk: handle UBLK_U_IO_FETCH_IO_CMDS Date: Fri, 21 Nov 2025 09:58:47 +0800 Message-ID: <20251121015851.3672073-26-ming.lei@redhat.com> In-Reply-To: <20251121015851.3672073-1-ming.lei@redhat.com> References: <20251121015851.3672073-1-ming.lei@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 Content-Type: text/plain; charset="utf-8" Add support for UBLK_U_IO_FETCH_IO_CMDS to enable efficient batch fetching of I/O commands using multishot io_uring operations. Key improvements: - Implement multishot UBLK_U_IO_FETCH_IO_CMDS for continuous command fetchi= ng - Add fetch buffer management with page-aligned, mlocked buffers - Process fetched I/O command tags from kernel-provided buffers - Integrate fetch operations with existing batch I/O infrastructure - Significantly reduce uring_cmd issuing overhead through batching The implementation uses two fetch buffers per thread with automatic requeuing to maintain continuous I/O command flow. Each fetch operation retrieves multiple command tags in a single syscall, dramatically improving performance compared to individual command fetching. Technical details: - Fetch buffers are page-aligned and mlocked for optimal performance - Uses IORING_URING_CMD_MULTISHOT for continuous operation - Automatic buffer management and requeuing on completion - Enhanced CQE handling for fetch command completions Signed-off-by: Ming Lei --- tools/testing/selftests/ublk/batch.c | 134 ++++++++++++++++++++++++++- tools/testing/selftests/ublk/kublk.c | 14 ++- tools/testing/selftests/ublk/kublk.h | 13 +++ 3 files changed, 157 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/ublk/batch.c b/tools/testing/selftests= /ublk/batch.c index e240d4decedf..7db91f910944 100644 --- a/tools/testing/selftests/ublk/batch.c +++ b/tools/testing/selftests/ublk/batch.c @@ -140,15 +140,63 @@ void ublk_batch_prepare(struct ublk_thread *t) t->nr_bufs); } =20 +static void free_batch_fetch_buf(struct ublk_thread *t) +{ + int i; + + for (i =3D 0; i < UBLKS_T_NR_FETCH_BUF; i++) { + io_uring_free_buf_ring(&t->ring, t->fetch[i].br, 1, i); + munlock(t->fetch[i].fetch_buf, t->fetch[i].fetch_buf_size); + free(t->fetch[i].fetch_buf); + } +} + +static int alloc_batch_fetch_buf(struct ublk_thread *t) +{ + /* page aligned fetch buffer, and it is mlocked for speedup delivery */ + unsigned pg_sz =3D getpagesize(); + unsigned buf_size =3D round_up(t->dev->dev_info.queue_depth * 2, pg_sz); + int ret; + int i =3D 0; + + for (i =3D 0; i < UBLKS_T_NR_FETCH_BUF; i++) { + t->fetch[i].fetch_buf_size =3D buf_size; + + if (posix_memalign((void **)&t->fetch[i].fetch_buf, pg_sz, + t->fetch[i].fetch_buf_size)) + return -ENOMEM; + + /* lock fetch buffer page for fast fetching */ + if (mlock(t->fetch[i].fetch_buf, t->fetch[i].fetch_buf_size)) + ublk_err("%s: can't lock fetch buffer %s\n", __func__, + strerror(errno)); + t->fetch[i].br =3D io_uring_setup_buf_ring(&t->ring, 1, + i, IOU_PBUF_RING_INC, &ret); + if (!t->fetch[i].br) { + ublk_err("Buffer ring register failed %d\n", ret); + return ret; + } + } + + return 0; +} + int ublk_batch_alloc_buf(struct ublk_thread *t) { + int ret; + ublk_assert(t->nr_commit_buf < 16); - return alloc_batch_commit_buf(t); + + ret =3D alloc_batch_commit_buf(t); + if (ret) + return ret; + return alloc_batch_fetch_buf(t); } =20 void ublk_batch_free_buf(struct ublk_thread *t) { free_batch_commit_buf(t); + free_batch_fetch_buf(t); } =20 static void ublk_init_batch_cmd(struct ublk_thread *t, __u16 q_id, @@ -199,6 +247,76 @@ static void ublk_setup_commit_sqe(struct ublk_thread *= t, cmd->flags |=3D t->cmd_flags; } =20 +static void ublk_batch_queue_fetch(struct ublk_thread *t, + struct ublk_queue *q, + unsigned short buf_idx) +{ + unsigned short nr_elem =3D t->fetch[buf_idx].fetch_buf_size / 2; + struct io_uring_sqe *sqe; + + io_uring_buf_ring_add(t->fetch[buf_idx].br, t->fetch[buf_idx].fetch_buf, + t->fetch[buf_idx].fetch_buf_size, + 0, 0, 0); + io_uring_buf_ring_advance(t->fetch[buf_idx].br, 1); + + ublk_io_alloc_sqes(t, &sqe, 1); + + ublk_init_batch_cmd(t, q->q_id, sqe, UBLK_U_IO_FETCH_IO_CMDS, 2, nr_elem, + buf_idx); + + sqe->rw_flags=3D IORING_URING_CMD_MULTISHOT; + sqe->buf_group =3D buf_idx; + sqe->flags |=3D IOSQE_BUFFER_SELECT; + + t->fetch[buf_idx].fetch_buf_off =3D 0; +} + +void ublk_batch_start_fetch(struct ublk_thread *t, + struct ublk_queue *q) +{ + int i; + + for (i =3D 0; i < UBLKS_T_NR_FETCH_BUF; i++) + ublk_batch_queue_fetch(t, q, i); +} + +static unsigned short ublk_compl_batch_fetch(struct ublk_thread *t, + struct ublk_queue *q, + const struct io_uring_cqe *cqe) +{ + unsigned short buf_idx =3D user_data_to_tag(cqe->user_data); + unsigned start =3D t->fetch[buf_idx].fetch_buf_off; + unsigned end =3D start + cqe->res; + void *buf =3D t->fetch[buf_idx].fetch_buf; + int i; + + if (cqe->res < 0) + return buf_idx; + + if ((end - start) / 2 > q->q_depth) { + ublk_err("%s: fetch duplicated ios offset %u count %u\n", _= _func__, start, cqe->res); + + for (i =3D start; i < end; i +=3D 2) { + unsigned short tag =3D *(unsigned short *)(buf + i); + + ublk_err("%u ", tag); + } + ublk_err("\n"); + } + + for (i =3D start; i < end; i +=3D 2) { + unsigned short tag =3D *(unsigned short *)(buf + i); + + if (tag >=3D q->q_depth) + ublk_err("%s: bad tag %u\n", __func__, tag); + + if (q->tgt_ops->queue_io) + q->tgt_ops->queue_io(t, q, tag); + } + t->fetch[buf_idx].fetch_buf_off =3D end; + return buf_idx; +} + int ublk_batch_queue_prep_io_cmds(struct ublk_thread *t, struct ublk_queue= *q) { unsigned short nr_elem =3D q->q_depth; @@ -258,12 +376,26 @@ void ublk_batch_compl_cmd(struct ublk_thread *t, const struct io_uring_cqe *cqe) { unsigned op =3D user_data_to_op(cqe->user_data); + struct ublk_queue *q; + unsigned buf_idx; + unsigned q_id; =20 if (op =3D=3D _IOC_NR(UBLK_U_IO_PREP_IO_CMDS) || op =3D=3D _IOC_NR(UBLK_U_IO_COMMIT_IO_CMDS)) { ublk_batch_compl_commit_cmd(t, cqe, op); return; } + + /* FETCH command is per queue */ + q_id =3D user_data_to_q_id(cqe->user_data); + q =3D &t->dev->q[q_id]; + buf_idx =3D ublk_compl_batch_fetch(t, q, cqe); + + if (cqe->res < 0 && cqe->res !=3D -ENOBUFS) { + t->state |=3D UBLKS_T_STOPPING; + } else if (!(cqe->flags & IORING_CQE_F_MORE) || cqe->res =3D=3D -ENOBUFS)= { + ublk_batch_queue_fetch(t, q, buf_idx); + } } =20 void ublk_batch_commit_io_cmds(struct ublk_thread *t) diff --git a/tools/testing/selftests/ublk/kublk.c b/tools/testing/selftests= /ublk/kublk.c index 6565e804679c..cb329c7aebc4 100644 --- a/tools/testing/selftests/ublk/kublk.c +++ b/tools/testing/selftests/ublk/kublk.c @@ -493,6 +493,10 @@ static int ublk_thread_init(struct ublk_thread *t, uns= igned long long extra_flag int ring_depth =3D dev->tgt.sq_depth, cq_depth =3D dev->tgt.cq_depth; int ret; =20 + /* FETCH_IO_CMDS is multishot, so increase cq depth for BATCH_IO */ + if (ublk_dev_batch_io(dev)) + cq_depth +=3D dev->dev_info.queue_depth; + ret =3D ublk_setup_ring(&t->ring, ring_depth, cq_depth, IORING_SETUP_COOP_TASKRUN | IORING_SETUP_SINGLE_ISSUER | @@ -797,7 +801,7 @@ static void ublk_handle_cqe(struct ublk_thread *t, unsigned q_id =3D user_data_to_q_id(cqe->user_data); unsigned cmd_op =3D user_data_to_op(cqe->user_data); =20 - if (cqe->res < 0 && cqe->res !=3D -ENODEV) + if (cqe->res < 0 && cqe->res !=3D -ENODEV && cqe->res !=3D -ENOBUFS) ublk_err("%s: res %d userdata %llx thread state %x\n", __func__, cqe->res, cqe->user_data, t->state); =20 @@ -922,9 +926,13 @@ static __attribute__((noinline)) int __ublk_io_handler= _fn(struct ublk_thread_inf if (!ublk_thread_batch_io(&t)) { /* submit all io commands to ublk driver */ ublk_submit_fetch_commands(&t); - } else if (!t.idx) { + } else { + struct ublk_queue *q =3D &t.dev->q[t.idx]; + /* prepare all io commands in the 1st thread context */ - ublk_batch_setup_queues(&t); + if (!t.idx) + ublk_batch_setup_queues(&t); + ublk_batch_start_fetch(&t, q); } =20 do { diff --git a/tools/testing/selftests/ublk/kublk.h b/tools/testing/selftests= /ublk/kublk.h index 0a355653d64c..222501048c24 100644 --- a/tools/testing/selftests/ublk/kublk.h +++ b/tools/testing/selftests/ublk/kublk.h @@ -190,6 +190,13 @@ struct batch_commit_buf { unsigned short count; }; =20 +struct batch_fetch_buf { + struct io_uring_buf_ring *br; + void *fetch_buf; + unsigned int fetch_buf_size; + unsigned int fetch_buf_off; +}; + struct ublk_thread { struct ublk_dev *dev; unsigned idx; @@ -216,6 +223,9 @@ struct ublk_thread { #define UBLKS_T_COMMIT_BUF_INV_IDX ((unsigned short)-1) struct allocator commit_buf_alloc; struct batch_commit_buf commit; + /* FETCH_IO_CMDS buffer */ +#define UBLKS_T_NR_FETCH_BUF 2 + struct batch_fetch_buf fetch[UBLKS_T_NR_FETCH_BUF]; =20 struct io_uring ring; }; @@ -468,6 +478,9 @@ static inline unsigned short ublk_batch_io_buf_idx( =20 /* Queue UBLK_U_IO_PREP_IO_CMDS for a specific queue with batch elements */ int ublk_batch_queue_prep_io_cmds(struct ublk_thread *t, struct ublk_queue= *q); +/* Start fetching I/O commands using multishot UBLK_U_IO_FETCH_IO_CMDS */ +void ublk_batch_start_fetch(struct ublk_thread *t, + struct ublk_queue *q); /* Handle completion of batch I/O commands (prep/commit) */ void ublk_batch_compl_cmd(struct ublk_thread *t, const struct io_uring_cqe *cqe); --=20 2.47.0