From nobody Tue Dec 2 02:04:07 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2E93427FD43 for ; Fri, 21 Nov 2025 02:00:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763690419; cv=none; b=OiDKbOUH8XsS6x0pdK8zBk3X2i+/59cWtNuj/ozKR2kXH+T0cScTzkrGvOt5x5kcWr4emRCX21WNlTIaeSOTuMTlSkkmrsxyfMSmr36u5fOt5lxzIPV11oxrtvR9d8C7CvD+SPU2H084bawlAMDKlWh/ImsJwaC1CPpsej+LOJ8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763690419; c=relaxed/simple; bh=if3Ad8D9MLZgclpENQrpcBjRHrZ0O72fYyvYdbS9nD8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JVtIfZYzp5Ahn3FMTO+dNXqYN/DokgI5aDzsGJyT7uqwl3cCSSJoRoEBwVkO+owfJaId6JjN3qZMef8/eOmy/OUQ1vjvhNYWMFhPziGYOKPCMerjo7qSJaJ5yB8DdeANAhC2uLTJ9nFksvfoqi5mwqP0OxoltK7LEDXyZeCBgbk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=ZPHhB2y9; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="ZPHhB2y9" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1763690416; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=APgPdgnWoBJ34YoCRwEfYGiHtu7xr5eSm1q+oTVRq1Y=; b=ZPHhB2y94CiZ93u+XRZsJiI7jtBtAlIUsPK8gOp4CDCt887xy2dTLidAmEgLUx1AKVAbR/ ezCQZIIZlK0zJ7o8b5Fe6tmE/3Ge/IkrqzEL5nFJWuNu6bhXXAtKjXpe0z3b4hKgXeXeGB NGtMFET3Cedf+fhDggGgywvXCfCfavE= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-130-982BNqtBMNK0LZjufiZ3wQ-1; Thu, 20 Nov 2025 21:00:09 -0500 X-MC-Unique: 982BNqtBMNK0LZjufiZ3wQ-1 X-Mimecast-MFC-AGG-ID: 982BNqtBMNK0LZjufiZ3wQ_1763690408 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id E0ADA1956058; Fri, 21 Nov 2025 02:00:07 +0000 (UTC) Received: from localhost (unknown [10.72.116.211]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id E0D6C1940E82; Fri, 21 Nov 2025 02:00:06 +0000 (UTC) From: Ming Lei To: Jens Axboe , linux-block@vger.kernel.org Cc: Caleb Sander Mateos , Uday Shankar , Stefani Seibold , Andrew Morton , linux-kernel@vger.kernel.org, Ming Lei Subject: [PATCH V4 16/27] ublk: add new feature UBLK_F_BATCH_IO Date: Fri, 21 Nov 2025 09:58:38 +0800 Message-ID: <20251121015851.3672073-17-ming.lei@redhat.com> In-Reply-To: <20251121015851.3672073-1-ming.lei@redhat.com> References: <20251121015851.3672073-1-ming.lei@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Content-Type: text/plain; charset="utf-8" Add new feature UBLK_F_BATCH_IO which replaces the following two per-io commands: - UBLK_U_IO_FETCH_REQ - UBLK_U_IO_COMMIT_AND_FETCH_REQ with three per-queue batch io uring_cmd: - UBLK_U_IO_PREP_IO_CMDS - UBLK_U_IO_COMMIT_IO_CMDS - UBLK_U_IO_FETCH_IO_CMDS Then ublk can deliver batch io commands to ublk server in single multishort uring_cmd, also allows to prepare & commit multiple commands in batch style via single uring_cmd, communication cost is reduced a lot. This feature also doesn't limit task context any more for all supported commands, so any allowed uring_cmd can be issued in any task context. ublk server implementation becomes much easier. Meantime load balance becomes much easier to support with this feature. The command `UBLK_U_IO_FETCH_IO_CMDS` can be issued from multiple task contexts, so each task can adjust this command's buffer length or number of inflight commands for controlling how much load is handled by current task. Later, priority parameter will be added to command `UBLK_U_IO_FETCH_IO_CMDS` for improving load balance support. UBLK_U_IO_GET_DATA isn't supported in batch io yet, but it may be enabled in future via its batch pair. Signed-off-by: Ming Lei --- drivers/block/ublk_drv.c | 58 ++++++++++++++++++++++++++++++++--- include/uapi/linux/ublk_cmd.h | 16 ++++++++++ 2 files changed, 69 insertions(+), 5 deletions(-) diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index 849199771f86..90cd1863bc83 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -74,7 +74,8 @@ | UBLK_F_AUTO_BUF_REG \ | UBLK_F_QUIESCE \ | UBLK_F_PER_IO_DAEMON \ - | UBLK_F_BUF_REG_OFF_DAEMON) + | UBLK_F_BUF_REG_OFF_DAEMON \ + | UBLK_F_BATCH_IO) =20 #define UBLK_F_ALL_RECOVERY_FLAGS (UBLK_F_USER_RECOVERY \ | UBLK_F_USER_RECOVERY_REISSUE \ @@ -320,12 +321,12 @@ static void ublk_batch_dispatch(struct ublk_queue *ub= q, =20 static inline bool ublk_dev_support_batch_io(const struct ublk_device *ub) { - return false; + return ub->dev_info.flags & UBLK_F_BATCH_IO; } =20 static inline bool ublk_support_batch_io(const struct ublk_queue *ubq) { - return false; + return ubq->flags & UBLK_F_BATCH_IO; } =20 static inline void ublk_io_lock(struct ublk_io *io) @@ -3450,6 +3451,41 @@ static int ublk_validate_batch_fetch_cmd(struct ublk= _batch_io_data *data, return 0; } =20 +static int ublk_handle_non_batch_cmd(struct io_uring_cmd *cmd, + unsigned int issue_flags) +{ + const struct ublksrv_io_cmd *ub_cmd =3D io_uring_sqe_cmd(cmd->sqe); + struct ublk_device *ub =3D cmd->file->private_data; + unsigned tag =3D READ_ONCE(ub_cmd->tag); + unsigned q_id =3D READ_ONCE(ub_cmd->q_id); + unsigned index =3D READ_ONCE(ub_cmd->addr); + struct ublk_queue *ubq; + struct ublk_io *io; + int ret =3D -EINVAL; + + if (!ub) + return ret; + + if (q_id >=3D ub->dev_info.nr_hw_queues) + return ret; + + ubq =3D ublk_get_queue(ub, q_id); + if (tag >=3D ubq->q_depth) + return ret; + + io =3D &ubq->ios[tag]; + + switch (cmd->cmd_op) { + case UBLK_U_IO_REGISTER_IO_BUF: + return ublk_register_io_buf(cmd, ub, q_id, tag, io, index, + issue_flags); + case UBLK_U_IO_UNREGISTER_IO_BUF: + return ublk_unregister_io_buf(cmd, ub, index, issue_flags); + default: + return -EOPNOTSUPP; + } +} + static int ublk_ch_batch_io_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags) { @@ -3497,7 +3533,8 @@ static int ublk_ch_batch_io_uring_cmd(struct io_uring= _cmd *cmd, ret =3D ublk_handle_batch_fetch_cmd(&data); break; default: - ret =3D -EOPNOTSUPP; + ret =3D ublk_handle_non_batch_cmd(cmd, issue_flags); + break; } out: return ret; @@ -4163,9 +4200,13 @@ static int ublk_ctrl_add_dev(const struct ublksrv_ct= rl_cmd *header) =20 ub->dev_info.flags |=3D UBLK_F_CMD_IOCTL_ENCODE | UBLK_F_URING_CMD_COMP_IN_TASK | - UBLK_F_PER_IO_DAEMON | + (ublk_dev_support_batch_io(ub) ? 0 : UBLK_F_PER_IO_DAEMON) | UBLK_F_BUF_REG_OFF_DAEMON; =20 + /* So far, UBLK_F_PER_IO_DAEMON won't be exposed for BATCH_IO */ + if (ublk_dev_support_batch_io(ub)) + ub->dev_info.flags &=3D ~UBLK_F_PER_IO_DAEMON; + /* GET_DATA isn't needed any more with USER_COPY or ZERO COPY */ if (ub->dev_info.flags & (UBLK_F_USER_COPY | UBLK_F_SUPPORT_ZERO_COPY | UBLK_F_AUTO_BUF_REG)) @@ -4518,6 +4559,13 @@ static int ublk_wait_for_idle_io(struct ublk_device = *ub, unsigned int elapsed =3D 0; int ret; =20 + /* + * For UBLK_F_BATCH_IO ublk server can get notified with existing + * or new fetch command, so needn't wait any more + */ + if (ublk_dev_support_batch_io(ub)) + return 0; + while (elapsed < timeout_ms && !signal_pending(current)) { unsigned int queues_cancelable =3D 0; int i; diff --git a/include/uapi/linux/ublk_cmd.h b/include/uapi/linux/ublk_cmd.h index cd894c1d188e..5e8b1211b7f4 100644 --- a/include/uapi/linux/ublk_cmd.h +++ b/include/uapi/linux/ublk_cmd.h @@ -335,6 +335,22 @@ */ #define UBLK_F_BUF_REG_OFF_DAEMON (1ULL << 14) =20 + +/* + * Support the following commands for delivering & committing io command + * in batch. + * + * - UBLK_U_IO_PREP_IO_CMDS + * - UBLK_U_IO_COMMIT_IO_CMDS + * - UBLK_U_IO_FETCH_IO_CMDS + * - UBLK_U_IO_REGISTER_IO_BUF + * - UBLK_U_IO_UNREGISTER_IO_BUF + * + * The existing UBLK_U_IO_FETCH_REQ, UBLK_U_IO_COMMIT_AND_FETCH_REQ and + * UBLK_U_IO_GET_DATA uring_cmd are not supported for this feature. + */ +#define UBLK_F_BATCH_IO (1ULL << 15) + /* device state */ #define UBLK_S_DEV_DEAD 0 #define UBLK_S_DEV_LIVE 1 --=20 2.47.0