[PATCH v3] nvme-tcp: fix connect failure on receiving partial ICResp PDU

Caleb Sander Mateos posted 1 patch 3 weeks, 6 days ago
There is a newer version of this series
drivers/nvme/host/tcp.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
[PATCH v3] nvme-tcp: fix connect failure on receiving partial ICResp PDU
Posted by Caleb Sander Mateos 3 weeks, 6 days ago
nvme_tcp_init_connection() attempts to receive an ICResp PDU but only
checks that the return value from recvmsg() is non-negative. If the
sender closes the TCP connection or sends fewer than 128 bytes, this
check will pass even though the full PDU wasn't received.

Ensure the full ICResp PDU is received by checking that recvmsg()
returns the expected 128 bytes.

Additionally set the MSG_WAITALL flag for recvmsg(), as a sender could
split the ICResp over multiple TCP frames. Without MSG_WAITALL,
recvmsg() could return prematurely with only part of the PDU.

Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Fixes: 3f2304f8c6d6 ("nvme-tcp: add NVMe over TCP host driver")
---
v3: fix return value to indicate error
v2: add Fixes tag

 drivers/nvme/host/tcp.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index dc5bbca58c6d..8cd1735772c6 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1447,15 +1447,17 @@ static int nvme_tcp_init_connection(struct nvme_tcp_queue *queue)
 	iov.iov_len = sizeof(*icresp);
 	if (nvme_tcp_queue_tls(queue)) {
 		msg.msg_control = cbuf;
 		msg.msg_controllen = sizeof(cbuf);
 	}
+	msg.msg_flags = MSG_WAITALL;
 	ret = kernel_recvmsg(queue->sock, &msg, &iov, 1,
 			iov.iov_len, msg.msg_flags);
-	if (ret < 0) {
+	if (ret < sizeof(*icresp)) {
 		pr_warn("queue %d: failed to receive icresp, error %d\n",
 			nvme_tcp_queue_id(queue), ret);
+		ret = -ECONNRESET;
 		goto free_icresp;
 	}
 	ret = -ENOTCONN;
 	if (nvme_tcp_queue_tls(queue)) {
 		ctype = tls_get_record_type(queue->sock->sk,
-- 
2.45.2
Re: [PATCH v3] nvme-tcp: fix connect failure on receiving partial ICResp PDU
Posted by Keith Busch 3 weeks, 6 days ago
On Fri, Jan 24, 2025 at 11:03:41AM -0700, Caleb Sander Mateos wrote:
> +	msg.msg_flags = MSG_WAITALL;
>  	ret = kernel_recvmsg(queue->sock, &msg, &iov, 1,
>  			iov.iov_len, msg.msg_flags);
> -	if (ret < 0) {
> +	if (ret < sizeof(*icresp)) {
>  		pr_warn("queue %d: failed to receive icresp, error %d\n",
>  			nvme_tcp_queue_id(queue), ret);
> +		ret = -ECONNRESET;

If kernel_recvmsg() returns an error, don't we want to preserve and
return that instead of unconditionally overriding to -ECONNRESET? I
think the suggestion was to set ret to that only if ret >= 0.
Re: [PATCH v3] nvme-tcp: fix connect failure on receiving partial ICResp PDU
Posted by Caleb Sander 3 weeks, 6 days ago
Indeed. Time for v4...

On Fri, Jan 24, 2025 at 10:38 AM Keith Busch <kbusch@kernel.org> wrote:
>
> On Fri, Jan 24, 2025 at 11:03:41AM -0700, Caleb Sander Mateos wrote:
> > +     msg.msg_flags = MSG_WAITALL;
> >       ret = kernel_recvmsg(queue->sock, &msg, &iov, 1,
> >                       iov.iov_len, msg.msg_flags);
> > -     if (ret < 0) {
> > +     if (ret < sizeof(*icresp)) {
> >               pr_warn("queue %d: failed to receive icresp, error %d\n",
> >                       nvme_tcp_queue_id(queue), ret);
> > +             ret = -ECONNRESET;
>
> If kernel_recvmsg() returns an error, don't we want to preserve and
> return that instead of unconditionally overriding to -ECONNRESET? I
> think the suggestion was to set ret to that only if ret >= 0.