[PATCH 08/52] migration/rdma: Fix qemu_rdma_accept() to return failure on errors

Markus Armbruster posted 52 patches 11 months, 3 weeks ago
Maintainers: Juan Quintela <quintela@redhat.com>, Peter Xu <peterx@redhat.com>, Leonardo Bras <leobras@redhat.com>
There is a newer version of this series
[PATCH 08/52] migration/rdma: Fix qemu_rdma_accept() to return failure on errors
Posted by Markus Armbruster 11 months, 3 weeks ago
qemu_rdma_accept() returns 0 in some cases even when it didn't
complete its job due to errors.  Impact is not obvious.  I figure the
caller will soon fail again with a misleading error message.

Fix it to return -1 on any failure.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
---
 migration/rdma.c | 19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index 177d73a2ba..4339fcc7b2 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -3352,6 +3352,7 @@ static int qemu_rdma_accept(RDMAContext *rdma)
 
     if (cm_event->event != RDMA_CM_EVENT_CONNECT_REQUEST) {
         rdma_ack_cm_event(cm_event);
+        ret = -1;
         goto err_rdma_dest_wait;
     }
 
@@ -3364,6 +3365,7 @@ static int qemu_rdma_accept(RDMAContext *rdma)
         rdma_return_path = qemu_rdma_data_init(rdma->host_port, NULL);
         if (rdma_return_path == NULL) {
             rdma_ack_cm_event(cm_event);
+            ret = -1;
             goto err_rdma_dest_wait;
         }
 
@@ -3375,10 +3377,11 @@ static int qemu_rdma_accept(RDMAContext *rdma)
     network_to_caps(&cap);
 
     if (cap.version < 1 || cap.version > RDMA_CONTROL_VERSION_CURRENT) {
-            error_report("Unknown source RDMA version: %d, bailing...",
-                            cap.version);
-            rdma_ack_cm_event(cm_event);
-            goto err_rdma_dest_wait;
+        error_report("Unknown source RDMA version: %d, bailing...",
+                     cap.version);
+        rdma_ack_cm_event(cm_event);
+        ret = -1;
+        goto err_rdma_dest_wait;
     }
 
     /*
@@ -3408,9 +3411,10 @@ static int qemu_rdma_accept(RDMAContext *rdma)
     if (!rdma->verbs) {
         rdma->verbs = verbs;
     } else if (rdma->verbs != verbs) {
-            error_report("ibv context not matching %p, %p!", rdma->verbs,
-                         verbs);
-            goto err_rdma_dest_wait;
+        error_report("ibv context not matching %p, %p!", rdma->verbs,
+                     verbs);
+        ret = -1;
+        goto err_rdma_dest_wait;
     }
 
     qemu_rdma_dump_id("dest_init", verbs);
@@ -3467,6 +3471,7 @@ static int qemu_rdma_accept(RDMAContext *rdma)
     if (cm_event->event != RDMA_CM_EVENT_ESTABLISHED) {
         error_report("rdma_accept not event established");
         rdma_ack_cm_event(cm_event);
+        ret = -1;
         goto err_rdma_dest_wait;
     }
 
-- 
2.41.0
Re: [PATCH 08/52] migration/rdma: Fix qemu_rdma_accept() to return failure on errors
Posted by Zhijian Li (Fujitsu) 11 months, 3 weeks ago

On 18/09/2023 22:41, Markus Armbruster wrote:
> qemu_rdma_accept() returns 0 in some cases even when it didn't
> complete its job due to errors.  Impact is not obvious.  I figure the
> caller will soon fail again with a misleading error message.
> 
> Fix it to return -1 on any failure.
> 
> Signed-off-by: Markus Armbruster <armbru@redhat.com>

I noticed that ret initialization is also meaningless in qemu_rdma_accept()

3354 static int qemu_rdma_accept(RDMAContext *rdma)
3355 {
3356     RDMACapabilities cap;
3357     struct rdma_conn_param conn_param = {
3358                                             .responder_resources = 2,
3359                                             .private_data = &cap,
3360                                             .private_data_len = sizeof(cap),
3361                                          };
3362     RDMAContext *rdma_return_path = NULL;
3363     struct rdma_cm_event *cm_event;
3364     struct ibv_context *verbs;
3365     int ret = -EINVAL;     <<<<< drop it ?
3366     int idx;


Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Re: [PATCH 08/52] migration/rdma: Fix qemu_rdma_accept() to return failure on errors
Posted by Markus Armbruster 11 months, 2 weeks ago
"Zhijian Li (Fujitsu)" <lizhijian@fujitsu.com> writes:

> On 18/09/2023 22:41, Markus Armbruster wrote:
>> qemu_rdma_accept() returns 0 in some cases even when it didn't
>> complete its job due to errors.  Impact is not obvious.  I figure the
>> caller will soon fail again with a misleading error message.
>> 
>> Fix it to return -1 on any failure.
>> 
>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>
> I noticed that ret initialization is also meaningless in qemu_rdma_accept()
>
> 3354 static int qemu_rdma_accept(RDMAContext *rdma)
> 3355 {
> 3356     RDMACapabilities cap;
> 3357     struct rdma_conn_param conn_param = {
> 3358                                             .responder_resources = 2,
> 3359                                             .private_data = &cap,
> 3360                                             .private_data_len = sizeof(cap),
> 3361                                          };
> 3362     RDMAContext *rdma_return_path = NULL;
> 3363     struct rdma_cm_event *cm_event;
> 3364     struct ibv_context *verbs;
> 3365     int ret = -EINVAL;     <<<<< drop it ?
> 3366     int idx;

PATCH 27 will drop it.

> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>

Thanks!
Re: [PATCH 08/52] migration/rdma: Fix qemu_rdma_accept() to return failure on errors
Posted by Fabiano Rosas 11 months, 3 weeks ago
Markus Armbruster <armbru@redhat.com> writes:

> qemu_rdma_accept() returns 0 in some cases even when it didn't
> complete its job due to errors.  Impact is not obvious.  I figure the
> caller will soon fail again with a misleading error message.
>
> Fix it to return -1 on any failure.
>
> Signed-off-by: Markus Armbruster <armbru@redhat.com>

Reviewed-by: Fabiano Rosas <farosas@suse.de>