[v2] nbd client refactoring and fixing

[Qemu-devel] [PATCH v2 7/7] block/nbd-client: do not yield from nbd_read_reply_entry

Posted by Vladimir Sementsov-Ogievskiy 8 years, 4 months ago

Refactor nbd client to not yield from nbd_read_reply_entry. It's
possible now as all reading is done in nbd_read_reply_entry and
all request-related data is stored in per-request s->requests[i].

We need here some additional work with s->requests[i].ret and
s->quit to not fail requests on normal disconnet and to not report
reading errors on normal disconnect.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/nbd-client.c | 29 ++++++++++-------------------
 1 file changed, 10 insertions(+), 19 deletions(-)

diff --git a/block/nbd-client.c b/block/nbd-client.c
index f80a4c5564..bf20d5d5e6 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -79,7 +79,11 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque)
     while (!s->quit) {
         ret = nbd_receive_reply(s->ioc, &reply, &local_err);
         if (ret < 0) {
-            error_report_err(local_err);
+            if (s->quit) {
+                error_free(local_err);
+            } else {
+                error_report_err(local_err);
+            }
         }
         if (ret <= 0 || s->quit) {
             break;
@@ -112,19 +116,8 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque)
             }
         }
 
-        /* We're woken up again by the request itself.  Note that there
-         * is no race between yielding and reentering read_reply_co.  This
-         * is because:
-         *
-         * - if the request runs on the same AioContext, it is only
-         *   entered after we yield
-         *
-         * - if the request runs on a different AioContext, reentering
-         *   read_reply_co happens through a bottom half, which can only
-         *   run after we yield.
-         */
+        s->requests[i].receiving = false;
         aio_co_wake(s->requests[i].coroutine);
-        qemu_coroutine_yield();
     }
 
     s->quit = true;
@@ -157,6 +150,7 @@ static int nbd_co_send_request(BlockDriverState *bs,
 
     s->requests[i].coroutine = qemu_coroutine_self();
     s->requests[i].receiving = false;
+    s->requests[i].ret = -EIO;
 
     request->handle = INDEX_TO_HANDLE(s, i);
     s->requests[i].request = request;
@@ -210,7 +204,7 @@ static int nbd_co_receive_reply(NBDClientSession *s, NBDRequest *request)
     s->requests[i].receiving = true;
     qemu_coroutine_yield();
     s->requests[i].receiving = false;
-    if (!s->ioc || s->quit) {
+    if (!s->ioc) {
         ret = -EIO;
     } else {
         ret = s->requests[i].ret;
@@ -218,11 +212,6 @@ static int nbd_co_receive_reply(NBDClientSession *s, NBDRequest *request)
 
     s->requests[i].coroutine = NULL;
 
-    /* Kick the read_reply_co to get the next reply.  */
-    if (s->read_reply_co) {
-        aio_co_wake(s->read_reply_co);
-    }
-
     qemu_co_mutex_lock(&s->send_mutex);
     s->in_flight--;
     qemu_co_queue_next(&s->free_sema);
@@ -364,6 +353,8 @@ void nbd_client_close(BlockDriverState *bs)
 
     nbd_send_request(client->ioc, &request);
 
+    client->quit = true;
+
     nbd_teardown_connection(bs);
 }
 
-- 
2.11.1

Re: [Qemu-devel] [PATCH v2 7/7] block/nbd-client: do not yield from nbd_read_reply_entry

Posted by Eric Blake 8 years, 4 months ago

On 09/18/2017 08:59 AM, Vladimir Sementsov-Ogievskiy wrote:
> Refactor nbd client to not yield from nbd_read_reply_entry. It's
> possible now as all reading is done in nbd_read_reply_entry and
> all request-related data is stored in per-request s->requests[i].
> 
> We need here some additional work with s->requests[i].ret and
> s->quit to not fail requests on normal disconnet and to not report
> reading errors on normal disconnect.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block/nbd-client.c | 29 ++++++++++-------------------
>  1 file changed, 10 insertions(+), 19 deletions(-)

The diffstat may have merit, but I'm wondering:

> 
> diff --git a/block/nbd-client.c b/block/nbd-client.c
> index f80a4c5564..bf20d5d5e6 100644
> --- a/block/nbd-client.c
> +++ b/block/nbd-client.c
> @@ -79,7 +79,11 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque)
>      while (!s->quit) {
>          ret = nbd_receive_reply(s->ioc, &reply, &local_err);
>          if (ret < 0) {
> -            error_report_err(local_err);
> +            if (s->quit) {
> +                error_free(local_err);

This discards errors on all remaining coroutines after we detect an
early exit. Are we sure the coroutine that set s->quit will have
reported an appropriate error, and thus explaining why we can discard
all secondary errors?  A comment in the code would be helpful.

> +            } else {
> +                error_report_err(local_err);
> +            }
>          }

> @@ -210,7 +204,7 @@ static int nbd_co_receive_reply(NBDClientSession *s, NBDRequest *request)
>      s->requests[i].receiving = true;
>      qemu_coroutine_yield();
>      s->requests[i].receiving = false;
> -    if (!s->ioc || s->quit) {
> +    if (!s->ioc) {
>          ret = -EIO;

Why don't we need to check s->quit any more?  That was just added
earlier in the series; why the churn?

>      } else {
>          ret = s->requests[i].ret;
> @@ -218,11 +212,6 @@ static int nbd_co_receive_reply(NBDClientSession *s, NBDRequest *request)
>  
>      s->requests[i].coroutine = NULL;
>  
> -    /* Kick the read_reply_co to get the next reply.  */
> -    if (s->read_reply_co) {
> -        aio_co_wake(s->read_reply_co);
> -    }
> -
>      qemu_co_mutex_lock(&s->send_mutex);
>      s->in_flight--;
>      qemu_co_queue_next(&s->free_sema);
> @@ -364,6 +353,8 @@ void nbd_client_close(BlockDriverState *bs)
>  
>      nbd_send_request(client->ioc, &request);
>  
> +    client->quit = true;

Previously, client->quit was only set when detecting a broken server,
now it is also set for a clean exit.  Do we need to change any
documentation of the field?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] [PATCH v2 7/7] block/nbd-client: do not yield from nbd_read_reply_entry

Posted by Vladimir Sementsov-Ogievskiy 8 years, 4 months ago

19.09.2017 01:36, Eric Blake wrote:
> On 09/18/2017 08:59 AM, Vladimir Sementsov-Ogievskiy wrote:
>> Refactor nbd client to not yield from nbd_read_reply_entry. It's
>> possible now as all reading is done in nbd_read_reply_entry and
>> all request-related data is stored in per-request s->requests[i].
>>
>> We need here some additional work with s->requests[i].ret and
>> s->quit to not fail requests on normal disconnet and to not report
>> reading errors on normal disconnect.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   block/nbd-client.c | 29 ++++++++++-------------------
>>   1 file changed, 10 insertions(+), 19 deletions(-)
> The diffstat may have merit, but I'm wondering:
>
>> diff --git a/block/nbd-client.c b/block/nbd-client.c
>> index f80a4c5564..bf20d5d5e6 100644
>> --- a/block/nbd-client.c
>> +++ b/block/nbd-client.c
>> @@ -79,7 +79,11 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque)
>>       while (!s->quit) {
>>           ret = nbd_receive_reply(s->ioc, &reply, &local_err);
>>           if (ret < 0) {
>> -            error_report_err(local_err);
>> +            if (s->quit) {
>> +                error_free(local_err);
> This discards errors on all remaining coroutines after we detect an
> early exit. Are we sure the coroutine that set s->quit will have
> reported an appropriate error, and thus explaining why we can discard
> all secondary errors?  A comment in the code would be helpful.

I'll think about it.

>
>> +            } else {
>> +                error_report_err(local_err);
>> +            }
>>           }
>> @@ -210,7 +204,7 @@ static int nbd_co_receive_reply(NBDClientSession *s, NBDRequest *request)
>>       s->requests[i].receiving = true;
>>       qemu_coroutine_yield();
>>       s->requests[i].receiving = false;
>> -    if (!s->ioc || s->quit) {
>> +    if (!s->ioc) {
>>           ret = -EIO;
> Why don't we need to check s->quit any more?  That was just added
> earlier in the series; why the churn?

it is "to not fail requests on normal disconnet". Because reply_entry 
may be already finished.
Initializing "+    s->requests[i].ret = -EIO;" is enough.

>
>>       } else {
>>           ret = s->requests[i].ret;
>> @@ -218,11 +212,6 @@ static int nbd_co_receive_reply(NBDClientSession *s, NBDRequest *request)
>>   
>>       s->requests[i].coroutine = NULL;
>>   
>> -    /* Kick the read_reply_co to get the next reply.  */
>> -    if (s->read_reply_co) {
>> -        aio_co_wake(s->read_reply_co);
>> -    }
>> -
>>       qemu_co_mutex_lock(&s->send_mutex);
>>       s->in_flight--;
>>       qemu_co_queue_next(&s->free_sema);
>> @@ -364,6 +353,8 @@ void nbd_client_close(BlockDriverState *bs)
>>   
>>       nbd_send_request(client->ioc, &request);
>>   
>> +    client->quit = true;
> Previously, client->quit was only set when detecting a broken server,
> now it is also set for a clean exit.  Do we need to change any
> documentation of the field?

It has documentation?

>

-- 
Best regards,
Vladimir

Re: [Qemu-devel] [PATCH v2 7/7] block/nbd-client: do not yield from nbd_read_reply_entry

Posted by Eric Blake 8 years, 4 months ago

On 09/19/2017 05:00 AM, Vladimir Sementsov-Ogievskiy wrote:
> 19.09.2017 01:36, Eric Blake wrote:
>> On 09/18/2017 08:59 AM, Vladimir Sementsov-Ogievskiy wrote:
>>> Refactor nbd client to not yield from nbd_read_reply_entry. It's
>>> possible now as all reading is done in nbd_read_reply_entry and
>>> all request-related data is stored in per-request s->requests[i].
>>>
>>> We need here some additional work with s->requests[i].ret and
>>> s->quit to not fail requests on normal disconnet and to not report
>>> reading errors on normal disconnect.
>>>

>>> @@ -364,6 +353,8 @@ void nbd_client_close(BlockDriverState *bs)
>>>         nbd_send_request(client->ioc, &request);
>>>   +    client->quit = true;
>> Previously, client->quit was only set when detecting a broken server,
>> now it is also set for a clean exit.  Do we need to change any
>> documentation of the field?
> 
> It has documentation?

Touche.  But it probably should, especially if we are changing its
semantics, to make it easier to understand from looking at the struct
what semantics to expect.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

[Qemu-devel] [PATCH v2 1/7] block/nbd-client: refactor nbd_co_receive_reply
[Qemu-devel] [PATCH v2 2/7] block/nbd-client: exit reply-reading coroutine on incorrect handle
[Qemu-devel] [PATCH v2 3/7] block/nbd-client: refactor reading reply
[Qemu-devel] [PATCH v2 4/7] block/nbd-client: drop reply field from NBDClientSession
[Qemu-devel] [PATCH v2 5/7] block/nbd-client: nbd_co_send_request: return -EIO if s->quit was set in parallel
[Qemu-devel] [PATCH v2 6/7] block/nbd-client: early fail nbd_read_reply_entry if s->quit is set
[Qemu-devel] [PATCH v2 7/7] block/nbd-client: do not yield from nbd_read_reply_entry