[PATCH] daemon: fix wrong request count for sparse stream

Vincent Vanlaer posted 1 patch 2 months, 1 week ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20240219222402.426515-1-libvirt-e6954efa@volkihar.be
src/remote/remote_daemon_stream.c | 1 +
1 file changed, 1 insertion(+)
[PATCH] daemon: fix wrong request count for sparse stream
Posted by Vincent Vanlaer 2 months, 1 week ago
Similar to when actual data is being written to the stream, it is
necessary to acknowledge handling of the client request when a hole is
encountered. This is done later in daemonStreamHandleWrite by sending a
fake zero-length reply if the status variable is set to
VIR_STREAM_CONTINUE. It seems that setting status from the message
header was missed for holes in the introduction of the sparse stream
feature.

Signed-off-by: Vincent Vanlaer <libvirt-e6954efa@volkihar.be>
---
 src/remote/remote_daemon_stream.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/remote/remote_daemon_stream.c b/src/remote/remote_daemon_stream.c
index 1a89ff822c..453728a66b 100644
--- a/src/remote/remote_daemon_stream.c
+++ b/src/remote/remote_daemon_stream.c
@@ -747,6 +747,7 @@ daemonStreamHandleWrite(virNetServerClient *client,
              * Otherwise just carry on with processing stream
              * data. */
             ret = daemonStreamHandleHole(client, stream, msg);
+            status = msg->header.status;
         } else if (msg->header.type == VIR_NET_STREAM) {
             status = msg->header.status;
             switch (status) {
-- 
2.42.0
_______________________________________________
Devel mailing list -- devel@lists.libvirt.org
To unsubscribe send an email to devel-leave@lists.libvirt.org
Re: [PATCH] daemon: fix wrong request count for sparse stream
Posted by Michal Prívozník 2 months ago
On 2/19/24 23:24, Vincent Vanlaer wrote:
> Similar to when actual data is being written to the stream, it is
> necessary to acknowledge handling of the client request when a hole is
> encountered. This is done later in daemonStreamHandleWrite by sending a
> fake zero-length reply if the status variable is set to
> VIR_STREAM_CONTINUE. It seems that setting status from the message
> header was missed for holes in the introduction of the sparse stream
> feature.
> 
> Signed-off-by: Vincent Vanlaer <libvirt-e6954efa@volkihar.be>
> ---
>  src/remote/remote_daemon_stream.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/remote/remote_daemon_stream.c b/src/remote/remote_daemon_stream.c
> index 1a89ff822c..453728a66b 100644
> --- a/src/remote/remote_daemon_stream.c
> +++ b/src/remote/remote_daemon_stream.c
> @@ -747,6 +747,7 @@ daemonStreamHandleWrite(virNetServerClient *client,
>               * Otherwise just carry on with processing stream
>               * data. */
>              ret = daemonStreamHandleHole(client, stream, msg);
> +            status = msg->header.status;
>          } else if (msg->header.type == VIR_NET_STREAM) {
>              status = msg->header.status;
>              switch (status) {

I'm wondering why is this needed. I mean - is there a bug and what are
the steps to reproduce? It's been a while since I touched this part of
the codebase.

Michal
_______________________________________________
Devel mailing list -- devel@lists.libvirt.org
To unsubscribe send an email to devel-leave@lists.libvirt.org
Re: [PATCH] daemon: fix wrong request count for sparse stream
Posted by Vincent Vanlaer 2 months ago
On 22/02/2024 16:22, Michal Prívozník wrote:
> On 2/19/24 23:24, Vincent Vanlaer wrote:
>> Similar to when actual data is being written to the stream, it is
>> necessary to acknowledge handling of the client request when a hole is
>> encountered. This is done later in daemonStreamHandleWrite by sending a
>> fake zero-length reply if the status variable is set to
>> VIR_STREAM_CONTINUE. It seems that setting status from the message
>> header was missed for holes in the introduction of the sparse stream
>> feature.
>>
>> Signed-off-by: Vincent Vanlaer <libvirt-e6954efa@volkihar.be>
>> ---
>>   src/remote/remote_daemon_stream.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/src/remote/remote_daemon_stream.c b/src/remote/remote_daemon_stream.c
>> index 1a89ff822c..453728a66b 100644
>> --- a/src/remote/remote_daemon_stream.c
>> +++ b/src/remote/remote_daemon_stream.c
>> @@ -747,6 +747,7 @@ daemonStreamHandleWrite(virNetServerClient *client,
>>                * Otherwise just carry on with processing stream
>>                * data. */
>>               ret = daemonStreamHandleHole(client, stream, msg);
>> +            status = msg->header.status;
>>           } else if (msg->header.type == VIR_NET_STREAM) {
>>               status = msg->header.status;
>>               switch (status) {
> 
> I'm wondering why is this needed. I mean - is there a bug and what are
> the steps to reproduce? It's been a while since I touched this part of
> the codebase.
> 
> Michal
> 

What turns this into a visible bug is that libvirt tracks how many 
requests from a certain connection are being executed, to prevent one 
client from occupying all threads of the libvirt daemon (this is 
configurable by the max_client_requests option). From what I can gather, 
libvirt considers a request handled when a reply is sent to the client. 
In the case of streams, no replies need to be sent to the client. 
Instead, an fake reply is sent just for bookkeeping later in 
daemonStreamHandleWrite:

>         /* 'CONTINUE' messages don't send a reply (unless error
>          * occurred), so to release the 'msg' object we need to
>          * send a fake zero-length reply. Nothing actually gets
>          * onto the wire, but this causes the client to reset
>          * its active request count / throttling
>          */
>         if (status == VIR_NET_CONTINUE) {
>             virNetMessageClear(msg);
>             msg->header.type = VIR_NET_REPLY;
>             if (virNetServerClientSendMessage(client, msg) < 0) {
>                 virNetMessageFree(msg);
>                 virNetServerClientImmediateClose(client);
>                 return -1;
>             }
>         }

At the start, "status" is set to "VIR_NET_OK", which in the current 
version of the code means that if a hole is received, no fake reply is 
sent back. If this happens enough for a certain connection (the default 
limit of max_client_requests is 5), you get the following warning:

> virNetServerClientDispatchRead:1266 : Client hit max requests limit 50. This may result in keep-alive timeouts. Consider tuning the max_client_requests server parameter

At this point your connection is effectively dead, as libvirt will no 
longer process requests, even those unrelated to the stream.

This patch fixes that by actually updating "status" in the case a hole 
is received.

The steps to reproduce this bug would be to try to call 
virStreamSendHole in a loop, and notice that at some point the 
connection goes stale. Here is some C code that should reproduce the bug 
(provided you have pool with the name images, and no volume test in it):

> #include <stdio.h>
> #include <libvirt/libvirt.h>
> 
> void main() {
>     virConnectPtr conn = virConnectOpen("qemu:///system");
>     virStoragePoolPtr pool = virStoragePoolLookupByName(conn, "images");
>     const char *vol_xml = "<volume><name>test</name><capacity>1048576</capacity><target><format type='qcow2'/></target></volume>";
>     virStorageVolPtr vol = virStorageVolCreateXML(pool, vol_xml, 0);
> 
>     virStreamPtr stream = virStreamNew(conn, 0);
>     virStorageVolUpload(vol, stream, 0, 1024 * 1024, VIR_STORAGE_VOL_UPLOAD_SPARSE_STREAM);
> 
>     int i = 0;
>     while (1) {
>         printf("Sending hole %d\n", i);
>         i++;
>         virStreamSendHole(stream, 1, 0);
>     }
> }

With the bug present, this will hang relatively quickly (after 120 
iterations on my system with max_client_requests set to 50, the 
difference is presumably due to the socket buffers)

Cheers,
Vincent
_______________________________________________
Devel mailing list -- devel@lists.libvirt.org
To unsubscribe send an email to devel-leave@lists.libvirt.org
Re: [PATCH] daemon: fix wrong request count for sparse stream
Posted by Michal Prívozník 2 months ago
On 2/22/24 21:22, Vincent Vanlaer wrote:
> On 22/02/2024 16:22, Michal Prívozník wrote:
>> On 2/19/24 23:24, Vincent Vanlaer wrote:
>>> Similar to when actual data is being written to the stream, it is
>>> necessary to acknowledge handling of the client request when a hole is
>>> encountered. This is done later in daemonStreamHandleWrite by sending a
>>> fake zero-length reply if the status variable is set to
>>> VIR_STREAM_CONTINUE. It seems that setting status from the message
>>> header was missed for holes in the introduction of the sparse stream
>>> feature.
>>>
>>> Signed-off-by: Vincent Vanlaer <libvirt-e6954efa@volkihar.be>
>>> ---
>>>   src/remote/remote_daemon_stream.c | 1 +
>>>   1 file changed, 1 insertion(+)
>>>
>>> diff --git a/src/remote/remote_daemon_stream.c
>>> b/src/remote/remote_daemon_stream.c
>>> index 1a89ff822c..453728a66b 100644
>>> --- a/src/remote/remote_daemon_stream.c
>>> +++ b/src/remote/remote_daemon_stream.c
>>> @@ -747,6 +747,7 @@ daemonStreamHandleWrite(virNetServerClient *client,
>>>                * Otherwise just carry on with processing stream
>>>                * data. */
>>>               ret = daemonStreamHandleHole(client, stream, msg);
>>> +            status = msg->header.status;
>>>           } else if (msg->header.type == VIR_NET_STREAM) {
>>>               status = msg->header.status;
>>>               switch (status) {
>>
>> I'm wondering why is this needed. I mean - is there a bug and what are
>> the steps to reproduce? It's been a while since I touched this part of
>> the codebase.
>>
>> Michal
>>
> 
> What turns this into a visible bug is that libvirt tracks how many
> requests from a certain connection are being executed, to prevent one
> client from occupying all threads of the libvirt daemon (this is
> configurable by the max_client_requests option). From what I can gather,
> libvirt considers a request handled when a reply is sent to the client.
> In the case of streams, no replies need to be sent to the client.
> Instead, an fake reply is sent just for bookkeeping later in
> daemonStreamHandleWrite:
> 
>>         /* 'CONTINUE' messages don't send a reply (unless error
>>          * occurred), so to release the 'msg' object we need to
>>          * send a fake zero-length reply. Nothing actually gets
>>          * onto the wire, but this causes the client to reset
>>          * its active request count / throttling
>>          */
>>         if (status == VIR_NET_CONTINUE) {
>>             virNetMessageClear(msg);
>>             msg->header.type = VIR_NET_REPLY;
>>             if (virNetServerClientSendMessage(client, msg) < 0) {
>>                 virNetMessageFree(msg);
>>                 virNetServerClientImmediateClose(client);
>>                 return -1;
>>             }
>>         }
> 
> At the start, "status" is set to "VIR_NET_OK", which in the current
> version of the code means that if a hole is received, no fake reply is
> sent back. If this happens enough for a certain connection (the default
> limit of max_client_requests is 5), you get the following warning:
> 
>> virNetServerClientDispatchRead:1266 : Client hit max requests limit
>> 50. This may result in keep-alive timeouts. Consider tuning the
>> max_client_requests server parameter
> 
> At this point your connection is effectively dead, as libvirt will no
> longer process requests, even those unrelated to the stream.
> 
> This patch fixes that by actually updating "status" in the case a hole
> is received.
> 
> The steps to reproduce this bug would be to try to call
> virStreamSendHole in a loop, and notice that at some point the
> connection goes stale. Here is some C code that should reproduce the bug
> (provided you have pool with the name images, and no volume test in it):
> 
>> #include <stdio.h>
>> #include <libvirt/libvirt.h>
>>
>> void main() {
>>     virConnectPtr conn = virConnectOpen("qemu:///system");
>>     virStoragePoolPtr pool = virStoragePoolLookupByName(conn, "images");
>>     const char *vol_xml =
>> "<volume><name>test</name><capacity>1048576</capacity><target><format
>> type='qcow2'/></target></volume>";
>>     virStorageVolPtr vol = virStorageVolCreateXML(pool, vol_xml, 0);
>>
>>     virStreamPtr stream = virStreamNew(conn, 0);
>>     virStorageVolUpload(vol, stream, 0, 1024 * 1024,
>> VIR_STORAGE_VOL_UPLOAD_SPARSE_STREAM);
>>
>>     int i = 0;
>>     while (1) {
>>         printf("Sending hole %d\n", i);
>>         i++;
>>         virStreamSendHole(stream, 1, 0);
>>     }
>> }
> 
> With the bug present, this will hang relatively quickly (after 120
> iterations on my system with max_client_requests set to 50, the
> difference is presumably due to the socket buffers)

Perfect! This was the piece I was missing. Thanks.

Reviewed-by: Michal Privoznik <mprivozn@redhat.com>

and merged. Congratulations on your first libvirt contribution!

Michal
_______________________________________________
Devel mailing list -- devel@lists.libvirt.org
To unsubscribe send an email to devel-leave@lists.libvirt.org