block/curl.c | 133 +++++++++++++++++++++++---------------------------- 1 file changed, 59 insertions(+), 74 deletions(-)
Hi, As reported in https://bugzilla.redhat.com/show_bug.cgi?id=1740193, our curl block driver can spontaneously hang. This becomes visible e.g. when reading compressed qcow2 images: $ qemu-img convert -p -O raw -n \ https://download.cirros-cloud.net/0.4.0/cirros-0.4.0-x86_64-disk.img \ null-co:// (Hangs at 74.21 %, usually.) A more direct way is: $ qemu-img bench -f raw http://download.qemu.org/qemu-4.1.0.tar.xz \ -d 1 -S 524288 -c 2 (Which simply performs two requests, and the second one hangs. You can use any HTTP resource (probably FTP, too) you’d like that is at least 1 MB in size.) It turns out that this is because cURL 7.59.0 has added a protective feature against some misuse we had in our code: curl_multi_add_handle() must not be called from within a cURL callback, but in some cases we did. As of 7.59.0, this fails, our new request is not registered and the I/O request stalls. This is fixed by patch 6. Patch 7 makes us check for curl_multi_add_handle()’s return code, because if we had done that before, debugging would have been much simpler. On the way to fixing it, I had a look over the whole cURL code and found a suspicious QLIST_FOREACH_SAFE() loop that actually does not seem very safe at all. I think this may lead to crashes, although I have never seen any myself. https://bugzilla.redhat.com/show_bug.cgi?id=1744602#c5 shows one in exactly the function in question, so I think it actually is a problem. This is fixed by patch 5, patches 1, 2, and 4 prepare for it. (Patch 3 is kind of a misc patch that should ensure that we always end up calling curl_multi_check_completion() whenever a request might have been completed.) v2: - Patch 2: Remove the socket from the list only add the end of the function (yielding a nicer 5+/5- diff stat) - Patch 3: Added - Patch 4: Rebased on patch 3, and s/socket/ready_socket/ in one place - Patch 5: Rebased on the changed patch 4 git-backport-diff against v1: Key: [----] : patches are identical [####] : number of functional differences between upstream/downstream patch [down] : patch is downstream-only The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively 001/7:[----] [--] 'curl: Keep pointer to the CURLState in CURLSocket' 002/7:[0007] [FC] 'curl: Keep *socket until the end of curl_sock_cb()' 003/7:[down] 'curl: Check completion in curl_multi_do()' 004/7:[0019] [FC] 'curl: Pass CURLSocket to curl_multi_{do,read}()' 005/7:[0002] [FC] 'curl: Report only ready sockets' 006/7:[----] [--] 'curl: Handle success in multi_check_completion' 007/7:[----] [--] 'curl: Check curl_multi_add_handle()'s return code' Max Reitz (7): curl: Keep pointer to the CURLState in CURLSocket curl: Keep *socket until the end of curl_sock_cb() curl: Check completion in curl_multi_do() curl: Pass CURLSocket to curl_multi_do() curl: Report only ready sockets curl: Handle success in multi_check_completion curl: Check curl_multi_add_handle()'s return code block/curl.c | 133 +++++++++++++++++++++++---------------------------- 1 file changed, 59 insertions(+), 74 deletions(-) -- 2.21.0
On 9/10/19 8:41 AM, Max Reitz wrote: > Hi, > > As reported in https://bugzilla.redhat.com/show_bug.cgi?id=1740193, our > curl block driver can spontaneously hang. This becomes visible e.g. > when reading compressed qcow2 images: > > $ qemu-img convert -p -O raw -n \ > https://download.cirros-cloud.net/0.4.0/cirros-0.4.0-x86_64-disk.img \ > null-co:// > > (Hangs at 74.21 %, usually.) > > A more direct way is: > > $ qemu-img bench -f raw http://download.qemu.org/qemu-4.1.0.tar.xz \ > -d 1 -S 524288 -c 2 > > (Which simply performs two requests, and the second one hangs. You can > use any HTTP resource (probably FTP, too) you’d like that is at least > 1 MB in size.) > > It turns out that this is because cURL 7.59.0 has added a protective > feature against some misuse we had in our code: curl_multi_add_handle() > must not be called from within a cURL callback, but in some cases we > did. As of 7.59.0, this fails, our new request is not registered and > the I/O request stalls. This is fixed by patch 6. > > Patch 7 makes us check for curl_multi_add_handle()’s return code, > because if we had done that before, debugging would have been much > simpler. > > > On the way to fixing it, I had a look over the whole cURL code and found > a suspicious QLIST_FOREACH_SAFE() loop that actually does not seem very > safe at all. I think this may lead to crashes, although I have never > seen any myself. https://bugzilla.redhat.com/show_bug.cgi?id=1744602#c5 > shows one in exactly the function in question, so I think it actually is > a problem. > > This is fixed by patch 5, patches 1, 2, and 4 prepare for it. > > (Patch 3 is kind of a misc patch that should ensure that we always end > up calling curl_multi_check_completion() whenever a request might have > been completed.) > > > v2: > - Patch 2: Remove the socket from the list only add the end of the > function (yielding a nicer 5+/5- diff stat) > - Patch 3: Added > - Patch 4: Rebased on patch 3, and s/socket/ready_socket/ in one place > - Patch 5: Rebased on the changed patch 4 > > > git-backport-diff against v1: > > Key: > [----] : patches are identical > [####] : number of functional differences between upstream/downstream patch > [down] : patch is downstream-only > The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively > > 001/7:[----] [--] 'curl: Keep pointer to the CURLState in CURLSocket' > 002/7:[0007] [FC] 'curl: Keep *socket until the end of curl_sock_cb()' > 003/7:[down] 'curl: Check completion in curl_multi_do()' > 004/7:[0019] [FC] 'curl: Pass CURLSocket to curl_multi_{do,read}()' > 005/7:[0002] [FC] 'curl: Report only ready sockets' > 006/7:[----] [--] 'curl: Handle success in multi_check_completion' > 007/7:[----] [--] 'curl: Check curl_multi_add_handle()'s return code' > > > Max Reitz (7): > curl: Keep pointer to the CURLState in CURLSocket > curl: Keep *socket until the end of curl_sock_cb() > curl: Check completion in curl_multi_do() > curl: Pass CURLSocket to curl_multi_do() > curl: Report only ready sockets > curl: Handle success in multi_check_completion > curl: Check curl_multi_add_handle()'s return code > > block/curl.c | 133 +++++++++++++++++++++++---------------------------- > 1 file changed, 59 insertions(+), 74 deletions(-) > And for 4-7: Reviewed-by: John Snow <jsnow@redhat.com>
On 10.09.19 14:41, Max Reitz wrote: > Hi, > > As reported in https://bugzilla.redhat.com/show_bug.cgi?id=1740193, our > curl block driver can spontaneously hang. This becomes visible e.g. > when reading compressed qcow2 images: > > $ qemu-img convert -p -O raw -n \ > https://download.cirros-cloud.net/0.4.0/cirros-0.4.0-x86_64-disk.img \ > null-co:// > > (Hangs at 74.21 %, usually.) > > A more direct way is: > > $ qemu-img bench -f raw http://download.qemu.org/qemu-4.1.0.tar.xz \ > -d 1 -S 524288 -c 2 > > (Which simply performs two requests, and the second one hangs. You can > use any HTTP resource (probably FTP, too) you’d like that is at least > 1 MB in size.) > > It turns out that this is because cURL 7.59.0 has added a protective > feature against some misuse we had in our code: curl_multi_add_handle() > must not be called from within a cURL callback, but in some cases we > did. As of 7.59.0, this fails, our new request is not registered and > the I/O request stalls. This is fixed by patch 6. > > Patch 7 makes us check for curl_multi_add_handle()’s return code, > because if we had done that before, debugging would have been much > simpler. > > > On the way to fixing it, I had a look over the whole cURL code and found > a suspicious QLIST_FOREACH_SAFE() loop that actually does not seem very > safe at all. I think this may lead to crashes, although I have never > seen any myself. https://bugzilla.redhat.com/show_bug.cgi?id=1744602#c5 > shows one in exactly the function in question, so I think it actually is > a problem. > > This is fixed by patch 5, patches 1, 2, and 4 prepare for it. > > (Patch 3 is kind of a misc patch that should ensure that we always end > up calling curl_multi_check_completion() whenever a request might have > been completed.) Thanks for the review, applied to my block branch: https://git.xanclic.moe/XanClic/qemu/commits/branch/block Max
© 2016 - 2024 Red Hat, Inc.