[Qemu-devel] [PATCH v2 0/7] block/curl: Fix hang and potential crash

Max Reitz posted 7 patches 4 years, 6 months ago
Test docker-clang@ubuntu passed
Test FreeBSD passed
Test checkpatch passed
Test docker-quick@centos7 passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20190910124136.10565-1-mreitz@redhat.com
Maintainers: Max Reitz <mreitz@redhat.com>, Kevin Wolf <kwolf@redhat.com>
block/curl.c | 133 +++++++++++++++++++++++----------------------------
1 file changed, 59 insertions(+), 74 deletions(-)
[Qemu-devel] [PATCH v2 0/7] block/curl: Fix hang and potential crash
Posted by Max Reitz 4 years, 6 months ago
Hi,

As reported in https://bugzilla.redhat.com/show_bug.cgi?id=1740193, our
curl block driver can spontaneously hang.  This becomes visible e.g.
when reading compressed qcow2 images:

$ qemu-img convert -p -O raw -n \
  https://download.cirros-cloud.net/0.4.0/cirros-0.4.0-x86_64-disk.img \
  null-co://

(Hangs at 74.21 %, usually.)

A more direct way is:

$ qemu-img bench -f raw http://download.qemu.org/qemu-4.1.0.tar.xz \
    -d 1 -S 524288 -c 2

(Which simply performs two requests, and the second one hangs.  You can
use any HTTP resource (probably FTP, too) you’d like that is at least
1 MB in size.)

It turns out that this is because cURL 7.59.0 has added a protective
feature against some misuse we had in our code: curl_multi_add_handle()
must not be called from within a cURL callback, but in some cases we
did.  As of 7.59.0, this fails, our new request is not registered and
the I/O request stalls.  This is fixed by patch 6.

Patch 7 makes us check for curl_multi_add_handle()’s return code,
because if we had done that before, debugging would have been much
simpler.


On the way to fixing it, I had a look over the whole cURL code and found
a suspicious QLIST_FOREACH_SAFE() loop that actually does not seem very
safe at all.  I think this may lead to crashes, although I have never
seen any myself.  https://bugzilla.redhat.com/show_bug.cgi?id=1744602#c5
shows one in exactly the function in question, so I think it actually is
a problem.

This is fixed by patch 5, patches 1, 2, and 4 prepare for it.

(Patch 3 is kind of a misc patch that should ensure that we always end
up calling curl_multi_check_completion() whenever a request might have
been completed.)


v2:
- Patch 2: Remove the socket from the list only add the end of the
           function (yielding a nicer 5+/5- diff stat)
- Patch 3: Added
- Patch 4: Rebased on patch 3, and s/socket/ready_socket/ in one place
- Patch 5: Rebased on the changed patch 4


git-backport-diff against v1:

Key:
[----] : patches are identical
[####] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/7:[----] [--] 'curl: Keep pointer to the CURLState in CURLSocket'
002/7:[0007] [FC] 'curl: Keep *socket until the end of curl_sock_cb()'
003/7:[down] 'curl: Check completion in curl_multi_do()'
004/7:[0019] [FC] 'curl: Pass CURLSocket to curl_multi_{do,read}()'
005/7:[0002] [FC] 'curl: Report only ready sockets'
006/7:[----] [--] 'curl: Handle success in multi_check_completion'
007/7:[----] [--] 'curl: Check curl_multi_add_handle()'s return code'


Max Reitz (7):
  curl: Keep pointer to the CURLState in CURLSocket
  curl: Keep *socket until the end of curl_sock_cb()
  curl: Check completion in curl_multi_do()
  curl: Pass CURLSocket to curl_multi_do()
  curl: Report only ready sockets
  curl: Handle success in multi_check_completion
  curl: Check curl_multi_add_handle()'s return code

 block/curl.c | 133 +++++++++++++++++++++++----------------------------
 1 file changed, 59 insertions(+), 74 deletions(-)

-- 
2.21.0


Re: [Qemu-devel] [PATCH v2 0/7] block/curl: Fix hang and potential crash
Posted by John Snow 4 years, 6 months ago

On 9/10/19 8:41 AM, Max Reitz wrote:
> Hi,
> 
> As reported in https://bugzilla.redhat.com/show_bug.cgi?id=1740193, our
> curl block driver can spontaneously hang.  This becomes visible e.g.
> when reading compressed qcow2 images:
> 
> $ qemu-img convert -p -O raw -n \
>   https://download.cirros-cloud.net/0.4.0/cirros-0.4.0-x86_64-disk.img \
>   null-co://
> 
> (Hangs at 74.21 %, usually.)
> 
> A more direct way is:
> 
> $ qemu-img bench -f raw http://download.qemu.org/qemu-4.1.0.tar.xz \
>     -d 1 -S 524288 -c 2
> 
> (Which simply performs two requests, and the second one hangs.  You can
> use any HTTP resource (probably FTP, too) you’d like that is at least
> 1 MB in size.)
> 
> It turns out that this is because cURL 7.59.0 has added a protective
> feature against some misuse we had in our code: curl_multi_add_handle()
> must not be called from within a cURL callback, but in some cases we
> did.  As of 7.59.0, this fails, our new request is not registered and
> the I/O request stalls.  This is fixed by patch 6.
> 
> Patch 7 makes us check for curl_multi_add_handle()’s return code,
> because if we had done that before, debugging would have been much
> simpler.
> 
> 
> On the way to fixing it, I had a look over the whole cURL code and found
> a suspicious QLIST_FOREACH_SAFE() loop that actually does not seem very
> safe at all.  I think this may lead to crashes, although I have never
> seen any myself.  https://bugzilla.redhat.com/show_bug.cgi?id=1744602#c5
> shows one in exactly the function in question, so I think it actually is
> a problem.
> 
> This is fixed by patch 5, patches 1, 2, and 4 prepare for it.
> 
> (Patch 3 is kind of a misc patch that should ensure that we always end
> up calling curl_multi_check_completion() whenever a request might have
> been completed.)
> 
> 
> v2:
> - Patch 2: Remove the socket from the list only add the end of the
>            function (yielding a nicer 5+/5- diff stat)
> - Patch 3: Added
> - Patch 4: Rebased on patch 3, and s/socket/ready_socket/ in one place
> - Patch 5: Rebased on the changed patch 4
> 
> 
> git-backport-diff against v1:
> 
> Key:
> [----] : patches are identical
> [####] : number of functional differences between upstream/downstream patch
> [down] : patch is downstream-only
> The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively
> 
> 001/7:[----] [--] 'curl: Keep pointer to the CURLState in CURLSocket'
> 002/7:[0007] [FC] 'curl: Keep *socket until the end of curl_sock_cb()'
> 003/7:[down] 'curl: Check completion in curl_multi_do()'
> 004/7:[0019] [FC] 'curl: Pass CURLSocket to curl_multi_{do,read}()'
> 005/7:[0002] [FC] 'curl: Report only ready sockets'
> 006/7:[----] [--] 'curl: Handle success in multi_check_completion'
> 007/7:[----] [--] 'curl: Check curl_multi_add_handle()'s return code'
> 
> 
> Max Reitz (7):
>   curl: Keep pointer to the CURLState in CURLSocket
>   curl: Keep *socket until the end of curl_sock_cb()
>   curl: Check completion in curl_multi_do()
>   curl: Pass CURLSocket to curl_multi_do()
>   curl: Report only ready sockets
>   curl: Handle success in multi_check_completion
>   curl: Check curl_multi_add_handle()'s return code
> 
>  block/curl.c | 133 +++++++++++++++++++++++----------------------------
>  1 file changed, 59 insertions(+), 74 deletions(-)
> 

And for 4-7:

Reviewed-by: John Snow <jsnow@redhat.com>

Re: [Qemu-devel] [PATCH v2 0/7] block/curl: Fix hang and potential crash
Posted by Max Reitz 4 years, 6 months ago
On 10.09.19 14:41, Max Reitz wrote:
> Hi,
> 
> As reported in https://bugzilla.redhat.com/show_bug.cgi?id=1740193, our
> curl block driver can spontaneously hang.  This becomes visible e.g.
> when reading compressed qcow2 images:
> 
> $ qemu-img convert -p -O raw -n \
>   https://download.cirros-cloud.net/0.4.0/cirros-0.4.0-x86_64-disk.img \
>   null-co://
> 
> (Hangs at 74.21 %, usually.)
> 
> A more direct way is:
> 
> $ qemu-img bench -f raw http://download.qemu.org/qemu-4.1.0.tar.xz \
>     -d 1 -S 524288 -c 2
> 
> (Which simply performs two requests, and the second one hangs.  You can
> use any HTTP resource (probably FTP, too) you’d like that is at least
> 1 MB in size.)
> 
> It turns out that this is because cURL 7.59.0 has added a protective
> feature against some misuse we had in our code: curl_multi_add_handle()
> must not be called from within a cURL callback, but in some cases we
> did.  As of 7.59.0, this fails, our new request is not registered and
> the I/O request stalls.  This is fixed by patch 6.
> 
> Patch 7 makes us check for curl_multi_add_handle()’s return code,
> because if we had done that before, debugging would have been much
> simpler.
> 
> 
> On the way to fixing it, I had a look over the whole cURL code and found
> a suspicious QLIST_FOREACH_SAFE() loop that actually does not seem very
> safe at all.  I think this may lead to crashes, although I have never
> seen any myself.  https://bugzilla.redhat.com/show_bug.cgi?id=1744602#c5
> shows one in exactly the function in question, so I think it actually is
> a problem.
> 
> This is fixed by patch 5, patches 1, 2, and 4 prepare for it.
> 
> (Patch 3 is kind of a misc patch that should ensure that we always end
> up calling curl_multi_check_completion() whenever a request might have
> been completed.)

Thanks for the review, applied to my block branch:

https://git.xanclic.moe/XanClic/qemu/commits/branch/block

Max