[PATCH v4 00/23] backup performance: block_status + async

Vladimir Sementsov-Ogievskiy posted 23 patches 3 years, 2 months ago
Test checkpatch passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20210116214705.822267-1-vsementsov@virtuozzo.com
Maintainers: Xie Changlong <xiechanglong.d@gmail.com>, Kevin Wolf <kwolf@redhat.com>, Markus Armbruster <armbru@redhat.com>, Eric Blake <eblake@redhat.com>, Max Reitz <mreitz@redhat.com>, Wen Congyang <wencongyang2@huawei.com>, Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>, John Snow <jsnow@redhat.com>
qapi/block-core.json                   |  28 ++-
block/backup-top.h                     |   1 +
include/block/block-copy.h             |  61 ++++-
include/block/block_int.h              |   3 +
include/block/blockjob_int.h           |   2 +
block/backup-top.c                     |   6 +-
block/backup.c                         | 233 ++++++++++++-------
block/block-copy.c                     | 227 +++++++++++++++---
block/replication.c                    |   2 +
blockdev.c                             |  14 ++
blockjob.c                             |   6 +
job.c                                  |   3 +
scripts/simplebench/bench-backup.py    | 167 ++++++++++++++
scripts/simplebench/bench-example.py   |   2 +-
scripts/simplebench/bench_block_job.py |  13 +-
tests/qemu-iotests/056                 |   9 +-
tests/qemu-iotests/109.out             |  24 ++
tests/qemu-iotests/185                 |   3 +-
tests/qemu-iotests/185.out             |   3 +-
tests/qemu-iotests/219                 |  13 +-
tests/qemu-iotests/257                 |   1 +
tests/qemu-iotests/257.out             | 306 ++++++++++++-------------
22 files changed, 830 insertions(+), 297 deletions(-)
create mode 100755 scripts/simplebench/bench-backup.py
[PATCH v4 00/23] backup performance: block_status + async
Posted by Vladimir Sementsov-Ogievskiy 3 years, 2 months ago
Hi Max!
I applied my series onto yours 129-fixing and found, that 129 fails for backup.
And setting small max-chunk and even max-workers to 1 doesn't help! (setting
speed like in v3 still helps).

And I found, that the problem is that really, the whole backup job goes during
drain, because in new architecture we do just job_yield() during the whole
background block-copy.

This leads to modifying the existing patch in the series, which does job_enter()
from job_user_pause: we just need call job_enter() from job_pause() to cover
not only user pauses but also drained_begin.

So, now I don't need any additional fixing of 129.

Changes in v4:
- add a lot of Max's r-b's, thanks!

03: fix over-80 line (in comment), add r-b
09: was "[PATCH v3 10/25] job: call job_enter from job_user_pause",
    now changed to finally fix 129 iotest, drop r-b

10: squash-in additional wording on max-chunk, fix error message, keep r-b
17: drop extra include, assert job_is_cancelled() instead of check, add r-b
18: adjust commit message, add r-b
23: add comments and assertion, about the fact that test doesn't support
    paths with colon inside
    fix s/disable-copy-range/use-copy-range/

Vladimir Sementsov-Ogievskiy (23):
  qapi: backup: add perf.use-copy-range parameter
  block/block-copy: More explicit call_state
  block/block-copy: implement block_copy_async
  block/block-copy: add max_chunk and max_workers parameters
  block/block-copy: add list of all call-states
  block/block-copy: add ratelimit to block-copy
  block/block-copy: add block_copy_cancel
  blockjob: add set_speed to BlockJobDriver
  job: call job_enter from job_pause
  qapi: backup: add max-chunk and max-workers to x-perf struct
  iotests: 56: prepare for backup over block-copy
  iotests: 185: prepare for backup over block-copy
  iotests: 219: prepare for backup over block-copy
  iotests: 257: prepare for backup over block-copy
  block/block-copy: make progress_bytes_callback optional
  block/backup: drop extra gotos from backup_run()
  backup: move to block-copy
  qapi: backup: disable copy_range by default
  block/block-copy: drop unused block_copy_set_progress_callback()
  block/block-copy: drop unused argument of block_copy()
  simplebench/bench_block_job: use correct shebang line with python3
  simplebench: bench_block_job: add cmd_options argument
  simplebench: add bench-backup.py

 qapi/block-core.json                   |  28 ++-
 block/backup-top.h                     |   1 +
 include/block/block-copy.h             |  61 ++++-
 include/block/block_int.h              |   3 +
 include/block/blockjob_int.h           |   2 +
 block/backup-top.c                     |   6 +-
 block/backup.c                         | 233 ++++++++++++-------
 block/block-copy.c                     | 227 +++++++++++++++---
 block/replication.c                    |   2 +
 blockdev.c                             |  14 ++
 blockjob.c                             |   6 +
 job.c                                  |   3 +
 scripts/simplebench/bench-backup.py    | 167 ++++++++++++++
 scripts/simplebench/bench-example.py   |   2 +-
 scripts/simplebench/bench_block_job.py |  13 +-
 tests/qemu-iotests/056                 |   9 +-
 tests/qemu-iotests/109.out             |  24 ++
 tests/qemu-iotests/185                 |   3 +-
 tests/qemu-iotests/185.out             |   3 +-
 tests/qemu-iotests/219                 |  13 +-
 tests/qemu-iotests/257                 |   1 +
 tests/qemu-iotests/257.out             | 306 ++++++++++++-------------
 22 files changed, 830 insertions(+), 297 deletions(-)
 create mode 100755 scripts/simplebench/bench-backup.py

-- 
2.29.2


Re: [PATCH v4 00/23] backup performance: block_status + async
Posted by Max Reitz 3 years, 2 months ago
On 16.01.21 22:46, Vladimir Sementsov-Ogievskiy wrote:
> Hi Max!
> I applied my series onto yours 129-fixing and found, that 129 fails for backup.
> And setting small max-chunk and even max-workers to 1 doesn't help! (setting
> speed like in v3 still helps).
> 
> And I found, that the problem is that really, the whole backup job goes during
> drain, because in new architecture we do just job_yield() during the whole
> background block-copy.
> 
> This leads to modifying the existing patch in the series, which does job_enter()
> from job_user_pause: we just need call job_enter() from job_pause() to cover
> not only user pauses but also drained_begin.
> 
> So, now I don't need any additional fixing of 129.
> 
> Changes in v4:
> - add a lot of Max's r-b's, thanks!
> 
> 03: fix over-80 line (in comment), add r-b
> 09: was "[PATCH v3 10/25] job: call job_enter from job_user_pause",
>      now changed to finally fix 129 iotest, drop r-b
> 
> 10: squash-in additional wording on max-chunk, fix error message, keep r-b
> 17: drop extra include, assert job_is_cancelled() instead of check, add r-b
> 18: adjust commit message, add r-b
> 23: add comments and assertion, about the fact that test doesn't support
>      paths with colon inside
>      fix s/disable-copy-range/use-copy-range/

Hmmm, for me, 129 sometimes fails still, because it completes too 
quickly...  (The error then is that 'return[0]' does not exist in 
query-block-jobs’s result, because the job is already gone.)

When I insert a print(result) after the query-block-jobs, I can see that 
the job has always progressed really far, even if its still running. 
(Like, generally the offset is just one MB shy of 1G.)

I suspect the problem is that block-copy just copies too much from the 
start (by default); i.e., it starts 64 workers with, hm, well, 1 MB of 
chunk size?  Shouldn’t fill the 128 MB immediately...

Anyway, limiting the number of workers (to 1) and the chunk size (to 
64k) with x-perf does ensure that the backup job’s progress is limited 
to 1 MB or so, which looks fine to me.

I suppose we should do that, then (in 129), before patch 17?

(PS: I can also see a MacOS failure in iotest 256.  I suspect it’s 
related to this series, because 256 is a backup test (with iothreads), 
but I’m not sure yet.  The log is here:

https://cirrus-ci.com/task/5276331753603072
)

Max


Re: [PATCH v4 00/23] backup performance: block_status + async
Posted by Vladimir Sementsov-Ogievskiy 3 years, 2 months ago
19.01.2021 21:40, Max Reitz wrote:
> On 16.01.21 22:46, Vladimir Sementsov-Ogievskiy wrote:
>> Hi Max!
>> I applied my series onto yours 129-fixing and found, that 129 fails for backup.
>> And setting small max-chunk and even max-workers to 1 doesn't help! (setting
>> speed like in v3 still helps).
>>
>> And I found, that the problem is that really, the whole backup job goes during
>> drain, because in new architecture we do just job_yield() during the whole
>> background block-copy.
>>
>> This leads to modifying the existing patch in the series, which does job_enter()
>> from job_user_pause: we just need call job_enter() from job_pause() to cover
>> not only user pauses but also drained_begin.
>>
>> So, now I don't need any additional fixing of 129.
>>
>> Changes in v4:
>> - add a lot of Max's r-b's, thanks!
>>
>> 03: fix over-80 line (in comment), add r-b
>> 09: was "[PATCH v3 10/25] job: call job_enter from job_user_pause",
>>      now changed to finally fix 129 iotest, drop r-b
>>
>> 10: squash-in additional wording on max-chunk, fix error message, keep r-b
>> 17: drop extra include, assert job_is_cancelled() instead of check, add r-b
>> 18: adjust commit message, add r-b
>> 23: add comments and assertion, about the fact that test doesn't support
>>      paths with colon inside
>>      fix s/disable-copy-range/use-copy-range/
> 
> Hmmm, for me, 129 sometimes fails still, because it completes too quickly...  (The error then is that 'return[0]' does not exist in query-block-jobs’s result, because the job is already gone.)
> 
> When I insert a print(result) after the query-block-jobs, I can see that the job has always progressed really far, even if its still running. (Like, generally the offset is just one MB shy of 1G.)
> 
> I suspect the problem is that block-copy just copies too much from the start (by default); i.e., it starts 64 workers with, hm, well, 1 MB of chunk size?  Shouldn’t fill the 128 MB immediately...
> 
> Anyway, limiting the number of workers (to 1) and the chunk size (to 64k) with x-perf does ensure that the backup job’s progress is limited to 1 MB or so, which looks fine to me.
> 
> I suppose we should do that, then (in 129), before patch 17?

Yes, that sounds reasonable

> 
> (PS: I can also see a MacOS failure in iotest 256.  I suspect it’s related to this series, because 256 is a backup test (with iothreads), but I’m not sure yet.  The log is here:
> 
> https://cirrus-ci.com/task/5276331753603072
> )
> 

qemu received signal 31 ?

googling for MacOS...

  31    SIGUSR2      terminate process    User defined signal 2


And what does it mean? Something interesting may be in qemu log..


-- 
Best regards,
Vladimir

Re: [PATCH v4 00/23] backup performance: block_status + async
Posted by Vladimir Sementsov-Ogievskiy 3 years, 2 months ago
19.01.2021 22:22, Vladimir Sementsov-Ogievskiy wrote:
>> Hmmm, for me, 129 sometimes fails still, because it completes too quickly...  (The error then is that 'return[0]' does not exist in query-block-jobs’s result, because the job is already gone.)
>>
>> When I insert a print(result) after the query-block-jobs, I can see that the job has always progressed really far, even if its still running. (Like, generally the offset is just one MB shy of 1G.)
>>
>> I suspect the problem is that block-copy just copies too much from the start (by default); i.e., it starts 64 workers with, hm, well, 1 MB of chunk size?  Shouldn’t fill the 128 MB immediately...
>>
>> Anyway, limiting the number of workers (to 1) and the chunk size (to 64k) with x-perf does ensure that the backup job’s progress is limited to 1 MB or so, which looks fine to me.
>>
>> I suppose we should do that, then (in 129), before patch 17?
> 
> Yes, that sounds reasonable

Still, may be keeping number of workers >1 is good to, to test new architecture.. Just make it to be 4 or 8

-- 
Best regards,
Vladimir

Re: [PATCH v4 00/23] backup performance: block_status + async
Posted by Max Reitz 3 years, 2 months ago
On 19.01.21 20:22, Vladimir Sementsov-Ogievskiy wrote:
> 19.01.2021 21:40, Max Reitz wrote:
>> On 16.01.21 22:46, Vladimir Sementsov-Ogievskiy wrote:
>>> Hi Max!
>>> I applied my series onto yours 129-fixing and found, that 129 fails 
>>> for backup.
>>> And setting small max-chunk and even max-workers to 1 doesn't help! 
>>> (setting
>>> speed like in v3 still helps).
>>>
>>> And I found, that the problem is that really, the whole backup job 
>>> goes during
>>> drain, because in new architecture we do just job_yield() during the 
>>> whole
>>> background block-copy.
>>>
>>> This leads to modifying the existing patch in the series, which does 
>>> job_enter()
>>> from job_user_pause: we just need call job_enter() from job_pause() 
>>> to cover
>>> not only user pauses but also drained_begin.
>>>
>>> So, now I don't need any additional fixing of 129.
>>>
>>> Changes in v4:
>>> - add a lot of Max's r-b's, thanks!
>>>
>>> 03: fix over-80 line (in comment), add r-b
>>> 09: was "[PATCH v3 10/25] job: call job_enter from job_user_pause",
>>>      now changed to finally fix 129 iotest, drop r-b
>>>
>>> 10: squash-in additional wording on max-chunk, fix error message, 
>>> keep r-b
>>> 17: drop extra include, assert job_is_cancelled() instead of check, 
>>> add r-b
>>> 18: adjust commit message, add r-b
>>> 23: add comments and assertion, about the fact that test doesn't support
>>>      paths with colon inside
>>>      fix s/disable-copy-range/use-copy-range/
>>
>> Hmmm, for me, 129 sometimes fails still, because it completes too 
>> quickly...  (The error then is that 'return[0]' does not exist in 
>> query-block-jobs’s result, because the job is already gone.)
>>
>> When I insert a print(result) after the query-block-jobs, I can see 
>> that the job has always progressed really far, even if its still 
>> running. (Like, generally the offset is just one MB shy of 1G.)
>>
>> I suspect the problem is that block-copy just copies too much from the 
>> start (by default); i.e., it starts 64 workers with, hm, well, 1 MB of 
>> chunk size?  Shouldn’t fill the 128 MB immediately...
>>
>> Anyway, limiting the number of workers (to 1) and the chunk size (to 
>> 64k) with x-perf does ensure that the backup job’s progress is limited 
>> to 1 MB or so, which looks fine to me.
>>
>> I suppose we should do that, then (in 129), before patch 17?
> 
> Yes, that sounds reasonable
> 
>>
>> (PS: I can also see a MacOS failure in iotest 256.  I suspect it’s 
>> related to this series, because 256 is a backup test (with iothreads), 
>> but I’m not sure yet.  The log is here:
>>
>> https://cirrus-ci.com/task/5276331753603072
>> )
>>
> 
> qemu received signal 31 ?
> 
> googling for MacOS...
> 
>   31    SIGUSR2      terminate process    User defined signal 2

coroutine-sigaltstack uses SIGUSR2 to set up new coroutines.  Perhaps 
it’s unrelated to backup?  Guess I’ll just run the test one more time. O:)

Max


Re: [PATCH v4 00/23] backup performance: block_status + async
Posted by Max Reitz 3 years, 2 months ago
On 20.01.21 11:39, Max Reitz wrote:
> On 19.01.21 20:22, Vladimir Sementsov-Ogievskiy wrote:
>> 19.01.2021 21:40, Max Reitz wrote:
>>> On 16.01.21 22:46, Vladimir Sementsov-Ogievskiy wrote:
>>>> Hi Max!
>>>> I applied my series onto yours 129-fixing and found, that 129 fails 
>>>> for backup.
>>>> And setting small max-chunk and even max-workers to 1 doesn't help! 
>>>> (setting
>>>> speed like in v3 still helps).
>>>>
>>>> And I found, that the problem is that really, the whole backup job 
>>>> goes during
>>>> drain, because in new architecture we do just job_yield() during the 
>>>> whole
>>>> background block-copy.
>>>>
>>>> This leads to modifying the existing patch in the series, which does 
>>>> job_enter()
>>>> from job_user_pause: we just need call job_enter() from job_pause() 
>>>> to cover
>>>> not only user pauses but also drained_begin.
>>>>
>>>> So, now I don't need any additional fixing of 129.
>>>>
>>>> Changes in v4:
>>>> - add a lot of Max's r-b's, thanks!
>>>>
>>>> 03: fix over-80 line (in comment), add r-b
>>>> 09: was "[PATCH v3 10/25] job: call job_enter from job_user_pause",
>>>>      now changed to finally fix 129 iotest, drop r-b
>>>>
>>>> 10: squash-in additional wording on max-chunk, fix error message, 
>>>> keep r-b
>>>> 17: drop extra include, assert job_is_cancelled() instead of check, 
>>>> add r-b
>>>> 18: adjust commit message, add r-b
>>>> 23: add comments and assertion, about the fact that test doesn't 
>>>> support
>>>>      paths with colon inside
>>>>      fix s/disable-copy-range/use-copy-range/
>>>
>>> Hmmm, for me, 129 sometimes fails still, because it completes too 
>>> quickly...  (The error then is that 'return[0]' does not exist in 
>>> query-block-jobs’s result, because the job is already gone.)
>>>
>>> When I insert a print(result) after the query-block-jobs, I can see 
>>> that the job has always progressed really far, even if its still 
>>> running. (Like, generally the offset is just one MB shy of 1G.)
>>>
>>> I suspect the problem is that block-copy just copies too much from 
>>> the start (by default); i.e., it starts 64 workers with, hm, well, 1 
>>> MB of chunk size?  Shouldn’t fill the 128 MB immediately...
>>>
>>> Anyway, limiting the number of workers (to 1) and the chunk size (to 
>>> 64k) with x-perf does ensure that the backup job’s progress is 
>>> limited to 1 MB or so, which looks fine to me.
>>>
>>> I suppose we should do that, then (in 129), before patch 17?
>>
>> Yes, that sounds reasonable
>>
>>>
>>> (PS: I can also see a MacOS failure in iotest 256.  I suspect it’s 
>>> related to this series, because 256 is a backup test (with 
>>> iothreads), but I’m not sure yet.  The log is here:
>>>
>>> https://cirrus-ci.com/task/5276331753603072
>>> )
>>>
>>
>> qemu received signal 31 ?
>>
>> googling for MacOS...
>>
>>   31    SIGUSR2      terminate process    User defined signal 2
> 
> coroutine-sigaltstack uses SIGUSR2 to set up new coroutines.  Perhaps 
> it’s unrelated to backup?  Guess I’ll just run the test one more time. O:)

I ran it again, got the same error.  There is no error on master, or 
before backup uses block_copy.

I’m trying to run a test directly on the “move to block-copy” commit, 
but so far Cirrus doesn’t seem to want me to do another test run right now.

(Though I’m pretty sure if there is no error before the block-copy 
commit, then using block-copy must be the problem.  The remaining 
patches in my block branch are just disabling copy_range, some clean-up, 
the simplebench patches, the locking code error reporting change, and a 
new iotest.)

Max


Re: [PATCH v4 00/23] backup performance: block_status + async
Posted by Max Reitz 3 years, 2 months ago
On 20.01.21 14:50, Max Reitz wrote:
> On 20.01.21 11:39, Max Reitz wrote:
>> On 19.01.21 20:22, Vladimir Sementsov-Ogievskiy wrote:
>>> 19.01.2021 21:40, Max Reitz wrote:
>>>> On 16.01.21 22:46, Vladimir Sementsov-Ogievskiy wrote:
>>>>> Hi Max!
>>>>> I applied my series onto yours 129-fixing and found, that 129 fails 
>>>>> for backup.
>>>>> And setting small max-chunk and even max-workers to 1 doesn't help! 
>>>>> (setting
>>>>> speed like in v3 still helps).
>>>>>
>>>>> And I found, that the problem is that really, the whole backup job 
>>>>> goes during
>>>>> drain, because in new architecture we do just job_yield() during 
>>>>> the whole
>>>>> background block-copy.
>>>>>
>>>>> This leads to modifying the existing patch in the series, which 
>>>>> does job_enter()
>>>>> from job_user_pause: we just need call job_enter() from job_pause() 
>>>>> to cover
>>>>> not only user pauses but also drained_begin.
>>>>>
>>>>> So, now I don't need any additional fixing of 129.
>>>>>
>>>>> Changes in v4:
>>>>> - add a lot of Max's r-b's, thanks!
>>>>>
>>>>> 03: fix over-80 line (in comment), add r-b
>>>>> 09: was "[PATCH v3 10/25] job: call job_enter from job_user_pause",
>>>>>      now changed to finally fix 129 iotest, drop r-b
>>>>>
>>>>> 10: squash-in additional wording on max-chunk, fix error message, 
>>>>> keep r-b
>>>>> 17: drop extra include, assert job_is_cancelled() instead of check, 
>>>>> add r-b
>>>>> 18: adjust commit message, add r-b
>>>>> 23: add comments and assertion, about the fact that test doesn't 
>>>>> support
>>>>>      paths with colon inside
>>>>>      fix s/disable-copy-range/use-copy-range/
>>>>
>>>> Hmmm, for me, 129 sometimes fails still, because it completes too 
>>>> quickly...  (The error then is that 'return[0]' does not exist in 
>>>> query-block-jobs’s result, because the job is already gone.)
>>>>
>>>> When I insert a print(result) after the query-block-jobs, I can see 
>>>> that the job has always progressed really far, even if its still 
>>>> running. (Like, generally the offset is just one MB shy of 1G.)
>>>>
>>>> I suspect the problem is that block-copy just copies too much from 
>>>> the start (by default); i.e., it starts 64 workers with, hm, well, 1 
>>>> MB of chunk size?  Shouldn’t fill the 128 MB immediately...
>>>>
>>>> Anyway, limiting the number of workers (to 1) and the chunk size (to 
>>>> 64k) with x-perf does ensure that the backup job’s progress is 
>>>> limited to 1 MB or so, which looks fine to me.
>>>>
>>>> I suppose we should do that, then (in 129), before patch 17?
>>>
>>> Yes, that sounds reasonable
>>>
>>>>
>>>> (PS: I can also see a MacOS failure in iotest 256.  I suspect it’s 
>>>> related to this series, because 256 is a backup test (with 
>>>> iothreads), but I’m not sure yet.  The log is here:
>>>>
>>>> https://cirrus-ci.com/task/5276331753603072
>>>> )
>>>>
>>>
>>> qemu received signal 31 ?
>>>
>>> googling for MacOS...
>>>
>>>   31    SIGUSR2      terminate process    User defined signal 2
>>
>> coroutine-sigaltstack uses SIGUSR2 to set up new coroutines.  Perhaps 
>> it’s unrelated to backup?  Guess I’ll just run the test one more time. 
>> O:)
> 
> I ran it again, got the same error.  There is no error on master, or 
> before backup uses block_copy.
> 
> I’m trying to run a test directly on the “move to block-copy” commit, 
> but so far Cirrus doesn’t seem to want me to do another test run right now.
> 
> (Though I’m pretty sure if there is no error before the block-copy 
> commit, then using block-copy must be the problem.  The remaining 
> patches in my block branch are just disabling copy_range, some clean-up, 
> the simplebench patches, the locking code error reporting change, and a 
> new iotest.)

I was able to reproduce the signal on Linux, by passing
--with-coroutine=sigaltstack to configure (sometimes takes like 10 runs 
to fail, though).  “move to block-copy” is indeed the first failing commiit.

Letting qemu_coroutine_new() set up an aborting signal handler for 
SIGUSR2 instead of restoring the default action (i.e. exiting without 
core dump) yields this back trace:

Thread 1:

#0  0x00007f61870ba9d5 in raise () at /lib64/libc.so.6
#1  0x00007f61870a38a4 in abort () at /lib64/libc.so.6
#2  0x0000562cd2be7350 in sigusr2_handler (s=<optimized out>) at 
../util/coroutine-sigaltstack.c:151
#3  0x00007f6187cee1e0 in <signal handler called> () at 
/lib64/libpthread.so.0
#4  0x00007f61870bad8a in sigsuspend () at /lib64/libc.so.6
#5  0x0000562cd30d9e48 in qemu_coroutine_new () at 
../util/coroutine-sigaltstack.c:220
#6  0x0000562cd3106206 in qemu_coroutine_create
     (entry=entry@entry=0x562cd303c950 <block_copy_async_co_entry>, 
opaque=opaque@entry=0x7f6170002c00)
     at ../util/qemu-coroutine.c:75
#7  0x0000562cd303cd0e in block_copy_async
     (s=0x562cd4f94400, offset=offset@entry=0, bytes=67108864, 
max_workers=64, max_chunk=0, cb=cb@entry=0x562cd301dfe0 
<backup_block_copy_callback>, cb_opaque=0x562cd6256940) at 
../block/block-copy.c:752
#8  0x0000562cd301dd4e in backup_loop (job=<optimized out>) at 
../block/backup.c:156
#9  backup_run (job=0x562cd6256940, errp=<optimized out>) at 
../block/backup.c:299
#10 0x0000562cd2fc84d2 in job_co_entry (opaque=0x562cd6256940) at 
../job.c:911
#11 0x0000562cd30da04b in coroutine_bootstrap 
(self=self@entry=0x562cd6262ef0, co=co@entry=0x562cd6262ef0)
     at ../util/coroutine-sigaltstack.c:105
#12 0x0000562cd30da0e1 in coroutine_trampoline (signal=<optimized out>) 
at ../util/coroutine-sigaltstack.c:146
#13 0x00007f6187cee1e0 in <signal handler called> () at 
/lib64/libpthread.so.0
#14 0x00007f61870bad8a in sigsuspend () at /lib64/libc.so.6

Thread 2:

(gdb) bt
#0  0x00007f6187cee2b6 in __libc_sigaction () at /lib64/libpthread.so.0
#1  0x0000562cd30d9dc7 in qemu_coroutine_new () at 
../util/coroutine-sigaltstack.c:194
#2  0x0000562cd3106206 in qemu_coroutine_create
     (entry=entry@entry=0x562cd3016c20 <aio_task_co>, 
opaque=opaque@entry=0x7f616c0063d0)
     at ../util/qemu-coroutine.c:75
#3  0x0000562cd3016dd4 in aio_task_pool_start_task (pool=0x7f616c0067d0, 
task=0x7f616c0063d0)
     at ../block/aio_task.c:94
#4  0x0000562cd303c193 in block_copy_task_run (task=<optimized out>, 
pool=<optimized out>)
     at ../block/block-copy.c:330
#5  block_copy_dirty_clusters (call_state=0x7f616c002c00) at 
../block/block-copy.c:646
#6  block_copy_common (call_state=0x7f616c002c00) at 
../block/block-copy.c:696
#7  0x0000562cd30da04b in coroutine_bootstrap 
(self=self@entry=0x7f616c003660, co=co@entry=0x7f616c003660)
     at ../util/coroutine-sigaltstack.c:105
#8  0x0000562cd30da0e1 in coroutine_trampoline (signal=<optimized out>) 
at ../util/coroutine-sigaltstack.c:146
#9  0x00007f6187cee1e0 in <signal handler called> () at 
/lib64/libpthread.so.0
#10 0x00007f61870bad8a in sigsuspend () at /lib64/libc.so.6
#11 0x0000562cd30d9f61 in qemu_coroutine_new () at 
../util/coroutine-sigaltstack.c:254
#12 0x0000562cd61b40d0 in  ()
#13 0x00007f6163fff930 in  ()
#14 0x00007f6187cecc5f in annobin_pt_longjmp.c_end () at 
/lib64/libpthread.so.0
#15 0x0000562cd30da00d in qemu_coroutine_switch
     (from_=0x562cd5cc3400, to_=0x7f616c003108, 
action=action@entry=COROUTINE_YIELD)
     at ../util/coroutine-sigaltstack.c:284
#16 0x0000562cd31065f0 in qemu_coroutine_yield () at 
../util/qemu-coroutine.c:193
#17 0x0000562cd2fc751d in job_do_yield (job=0x562cd5cc3400, 
ns=18446744073709551615) at ../job.c:478
#18 0x0000562cd2fc855f in job_yield (job=0x562cd5cc3400) at ../job.c:525
#19 0x0000562cd301dd74 in backup_loop (job=<optimized out>) at 
../block/backup.c:164
#20 backup_run (job=0x562cd5cc3400, errp=<optimized out>) at 
../block/backup.c:299
#21 0x0000562cd2fc84d2 in job_co_entry (opaque=0x562cd5cc3400) at 
../job.c:911
#22 0x0000562cd30da04b in coroutine_bootstrap 
(self=self@entry=0x562cd61b40d0, co=co@entry=0x562cd61b40d0)
     at ../util/coroutine-sigaltstack.c:105
#23 0x0000562cd30da0e1 in coroutine_trampoline (signal=<optimized out>) 
at ../util/coroutine-sigaltstack.c:146
#24 0x00007f6187cee1e0 in <signal handler called> () at 
/lib64/libpthread.so.0
#25 0x00007f61870bad8a in sigsuspend () at /lib64/libc.so.6
#26 0x0000000000000001 in  ()
#27 0x0000000000000000 in  ()

 From a glance, it looks to me like two coroutines are created 
simultaneously in two threads, and so one thread sets up a special 
SIGUSR2 action, then another reverts SIGUSR2 to the default, and then 
the first one kills itself with SIGUSR2.

Not sure what this has to do with backup, though it is interesting that 
backup_loop() runs in two threads.  So perhaps some AioContext problem.

Max


Re: [PATCH v4 00/23] backup performance: block_status + async
Posted by Max Reitz 3 years, 2 months ago
On 20.01.21 15:34, Max Reitz wrote:
> On 20.01.21 14:50, Max Reitz wrote:
>> On 20.01.21 11:39, Max Reitz wrote:
>>> On 19.01.21 20:22, Vladimir Sementsov-Ogievskiy wrote:
>>>> 19.01.2021 21:40, Max Reitz wrote:
>>>>> On 16.01.21 22:46, Vladimir Sementsov-Ogievskiy wrote:
>>>>>> Hi Max!
>>>>>> I applied my series onto yours 129-fixing and found, that 129 
>>>>>> fails for backup.
>>>>>> And setting small max-chunk and even max-workers to 1 doesn't 
>>>>>> help! (setting
>>>>>> speed like in v3 still helps).
>>>>>>
>>>>>> And I found, that the problem is that really, the whole backup job 
>>>>>> goes during
>>>>>> drain, because in new architecture we do just job_yield() during 
>>>>>> the whole
>>>>>> background block-copy.
>>>>>>
>>>>>> This leads to modifying the existing patch in the series, which 
>>>>>> does job_enter()
>>>>>> from job_user_pause: we just need call job_enter() from 
>>>>>> job_pause() to cover
>>>>>> not only user pauses but also drained_begin.
>>>>>>
>>>>>> So, now I don't need any additional fixing of 129.
>>>>>>
>>>>>> Changes in v4:
>>>>>> - add a lot of Max's r-b's, thanks!
>>>>>>
>>>>>> 03: fix over-80 line (in comment), add r-b
>>>>>> 09: was "[PATCH v3 10/25] job: call job_enter from job_user_pause",
>>>>>>      now changed to finally fix 129 iotest, drop r-b
>>>>>>
>>>>>> 10: squash-in additional wording on max-chunk, fix error message, 
>>>>>> keep r-b
>>>>>> 17: drop extra include, assert job_is_cancelled() instead of 
>>>>>> check, add r-b
>>>>>> 18: adjust commit message, add r-b
>>>>>> 23: add comments and assertion, about the fact that test doesn't 
>>>>>> support
>>>>>>      paths with colon inside
>>>>>>      fix s/disable-copy-range/use-copy-range/
>>>>>
>>>>> Hmmm, for me, 129 sometimes fails still, because it completes too 
>>>>> quickly...  (The error then is that 'return[0]' does not exist in 
>>>>> query-block-jobs’s result, because the job is already gone.)
>>>>>
>>>>> When I insert a print(result) after the query-block-jobs, I can see 
>>>>> that the job has always progressed really far, even if its still 
>>>>> running. (Like, generally the offset is just one MB shy of 1G.)
>>>>>
>>>>> I suspect the problem is that block-copy just copies too much from 
>>>>> the start (by default); i.e., it starts 64 workers with, hm, well, 
>>>>> 1 MB of chunk size?  Shouldn’t fill the 128 MB immediately...
>>>>>
>>>>> Anyway, limiting the number of workers (to 1) and the chunk size 
>>>>> (to 64k) with x-perf does ensure that the backup job’s progress is 
>>>>> limited to 1 MB or so, which looks fine to me.
>>>>>
>>>>> I suppose we should do that, then (in 129), before patch 17?
>>>>
>>>> Yes, that sounds reasonable
>>>>
>>>>>
>>>>> (PS: I can also see a MacOS failure in iotest 256.  I suspect it’s 
>>>>> related to this series, because 256 is a backup test (with 
>>>>> iothreads), but I’m not sure yet.  The log is here:
>>>>>
>>>>> https://cirrus-ci.com/task/5276331753603072
>>>>> )
>>>>>
>>>>
>>>> qemu received signal 31 ?
>>>>
>>>> googling for MacOS...
>>>>
>>>>   31    SIGUSR2      terminate process    User defined signal 2
>>>
>>> coroutine-sigaltstack uses SIGUSR2 to set up new coroutines.  Perhaps 
>>> it’s unrelated to backup?  Guess I’ll just run the test one more 
>>> time. O:)
>>
>> I ran it again, got the same error.  There is no error on master, or 
>> before backup uses block_copy.
>>
>> I’m trying to run a test directly on the “move to block-copy” commit, 
>> but so far Cirrus doesn’t seem to want me to do another test run right 
>> now.
>>
>> (Though I’m pretty sure if there is no error before the block-copy 
>> commit, then using block-copy must be the problem.  The remaining 
>> patches in my block branch are just disabling copy_range, some 
>> clean-up, the simplebench patches, the locking code error reporting 
>> change, and a new iotest.)
> 
> I was able to reproduce the signal on Linux, by passing
> --with-coroutine=sigaltstack to configure (sometimes takes like 10 runs 
> to fail, though).  “move to block-copy” is indeed the first failing 
> commiit.
> 
> Letting qemu_coroutine_new() set up an aborting signal handler for 
> SIGUSR2 instead of restoring the default action (i.e. exiting without 
> core dump) yields this back trace:
> 
> Thread 1:
> 
> #0  0x00007f61870ba9d5 in raise () at /lib64/libc.so.6
> #1  0x00007f61870a38a4 in abort () at /lib64/libc.so.6
> #2  0x0000562cd2be7350 in sigusr2_handler (s=<optimized out>) at 
> ../util/coroutine-sigaltstack.c:151
> #3  0x00007f6187cee1e0 in <signal handler called> () at 
> /lib64/libpthread.so.0
> #4  0x00007f61870bad8a in sigsuspend () at /lib64/libc.so.6
> #5  0x0000562cd30d9e48 in qemu_coroutine_new () at 
> ../util/coroutine-sigaltstack.c:220
> #6  0x0000562cd3106206 in qemu_coroutine_create
>      (entry=entry@entry=0x562cd303c950 <block_copy_async_co_entry>, 
> opaque=opaque@entry=0x7f6170002c00)
>      at ../util/qemu-coroutine.c:75
> #7  0x0000562cd303cd0e in block_copy_async
>      (s=0x562cd4f94400, offset=offset@entry=0, bytes=67108864, 
> max_workers=64, max_chunk=0, cb=cb@entry=0x562cd301dfe0 
> <backup_block_copy_callback>, cb_opaque=0x562cd6256940) at 
> ../block/block-copy.c:752
> #8  0x0000562cd301dd4e in backup_loop (job=<optimized out>) at 
> ../block/backup.c:156
> #9  backup_run (job=0x562cd6256940, errp=<optimized out>) at 
> ../block/backup.c:299
> #10 0x0000562cd2fc84d2 in job_co_entry (opaque=0x562cd6256940) at 
> ../job.c:911
> #11 0x0000562cd30da04b in coroutine_bootstrap 
> (self=self@entry=0x562cd6262ef0, co=co@entry=0x562cd6262ef0)
>      at ../util/coroutine-sigaltstack.c:105
> #12 0x0000562cd30da0e1 in coroutine_trampoline (signal=<optimized out>) 
> at ../util/coroutine-sigaltstack.c:146
> #13 0x00007f6187cee1e0 in <signal handler called> () at 
> /lib64/libpthread.so.0
> #14 0x00007f61870bad8a in sigsuspend () at /lib64/libc.so.6
> 
> Thread 2:
> 
> (gdb) bt
> #0  0x00007f6187cee2b6 in __libc_sigaction () at /lib64/libpthread.so.0
> #1  0x0000562cd30d9dc7 in qemu_coroutine_new () at 
> ../util/coroutine-sigaltstack.c:194
> #2  0x0000562cd3106206 in qemu_coroutine_create
>      (entry=entry@entry=0x562cd3016c20 <aio_task_co>, 
> opaque=opaque@entry=0x7f616c0063d0)
>      at ../util/qemu-coroutine.c:75
> #3  0x0000562cd3016dd4 in aio_task_pool_start_task (pool=0x7f616c0067d0, 
> task=0x7f616c0063d0)
>      at ../block/aio_task.c:94
> #4  0x0000562cd303c193 in block_copy_task_run (task=<optimized out>, 
> pool=<optimized out>)
>      at ../block/block-copy.c:330
> #5  block_copy_dirty_clusters (call_state=0x7f616c002c00) at 
> ../block/block-copy.c:646
> #6  block_copy_common (call_state=0x7f616c002c00) at 
> ../block/block-copy.c:696
> #7  0x0000562cd30da04b in coroutine_bootstrap 
> (self=self@entry=0x7f616c003660, co=co@entry=0x7f616c003660)
>      at ../util/coroutine-sigaltstack.c:105
> #8  0x0000562cd30da0e1 in coroutine_trampoline (signal=<optimized out>) 
> at ../util/coroutine-sigaltstack.c:146
> #9  0x00007f6187cee1e0 in <signal handler called> () at 
> /lib64/libpthread.so.0
> #10 0x00007f61870bad8a in sigsuspend () at /lib64/libc.so.6
> #11 0x0000562cd30d9f61 in qemu_coroutine_new () at 
> ../util/coroutine-sigaltstack.c:254
> #12 0x0000562cd61b40d0 in  ()
> #13 0x00007f6163fff930 in  ()
> #14 0x00007f6187cecc5f in annobin_pt_longjmp.c_end () at 
> /lib64/libpthread.so.0
> #15 0x0000562cd30da00d in qemu_coroutine_switch
>      (from_=0x562cd5cc3400, to_=0x7f616c003108, 
> action=action@entry=COROUTINE_YIELD)
>      at ../util/coroutine-sigaltstack.c:284
> #16 0x0000562cd31065f0 in qemu_coroutine_yield () at 
> ../util/qemu-coroutine.c:193
> #17 0x0000562cd2fc751d in job_do_yield (job=0x562cd5cc3400, 
> ns=18446744073709551615) at ../job.c:478
> #18 0x0000562cd2fc855f in job_yield (job=0x562cd5cc3400) at ../job.c:525
> #19 0x0000562cd301dd74 in backup_loop (job=<optimized out>) at 
> ../block/backup.c:164
> #20 backup_run (job=0x562cd5cc3400, errp=<optimized out>) at 
> ../block/backup.c:299
> #21 0x0000562cd2fc84d2 in job_co_entry (opaque=0x562cd5cc3400) at 
> ../job.c:911
> #22 0x0000562cd30da04b in coroutine_bootstrap 
> (self=self@entry=0x562cd61b40d0, co=co@entry=0x562cd61b40d0)
>      at ../util/coroutine-sigaltstack.c:105
> #23 0x0000562cd30da0e1 in coroutine_trampoline (signal=<optimized out>) 
> at ../util/coroutine-sigaltstack.c:146
> #24 0x00007f6187cee1e0 in <signal handler called> () at 
> /lib64/libpthread.so.0
> #25 0x00007f61870bad8a in sigsuspend () at /lib64/libc.so.6
> #26 0x0000000000000001 in  ()
> #27 0x0000000000000000 in  ()
> 
>  From a glance, it looks to me like two coroutines are created 
> simultaneously in two threads, and so one thread sets up a special 
> SIGUSR2 action, then another reverts SIGUSR2 to the default, and then 
> the first one kills itself with SIGUSR2.
> 
> Not sure what this has to do with backup, though it is interesting that 
> backup_loop() runs in two threads.  So perhaps some AioContext problem.

Oh, 256 runs two backups concurrently.  So it isn’t that interesting, 
but perhaps part of the problem still.  (I have no idea, still looking.)

Max


Re: [PATCH v4 00/23] backup performance: block_status + async
Posted by Max Reitz 3 years, 2 months ago
On 20.01.21 15:44, Max Reitz wrote:
> On 20.01.21 15:34, Max Reitz wrote:

[...]

>>  From a glance, it looks to me like two coroutines are created 
>> simultaneously in two threads, and so one thread sets up a special 
>> SIGUSR2 action, then another reverts SIGUSR2 to the default, and then 
>> the first one kills itself with SIGUSR2.
>>
>> Not sure what this has to do with backup, though it is interesting 
>> that backup_loop() runs in two threads.  So perhaps some AioContext 
>> problem.
> 
> Oh, 256 runs two backups concurrently.  So it isn’t that interesting, 
> but perhaps part of the problem still.  (I have no idea, still looking.)

So this is what I found out:

coroutine-sigaltstack, when creating a new coroutine, sets up a signal 
handler for SIGUSR2, then kills itself with SIGUSR2, then uses the 
signal handler context (with a sigaltstack) for the new coroutine, and 
then (the signal handler returns after a sigsetjmp()) the old SIGUSR2 
behavior is restored.

What I fail to understand is how this is thread-safe.  Setting up signal 
handlers is a process-wide action.  When one thread changes what SIGUSR2 
does, this will affect all threads immediately, so when two threads run 
coroutine-sigaltstack’s qemu_coroutine_new() concurrently, and one 
thread reverts to the default action before the other has SIGUSR2’ed 
itself, that later SIGUSR2 will kill the whole process.

(I suppose it gets even more interesting when one thread has set up the 
sigaltstack, then the other sets up its own sigaltstack, and then both 
kill themselves with SIGUSR2, so both coroutines get the same stack...)

I have no idea why this has never been hit before, but it makes sense 
why block-copy backup makes it apparent: It creates 64+x coroutines in a 
very short time span, and 256 makes it do so in two threads concurrently 
(thanks to launching two backups in two AioContexts in a transaction).

So...  Looks to me like a bug in coroutine-sigaltstack.  Not sure what 
to do now, though.  I don’t think we can use block-copy for backup 
before that backend is fixed.  (And that is assuming that it’s indeed 
coroutine-sigaltstack’s fault.)

I’ll try to add some locking, see what it does, and send a mail 
concerning coroutine-sigaltstack to qemu-devel.

Max


Re: [PATCH v4 00/23] backup performance: block_status + async
Posted by Max Reitz 3 years, 2 months ago
On 20.01.21 16:53, Max Reitz wrote:
> On 20.01.21 15:44, Max Reitz wrote:
>> On 20.01.21 15:34, Max Reitz wrote:
> 
> [...]
> 
>>>  From a glance, it looks to me like two coroutines are created 
>>> simultaneously in two threads, and so one thread sets up a special 
>>> SIGUSR2 action, then another reverts SIGUSR2 to the default, and then 
>>> the first one kills itself with SIGUSR2.
>>>
>>> Not sure what this has to do with backup, though it is interesting 
>>> that backup_loop() runs in two threads.  So perhaps some AioContext 
>>> problem.
>>
>> Oh, 256 runs two backups concurrently.  So it isn’t that interesting, 
>> but perhaps part of the problem still.  (I have no idea, still looking.)
> 
> So this is what I found out:
> 
> coroutine-sigaltstack, when creating a new coroutine, sets up a signal 
> handler for SIGUSR2, then kills itself with SIGUSR2, then uses the 
> signal handler context (with a sigaltstack) for the new coroutine, and 
> then (the signal handler returns after a sigsetjmp()) the old SIGUSR2 
> behavior is restored.
> 
> What I fail to understand is how this is thread-safe.  Setting up signal 
> handlers is a process-wide action.  When one thread changes what SIGUSR2 
> does, this will affect all threads immediately, so when two threads run 
> coroutine-sigaltstack’s qemu_coroutine_new() concurrently, and one 
> thread reverts to the default action before the other has SIGUSR2’ed 
> itself, that later SIGUSR2 will kill the whole process.
> 
> (I suppose it gets even more interesting when one thread has set up the 
> sigaltstack, then the other sets up its own sigaltstack, and then both 
> kill themselves with SIGUSR2, so both coroutines get the same stack...)
> 
> I have no idea why this has never been hit before, but it makes sense 
> why block-copy backup makes it apparent: It creates 64+x coroutines in a 
> very short time span, and 256 makes it do so in two threads concurrently 
> (thanks to launching two backups in two AioContexts in a transaction).
> 
> So...  Looks to me like a bug in coroutine-sigaltstack.  Not sure what 
> to do now, though.  I don’t think we can use block-copy for backup 
> before that backend is fixed.  (And that is assuming that it’s indeed 
> coroutine-sigaltstack’s fault.)
> 
> I’ll try to add some locking, see what it does, and send a mail 
> concerning coroutine-sigaltstack to qemu-devel.

Yeah, adding a mutex around the section where SIGUSR2 handling and the 
sigaltstack are non-default makes the problem go away.

Max


Re: [PATCH v4 00/23] backup performance: block_status + async
Posted by Daniel P. Berrangé 3 years, 2 months ago
On Wed, Jan 20, 2021 at 04:53:26PM +0100, Max Reitz wrote:
> On 20.01.21 15:44, Max Reitz wrote:
> > On 20.01.21 15:34, Max Reitz wrote:
> 
> [...]
> 
> > >  From a glance, it looks to me like two coroutines are created
> > > simultaneously in two threads, and so one thread sets up a special
> > > SIGUSR2 action, then another reverts SIGUSR2 to the default, and
> > > then the first one kills itself with SIGUSR2.
> > > 
> > > Not sure what this has to do with backup, though it is interesting
> > > that backup_loop() runs in two threads.  So perhaps some AioContext
> > > problem.
> > 
> > Oh, 256 runs two backups concurrently.  So it isn’t that interesting,
> > but perhaps part of the problem still.  (I have no idea, still looking.)
> 
> So this is what I found out:
> 
> coroutine-sigaltstack, when creating a new coroutine, sets up a signal
> handler for SIGUSR2, then kills itself with SIGUSR2, then uses the signal
> handler context (with a sigaltstack) for the new coroutine, and then (the
> signal handler returns after a sigsetjmp()) the old SIGUSR2 behavior is
> restored.
> 
> What I fail to understand is how this is thread-safe.  Setting up signal
> handlers is a process-wide action.  When one thread changes what SIGUSR2
> does, this will affect all threads immediately, so when two threads run
> coroutine-sigaltstack’s qemu_coroutine_new() concurrently, and one thread
> reverts to the default action before the other has SIGUSR2’ed itself, that
> later SIGUSR2 will kill the whole process.
> 
> (I suppose it gets even more interesting when one thread has set up the
> sigaltstack, then the other sets up its own sigaltstack, and then both kill
> themselves with SIGUSR2, so both coroutines get the same stack...)
> 
> I have no idea why this has never been hit before, but it makes sense why
> block-copy backup makes it apparent: It creates 64+x coroutines in a very
> short time span, and 256 makes it do so in two threads concurrently (thanks
> to launching two backups in two AioContexts in a transaction).
> 
> So...  Looks to me like a bug in coroutine-sigaltstack.  Not sure what to do
> now, though.  I don’t think we can use block-copy for backup before that
> backend is fixed.  (And that is assuming that it’s indeed
> coroutine-sigaltstack’s fault.)
> 
> I’ll try to add some locking, see what it does, and send a mail concerning
> coroutine-sigaltstack to qemu-devel.

I'm wondering if we should simply remove the sigaltstack impl and use
ucontext on MacOS too.

MacOS has ucontext marked as deprecated by default, seemingly because
this functionality was deprecated by POSIX. The functionality is still
available without deprecation warnings if you set _XOPEN_SOURCE.

IOW, it is trivial to make the ucontext impl work on MacOS simply by
adding

 #define _XOPEN_SOURCE 600

before including ucontext.h in coroutine-ucontext.c, and removing the
restrictions in configure



diff --git a/configure b/configure
index 881af4b6be..a58bdf70f3 100755
--- a/configure
+++ b/configure
@@ -4822,8 +4822,9 @@ fi
 # specific one.
 
 ucontext_works=no
-if test "$darwin" != "yes"; then
+
   cat > $TMPC << EOF
+#define _XOPEN_SOURCE 600
 #include <ucontext.h>
 #ifdef __stub_makecontext
 #error Ignoring glibc stub makecontext which will always fail
@@ -4833,7 +4834,6 @@ EOF
   if compile_prog "" "" ; then
     ucontext_works=yes
   fi
-fi
 
 if test "$coroutine" = ""; then
   if test "$mingw32" = "yes"; then
diff --git a/util/coroutine-ucontext.c b/util/coroutine-ucontext.c
index 904b375192..9c0a2cf85c 100644
--- a/util/coroutine-ucontext.c
+++ b/util/coroutine-ucontext.c
@@ -22,6 +22,7 @@
 #ifdef _FORTIFY_SOURCE
 #undef _FORTIFY_SOURCE
 #endif
+#define _XOPEN_SOURCE 600
 #include "qemu/osdep.h"
 #include <ucontext.h>
 #include "qemu/coroutine_int.h"



Further more for iOS there was a proposal to add support for using
libucontext, which provides a clean impl of ucontext APIs for x86
and aarch64 hosts.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Re: [PATCH v4 00/23] backup performance: block_status + async
Posted by Max Reitz 3 years, 2 months ago
On 20.01.21 17:04, Daniel P. Berrangé wrote:
> On Wed, Jan 20, 2021 at 04:53:26PM +0100, Max Reitz wrote:
>> On 20.01.21 15:44, Max Reitz wrote:
>>> On 20.01.21 15:34, Max Reitz wrote:
>>
>> [...]
>>
>>>>   From a glance, it looks to me like two coroutines are created
>>>> simultaneously in two threads, and so one thread sets up a special
>>>> SIGUSR2 action, then another reverts SIGUSR2 to the default, and
>>>> then the first one kills itself with SIGUSR2.
>>>>
>>>> Not sure what this has to do with backup, though it is interesting
>>>> that backup_loop() runs in two threads.  So perhaps some AioContext
>>>> problem.
>>>
>>> Oh, 256 runs two backups concurrently.  So it isn’t that interesting,
>>> but perhaps part of the problem still.  (I have no idea, still looking.)
>>
>> So this is what I found out:
>>
>> coroutine-sigaltstack, when creating a new coroutine, sets up a signal
>> handler for SIGUSR2, then kills itself with SIGUSR2, then uses the signal
>> handler context (with a sigaltstack) for the new coroutine, and then (the
>> signal handler returns after a sigsetjmp()) the old SIGUSR2 behavior is
>> restored.
>>
>> What I fail to understand is how this is thread-safe.  Setting up signal
>> handlers is a process-wide action.  When one thread changes what SIGUSR2
>> does, this will affect all threads immediately, so when two threads run
>> coroutine-sigaltstack’s qemu_coroutine_new() concurrently, and one thread
>> reverts to the default action before the other has SIGUSR2’ed itself, that
>> later SIGUSR2 will kill the whole process.
>>
>> (I suppose it gets even more interesting when one thread has set up the
>> sigaltstack, then the other sets up its own sigaltstack, and then both kill
>> themselves with SIGUSR2, so both coroutines get the same stack...)
>>
>> I have no idea why this has never been hit before, but it makes sense why
>> block-copy backup makes it apparent: It creates 64+x coroutines in a very
>> short time span, and 256 makes it do so in two threads concurrently (thanks
>> to launching two backups in two AioContexts in a transaction).
>>
>> So...  Looks to me like a bug in coroutine-sigaltstack.  Not sure what to do
>> now, though.  I don’t think we can use block-copy for backup before that
>> backend is fixed.  (And that is assuming that it’s indeed
>> coroutine-sigaltstack’s fault.)
>>
>> I’ll try to add some locking, see what it does, and send a mail concerning
>> coroutine-sigaltstack to qemu-devel.
> 
> I'm wondering if we should simply remove the sigaltstack impl and use
> ucontext on MacOS too.
> 
> MacOS has ucontext marked as deprecated by default, seemingly because
> this functionality was deprecated by POSIX. The functionality is still
> available without deprecation warnings if you set _XOPEN_SOURCE.

 From my outside point of view (on coroutine backends), everything you 
wrote below sounds like a very reasonable thing to do.  So perhaps we 
should (I’m not the right person to decide that, though).

However, for me, the immediate question is what I can do now.  I naively 
believe that dropping the sigaltstack implementation would require a 
deprecation cycle.  (If it doesn’t and we can get it out in, say, a 
week, great.)

I need something that helps right now, because I have Vladimir’s series 
in my block branch, the failure doesn’t seem to be its fault, but I 
can’t send a pull request as long as concurrent backups in two iothreads 
make qemu effectively crash when using a specific coroutine backend. 
(And I don’t see configure giving me a warning that choosing sigaltstack 
might be bad idea.)

I suppose I hope that the prospect of wanting to drop sigaltstack 
altogether may lessen the resistance to just wrapping most of its 
qemu_coroutine_new() implementation in a lock until then...  (i.e., what 
the RFC does that I’ve attached to
https://lists.nongnu.org/archive/html/qemu-devel/2021-01/msg05164.html)

Max

> IOW, it is trivial to make the ucontext impl work on MacOS simply by
> adding
> 
>   #define _XOPEN_SOURCE 600
> 
> before including ucontext.h in coroutine-ucontext.c, and removing the
> restrictions in configure
> 
> 
> 
> diff --git a/configure b/configure
> index 881af4b6be..a58bdf70f3 100755
> --- a/configure
> +++ b/configure
> @@ -4822,8 +4822,9 @@ fi
>   # specific one.
>   
>   ucontext_works=no
> -if test "$darwin" != "yes"; then
> +
>     cat > $TMPC << EOF
> +#define _XOPEN_SOURCE 600
>   #include <ucontext.h>
>   #ifdef __stub_makecontext
>   #error Ignoring glibc stub makecontext which will always fail
> @@ -4833,7 +4834,6 @@ EOF
>     if compile_prog "" "" ; then
>       ucontext_works=yes
>     fi
> -fi
>   
>   if test "$coroutine" = ""; then
>     if test "$mingw32" = "yes"; then
> diff --git a/util/coroutine-ucontext.c b/util/coroutine-ucontext.c
> index 904b375192..9c0a2cf85c 100644
> --- a/util/coroutine-ucontext.c
> +++ b/util/coroutine-ucontext.c
> @@ -22,6 +22,7 @@
>   #ifdef _FORTIFY_SOURCE
>   #undef _FORTIFY_SOURCE
>   #endif
> +#define _XOPEN_SOURCE 600
>   #include "qemu/osdep.h"
>   #include <ucontext.h>
>   #include "qemu/coroutine_int.h"
> 
> 
> 
> Further more for iOS there was a proposal to add support for using
> libucontext, which provides a clean impl of ucontext APIs for x86
> and aarch64 hosts.
> 
> Regards,
> Daniel
> 


Re: [PATCH v4 00/23] backup performance: block_status + async
Posted by Max Reitz 3 years, 2 months ago
On 16.01.21 22:46, Vladimir Sementsov-Ogievskiy wrote:
> Hi Max!
> I applied my series onto yours 129-fixing and found, that 129 fails for backup.
> And setting small max-chunk and even max-workers to 1 doesn't help! (setting
> speed like in v3 still helps).
> 
> And I found, that the problem is that really, the whole backup job goes during
> drain, because in new architecture we do just job_yield() during the whole
> background block-copy.

OK, so as it was in v3, the job was drained, but since it was already 
yielding while block-copy was running in the background, nothing 
happened; the block-copy completed and only then was the job woken (and 
then there was no reason to pause, because it was done already).

So now the job is entered on drain, too (not only user pauses), which 
means that it gets a chance to pause background requests.

In backup’s case, that means invoking job_yield() again, which sets a 
job_pause_point(), which will cancel the block-copy.  Once the job is 
unpaused (re-entered by job_resume()), backup sees block-copy is 
cancelled (and finished), leaves the loop, and retries with a new 
block-copy call.

I think I got it now.


So all that’s left is issuing a thanks to you – thanks! – and announcing 
that I’ve applied this series to my block branch (with s/not 
unsupported/not supported/ in patch 23):

https://git.xanclic.moe/XanClic/qemu/commits/branch/block

Max


Re: [PATCH v4 00/23] backup performance: block_status + async
Posted by Vladimir Sementsov-Ogievskiy 3 years, 2 months ago
18.01.2021 18:07, Max Reitz wrote:
> On 16.01.21 22:46, Vladimir Sementsov-Ogievskiy wrote:
>> Hi Max!
>> I applied my series onto yours 129-fixing and found, that 129 fails for backup.
>> And setting small max-chunk and even max-workers to 1 doesn't help! (setting
>> speed like in v3 still helps).
>>
>> And I found, that the problem is that really, the whole backup job goes during
>> drain, because in new architecture we do just job_yield() during the whole
>> background block-copy.
> 
> OK, so as it was in v3, the job was drained, but since it was already yielding while block-copy was running in the background, nothing happened; the block-copy completed and only then was the job woken (and then there was no reason to pause, because it was done already).
> 
> So now the job is entered on drain, too (not only user pauses), which means that it gets a chance to pause background requests.
> 
> In backup’s case, that means invoking job_yield() again, which sets a job_pause_point(), which will cancel the block-copy.  Once the job is unpaused (re-entered by job_resume()), backup sees block-copy is cancelled (and finished), leaves the loop, and retries with a new block-copy call.
> 
> I think I got it now.
> 
> 
> So all that’s left is issuing a thanks to you – thanks! – and announcing that I’ve applied this series to my block branch (with s/not unsupported/not supported/ in patch 23):
> 
> https://git.xanclic.moe/XanClic/qemu/commits/branch/block
> 

Great! Thanks!


-- 
Best regards,
Vladimir