On 4/8/21 12:20 PM, Max Reitz wrote:
> Hi,
>
> See patch 1 for a detailed explanation of the problem.
>
> The gist is: Draining a READY job makes it transition to STANDBY, and
> jobs on STANDBY cannot be completed. Ending the drained section will
> schedule the job (so it is then resumed), but not wait until it is
> actually running again.
>
> Therefore, it can happen that issuing block-job-complete fails when you
> issue it right after some draining operation.
>
> I tried to come up with an iotest reproducer, but in the end I only got
> something that reproduced the issue like 2/10 times, and it required
> heavy I/O, so it is nothing I would like to have as part of the iotests.
> Instead, I opted for a unit test, which allows me to cheat a bit
> (specifically, locking the job IO thread before ending the drained
> section).
>
>
> Max Reitz (3):
> job: Add job_wait_unpaused() for block-job-complete
> test-blockjob: Test job_wait_unpaused()
> iotests/041: block-job-complete on user-paused job
>
> include/qemu/job.h | 15 ++++
> blockdev.c | 3 +
> job.c | 42 +++++++++++
> tests/unit/test-blockjob.c | 140 +++++++++++++++++++++++++++++++++++++
> tests/qemu-iotests/041 | 13 +++-
> 5 files changed, 212 insertions(+), 1 deletion(-)
>
Left comments and review on #1, skimmed 2/3. Not sure if it's
appropriate for 6.0 yet, that might depend on the responses to my
comments and other reviewers and so on.
Acked-by: John Snow <jsnow@redhat.com>