[Qemu-devel] [PATCH] blockjob: fix user pause in block_job_error_action

Vladimir Sementsov-Ogievskiy posted 1 patch 5 years, 1 month ago
Test docker-clang@ubuntu passed
Test asan passed
Test checkpatch passed
Test docker-mingw@fedora passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20190319092442.14335-1-vsementsov@virtuozzo.com
Maintainers: Kevin Wolf <kwolf@redhat.com>, John Snow <jsnow@redhat.com>, Max Reitz <mreitz@redhat.com>
blockjob.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
[Qemu-devel] [PATCH] blockjob: fix user pause in block_job_error_action
Posted by Vladimir Sementsov-Ogievskiy 5 years, 1 month ago
Job (especially mirror) may call block_job_error_action several
times before actual pause if it has several in-flight requests.

block_job_error_action will call job_pause more than once in this case,
which lead to following block-job-resume qmp command can't actually
resume the job.

Fix it by do not increase pause level in block_job_error_action if
user_paused already set.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---

Hi all!

I have a reproducer, however, I don't know how to make a test from it,
as I don't know a way to create a size-limited directory without root
access.

So here is it:

    #!/usr/bin/bash

    set -e

    MNT=/mirror-test
    A=$MNT/a
    B=$MNT/b
    STONE=$MNT/stone

    mkdir $MNT
    mount -t tmpfs -o size=11M tmpfs $MNT
    dd if=/dev/urandom of=$A bs=1M count=5
    dd if=/dev/urandom of=$STONE bs=1M count=5

    (
    cat <<QMP
    {"execute": "qmp_capabilities"} 
    {"execute": "drive-mirror", "arguments": {"device": "drive0", "sync": "full", "target": "$B", "on-target-error": "enospc"}} 
    QMP

    sleep 2
    rm -rf $STONE

    cat <<QMP
    {"execute": "query-block-jobs"} 
    {"execute": "block-job-resume", "arguments": {"device": "drive0"}} 
    QMP

    sleep 2
    cat <<QMP
    {"execute": "query-block-jobs"} 
    {"execute": "quit"} 
    QMP
    ) | ./x86_64-softmmu/qemu-system-x86_64 -nographic -qmp stdio -nodefaults -serial none -drive file=$A,id=drive0

    umount $MNT
    rmdir $MNT


 blockjob.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index 58de8cb024..730101d282 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -501,9 +501,11 @@ BlockErrorAction block_job_error_action(BlockJob *job, BlockdevOnError on_err,
                                         action);
     }
     if (action == BLOCK_ERROR_ACTION_STOP) {
-        job_pause(&job->job);
-        /* make the pause user visible, which will be resumed from QMP. */
-        job->job.user_paused = true;
+        if (!job->job.user_paused) {
+            job_pause(&job->job);
+            /* make the pause user visible, which will be resumed from QMP. */
+            job->job.user_paused = true;
+        }
         block_job_iostatus_set_err(job, error);
     }
     return action;
-- 
2.18.0


Re: [Qemu-devel] [PATCH] blockjob: fix user pause in block_job_error_action
Posted by Kevin Wolf 5 years, 1 month ago
Am 19.03.2019 um 10:24 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Job (especially mirror) may call block_job_error_action several
> times before actual pause if it has several in-flight requests.
> 
> block_job_error_action will call job_pause more than once in this case,
> which lead to following block-job-resume qmp command can't actually
> resume the job.
> 
> Fix it by do not increase pause level in block_job_error_action if
> user_paused already set.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

Thanks, applied to the block branch. (But please follow up with a test
case anyway.)

> Hi all!
> 
> I have a reproducer, however, I don't know how to make a test from it,
> as I don't know a way to create a size-limited directory without root
> access.

Maybe we can use the 'size' option of the raw format driver below a
qcow2 layer to achieve something like this? You can update the size at
runtime using 'reopen -o size=...' in HMP qemu-io.

file <- raw [size=11M] <- qcow2 [size=1G]

This should return an error after writing the first 11M (including
metadata), and once you resize the raw layer, you should be able to
resume the job.

Kevin

Re: [Qemu-devel] [PATCH] blockjob: fix user pause in block_job_error_action
Posted by Vladimir Sementsov-Ogievskiy 5 years, 1 month ago
19.03.2019 14:01, Kevin Wolf wrote:
> Am 19.03.2019 um 10:24 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> Job (especially mirror) may call block_job_error_action several
>> times before actual pause if it has several in-flight requests.
>>
>> block_job_error_action will call job_pause more than once in this case,
>> which lead to following block-job-resume qmp command can't actually
>> resume the job.
>>
>> Fix it by do not increase pause level in block_job_error_action if
>> user_paused already set.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> 
> Thanks, applied to the block branch. (But please follow up with a test
> case anyway.)
> 
>> Hi all!
>>
>> I have a reproducer, however, I don't know how to make a test from it,
>> as I don't know a way to create a size-limited directory without root
>> access.
> 
> Maybe we can use the 'size' option of the raw format driver below a
> qcow2 layer to achieve something like this? You can update the size at
> runtime using 'reopen -o size=...' in HMP qemu-io.

Cool that works, thanks!

> 
> file <- raw [size=11M] <- qcow2 [size=1G]
> 
> This should return an error after writing the first 11M (including
> metadata), and once you resize the raw layer, you should be able to
> resume the job.
> 
> Kevin
> 


-- 
Best regards,
Vladimir
Re: [Qemu-devel] [PATCH] blockjob: fix user pause in block_job_error_action
Posted by John Snow 5 years, 1 month ago

On 3/19/19 5:24 AM, Vladimir Sementsov-Ogievskiy wrote:
> Job (especially mirror) may call block_job_error_action several
> times before actual pause if it has several in-flight requests.
> 
> block_job_error_action will call job_pause more than once in this case,
> which lead to following block-job-resume qmp command can't actually
> resume the job.
> 
> Fix it by do not increase pause level in block_job_error_action if
> user_paused already set.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

Not that you need it now, but:

Reviewed-by: John Snow <jsnow@redhat.com>

Good find.

--js