Reference: https://bugzilla.redhat.com/show_bug.cgi?id=1367369#c8 It's possible to wedge QEMU if the guest tries to reset a virtio-pci device as QEMU is also using the drive for a blockjob. This patchset aims to allow us to safely pause/resume jobs attached to individual nodes in a manner similar to how bdrv_drain_all_begin/end do. John Snow (3): blockjob: add block_job_start_shim block-backend: add drained_begin / drained_end ops blockjob: add devops to blockjob backends block/block-backend.c | 24 ++++++++++++++++-- blockjob.c | 55 +++++++++++++++++++++++++++++++++--------- include/sysemu/block-backend.h | 8 ++++++ 3 files changed, 73 insertions(+), 14 deletions(-) -- 2.9.3
Hi, This series seems to have some coding style problems. See output below for more information: Type: series Subject: [Qemu-devel] [PATCH v2 0/3] block: pause block jobs for bdrv_drain_begin/end Message-id: 20170316212351.13797-1-jsnow@redhat.com === TEST SCRIPT BEGIN === #!/bin/bash BASE=base n=1 total=$(git log --oneline $BASE.. | wc -l) failed=0 # Useful git options git config --local diff.renamelimit 0 git config --local diff.renames True commits="$(git log --format=%H --reverse $BASE..)" for c in $commits; do echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..." if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then failed=1 echo fi n=$((n+1)) done exit $failed === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 From https://github.com/patchew-project/qemu * [new tag] patchew/20170316212351.13797-1-jsnow@redhat.com -> patchew/20170316212351.13797-1-jsnow@redhat.com Switched to a new branch 'test' 1cca6f3 blockjob: add devops to blockjob backends 864d906 block-backend: add drained_begin / drained_end ops 5e4f22d blockjob: add block_job_start_shim === OUTPUT BEGIN === Checking PATCH 1/3: blockjob: add block_job_start_shim... Checking PATCH 2/3: block-backend: add drained_begin / drained_end ops... ERROR: suspect code indent for conditional statements (8, 14) #70: FILE: block/block-backend.c:1903: + if (blk->dev_ops && blk->dev_ops->drained_end) { + blk->dev_ops->drained_end(blk->dev_opaque); total: 1 errors, 0 warnings, 67 lines checked Your patch has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. Checking PATCH 3/3: blockjob: add devops to blockjob backends... === OUTPUT END === Test command exited with code: 1 --- Email generated automatically by Patchew [http://patchew.org/]. Please send your feedback to patchew-devel@freelists.org
ping, is this the only issue? Any feedback? If this can hit 2.9 that would be good. --js On 03/16/2017 05:28 PM, no-reply@patchew.org wrote: > Hi, > > This series seems to have some coding style problems. See output below for > more information: > > Type: series > Subject: [Qemu-devel] [PATCH v2 0/3] block: pause block jobs for bdrv_drain_begin/end > Message-id: 20170316212351.13797-1-jsnow@redhat.com > > === TEST SCRIPT BEGIN === > #!/bin/bash > > BASE=base > n=1 > total=$(git log --oneline $BASE.. | wc -l) > failed=0 > > # Useful git options > git config --local diff.renamelimit 0 > git config --local diff.renames True > > commits="$(git log --format=%H --reverse $BASE..)" > for c in $commits; do > echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..." > if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then > failed=1 > echo > fi > n=$((n+1)) > done > > exit $failed > === TEST SCRIPT END === > > Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 > From https://github.com/patchew-project/qemu > * [new tag] patchew/20170316212351.13797-1-jsnow@redhat.com -> patchew/20170316212351.13797-1-jsnow@redhat.com > Switched to a new branch 'test' > 1cca6f3 blockjob: add devops to blockjob backends > 864d906 block-backend: add drained_begin / drained_end ops > 5e4f22d blockjob: add block_job_start_shim > > === OUTPUT BEGIN === > Checking PATCH 1/3: blockjob: add block_job_start_shim... > Checking PATCH 2/3: block-backend: add drained_begin / drained_end ops... > ERROR: suspect code indent for conditional statements (8, 14) > #70: FILE: block/block-backend.c:1903: > + if (blk->dev_ops && blk->dev_ops->drained_end) { > + blk->dev_ops->drained_end(blk->dev_opaque); > > total: 1 errors, 0 warnings, 67 lines checked > > Your patch has style problems, please review. If any of these errors > are false positives report them to the maintainer, see > CHECKPATCH in MAINTAINERS. > > Checking PATCH 3/3: blockjob: add devops to blockjob backends... > === OUTPUT END === > > Test command exited with code: 1 > > > --- > Email generated automatically by Patchew [http://patchew.org/]. > Please send your feedback to patchew-devel@freelists.org >
On Wed, Mar 22, 2017 at 11:37:15AM -0400, John Snow wrote: > ping, is this the only issue? Any feedback? If this can hit 2.9 that > would be good. > The series looks fine to me, and I can patch up the nit from patchew when applying. But do you happen to have a qemu-iotest for this case, or is it not very feasible to create one? > > On 03/16/2017 05:28 PM, no-reply@patchew.org wrote: > > Hi, > > > > This series seems to have some coding style problems. See output below for > > more information: > > > > Type: series > > Subject: [Qemu-devel] [PATCH v2 0/3] block: pause block jobs for bdrv_drain_begin/end > > Message-id: 20170316212351.13797-1-jsnow@redhat.com > > > > === TEST SCRIPT BEGIN === > > #!/bin/bash > > > > BASE=base > > n=1 > > total=$(git log --oneline $BASE.. | wc -l) > > failed=0 > > > > # Useful git options > > git config --local diff.renamelimit 0 > > git config --local diff.renames True > > > > commits="$(git log --format=%H --reverse $BASE..)" > > for c in $commits; do > > echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..." > > if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then > > failed=1 > > echo > > fi > > n=$((n+1)) > > done > > > > exit $failed > > === TEST SCRIPT END === > > > > Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 > > From https://github.com/patchew-project/qemu > > * [new tag] patchew/20170316212351.13797-1-jsnow@redhat.com -> patchew/20170316212351.13797-1-jsnow@redhat.com > > Switched to a new branch 'test' > > 1cca6f3 blockjob: add devops to blockjob backends > > 864d906 block-backend: add drained_begin / drained_end ops > > 5e4f22d blockjob: add block_job_start_shim > > > > === OUTPUT BEGIN === > > Checking PATCH 1/3: blockjob: add block_job_start_shim... > > Checking PATCH 2/3: block-backend: add drained_begin / drained_end ops... > > ERROR: suspect code indent for conditional statements (8, 14) > > #70: FILE: block/block-backend.c:1903: > > + if (blk->dev_ops && blk->dev_ops->drained_end) { > > + blk->dev_ops->drained_end(blk->dev_opaque); > > > > total: 1 errors, 0 warnings, 67 lines checked > > > > Your patch has style problems, please review. If any of these errors > > are false positives report them to the maintainer, see > > CHECKPATCH in MAINTAINERS. > > > > Checking PATCH 3/3: blockjob: add devops to blockjob backends... > > === OUTPUT END === > > > > Test command exited with code: 1 > > > > > > --- > > Email generated automatically by Patchew [http://patchew.org/]. > > Please send your feedback to patchew-devel@freelists.org > >
On 03/22/2017 12:01 PM, Jeff Cody wrote: > On Wed, Mar 22, 2017 at 11:37:15AM -0400, John Snow wrote: >> ping, is this the only issue? Any feedback? If this can hit 2.9 that >> would be good. >> > > The series looks fine to me, and I can patch up the nit from patchew when > applying. But do you happen to have a qemu-iotest for this case, or is it > not very feasible to create one? > I might need a hint from Paolo on how; my reproducer ATM is to literally boot a Fedora VM and issue reboots/QMP commands and manually observe a hang. (Which is pretty subjective ...) An iotest version would probably involve using the qtest socket to issue a PCI reset of some sort inbetween QMP commands as necessary, but testing for a hang in iotests seems race-prone. I have no idea how long this hang would last on other machines, for instance. There might be a more fool-proof automated testing method, but at the second I'm drawing a blank. >> >> On 03/16/2017 05:28 PM, no-reply@patchew.org wrote: >>> Hi, >>> >>> This series seems to have some coding style problems. See output below for >>> more information: >>> >>> Type: series >>> Subject: [Qemu-devel] [PATCH v2 0/3] block: pause block jobs for bdrv_drain_begin/end >>> Message-id: 20170316212351.13797-1-jsnow@redhat.com >>> >>> === TEST SCRIPT BEGIN === >>> #!/bin/bash >>> >>> BASE=base >>> n=1 >>> total=$(git log --oneline $BASE.. | wc -l) >>> failed=0 >>> >>> # Useful git options >>> git config --local diff.renamelimit 0 >>> git config --local diff.renames True >>> >>> commits="$(git log --format=%H --reverse $BASE..)" >>> for c in $commits; do >>> echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..." >>> if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then >>> failed=1 >>> echo >>> fi >>> n=$((n+1)) >>> done >>> >>> exit $failed >>> === TEST SCRIPT END === >>> >>> Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 >>> From https://github.com/patchew-project/qemu >>> * [new tag] patchew/20170316212351.13797-1-jsnow@redhat.com -> patchew/20170316212351.13797-1-jsnow@redhat.com >>> Switched to a new branch 'test' >>> 1cca6f3 blockjob: add devops to blockjob backends >>> 864d906 block-backend: add drained_begin / drained_end ops >>> 5e4f22d blockjob: add block_job_start_shim >>> >>> === OUTPUT BEGIN === >>> Checking PATCH 1/3: blockjob: add block_job_start_shim... >>> Checking PATCH 2/3: block-backend: add drained_begin / drained_end ops... >>> ERROR: suspect code indent for conditional statements (8, 14) >>> #70: FILE: block/block-backend.c:1903: >>> + if (blk->dev_ops && blk->dev_ops->drained_end) { >>> + blk->dev_ops->drained_end(blk->dev_opaque); >>> >>> total: 1 errors, 0 warnings, 67 lines checked >>> >>> Your patch has style problems, please review. If any of these errors >>> are false positives report them to the maintainer, see >>> CHECKPATCH in MAINTAINERS. >>> >>> Checking PATCH 3/3: blockjob: add devops to blockjob backends... >>> === OUTPUT END === >>> >>> Test command exited with code: 1 >>> >>> >>> --- >>> Email generated automatically by Patchew [http://patchew.org/]. >>> Please send your feedback to patchew-devel@freelists.org >>>
On Wed, Mar 22, 2017 at 12:05:26PM -0400, John Snow wrote: > > > On 03/22/2017 12:01 PM, Jeff Cody wrote: > > On Wed, Mar 22, 2017 at 11:37:15AM -0400, John Snow wrote: > >> ping, is this the only issue? Any feedback? If this can hit 2.9 that > >> would be good. > >> > > > > The series looks fine to me, and I can patch up the nit from patchew when > > applying. But do you happen to have a qemu-iotest for this case, or is it > > not very feasible to create one? > > > > I might need a hint from Paolo on how; my reproducer ATM is to literally > boot a Fedora VM and issue reboots/QMP commands and manually observe a > hang. (Which is pretty subjective ...) > > An iotest version would probably involve using the qtest socket to issue > a PCI reset of some sort inbetween QMP commands as necessary, but > testing for a hang in iotests seems race-prone. I have no idea how long > this hang would last on other machines, for instance. > > There might be a more fool-proof automated testing method, but at the > second I'm drawing a blank. > OK, I figured as much. We'll chalk that up as "not very feasible", at least for the moment :) > >> > >> On 03/16/2017 05:28 PM, no-reply@patchew.org wrote: > >>> Hi, > >>> > >>> This series seems to have some coding style problems. See output below for > >>> more information: > >>> > >>> Type: series > >>> Subject: [Qemu-devel] [PATCH v2 0/3] block: pause block jobs for bdrv_drain_begin/end > >>> Message-id: 20170316212351.13797-1-jsnow@redhat.com > >>> > >>> === TEST SCRIPT BEGIN === > >>> #!/bin/bash > >>> > >>> BASE=base > >>> n=1 > >>> total=$(git log --oneline $BASE.. | wc -l) > >>> failed=0 > >>> > >>> # Useful git options > >>> git config --local diff.renamelimit 0 > >>> git config --local diff.renames True > >>> > >>> commits="$(git log --format=%H --reverse $BASE..)" > >>> for c in $commits; do > >>> echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..." > >>> if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then > >>> failed=1 > >>> echo > >>> fi > >>> n=$((n+1)) > >>> done > >>> > >>> exit $failed > >>> === TEST SCRIPT END === > >>> > >>> Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 > >>> From https://github.com/patchew-project/qemu > >>> * [new tag] patchew/20170316212351.13797-1-jsnow@redhat.com -> patchew/20170316212351.13797-1-jsnow@redhat.com > >>> Switched to a new branch 'test' > >>> 1cca6f3 blockjob: add devops to blockjob backends > >>> 864d906 block-backend: add drained_begin / drained_end ops > >>> 5e4f22d blockjob: add block_job_start_shim > >>> > >>> === OUTPUT BEGIN === > >>> Checking PATCH 1/3: blockjob: add block_job_start_shim... > >>> Checking PATCH 2/3: block-backend: add drained_begin / drained_end ops... > >>> ERROR: suspect code indent for conditional statements (8, 14) > >>> #70: FILE: block/block-backend.c:1903: > >>> + if (blk->dev_ops && blk->dev_ops->drained_end) { > >>> + blk->dev_ops->drained_end(blk->dev_opaque); > >>> > >>> total: 1 errors, 0 warnings, 67 lines checked > >>> > >>> Your patch has style problems, please review. If any of these errors > >>> are false positives report them to the maintainer, see > >>> CHECKPATCH in MAINTAINERS. > >>> > >>> Checking PATCH 3/3: blockjob: add devops to blockjob backends... > >>> === OUTPUT END === > >>> > >>> Test command exited with code: 1 > >>> > >>> > >>> --- > >>> Email generated automatically by Patchew [http://patchew.org/]. > >>> Please send your feedback to patchew-devel@freelists.org > >>>
On Thu, Mar 16, 2017 at 02:28:47PM -0700, no-reply@patchew.org wrote: > Hi, > > This series seems to have some coding style problems. See output below for > more information: > > Type: series > Subject: [Qemu-devel] [PATCH v2 0/3] block: pause block jobs for bdrv_drain_begin/end > Message-id: 20170316212351.13797-1-jsnow@redhat.com > > === TEST SCRIPT BEGIN === > #!/bin/bash > > BASE=base > n=1 > total=$(git log --oneline $BASE.. | wc -l) > failed=0 > > # Useful git options > git config --local diff.renamelimit 0 > git config --local diff.renames True > > commits="$(git log --format=%H --reverse $BASE..)" > for c in $commits; do > echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..." > if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then > failed=1 > echo > fi > n=$((n+1)) > done > > exit $failed > === TEST SCRIPT END === > > Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 > From https://github.com/patchew-project/qemu > * [new tag] patchew/20170316212351.13797-1-jsnow@redhat.com -> patchew/20170316212351.13797-1-jsnow@redhat.com > Switched to a new branch 'test' > 1cca6f3 blockjob: add devops to blockjob backends > 864d906 block-backend: add drained_begin / drained_end ops > 5e4f22d blockjob: add block_job_start_shim > > === OUTPUT BEGIN === > Checking PATCH 1/3: blockjob: add block_job_start_shim... > Checking PATCH 2/3: block-backend: add drained_begin / drained_end ops... > ERROR: suspect code indent for conditional statements (8, 14) > #70: FILE: block/block-backend.c:1903: > + if (blk->dev_ops && blk->dev_ops->drained_end) { > + blk->dev_ops->drained_end(blk->dev_opaque); > > total: 1 errors, 0 warnings, 67 lines checked > > Your patch has style problems, please review. If any of these errors > are false positives report them to the maintainer, see > CHECKPATCH in MAINTAINERS. > > Checking PATCH 3/3: blockjob: add devops to blockjob backends... > === OUTPUT END === > > Test command exited with code: 1 > > > --- > Email generated automatically by Patchew [http://patchew.org/]. > Please send your feedback to patchew-devel@freelists.org Fixed the patchew nit, and: Thanks, Applied to my block branch: git://github.com/codyprime/qemu-kvm-jtc.git block -Jeff
On Thu, Mar 16, 2017 at 9:23 PM, John Snow <jsnow@redhat.com> wrote: > Reference: https://bugzilla.redhat.com/show_bug.cgi?id=1367369#c8 > > It's possible to wedge QEMU if the guest tries to reset a virtio-pci > device as QEMU is also using the drive for a blockjob. This patchset > aims to allow us to safely pause/resume jobs attached to individual > nodes in a manner similar to how bdrv_drain_all_begin/end do. Weird, I thought the 0 nanosecond sleep that block jobs do in their main loop allows aio_poll() loops to finish. Maybe this bug is a regression? I've CCed Fam because I think he has also tackled this issue in the past. Stefan
On 23/03/2017 18:44, Stefan Hajnoczi wrote: >> It's possible to wedge QEMU if the guest tries to reset a virtio-pci >> device as QEMU is also using the drive for a blockjob. This patchset >> aims to allow us to safely pause/resume jobs attached to individual >> nodes in a manner similar to how bdrv_drain_all_begin/end do. > > Weird, I thought the 0 nanosecond sleep that block jobs do in their > main loop allows aio_poll() loops to finish. The 0 nanosecond sleep is now done in the BDS AioContext rather than in the "non-aio_poll-aware" main loop: commit 0b9caf9b3166c8deb3c4f3a774c2384b069dc29c Author: Fam Zheng <famz@redhat.com> Date: Tue Aug 26 15:15:43 2014 +0800 coroutine: Drop co_sleep_ns block_job_sleep_ns is the only user. Since we are moving towards AioContext aware code, it's better to use the explicit version and drop the old one. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Benoît Canet <benoit.canet@nodalink.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > I've CCed Fam because I think he has also tackled this issue in the past. Yes, he fixed the same issue for bdrv_drain_all. Paolo
On Thu, Mar 23, 2017 at 06:57:14PM +0100, Paolo Bonzini wrote: > > > On 23/03/2017 18:44, Stefan Hajnoczi wrote: > >> It's possible to wedge QEMU if the guest tries to reset a virtio-pci > >> device as QEMU is also using the drive for a blockjob. This patchset > >> aims to allow us to safely pause/resume jobs attached to individual > >> nodes in a manner similar to how bdrv_drain_all_begin/end do. > > > > Weird, I thought the 0 nanosecond sleep that block jobs do in their > > main loop allows aio_poll() loops to finish. > > The 0 nanosecond sleep is now done in the BDS AioContext rather than in > the "non-aio_poll-aware" main loop: > > commit 0b9caf9b3166c8deb3c4f3a774c2384b069dc29c > Author: Fam Zheng <famz@redhat.com> > Date: Tue Aug 26 15:15:43 2014 +0800 > > coroutine: Drop co_sleep_ns > > block_job_sleep_ns is the only user. Since we are moving towards > AioContext aware code, it's better to use the explicit version and drop > the old one. > > Signed-off-by: Fam Zheng <famz@redhat.com> > Reviewed-by: Benoît Canet <benoit.canet@nodalink.com> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> But we hold the AioContext lock and are calling aio_poll(), so I would expect our loop to terminate. The blockjob coroutine should still be leaving this little gap in activity during which the aio_poll() loop finishes. Stefan
On Fri, 03/24 15:27, Stefan Hajnoczi wrote: > On Thu, Mar 23, 2017 at 06:57:14PM +0100, Paolo Bonzini wrote: > > > > > > On 23/03/2017 18:44, Stefan Hajnoczi wrote: > > >> It's possible to wedge QEMU if the guest tries to reset a virtio-pci > > >> device as QEMU is also using the drive for a blockjob. This patchset > > >> aims to allow us to safely pause/resume jobs attached to individual > > >> nodes in a manner similar to how bdrv_drain_all_begin/end do. > > > > > > Weird, I thought the 0 nanosecond sleep that block jobs do in their > > > main loop allows aio_poll() loops to finish. > > > > The 0 nanosecond sleep is now done in the BDS AioContext rather than in > > the "non-aio_poll-aware" main loop: > > > > commit 0b9caf9b3166c8deb3c4f3a774c2384b069dc29c > > Author: Fam Zheng <famz@redhat.com> > > Date: Tue Aug 26 15:15:43 2014 +0800 > > > > coroutine: Drop co_sleep_ns > > > > block_job_sleep_ns is the only user. Since we are moving towards > > AioContext aware code, it's better to use the explicit version and drop > > the old one. > > > > Signed-off-by: Fam Zheng <famz@redhat.com> > > Reviewed-by: Benoît Canet <benoit.canet@nodalink.com> > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > > But we hold the AioContext lock and are calling aio_poll(), so I would > expect our loop to terminate. The blockjob coroutine should still be > leaving this little gap in activity during which the aio_poll() loop > finishes. I am not sure, but at each gap, aio_poll() is free to fire the 0 nanosecond sleep timer cb already, which will generate more I/O. The correct thing is, like in this series, set the pause flag. Fam
© 2016 - 2024 Red Hat, Inc.