block/mirror.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
Some Proxmox users reported an occasional assertion failure [0][1] in
busy VMs when using drive mirror with active mode. In particular, the
failure may occur for zero writes shorter than the job granularity:
> #0 0x00007b421154b507 in abort ()
> #1 0x00007b421154b420 in ?? ()
> #2 0x0000641c582e061f in bitmap_set (map=0x7b4204014e00, start=14, nr=-1)
> #3 0x0000641c58062824 in do_sync_target_write (job=0x641c7e73d1e0,
> method=MIRROR_METHOD_ZERO, offset=852480, bytes=4096, qiov=0x0, flags=0)
> #4 0x0000641c58062250 in bdrv_mirror_top_do_write (bs=0x641c7e62e1f0,
method=MIRROR_METHOD_ZERO, copy_to_target=true, offset=852480,
bytes=4096, qiov=0x0, flags=0)
> #5 0x0000641c58061f31 in bdrv_mirror_top_pwrite_zeroes (bs=0x641c7e62e1f0,
offset=852480, bytes=4096, flags=0)
The range for the dirty bitmap described by dirty_bitmap_offset and
dirty_bitmap_end is narrower than the original range and in fact,
dirty_bitmap_end might be smaller than dirty_bitmap_offset. There
already is a check for 'dirty_bitmap_offset < dirty_bitmap_end' before
resetting the dirty bitmap. Add such a check for setting the zero
bitmap too, which uses the same narrower range.
[0]: https://forum.proxmox.com/threads/177981/
[1]: https://bugzilla.proxmox.com/show_bug.cgi?id=7222
Cc: qemu-stable@nongnu.org
Fixes: 7e277545b9 ("mirror: Skip writing zeroes when target is already zero")
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
block/mirror.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/block/mirror.c b/block/mirror.c
index b344182c74..bc982cb99a 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -1514,9 +1514,12 @@ do_sync_target_write(MirrorBlockJob *job, MirrorMethod method,
assert(!qiov);
ret = blk_co_pwrite_zeroes(job->target, offset, bytes, flags);
if (job->zero_bitmap && ret >= 0) {
- bitmap_set(job->zero_bitmap, dirty_bitmap_offset / job->granularity,
- (dirty_bitmap_end - dirty_bitmap_offset) /
- job->granularity);
+ if (dirty_bitmap_offset < dirty_bitmap_end) {
+ bitmap_set(job->zero_bitmap,
+ dirty_bitmap_offset / job->granularity,
+ (dirty_bitmap_end - dirty_bitmap_offset) /
+ job->granularity);
+ }
}
break;
--
2.47.3
On Mon, Jan 12, 2026 at 04:23:51PM +0100, Fiona Ebner wrote:
> Some Proxmox users reported an occasional assertion failure [0][1] in
> busy VMs when using drive mirror with active mode. In particular, the
> failure may occur for zero writes shorter than the job granularity:
>
> > #0 0x00007b421154b507 in abort ()
> > #1 0x00007b421154b420 in ?? ()
> > #2 0x0000641c582e061f in bitmap_set (map=0x7b4204014e00, start=14, nr=-1)
> > #3 0x0000641c58062824 in do_sync_target_write (job=0x641c7e73d1e0,
> > method=MIRROR_METHOD_ZERO, offset=852480, bytes=4096, qiov=0x0, flags=0)
> > #4 0x0000641c58062250 in bdrv_mirror_top_do_write (bs=0x641c7e62e1f0,
> method=MIRROR_METHOD_ZERO, copy_to_target=true, offset=852480,
> bytes=4096, qiov=0x0, flags=0)
> > #5 0x0000641c58061f31 in bdrv_mirror_top_pwrite_zeroes (bs=0x641c7e62e1f0,
> offset=852480, bytes=4096, flags=0)
>
> The range for the dirty bitmap described by dirty_bitmap_offset and
> dirty_bitmap_end is narrower than the original range and in fact,
> dirty_bitmap_end might be smaller than dirty_bitmap_offset. There
> already is a check for 'dirty_bitmap_offset < dirty_bitmap_end' before
> resetting the dirty bitmap. Add such a check for setting the zero
> bitmap too, which uses the same narrower range.
>
> [0]: https://forum.proxmox.com/threads/177981/
> [1]: https://bugzilla.proxmox.com/show_bug.cgi?id=7222
>
> Cc: qemu-stable@nongnu.org
> Fixes: 7e277545b9 ("mirror: Skip writing zeroes when target is already zero")
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
> block/mirror.c | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/block/mirror.c b/block/mirror.c
> index b344182c74..bc982cb99a 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -1514,9 +1514,12 @@ do_sync_target_write(MirrorBlockJob *job, MirrorMethod method,
> assert(!qiov);
> ret = blk_co_pwrite_zeroes(job->target, offset, bytes, flags);
> if (job->zero_bitmap && ret >= 0) {
> - bitmap_set(job->zero_bitmap, dirty_bitmap_offset / job->granularity,
> - (dirty_bitmap_end - dirty_bitmap_offset) /
> - job->granularity);
> + if (dirty_bitmap_offset < dirty_bitmap_end) {
> + bitmap_set(job->zero_bitmap,
> + dirty_bitmap_offset / job->granularity,
> + (dirty_bitmap_end - dirty_bitmap_offset) /
> + job->granularity);
> + }
Why does this case clause use dirty_bitmap_offset and dirty_bitmap_end
instead of zero_bitmap_offset and zero_bitmap_end like the other
zero_bitmap operations in this switch statement?
if (job->zero_bitmap && ret >= 0) {
- bitmap_set(job->zero_bitmap, dirty_bitmap_offset / job->granularity,
- (dirty_bitmap_end - dirty_bitmap_offset) /
- job->granularity);
+ bitmap_set(job->zero_bitmap, zero_bitmap_offset,
+ zero_bitmap_end - zero_bitmap_offset);
}
I'm probably missing something, but it's not obvious to me :).
Stefan
Am 19.01.26 um 9:10 PM schrieb Stefan Hajnoczi:
> On Mon, Jan 12, 2026 at 04:23:51PM +0100, Fiona Ebner wrote:
>> Some Proxmox users reported an occasional assertion failure [0][1] in
>> busy VMs when using drive mirror with active mode. In particular, the
>> failure may occur for zero writes shorter than the job granularity:
>>
>>> #0 0x00007b421154b507 in abort ()
>>> #1 0x00007b421154b420 in ?? ()
>>> #2 0x0000641c582e061f in bitmap_set (map=0x7b4204014e00, start=14, nr=-1)
>>> #3 0x0000641c58062824 in do_sync_target_write (job=0x641c7e73d1e0,
>>> method=MIRROR_METHOD_ZERO, offset=852480, bytes=4096, qiov=0x0, flags=0)
>>> #4 0x0000641c58062250 in bdrv_mirror_top_do_write (bs=0x641c7e62e1f0,
>> method=MIRROR_METHOD_ZERO, copy_to_target=true, offset=852480,
>> bytes=4096, qiov=0x0, flags=0)
>>> #5 0x0000641c58061f31 in bdrv_mirror_top_pwrite_zeroes (bs=0x641c7e62e1f0,
>> offset=852480, bytes=4096, flags=0)
>>
>> The range for the dirty bitmap described by dirty_bitmap_offset and
>> dirty_bitmap_end is narrower than the original range and in fact,
>> dirty_bitmap_end might be smaller than dirty_bitmap_offset. There
>> already is a check for 'dirty_bitmap_offset < dirty_bitmap_end' before
>> resetting the dirty bitmap. Add such a check for setting the zero
>> bitmap too, which uses the same narrower range.
>>
>> [0]: https://forum.proxmox.com/threads/177981/
>> [1]: https://bugzilla.proxmox.com/show_bug.cgi?id=7222
>>
>> Cc: qemu-stable@nongnu.org
>> Fixes: 7e277545b9 ("mirror: Skip writing zeroes when target is already zero")
>> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
>> ---
>> block/mirror.c | 9 ++++++---
>> 1 file changed, 6 insertions(+), 3 deletions(-)
>>
>> diff --git a/block/mirror.c b/block/mirror.c
>> index b344182c74..bc982cb99a 100644
>> --- a/block/mirror.c
>> +++ b/block/mirror.c
>> @@ -1514,9 +1514,12 @@ do_sync_target_write(MirrorBlockJob *job, MirrorMethod method,
>> assert(!qiov);
>> ret = blk_co_pwrite_zeroes(job->target, offset, bytes, flags);
>> if (job->zero_bitmap && ret >= 0) {
>> - bitmap_set(job->zero_bitmap, dirty_bitmap_offset / job->granularity,
>> - (dirty_bitmap_end - dirty_bitmap_offset) /
>> - job->granularity);
>> + if (dirty_bitmap_offset < dirty_bitmap_end) {
>> + bitmap_set(job->zero_bitmap,
>> + dirty_bitmap_offset / job->granularity,
>> + (dirty_bitmap_end - dirty_bitmap_offset) /
>> + job->granularity);
>> + }
>
> Why does this case clause use dirty_bitmap_offset and dirty_bitmap_end
> instead of zero_bitmap_offset and zero_bitmap_end like the other
> zero_bitmap operations in this switch statement?
>
> if (job->zero_bitmap && ret >= 0) {
> - bitmap_set(job->zero_bitmap, dirty_bitmap_offset / job->granularity,
> - (dirty_bitmap_end - dirty_bitmap_offset) /
> - job->granularity);
> + bitmap_set(job->zero_bitmap, zero_bitmap_offset,
> + zero_bitmap_end - zero_bitmap_offset);
> }
>
> I'm probably missing something, but it's not obvious to me :).
Sorry, I could've mentioned this in the commit message. There is a
comment explaining it further up in the function:
> /*
> * Tails are either clean or shrunk, so for dirty bitmap resetting
> * we safely align the range narrower. But for zero bitmap, round
> * range wider for checking or clearing, and narrower for setting.
> */
> dirty_bitmap_offset = QEMU_ALIGN_UP(offset, job->granularity);
> dirty_bitmap_end = QEMU_ALIGN_DOWN(offset + bytes, job->granularity);
> if (dirty_bitmap_offset < dirty_bitmap_end) {
> bdrv_reset_dirty_bitmap(job->dirty_bitmap, dirty_bitmap_offset,
> dirty_bitmap_end - dirty_bitmap_offset);
> }
> zero_bitmap_offset = offset / job->granularity;
> zero_bitmap_end = DIV_ROUND_UP(offset + bytes, job->granularity);
On Tue, Jan 20, 2026 at 09:57:42AM +0100, Fiona Ebner wrote:
> Am 19.01.26 um 9:10 PM schrieb Stefan Hajnoczi:
> > On Mon, Jan 12, 2026 at 04:23:51PM +0100, Fiona Ebner wrote:
> >> Some Proxmox users reported an occasional assertion failure [0][1] in
> >> busy VMs when using drive mirror with active mode. In particular, the
> >> failure may occur for zero writes shorter than the job granularity:
> >>
> >>> #0 0x00007b421154b507 in abort ()
> >>> #1 0x00007b421154b420 in ?? ()
> >>> #2 0x0000641c582e061f in bitmap_set (map=0x7b4204014e00, start=14, nr=-1)
> >>> #3 0x0000641c58062824 in do_sync_target_write (job=0x641c7e73d1e0,
> >>> method=MIRROR_METHOD_ZERO, offset=852480, bytes=4096, qiov=0x0, flags=0)
> >>> #4 0x0000641c58062250 in bdrv_mirror_top_do_write (bs=0x641c7e62e1f0,
> >> method=MIRROR_METHOD_ZERO, copy_to_target=true, offset=852480,
> >> bytes=4096, qiov=0x0, flags=0)
> >>> #5 0x0000641c58061f31 in bdrv_mirror_top_pwrite_zeroes (bs=0x641c7e62e1f0,
> >> offset=852480, bytes=4096, flags=0)
> >>
> >> The range for the dirty bitmap described by dirty_bitmap_offset and
> >> dirty_bitmap_end is narrower than the original range and in fact,
> >> dirty_bitmap_end might be smaller than dirty_bitmap_offset. There
> >> already is a check for 'dirty_bitmap_offset < dirty_bitmap_end' before
> >> resetting the dirty bitmap. Add such a check for setting the zero
> >> bitmap too, which uses the same narrower range.
> >>
> >> [0]: https://forum.proxmox.com/threads/177981/
> >> [1]: https://bugzilla.proxmox.com/show_bug.cgi?id=7222
> >>
> >> Cc: qemu-stable@nongnu.org
> >> Fixes: 7e277545b9 ("mirror: Skip writing zeroes when target is already zero")
> >> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> >> ---
> >> block/mirror.c | 9 ++++++---
> >> 1 file changed, 6 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/block/mirror.c b/block/mirror.c
> >> index b344182c74..bc982cb99a 100644
> >> --- a/block/mirror.c
> >> +++ b/block/mirror.c
> >> @@ -1514,9 +1514,12 @@ do_sync_target_write(MirrorBlockJob *job, MirrorMethod method,
> >> assert(!qiov);
> >> ret = blk_co_pwrite_zeroes(job->target, offset, bytes, flags);
> >> if (job->zero_bitmap && ret >= 0) {
> >> - bitmap_set(job->zero_bitmap, dirty_bitmap_offset / job->granularity,
> >> - (dirty_bitmap_end - dirty_bitmap_offset) /
> >> - job->granularity);
> >> + if (dirty_bitmap_offset < dirty_bitmap_end) {
> >> + bitmap_set(job->zero_bitmap,
> >> + dirty_bitmap_offset / job->granularity,
> >> + (dirty_bitmap_end - dirty_bitmap_offset) /
> >> + job->granularity);
> >> + }
> >
> > Why does this case clause use dirty_bitmap_offset and dirty_bitmap_end
> > instead of zero_bitmap_offset and zero_bitmap_end like the other
> > zero_bitmap operations in this switch statement?
> >
> > if (job->zero_bitmap && ret >= 0) {
> > - bitmap_set(job->zero_bitmap, dirty_bitmap_offset / job->granularity,
> > - (dirty_bitmap_end - dirty_bitmap_offset) /
> > - job->granularity);
> > + bitmap_set(job->zero_bitmap, zero_bitmap_offset,
> > + zero_bitmap_end - zero_bitmap_offset);
> > }
> >
> > I'm probably missing something, but it's not obvious to me :).
>
> Sorry, I could've mentioned this in the commit message. There is a
> comment explaining it further up in the function:
>
> > /*
> > * Tails are either clean or shrunk, so for dirty bitmap resetting
> > * we safely align the range narrower. But for zero bitmap, round
> > * range wider for checking or clearing, and narrower for setting.
> > */
Thanks!
Stefan
On 12.01.26 18:23, Fiona Ebner wrote:
> Some Proxmox users reported an occasional assertion failure [0][1] in
> busy VMs when using drive mirror with active mode. In particular, the
> failure may occur for zero writes shorter than the job granularity:
>
>> #0 0x00007b421154b507 in abort ()
>> #1 0x00007b421154b420 in ?? ()
>> #2 0x0000641c582e061f in bitmap_set (map=0x7b4204014e00, start=14, nr=-1)
>> #3 0x0000641c58062824 in do_sync_target_write (job=0x641c7e73d1e0,
>> method=MIRROR_METHOD_ZERO, offset=852480, bytes=4096, qiov=0x0, flags=0)
>> #4 0x0000641c58062250 in bdrv_mirror_top_do_write (bs=0x641c7e62e1f0,
> method=MIRROR_METHOD_ZERO, copy_to_target=true, offset=852480,
> bytes=4096, qiov=0x0, flags=0)
>> #5 0x0000641c58061f31 in bdrv_mirror_top_pwrite_zeroes (bs=0x641c7e62e1f0,
> offset=852480, bytes=4096, flags=0)
>
> The range for the dirty bitmap described by dirty_bitmap_offset and
> dirty_bitmap_end is narrower than the original range and in fact,
> dirty_bitmap_end might be smaller than dirty_bitmap_offset. There
> already is a check for 'dirty_bitmap_offset < dirty_bitmap_end' before
> resetting the dirty bitmap. Add such a check for setting the zero
> bitmap too, which uses the same narrower range.
>
> [0]: https://forum.proxmox.com/threads/177981/
> [1]: https://bugzilla.proxmox.com/show_bug.cgi?id=7222
>
> Cc: qemu-stable@nongnu.org
> Fixes: 7e277545b9 ("mirror: Skip writing zeroes when target is already zero")
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Thanks for fixing! Adding a test would be great still.
Would be good if someone take it into larger PULL request, but if not I'll do.
Queued in my block branch.
--
Best regards,
Vladimir
This tests the scenario fixed by "block/mirror: check range
when setting zero bitmap for sync write" [0].
[0] https://lore.kernel.org/qemu-devel/20260112152544.261923-1-f.ebner@proxmox.com/
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
tests/qemu-iotests/151 | 20 ++++++++++++++++++++
tests/qemu-iotests/151.out | 4 ++--
2 files changed, 22 insertions(+), 2 deletions(-)
diff --git a/tests/qemu-iotests/151 b/tests/qemu-iotests/151
index 06ee3585db..9b9c815db5 100755
--- a/tests/qemu-iotests/151
+++ b/tests/qemu-iotests/151
@@ -191,6 +191,26 @@ class TestActiveMirror(iotests.QMPTestCase):
self.potential_writes_in_flight = False
+ def testUnalignedSmallerThanGranularityWriteZeroes(self):
+ # Fill the source image
+ self.vm.hmp_qemu_io('source', 'write -P 1 0 %i' % self.image_len);
+
+ # Start the block job
+ self.vm.cmd('blockdev-mirror',
+ job_id='mirror',
+ filter_node_name='mirror-node',
+ device='source-node',
+ target='target-node',
+ sync='full',
+ copy_mode='write-blocking')
+
+ # Wait for the READY event
+ self.wait_ready(drive='mirror')
+
+ for offset in range(6 * self.image_len // 8, 7 * self.image_len // 8, 1024 * 1024):
+ self.vm.hmp_qemu_io('source', 'aio_write -z %i 512' % (offset + 512))
+
+ self.complete_and_wait(drive='mirror', wait_ready=False)
class TestThrottledWithNbdExportBase(iotests.QMPTestCase):
image_len = 128 * 1024 * 1024 # MB
diff --git a/tests/qemu-iotests/151.out b/tests/qemu-iotests/151.out
index 3f8a935a08..2f7d3902f2 100644
--- a/tests/qemu-iotests/151.out
+++ b/tests/qemu-iotests/151.out
@@ -1,5 +1,5 @@
-......
+.......
----------------------------------------------------------------------
-Ran 6 tests
+Ran 7 tests
OK
--
2.47.3
On 20.01.26 14:38, Fiona Ebner wrote: > This tests the scenario fixed by "block/mirror: check range > when setting zero bitmap for sync write" [0]. > > [0]https://lore.kernel.org/qemu-devel/20260112152544.261923-1-f.ebner@proxmox.com/ > > Signed-off-by: Fiona Ebner<f.ebner@proxmox.com> Thanks! Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Tested-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> and queued. -- Best regards, Vladimir
© 2016 - 2026 Red Hat, Inc.