[PATCH] block-backend: protect setting block root to NULL with block graph write lock

Fiona Ebner posted 1 patch 2 months, 3 weeks ago
block/block-backend.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH] block-backend: protect setting block root to NULL with block graph write lock
Posted by Fiona Ebner 2 months, 3 weeks ago
Setting blk->root is a graph change operation and thus needs to be
protected by the block graph write lock in blk_remove_bs(). The
assignment to blk->root in blk_insert_bs() is already protected by
the block graph write lock.

In particular, the graph read lock in blk_co_do_flush() could
previously not ensure that blk_bs(blk) would always return the same
value during the locked section, which could lead to a segfault [0] in
combination with migration [1].

From the user-provided backtraces in the forum thread [1], it seems
like blk_co_do_flush() managed to get past the
blk_co_is_available(blk) check, meaning that blk_bs(blk) returned a
non-NULL value during the check, but then, when calling
bdrv_co_flush(), blk_bs(blk) returned NULL.

[0]:

> 0  bdrv_primary_child (bs=bs@entry=0x0) at ../block.c:8287
> 1  bdrv_co_flush (bs=0x0) at ../block/io.c:2948
> 2  bdrv_co_flush_entry (opaque=0x7a610affae90) at block/block-gen.c:901

[1]: https://forum.proxmox.com/threads/158072

Cc: qemu-stable@nongnu.org
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
---
 block/block-backend.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index c93a7525ad..9678615318 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -887,9 +887,9 @@ void blk_remove_bs(BlockBackend *blk)
      */
     blk_drain(blk);
     root = blk->root;
-    blk->root = NULL;
 
     bdrv_graph_wrlock();
+    blk->root = NULL;
     bdrv_root_unref_child(root);
     bdrv_graph_wrunlock();
 }
-- 
2.39.5
Re: [PATCH] block-backend: protect setting block root to NULL with block graph write lock
Posted by Kevin Wolf 2 months, 3 weeks ago
Am 08.01.2025 um 13:46 hat Fiona Ebner geschrieben:
> Setting blk->root is a graph change operation and thus needs to be
> protected by the block graph write lock in blk_remove_bs(). The
> assignment to blk->root in blk_insert_bs() is already protected by
> the block graph write lock.

Hm, if that's the case, then we should also enforce this in the
declaration of BlockBackend:

    BdrvChild * GRAPH_RDLOCK_PTR root;

However, this results in more compiler failures that we need to fix. You
caught the only remaining writer, but the lock is only fully effective
if all readers take it, too.

> In particular, the graph read lock in blk_co_do_flush() could
> previously not ensure that blk_bs(blk) would always return the same
> value during the locked section, which could lead to a segfault [0] in
> combination with migration [1].
> 
> From the user-provided backtraces in the forum thread [1], it seems
> like blk_co_do_flush() managed to get past the
> blk_co_is_available(blk) check, meaning that blk_bs(blk) returned a
> non-NULL value during the check, but then, when calling
> bdrv_co_flush(), blk_bs(blk) returned NULL.
> 
> [0]:
> 
> > 0  bdrv_primary_child (bs=bs@entry=0x0) at ../block.c:8287
> > 1  bdrv_co_flush (bs=0x0) at ../block/io.c:2948
> > 2  bdrv_co_flush_entry (opaque=0x7a610affae90) at block/block-gen.c:901
> 
> [1]: https://forum.proxmox.com/threads/158072
> 
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> ---
>  block/block-backend.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/block-backend.c b/block/block-backend.c
> index c93a7525ad..9678615318 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -887,9 +887,9 @@ void blk_remove_bs(BlockBackend *blk)
>       */
>      blk_drain(blk);
>      root = blk->root;
> -    blk->root = NULL;
>  
>      bdrv_graph_wrlock();
> +    blk->root = NULL;
>      bdrv_root_unref_child(root);
>      bdrv_graph_wrunlock();
>  }

I think the 'root = blk->root' needs to be inside the locked section,
too. Otherwise blk->root could change during bdrv_graph_wrlock() (which
has a nested event loop) and root would be stale. I assume clang would
complain about this with the added GRAPH_RDLOCK_PTR.

Kevin
Re: [PATCH] block-backend: protect setting block root to NULL with block graph write lock
Posted by Fiona Ebner 2 months, 3 weeks ago
Am 09.01.25 um 11:47 schrieb Kevin Wolf:
> Am 08.01.2025 um 13:46 hat Fiona Ebner geschrieben:
>> Setting blk->root is a graph change operation and thus needs to be
>> protected by the block graph write lock in blk_remove_bs(). The
>> assignment to blk->root in blk_insert_bs() is already protected by
>> the block graph write lock.
> 
> Hm, if that's the case, then we should also enforce this in the
> declaration of BlockBackend:
> 
>     BdrvChild * GRAPH_RDLOCK_PTR root;
> 
> However, this results in more compiler failures that we need to fix. You
> caught the only remaining writer, but the lock is only fully effective
> if all readers take it, too.

I started giving this a try, but quickly ran into some issues/questions:

1. For global state code, is it preferred to use
GRAPH_RDLOCK_GUARD_MAINLOOP() to cover the whole function or better to
use bdrv_graph_rd(un)lock_main_loop() to keep the locked section as
small as necessary? I feel like the former can be more readable, e.g. in
blk_insert_bs(), blk_new_open(), where blk->root is used in conditionals.

2. In particular, protecting blk->root means that blk_bs() needs to have
the read lock. In fact, blk_bs() is reading blk->root twice in a row, so
it seems like it could suffer from a potential NULL pointer dereference
(or I guess after compiler optimization a potential use-after-free)?

Since blk_bs() is IO_CODE() and not a coroutine, I tried to mark it
GRAPH_RDLOCK and move on to the callers.

However, one caller is blk_nb_sectors() which itself is called by
blk_get_geometry(). Both of these are manually-written coroutine wrappers:

> commit 81f730d4d0e8af9c0211c3fedf406df0046341a9
> Author: Paolo Bonzini <pbonzini@redhat.com>
> Date:   Fri Apr 7 17:33:03 2023 +0200
> 
>     block, block-backend: write some hot coroutine wrappers by hand
>     
>     The introduction of the graph lock is causing blk_get_geometry, a hot function
>     used in the I/O path, to create a coroutine.  However, the only part that really
>     needs to run in coroutine context is the call to bdrv_co_refresh_total_sectors,
>     which in turn only happens in the rare case of host CD-ROM devices.
>     
>     So, write by hand the three wrappers on the path from blk_co_get_geometry to
>     bdrv_co_refresh_total_sectors, so that the coroutine wrapper is only created
>     if bdrv_nb_sectors actually calls bdrv_refresh_total_sectors.

Both the blk_bs() and blk_nb_sectors() functions are IO_CODE(), but not
coroutines, and callers of blk_get_geometry are already in the device
code. I'm not sure how to proceed here, happy to hear suggestions :)

---snip---

>> diff --git a/block/block-backend.c b/block/block-backend.c
>> index c93a7525ad..9678615318 100644
>> --- a/block/block-backend.c
>> +++ b/block/block-backend.c
>> @@ -887,9 +887,9 @@ void blk_remove_bs(BlockBackend *blk)
>>       */
>>      blk_drain(blk);
>>      root = blk->root;
>> -    blk->root = NULL;
>>  
>>      bdrv_graph_wrlock();
>> +    blk->root = NULL;
>>      bdrv_root_unref_child(root);
>>      bdrv_graph_wrunlock();
>>  }
> 
> I think the 'root = blk->root' needs to be inside the locked section,
> too. Otherwise blk->root could change during bdrv_graph_wrlock() (which
> has a nested event loop) and root would be stale. I assume clang would
> complain about this with the added GRAPH_RDLOCK_PTR.

Oh I see, good catch!

Best Regards,
Fiona
Re: [PATCH] block-backend: protect setting block root to NULL with block graph write lock
Posted by Kevin Wolf 2 months, 2 weeks ago
Am 10.01.2025 um 17:37 hat Fiona Ebner geschrieben:
> Am 09.01.25 um 11:47 schrieb Kevin Wolf:
> > Am 08.01.2025 um 13:46 hat Fiona Ebner geschrieben:
> >> Setting blk->root is a graph change operation and thus needs to be
> >> protected by the block graph write lock in blk_remove_bs(). The
> >> assignment to blk->root in blk_insert_bs() is already protected by
> >> the block graph write lock.
> > 
> > Hm, if that's the case, then we should also enforce this in the
> > declaration of BlockBackend:
> > 
> >     BdrvChild * GRAPH_RDLOCK_PTR root;
> > 
> > However, this results in more compiler failures that we need to fix. You
> > caught the only remaining writer, but the lock is only fully effective
> > if all readers take it, too.
> 
> I started giving this a try, but quickly ran into some issues/questions:
> 
> 1. For global state code, is it preferred to use
> GRAPH_RDLOCK_GUARD_MAINLOOP() to cover the whole function or better to
> use bdrv_graph_rd(un)lock_main_loop() to keep the locked section as
> small as necessary? I feel like the former can be more readable, e.g. in
> blk_insert_bs(), blk_new_open(), where blk->root is used in conditionals.

Yes, I think we generally use GRAPH_RDLOCK_GUARD_MAINLOOP() if there is
no read not to (e.g. if you need to take a writer lock in another part
of the same function). It's essentially only an assertion anyway, so it
doesn't even actually hold a real lock for longer than necessary.

> 2. In particular, protecting blk->root means that blk_bs() needs to have
> the read lock. In fact, blk_bs() is reading blk->root twice in a row, so
> it seems like it could suffer from a potential NULL pointer dereference
> (or I guess after compiler optimization a potential use-after-free)?
> 
> Since blk_bs() is IO_CODE() and not a coroutine, I tried to mark it
> GRAPH_RDLOCK and move on to the callers.

It looks like blk_bs() is tricky in general... blk_bs() is used in
iothreads in devices. Not sure how easy it would be to trigger a bug
there, but from the locking alone, it seems entirely possible to have
the device iothread race with the main thread changing the root node.

> However, one caller is blk_nb_sectors() which itself is called by
> blk_get_geometry(). Both of these are manually-written coroutine wrappers:
> 
> > commit 81f730d4d0e8af9c0211c3fedf406df0046341a9
> > Author: Paolo Bonzini <pbonzini@redhat.com>
> > Date:   Fri Apr 7 17:33:03 2023 +0200
> > 
> >     block, block-backend: write some hot coroutine wrappers by hand
> >     
> >     The introduction of the graph lock is causing blk_get_geometry, a hot function
> >     used in the I/O path, to create a coroutine.  However, the only part that really
> >     needs to run in coroutine context is the call to bdrv_co_refresh_total_sectors,
> >     which in turn only happens in the rare case of host CD-ROM devices.
> >     
> >     So, write by hand the three wrappers on the path from blk_co_get_geometry to
> >     bdrv_co_refresh_total_sectors, so that the coroutine wrapper is only created
> >     if bdrv_nb_sectors actually calls bdrv_refresh_total_sectors.
> 
> Both the blk_bs() and blk_nb_sectors() functions are IO_CODE(), but not
> coroutines, and callers of blk_get_geometry are already in the device
> code. I'm not sure how to proceed here, happy to hear suggestions :)

Being in device code is not necessarily a problem, we could be doing
stuff already there if we wanted.

The important question is, what would be the right behaviour? Can any of
the calls run while the writer lock is held?

If not, we can probably assert that and just take a reader lock
(bdrv_graph_co_rdlock() has to run in coroutine context only because it
might have to wait for a writer - if we know that there is none, we
could do it outside of a coroutine).

But if yes, then we either have a real bug that needs to be solved or
the protection is provided by something other than the graph lock. Maybe
the answer is that because of how devices access the field, the graph
lock isn't the right mechanism for protecting BlockBackend.root and we
should rely on draining instead?

We already try to do this:

    blk_drain(blk);
    root = blk->root;
    blk->root = NULL;

It would make more sense for this to be a drained_begin/end section that
quiesces devices while we're changing the root node, otherwise we can
get new requests before blk_drain() returns.

Do you know if the I/O request dereferencing a NULL pointer came from a
device? (The stack trace in your commit message is shortened too much to
tell me a lot, and the link to your forum doesn't allow me to view the
logs without registering an account. Please include more information in
the commit message in v2.)

Kevin
Re: [PATCH] block-backend: protect setting block root to NULL with block graph write lock
Posted by Fiona Ebner 2 months, 1 week ago
Am 16.01.25 um 16:52 schrieb Kevin Wolf:
> Am 10.01.2025 um 17:37 hat Fiona Ebner geschrieben:
>> Am 09.01.25 um 11:47 schrieb Kevin Wolf:
>>> Am 08.01.2025 um 13:46 hat Fiona Ebner geschrieben:
>>>> Setting blk->root is a graph change operation and thus needs to be
>>>> protected by the block graph write lock in blk_remove_bs(). The
>>>> assignment to blk->root in blk_insert_bs() is already protected by
>>>> the block graph write lock.
>>>
>>> Hm, if that's the case, then we should also enforce this in the
>>> declaration of BlockBackend:
>>>
>>>     BdrvChild * GRAPH_RDLOCK_PTR root;
>>>
>>> However, this results in more compiler failures that we need to fix. You
>>> caught the only remaining writer, but the lock is only fully effective
>>> if all readers take it, too.
>>
>> I started giving this a try, but quickly ran into some issues/questions:
>>
>> 1. For global state code, is it preferred to use
>> GRAPH_RDLOCK_GUARD_MAINLOOP() to cover the whole function or better to
>> use bdrv_graph_rd(un)lock_main_loop() to keep the locked section as
>> small as necessary? I feel like the former can be more readable, e.g. in
>> blk_insert_bs(), blk_new_open(), where blk->root is used in conditionals.
> 
> Yes, I think we generally use GRAPH_RDLOCK_GUARD_MAINLOOP() if there is
> no read not to (e.g. if you need to take a writer lock in another part
> of the same function). It's essentially only an assertion anyway, so it
> doesn't even actually hold a real lock for longer than necessary.

Okay, thanks!

>> 2. In particular, protecting blk->root means that blk_bs() needs to have
>> the read lock. In fact, blk_bs() is reading blk->root twice in a row, so
>> it seems like it could suffer from a potential NULL pointer dereference
>> (or I guess after compiler optimization a potential use-after-free)?
>>
>> Since blk_bs() is IO_CODE() and not a coroutine, I tried to mark it
>> GRAPH_RDLOCK and move on to the callers.
> 
> It looks like blk_bs() is tricky in general... blk_bs() is used in
> iothreads in devices. Not sure how easy it would be to trigger a bug
> there, but from the locking alone, it seems entirely possible to have
> the device iothread race with the main thread changing the root node.
> 
>> However, one caller is blk_nb_sectors() which itself is called by
>> blk_get_geometry(). Both of these are manually-written coroutine wrappers:
>>
>>> commit 81f730d4d0e8af9c0211c3fedf406df0046341a9
>>> Author: Paolo Bonzini <pbonzini@redhat.com>
>>> Date:   Fri Apr 7 17:33:03 2023 +0200
>>>
>>>     block, block-backend: write some hot coroutine wrappers by hand
>>>     
>>>     The introduction of the graph lock is causing blk_get_geometry, a hot function
>>>     used in the I/O path, to create a coroutine.  However, the only part that really
>>>     needs to run in coroutine context is the call to bdrv_co_refresh_total_sectors,
>>>     which in turn only happens in the rare case of host CD-ROM devices.
>>>     
>>>     So, write by hand the three wrappers on the path from blk_co_get_geometry to
>>>     bdrv_co_refresh_total_sectors, so that the coroutine wrapper is only created
>>>     if bdrv_nb_sectors actually calls bdrv_refresh_total_sectors.
>>
>> Both the blk_bs() and blk_nb_sectors() functions are IO_CODE(), but not
>> coroutines, and callers of blk_get_geometry are already in the device
>> code. I'm not sure how to proceed here, happy to hear suggestions :)
> 
> Being in device code is not necessarily a problem, we could be doing
> stuff already there if we wanted.
> 
> The important question is, what would be the right behaviour? Can any of
> the calls run while the writer lock is held?
> 
> If not, we can probably assert that and just take a reader lock
> (bdrv_graph_co_rdlock() has to run in coroutine context only because it
> might have to wait for a writer - if we know that there is none, we
> could do it outside of a coroutine).
> 
> But if yes, then we either have a real bug that needs to be solved or
> the protection is provided by something other than the graph lock. Maybe
> the answer is that because of how devices access the field, the graph
> lock isn't the right mechanism for protecting BlockBackend.root and we
> should rely on draining instead?
> 
> We already try to do this:
> 
>     blk_drain(blk);
>     root = blk->root;
>     blk->root = NULL;
> 
> It would make more sense for this to be a drained_begin/end section that
> quiesces devices while we're changing the root node, otherwise we can
> get new requests before blk_drain() returns.

Good idea!

> Do you know if the I/O request dereferencing a NULL pointer came from a
> device? (The stack trace in your commit message is shortened too much to
> tell me a lot, and the link to your forum doesn't allow me to view the
> logs without registering an account. Please include more information in
> the commit message in v2.)

Unfortunately, I'm not sure. I've included the GDB script and resulting
log (with our downstream based on QEMU 9.0.2) below. Interestingly, the
problematic call happens in the main thread, while the previous ones
were in the live migration thread or, IIUC, in the IO thread. But it's
the same opaque pointer address as in the calls in the live migration
thread. Another detail to note is that the live migration thread did not
exit yet, but I don't know if that is important.

Best Regards,
Fiona

GDB script:

> handle SIGUSR1 noprint nostop
> handle SIGPIPE noprint nostop
> break bdrv_flush
> break bdrv_co_flush
> commands 1 2
> bt
> c
> end
> c

Log:

> Thread 3 "kvm" hit Breakpoint 2, bdrv_co_flush (bs=0x61f3aaaf0f80) at ../block/io.c:2947
> 2947    in ../block/io.c
> #0  bdrv_co_flush (bs=0x61f3aaaf0f80) at ../block/io.c:2947
> #1  0x000061f395266bf7 in blk_co_do_flush (blk=0x61f3aaafd870) at ../block/block-backend.c:1833
> #2  blk_aio_flush_entry (opaque=0x792088035380) at ../block/block-backend.c:1841
> #3  0x000061f3953baf7b in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at ../util/coroutine-ucontext.c:175
> #4  0x000079209a6a69c0 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
> #5  0x0000792096bfa5a0 in ?? ()
> #6  0x0000000000000000 in ?? ()
> 
> Thread 3 "kvm" hit Breakpoint 2, bdrv_co_flush (bs=0x61f3aaaf82e0) at ../block/io.c:2947
> 2947    in ../block/io.c
> #0  bdrv_co_flush (bs=0x61f3aaaf82e0) at ../block/io.c:2947
> #1  0x000061f395274911 in bdrv_co_flush (bs=0x61f3aaaf0f80) at ../block/io.c:3047
> #2  0x000061f395266bf7 in blk_co_do_flush (blk=0x61f3aaafd870) at ../block/block-backend.c:1833
> #3  blk_aio_flush_entry (opaque=0x792088035380) at ../block/block-backend.c:1841
> #4  0x000061f3953baf7b in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at ../util/coroutine-ucontext.c:175
> #5  0x000079209a6a69c0 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
> #6  0x0000792096bfa5a0 in ?? ()
> #7  0x0000000000000000 in ?? ()
> [New Thread 0x79206b0006c0 (LWP 103520)]
> [Thread 0x79206b0006c0 (LWP 103520) exited]
> [New Thread 0x79206b0006c0 (LWP 103521)]
> [Switching to Thread 0x79206b0006c0 (LWP 103521)]
> 
> Thread 24 "live_migration" hit Breakpoint 1, bdrv_flush (bs=0x61f3aaaf0f80) at block/block-gen.c:909
> #0  bdrv_flush (bs=0x61f3aaaf0f80) at block/block-gen.c:909
> #1  0x000061f395273cfd in bdrv_flush_all () at ../block/io.c:2348
> #2  0x000061f394fc8a9d in do_vm_stop (state=RUN_STATE_FINISH_MIGRATE, send_stop=<optimized out>) at ../system/cpus.c:297
> #3  0x000061f394ff41c9 in migration_stop_vm (s=s@entry=0x61f3aaaf0790, state=state@entry=RUN_STATE_FINISH_MIGRATE) at ../migration/migration.c:202
> #4  0x000061f394ffa88d in migration_completion_precopy (current_active_state=0x79206affaffc, s=0x61f3aaaf0790) at ../migration/migration.c:2732
> #5  migration_completion (s=0x61f3aaaf0790) at ../migration/migration.c:2814
> #6  migration_iteration_run (s=0x61f3aaaf0790) at ../migration/migration.c:3238
> #7  migration_thread (opaque=opaque@entry=0x61f3aaaf0790) at ../migration/migration.c:3490
> #8  0x000061f3953a35c8 in qemu_thread_start (args=0x61f3aacf3560) at ../util/qemu-thread-posix.c:541
> #9  0x000079209a6de1c4 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
> #10 0x000079209a75e85c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
> 
> Thread 24 "live_migration" hit Breakpoint 2, bdrv_co_flush (bs=0x61f3aaaf0f80) at ../block/io.c:2947
> #0  bdrv_co_flush (bs=0x61f3aaaf0f80) at ../block/io.c:2947
> #1  0x000061f395236882 in bdrv_co_flush_entry (opaque=0x79206affae90) at block/block-gen.c:901
> #2  0x000061f3953baf7b in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at ../util/coroutine-ucontext.c:175
> #3  0x000079209a6a69c0 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
> #4  0x000079206affa6b0 in ?? ()
> #5  0x0000000000000000 in ?? ()
> 
> Thread 24 "live_migration" hit Breakpoint 2, bdrv_co_flush (bs=0x61f3aaaf82e0) at ../block/io.c:2947
> 2947    in ../block/io.c
> #0  bdrv_co_flush (bs=0x61f3aaaf82e0) at ../block/io.c:2947
> #1  0x000061f395274911 in bdrv_co_flush (bs=0x61f3aaaf0f80) at ../block/io.c:3047
> #2  0x000061f395236882 in bdrv_co_flush_entry (opaque=0x79206affae90) at block/block-gen.c:901
> #3  0x000061f3953baf7b in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at ../util/coroutine-ucontext.c:175
> #4  0x000079209a6a69c0 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
> #5  0x000079206affa6b0 in ?? ()
> #6  0x0000000000000000 in ?? ()
> [Switching to Thread 0x792097a7d500 (LWP 23325)]
> 
> Thread 1 "kvm" hit Breakpoint 2, bdrv_co_flush (bs=0x0) at ../block/io.c:2947
> 2947    in ../block/io.c
> #0  bdrv_co_flush (bs=0x0) at ../block/io.c:2947
> #1  0x000061f395236882 in bdrv_co_flush_entry (opaque=0x79206affae90) at block/block-gen.c:901
> #2  0x000061f3953baf7b in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at ../util/coroutine-ucontext.c:175
> #3  0x000079209a6a69c0 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
> #4  0x000079206affa6b0 in ?? ()
> #5  0x0000000000000000 in ?? ()
> 
> Thread 1 "kvm" received signal SIGSEGV, Segmentation fault.
> bdrv_primary_child (bs=bs@entry=0x0) at ../block.c:8287
> [Inferior 1 (process 23325) detached]
Re: [PATCH] block-backend: protect setting block root to NULL with block graph write lock
Posted by Daniel Wielandt 2 months, 3 weeks ago
Suggestions hold pattern.. programmers obviously understand control logic.
Systems built with operational flaws. Will freeze up when u start repairing
it while it's running.

On Thu, Jan 9, 2025, 4:48 AM Kevin Wolf <kwolf@redhat.com> wrote:

> Am 08.01.2025 um 13:46 hat Fiona Ebner geschrieben:
> > Setting blk->root is a graph change operation and thus needs to be
> > protected by the block graph write lock in blk_remove_bs(). The
> > assignment to blk->root in blk_insert_bs() is already protected by
> > the block graph write lock.
>
> Hm, if that's the case, then we should also enforce this in the
> declaration of BlockBackend:
>
>     BdrvChild * GRAPH_RDLOCK_PTR root;
>
> However, this results in more compiler failures that we need to fix. You
> caught the only remaining writer, but the lock is only fully effective
> if all readers take it, too.
>
> > In particular, the graph read lock in blk_co_do_flush() could
> > previously not ensure that blk_bs(blk) would always return the same
> > value during the locked section, which could lead to a segfault [0] in
> > combination with migration [1].
> >
> > From the user-provided backtraces in the forum thread [1], it seems
> > like blk_co_do_flush() managed to get past the
> > blk_co_is_available(blk) check, meaning that blk_bs(blk) returned a
> > non-NULL value during the check, but then, when calling
> > bdrv_co_flush(), blk_bs(blk) returned NULL.
> >
> > [0]:
> >
> > > 0  bdrv_primary_child (bs=bs@entry=0x0) at ../block.c:8287
> > > 1  bdrv_co_flush (bs=0x0) at ../block/io.c:2948
> > > 2  bdrv_co_flush_entry (opaque=0x7a610affae90) at block/block-gen.c:901
> >
> > [1]: https://forum.proxmox.com/threads/158072
> >
> > Cc: qemu-stable@nongnu.org
> > Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
> > ---
> >  block/block-backend.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/block/block-backend.c b/block/block-backend.c
> > index c93a7525ad..9678615318 100644
> > --- a/block/block-backend.c
> > +++ b/block/block-backend.c
> > @@ -887,9 +887,9 @@ void blk_remove_bs(BlockBackend *blk)
> >       */
> >      blk_drain(blk);
> >      root = blk->root;
> > -    blk->root = NULL;
> >
> >      bdrv_graph_wrlock();
> > +    blk->root = NULL;
> >      bdrv_root_unref_child(root);
> >      bdrv_graph_wrunlock();
> >  }
>
> I think the 'root = blk->root' needs to be inside the locked section,
> too. Otherwise blk->root could change during bdrv_graph_wrlock() (which
> has a nested event loop) and root would be stale. I assume clang would
> complain about this with the added GRAPH_RDLOCK_PTR.
>
> Kevin
>
>
>
Re: [PATCH] block-backend: protect setting block root to NULL with block graph write lock
Posted by Daniel Wielandt 2 months, 3 weeks ago
I feel like your too focused on new problems when you  gave up repairing to
old os.. be ause at one time it was real slick.. now.theres parts that are
problems that can be eliminated easy and provide  a less bloated function
with simple code deletion.. fix the original is down its it's primary parts
.then u can add your fancy automation and setting get real com0lucated

On Thu, Jan 9, 2025, 4:52 AM Daniel Wielandt <wielandtdan@gmail.com> wrote:

> Suggestions hold pattern.. programmers obviously understand control logic.
> Systems built with operational flaws. Will freeze up when u start repairing
> it while it's running.
>
> On Thu, Jan 9, 2025, 4:48 AM Kevin Wolf <kwolf@redhat.com> wrote:
>
>> Am 08.01.2025 um 13:46 hat Fiona Ebner geschrieben:
>> > Setting blk->root is a graph change operation and thus needs to be
>> > protected by the block graph write lock in blk_remove_bs(). The
>> > assignment to blk->root in blk_insert_bs() is already protected by
>> > the block graph write lock.
>>
>> Hm, if that's the case, then we should also enforce this in the
>> declaration of BlockBackend:
>>
>>     BdrvChild * GRAPH_RDLOCK_PTR root;
>>
>> However, this results in more compiler failures that we need to fix. You
>> caught the only remaining writer, but the lock is only fully effective
>> if all readers take it, too.
>>
>> > In particular, the graph read lock in blk_co_do_flush() could
>> > previously not ensure that blk_bs(blk) would always return the same
>> > value during the locked section, which could lead to a segfault [0] in
>> > combination with migration [1].
>> >
>> > From the user-provided backtraces in the forum thread [1], it seems
>> > like blk_co_do_flush() managed to get past the
>> > blk_co_is_available(blk) check, meaning that blk_bs(blk) returned a
>> > non-NULL value during the check, but then, when calling
>> > bdrv_co_flush(), blk_bs(blk) returned NULL.
>> >
>> > [0]:
>> >
>> > > 0  bdrv_primary_child (bs=bs@entry=0x0) at ../block.c:8287
>> > > 1  bdrv_co_flush (bs=0x0) at ../block/io.c:2948
>> > > 2  bdrv_co_flush_entry (opaque=0x7a610affae90) at
>> block/block-gen.c:901
>> >
>> > [1]: https://forum.proxmox.com/threads/158072
>> >
>> > Cc: qemu-stable@nongnu.org
>> > Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
>> > ---
>> >  block/block-backend.c | 2 +-
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > diff --git a/block/block-backend.c b/block/block-backend.c
>> > index c93a7525ad..9678615318 100644
>> > --- a/block/block-backend.c
>> > +++ b/block/block-backend.c
>> > @@ -887,9 +887,9 @@ void blk_remove_bs(BlockBackend *blk)
>> >       */
>> >      blk_drain(blk);
>> >      root = blk->root;
>> > -    blk->root = NULL;
>> >
>> >      bdrv_graph_wrlock();
>> > +    blk->root = NULL;
>> >      bdrv_root_unref_child(root);
>> >      bdrv_graph_wrunlock();
>> >  }
>>
>> I think the 'root = blk->root' needs to be inside the locked section,
>> too. Otherwise blk->root could change during bdrv_graph_wrlock() (which
>> has a nested event loop) and root would be stale. I assume clang would
>> complain about this with the added GRAPH_RDLOCK_PTR.
>>
>> Kevin
>>
>>
>>
Re: [PATCH] block-backend: protect setting block root to NULL with block graph write lock
Posted by Daniel Wielandt 2 months, 3 weeks ago
We all know how the thing works and operates it does t take a genius to
know what's working and what's not.. I mean. I.notsaying take out tge
things that made us like to use it.. but I mean command should be
understood logic should be what we use as a am
Standard.. and trust me depending on where your at with structure and
scripts the logic can even be ...codeing..

On Thu, Jan 9, 2025, 4:55 AM Daniel Wielandt <wielandtdan@gmail.com> wrote:

> I feel like your too focused on new problems when you  gave up repairing
> to old os.. be ause at one time it was real slick.. now.theres parts that
> are problems that can be eliminated easy and provide  a less bloated
> function with simple code deletion.. fix the original is down its it's
> primary parts
> .then u can add your fancy automation and setting get real com0lucated
>
> On Thu, Jan 9, 2025, 4:52 AM Daniel Wielandt <wielandtdan@gmail.com>
> wrote:
>
>> Suggestions hold pattern.. programmers obviously understand control
>> logic. Systems built with operational flaws. Will freeze up when u start
>> repairing it while it's running.
>>
>> On Thu, Jan 9, 2025, 4:48 AM Kevin Wolf <kwolf@redhat.com> wrote:
>>
>>> Am 08.01.2025 um 13:46 hat Fiona Ebner geschrieben:
>>> > Setting blk->root is a graph change operation and thus needs to be
>>> > protected by the block graph write lock in blk_remove_bs(). The
>>> > assignment to blk->root in blk_insert_bs() is already protected by
>>> > the block graph write lock.
>>>
>>> Hm, if that's the case, then we should also enforce this in the
>>> declaration of BlockBackend:
>>>
>>>     BdrvChild * GRAPH_RDLOCK_PTR root;
>>>
>>> However, this results in more compiler failures that we need to fix. You
>>> caught the only remaining writer, but the lock is only fully effective
>>> if all readers take it, too.
>>>
>>> > In particular, the graph read lock in blk_co_do_flush() could
>>> > previously not ensure that blk_bs(blk) would always return the same
>>> > value during the locked section, which could lead to a segfault [0] in
>>> > combination with migration [1].
>>> >
>>> > From the user-provided backtraces in the forum thread [1], it seems
>>> > like blk_co_do_flush() managed to get past the
>>> > blk_co_is_available(blk) check, meaning that blk_bs(blk) returned a
>>> > non-NULL value during the check, but then, when calling
>>> > bdrv_co_flush(), blk_bs(blk) returned NULL.
>>> >
>>> > [0]:
>>> >
>>> > > 0  bdrv_primary_child (bs=bs@entry=0x0) at ../block.c:8287
>>> > > 1  bdrv_co_flush (bs=0x0) at ../block/io.c:2948
>>> > > 2  bdrv_co_flush_entry (opaque=0x7a610affae90) at
>>> block/block-gen.c:901
>>> >
>>> > [1]: https://forum.proxmox.com/threads/158072
>>> >
>>> > Cc: qemu-stable@nongnu.org
>>> > Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
>>> > ---
>>> >  block/block-backend.c | 2 +-
>>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>>> >
>>> > diff --git a/block/block-backend.c b/block/block-backend.c
>>> > index c93a7525ad..9678615318 100644
>>> > --- a/block/block-backend.c
>>> > +++ b/block/block-backend.c
>>> > @@ -887,9 +887,9 @@ void blk_remove_bs(BlockBackend *blk)
>>> >       */
>>> >      blk_drain(blk);
>>> >      root = blk->root;
>>> > -    blk->root = NULL;
>>> >
>>> >      bdrv_graph_wrlock();
>>> > +    blk->root = NULL;
>>> >      bdrv_root_unref_child(root);
>>> >      bdrv_graph_wrunlock();
>>> >  }
>>>
>>> I think the 'root = blk->root' needs to be inside the locked section,
>>> too. Otherwise blk->root could change during bdrv_graph_wrlock() (which
>>> has a nested event loop) and root would be stale. I assume clang would
>>> complain about this with the added GRAPH_RDLOCK_PTR.
>>>
>>> Kevin
>>>
>>>
>>>
Re: [PATCH] block-backend: protect setting block root to NULL with block graph write lock
Posted by Daniel Wielandt 2 months, 3 weeks ago
U just need someone whose .og personally invested to tell youwhay needs to
go. As a third party it'd obvious..remember the kernel and drivers ate not
the sibstricy..
Also kernel was written  for free but drivers use a very high overhead..so
which system deserves world domination.

None stupid it about the people not the crap we argue over at our nightjobs

On Thu, Jan 9, 2025, 4:59 AM Daniel Wielandt <wielandtdan@gmail.com> wrote:

> We all know how the thing works and operates it does t take a genius to
> know what's working and what's not.. I mean. I.notsaying take out tge
> things that made us like to use it.. but I mean command should be
> understood logic should be what we use as a am
> Standard.. and trust me depending on where your at with structure and
> scripts the logic can even be ...codeing..
>
> On Thu, Jan 9, 2025, 4:55 AM Daniel Wielandt <wielandtdan@gmail.com>
> wrote:
>
>> I feel like your too focused on new problems when you  gave up repairing
>> to old os.. be ause at one time it was real slick.. now.theres parts that
>> are problems that can be eliminated easy and provide  a less bloated
>> function with simple code deletion.. fix the original is down its it's
>> primary parts
>> .then u can add your fancy automation and setting get real com0lucated
>>
>> On Thu, Jan 9, 2025, 4:52 AM Daniel Wielandt <wielandtdan@gmail.com>
>> wrote:
>>
>>> Suggestions hold pattern.. programmers obviously understand control
>>> logic. Systems built with operational flaws. Will freeze up when u start
>>> repairing it while it's running.
>>>
>>> On Thu, Jan 9, 2025, 4:48 AM Kevin Wolf <kwolf@redhat.com> wrote:
>>>
>>>> Am 08.01.2025 um 13:46 hat Fiona Ebner geschrieben:
>>>> > Setting blk->root is a graph change operation and thus needs to be
>>>> > protected by the block graph write lock in blk_remove_bs(). The
>>>> > assignment to blk->root in blk_insert_bs() is already protected by
>>>> > the block graph write lock.
>>>>
>>>> Hm, if that's the case, then we should also enforce this in the
>>>> declaration of BlockBackend:
>>>>
>>>>     BdrvChild * GRAPH_RDLOCK_PTR root;
>>>>
>>>> However, this results in more compiler failures that we need to fix. You
>>>> caught the only remaining writer, but the lock is only fully effective
>>>> if all readers take it, too.
>>>>
>>>> > In particular, the graph read lock in blk_co_do_flush() could
>>>> > previously not ensure that blk_bs(blk) would always return the same
>>>> > value during the locked section, which could lead to a segfault [0] in
>>>> > combination with migration [1].
>>>> >
>>>> > From the user-provided backtraces in the forum thread [1], it seems
>>>> > like blk_co_do_flush() managed to get past the
>>>> > blk_co_is_available(blk) check, meaning that blk_bs(blk) returned a
>>>> > non-NULL value during the check, but then, when calling
>>>> > bdrv_co_flush(), blk_bs(blk) returned NULL.
>>>> >
>>>> > [0]:
>>>> >
>>>> > > 0  bdrv_primary_child (bs=bs@entry=0x0) at ../block.c:8287
>>>> > > 1  bdrv_co_flush (bs=0x0) at ../block/io.c:2948
>>>> > > 2  bdrv_co_flush_entry (opaque=0x7a610affae90) at
>>>> block/block-gen.c:901
>>>> >
>>>> > [1]: https://forum.proxmox.com/threads/158072
>>>> >
>>>> > Cc: qemu-stable@nongnu.org
>>>> > Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
>>>> > ---
>>>> >  block/block-backend.c | 2 +-
>>>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>>>> >
>>>> > diff --git a/block/block-backend.c b/block/block-backend.c
>>>> > index c93a7525ad..9678615318 100644
>>>> > --- a/block/block-backend.c
>>>> > +++ b/block/block-backend.c
>>>> > @@ -887,9 +887,9 @@ void blk_remove_bs(BlockBackend *blk)
>>>> >       */
>>>> >      blk_drain(blk);
>>>> >      root = blk->root;
>>>> > -    blk->root = NULL;
>>>> >
>>>> >      bdrv_graph_wrlock();
>>>> > +    blk->root = NULL;
>>>> >      bdrv_root_unref_child(root);
>>>> >      bdrv_graph_wrunlock();
>>>> >  }
>>>>
>>>> I think the 'root = blk->root' needs to be inside the locked section,
>>>> too. Otherwise blk->root could change during bdrv_graph_wrlock() (which
>>>> has a nested event loop) and root would be stale. I assume clang would
>>>> complain about this with the added GRAPH_RDLOCK_PTR.
>>>>
>>>> Kevin
>>>>
>>>>
>>>>
Re: [PATCH] block-backend: protect setting block root to NULL with block graph write lock
Posted by Daniel Wielandt 2 months, 3 weeks ago
Just my opinion.. also I want kitty .terminal..red header.. don't worry I
just started this morning..

On Thu, Jan 9, 2025, 5:03 AM Daniel Wielandt <wielandtdan@gmail.com> wrote:

> U just need someone whose .og personally invested to tell youwhay needs to
> go. As a third party it'd obvious..remember the kernel and drivers ate not
> the sibstricy..
> Also kernel was written  for free but drivers use a very high overhead..so
> which system deserves world domination.
>
> None stupid it about the people not the crap we argue over at our nightjobs
>
> On Thu, Jan 9, 2025, 4:59 AM Daniel Wielandt <wielandtdan@gmail.com>
> wrote:
>
>> We all know how the thing works and operates it does t take a genius to
>> know what's working and what's not.. I mean. I.notsaying take out tge
>> things that made us like to use it.. but I mean command should be
>> understood logic should be what we use as a am
>> Standard.. and trust me depending on where your at with structure and
>> scripts the logic can even be ...codeing..
>>
>> On Thu, Jan 9, 2025, 4:55 AM Daniel Wielandt <wielandtdan@gmail.com>
>> wrote:
>>
>>> I feel like your too focused on new problems when you  gave up repairing
>>> to old os.. be ause at one time it was real slick.. now.theres parts that
>>> are problems that can be eliminated easy and provide  a less bloated
>>> function with simple code deletion.. fix the original is down its it's
>>> primary parts
>>> .then u can add your fancy automation and setting get real com0lucated
>>>
>>> On Thu, Jan 9, 2025, 4:52 AM Daniel Wielandt <wielandtdan@gmail.com>
>>> wrote:
>>>
>>>> Suggestions hold pattern.. programmers obviously understand control
>>>> logic. Systems built with operational flaws. Will freeze up when u start
>>>> repairing it while it's running.
>>>>
>>>> On Thu, Jan 9, 2025, 4:48 AM Kevin Wolf <kwolf@redhat.com> wrote:
>>>>
>>>>> Am 08.01.2025 um 13:46 hat Fiona Ebner geschrieben:
>>>>> > Setting blk->root is a graph change operation and thus needs to be
>>>>> > protected by the block graph write lock in blk_remove_bs(). The
>>>>> > assignment to blk->root in blk_insert_bs() is already protected by
>>>>> > the block graph write lock.
>>>>>
>>>>> Hm, if that's the case, then we should also enforce this in the
>>>>> declaration of BlockBackend:
>>>>>
>>>>>     BdrvChild * GRAPH_RDLOCK_PTR root;
>>>>>
>>>>> However, this results in more compiler failures that we need to fix.
>>>>> You
>>>>> caught the only remaining writer, but the lock is only fully effective
>>>>> if all readers take it, too.
>>>>>
>>>>> > In particular, the graph read lock in blk_co_do_flush() could
>>>>> > previously not ensure that blk_bs(blk) would always return the same
>>>>> > value during the locked section, which could lead to a segfault [0]
>>>>> in
>>>>> > combination with migration [1].
>>>>> >
>>>>> > From the user-provided backtraces in the forum thread [1], it seems
>>>>> > like blk_co_do_flush() managed to get past the
>>>>> > blk_co_is_available(blk) check, meaning that blk_bs(blk) returned a
>>>>> > non-NULL value during the check, but then, when calling
>>>>> > bdrv_co_flush(), blk_bs(blk) returned NULL.
>>>>> >
>>>>> > [0]:
>>>>> >
>>>>> > > 0  bdrv_primary_child (bs=bs@entry=0x0) at ../block.c:8287
>>>>> > > 1  bdrv_co_flush (bs=0x0) at ../block/io.c:2948
>>>>> > > 2  bdrv_co_flush_entry (opaque=0x7a610affae90) at
>>>>> block/block-gen.c:901
>>>>> >
>>>>> > [1]: https://forum.proxmox.com/threads/158072
>>>>> >
>>>>> > Cc: qemu-stable@nongnu.org
>>>>> > Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
>>>>> > ---
>>>>> >  block/block-backend.c | 2 +-
>>>>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>> >
>>>>> > diff --git a/block/block-backend.c b/block/block-backend.c
>>>>> > index c93a7525ad..9678615318 100644
>>>>> > --- a/block/block-backend.c
>>>>> > +++ b/block/block-backend.c
>>>>> > @@ -887,9 +887,9 @@ void blk_remove_bs(BlockBackend *blk)
>>>>> >       */
>>>>> >      blk_drain(blk);
>>>>> >      root = blk->root;
>>>>> > -    blk->root = NULL;
>>>>> >
>>>>> >      bdrv_graph_wrlock();
>>>>> > +    blk->root = NULL;
>>>>> >      bdrv_root_unref_child(root);
>>>>> >      bdrv_graph_wrunlock();
>>>>> >  }
>>>>>
>>>>> I think the 'root = blk->root' needs to be inside the locked section,
>>>>> too. Otherwise blk->root could change during bdrv_graph_wrlock() (which
>>>>> has a nested event loop) and root would be stale. I assume clang would
>>>>> complain about this with the added GRAPH_RDLOCK_PTR.
>>>>>
>>>>> Kevin
>>>>>
>>>>>
>>>>>