[PATCH v2] dump-guest-memory: Use BQL to protect dump finalize process

Peter Xu posted 1 patch 2 years, 5 months ago
Test checkpatch passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20211116135441.7711-1-peterx@redhat.com
dump/dump.c | 24 ++++++++++++++++++------
1 file changed, 18 insertions(+), 6 deletions(-)
[PATCH v2] dump-guest-memory: Use BQL to protect dump finalize process
Posted by Peter Xu 2 years, 5 months ago
When finalizing the dump-guest-memory with detached mode, we'll first set dump
status to either FAIL or COMPLETE before doing the cleanup, however right after
the dump status change it's possible that another dump-guest-memory qmp command
is sent so both the main thread and dump thread (which is during cleanup) could
be accessing dump state in parallel.  That's racy.

Fix it by protecting the finalizing phase of dump-guest-memory using BQL as
well, as qmp_dump_guest_memory() (which is the QEMU main thread) requires BQL.
To do that, we expand the BQL from dump_cleanup() into dump_process(), so we
will take the BQL longer than before.  The critical section must cover the
status switch from ACTIVE->{FAIL|COMPLETE} so as to make sure there's no race
any more.

We can also just introduce a specific mutex to serialize the dump process, but
BQL should be enough for now so far, not to mention vm_start() in dump_cleanup
will need BQL anyway, so maybe easier to just use the same mutex.

Reported-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
v2:
- Fix parallel spelling [Marc-Andre]
- Add r-b for Marc-Andre
- Mention that qmp_dump_guest_memory() is with BQL held [Laszlo]
---
 dump/dump.c | 24 ++++++++++++++++++------
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/dump/dump.c b/dump/dump.c
index 662d0a62cd..196b7b8ab9 100644
--- a/dump/dump.c
+++ b/dump/dump.c
@@ -96,13 +96,7 @@ static int dump_cleanup(DumpState *s)
     g_free(s->guest_note);
     s->guest_note = NULL;
     if (s->resume) {
-        if (s->detached) {
-            qemu_mutex_lock_iothread();
-        }
         vm_start();
-        if (s->detached) {
-            qemu_mutex_unlock_iothread();
-        }
     }
     migrate_del_blocker(dump_migration_blocker);
 
@@ -1873,6 +1867,11 @@ static void dump_process(DumpState *s, Error **errp)
     Error *local_err = NULL;
     DumpQueryResult *result = NULL;
 
+    /*
+     * When running with detached mode, these operations are not run with BQL.
+     * It's still safe, because it's protected by setting s->state to ACTIVE,
+     * so dump_in_progress() check will block yet another dump-guest-memory.
+     */
     if (s->has_format && s->format == DUMP_GUEST_MEMORY_FORMAT_WIN_DMP) {
 #ifdef TARGET_X86_64
         create_win_dump(s, &local_err);
@@ -1883,6 +1882,15 @@ static void dump_process(DumpState *s, Error **errp)
         create_vmcore(s, &local_err);
     }
 
+    /*
+     * Serialize the finalizing of dump process using BQL to make sure no
+     * concurrent access to DumpState is allowed.  BQL is also required for
+     * dump_cleanup as vm_start() needs it.
+     */
+    if (s->detached) {
+        qemu_mutex_lock_iothread();
+    }
+
     /* make sure status is written after written_size updates */
     smp_wmb();
     qatomic_set(&s->status,
@@ -1898,6 +1906,10 @@ static void dump_process(DumpState *s, Error **errp)
 
     error_propagate(errp, local_err);
     dump_cleanup(s);
+
+    if (s->detached) {
+        qemu_mutex_unlock_iothread();
+    }
 }
 
 static void *dump_thread(void *data)
-- 
2.32.0


Re: [PATCH v2] dump-guest-memory: Use BQL to protect dump finalize process
Posted by Laszlo Ersek 2 years, 5 months ago
On 11/16/21 14:54, Peter Xu wrote:
> When finalizing the dump-guest-memory with detached mode, we'll first set dump
> status to either FAIL or COMPLETE before doing the cleanup, however right after
> the dump status change it's possible that another dump-guest-memory qmp command
> is sent so both the main thread and dump thread (which is during cleanup) could
> be accessing dump state in parallel.  That's racy.
> 
> Fix it by protecting the finalizing phase of dump-guest-memory using BQL as
> well, as qmp_dump_guest_memory() (which is the QEMU main thread) requires BQL.
> To do that, we expand the BQL from dump_cleanup() into dump_process(), so we
> will take the BQL longer than before.  The critical section must cover the
> status switch from ACTIVE->{FAIL|COMPLETE} so as to make sure there's no race
> any more.
> 
> We can also just introduce a specific mutex to serialize the dump process, but
> BQL should be enough for now so far, not to mention vm_start() in dump_cleanup
> will need BQL anyway, so maybe easier to just use the same mutex.
> 
> Reported-by: Laszlo Ersek <lersek@redhat.com>
> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
> v2:
> - Fix parallel spelling [Marc-Andre]
> - Add r-b for Marc-Andre
> - Mention that qmp_dump_guest_memory() is with BQL held [Laszlo]
> ---
>  dump/dump.c | 24 ++++++++++++++++++------
>  1 file changed, 18 insertions(+), 6 deletions(-)

Reviewed-by: Laszlo Ersek <lersek@redhat.com>

Thank you!
Laszlo


Re: [PATCH v2] dump-guest-memory: Use BQL to protect dump finalize process
Posted by Peter Xu 2 years, 5 months ago
On Wed, Nov 17, 2021 at 11:43:42AM +0100, Laszlo Ersek wrote:
> On 11/16/21 14:54, Peter Xu wrote:
> > When finalizing the dump-guest-memory with detached mode, we'll first set dump
> > status to either FAIL or COMPLETE before doing the cleanup, however right after
> > the dump status change it's possible that another dump-guest-memory qmp command
> > is sent so both the main thread and dump thread (which is during cleanup) could
> > be accessing dump state in parallel.  That's racy.
> > 
> > Fix it by protecting the finalizing phase of dump-guest-memory using BQL as
> > well, as qmp_dump_guest_memory() (which is the QEMU main thread) requires BQL.
> > To do that, we expand the BQL from dump_cleanup() into dump_process(), so we
> > will take the BQL longer than before.  The critical section must cover the
> > status switch from ACTIVE->{FAIL|COMPLETE} so as to make sure there's no race
> > any more.
> > 
> > We can also just introduce a specific mutex to serialize the dump process, but
> > BQL should be enough for now so far, not to mention vm_start() in dump_cleanup
> > will need BQL anyway, so maybe easier to just use the same mutex.
> > 
> > Reported-by: Laszlo Ersek <lersek@redhat.com>
> > Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> > v2:
> > - Fix parallel spelling [Marc-Andre]
> > - Add r-b for Marc-Andre
> > - Mention that qmp_dump_guest_memory() is with BQL held [Laszlo]
> > ---
> >  dump/dump.c | 24 ++++++++++++++++++------
> >  1 file changed, 18 insertions(+), 6 deletions(-)
> 
> Reviewed-by: Laszlo Ersek <lersek@redhat.com>

Thanks (again) for the review (and spot/analyze the issue)!

-- 
Peter Xu