From nobody Sun Nov 9 14:51:27 2025 Delivered-To: importer@patchew.org Received-SPF: temperror (zoho.com: Error in retrieving data from DNS) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=temperror (zoho.com: Error in retrieving data from DNS) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1551275935772311.17695582520093; Wed, 27 Feb 2019 05:58:55 -0800 (PST) Received: from localhost ([127.0.0.1]:44230 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gyzjQ-0001DG-Fm for importer@patchew.org; Wed, 27 Feb 2019 08:58:44 -0500 Received: from eggs.gnu.org ([209.51.188.92]:51322) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gyzUv-0005cE-WE for qemu-devel@nongnu.org; Wed, 27 Feb 2019 08:43:51 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gyzHM-00037V-1X for qemu-devel@nongnu.org; Wed, 27 Feb 2019 08:29:44 -0500 Received: from mx1.redhat.com ([209.132.183.28]:36084) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gyzHL-00036y-Nt for qemu-devel@nongnu.org; Wed, 27 Feb 2019 08:29:43 -0500 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7B05825BB7; Wed, 27 Feb 2019 12:11:04 +0000 (UTC) Received: from work-vm (unknown [10.36.118.3]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 054A461988; Wed, 27 Feb 2019 12:10:54 +0000 (UTC) Date: Wed, 27 Feb 2019 12:10:52 +0000 From: "Dr. David Alan Gilbert" To: alex.bennee@linaro.org Message-ID: <20190227121052.GD2602@work-vm> MIME-Version: 1.0 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable User-Agent: Mutt/1.10.1 (2018-07-13) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Wed, 27 Feb 2019 12:11:04 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] possible ahci/migrate fix X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-devel@nongnu.org, peterx@redhat.com, quintela@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Type: text/plain; charset="utf-8" Hi Alex, Can you see if the attached patch fixes the ahci/migrate failure you see; it won't fail for me however mean I am to it. From f4c327d14d656d1c0f0e694d0efc6165493416f0 Mon Sep 17 00:00:00 2001 From: "Dr. David Alan Gilbert" Date: Wed, 27 Feb 2019 12:04:37 +0000 Subject: [PATCH] migration: Cleanup during exit Currently we cleanup the migration object as we exit main after the main_loop finishes; however if there's a migration running things get messy and we can end up with the migration thread still trying to access freed structures. We now take a ref to the object around the migration thread itself, so the act of dropping the ref during exit doesn't cause us to lose the state until the thread quits. Cancelling the migration during migration also tries to get the thread to quit. We do this a bit earlier; so hopefully migration gets out of the way before all the devices etc are freed. Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Juan Quintela Tested-by: Alex Benn=C3=A9e --- migration/migration.c | 8 ++++++++ vl.c | 7 ++++++- 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/migration/migration.c b/migration/migration.c index e44f77af02..f612313d09 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -126,6 +126,7 @@ static bool migration_object_check(MigrationState *ms, = Error **errp); static int migration_maybe_pause(MigrationState *s, int *current_active_state, int new_state); +static void migrate_fd_cancel(MigrationState *s); =20 void migration_object_init(void) { @@ -169,6 +170,11 @@ void migration_object_init(void) =20 void migration_object_finalize(void) { + /* + * Cancel the current migration - that will (eventually) + * stop the migration using this structure + */ + migrate_fd_cancel(current_migration); object_unref(OBJECT(current_migration)); } =20 @@ -3134,6 +3140,7 @@ static void *migration_thread(void *opaque) =20 rcu_register_thread(); =20 + object_ref(OBJECT(s)); s->iteration_start_time =3D qemu_clock_get_ms(QEMU_CLOCK_REALTIME); =20 qemu_savevm_state_header(s->to_dst_file); @@ -3230,6 +3237,7 @@ static void *migration_thread(void *opaque) =20 trace_migration_thread_after_loop(); migration_iteration_finish(s); + object_unref(OBJECT(s)); rcu_unregister_thread(); return NULL; } diff --git a/vl.c b/vl.c index 2f340686a7..c1920165f3 100644 --- a/vl.c +++ b/vl.c @@ -4579,6 +4579,12 @@ int main(int argc, char **argv, char **envp) =20 gdbserver_cleanup(); =20 + /* + * cleaning up the migration object cancels any existing migration + * try to do this early so that it also stops using devices. + */ + migration_object_finalize(); + /* No more vcpu or device emulation activity beyond this point */ vm_shutdown(); =20 @@ -4594,7 +4600,6 @@ int main(int argc, char **argv, char **envp) monitor_cleanup(); qemu_chr_cleanup(); user_creatable_cleanup(); - migration_object_finalize(); /* TODO: unref root container, check all devices are ok */ =20 return 0; --=20 2.20.1 -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK