From nobody Sun Feb 8 11:22:07 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1644204572324535.3515695934286; Sun, 6 Feb 2022 19:29:32 -0800 (PST) Received: from localhost ([::1]:41962 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nGuiY-0007uM-Os for importer@patchew.org; Sun, 06 Feb 2022 22:29:30 -0500 Received: from eggs.gnu.org ([209.51.188.92]:43222) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nGug4-0006iV-S3 for qemu-devel@nongnu.org; Sun, 06 Feb 2022 22:27:00 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:37209) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nGufx-0001GU-U2 for qemu-devel@nongnu.org; Sun, 06 Feb 2022 22:26:55 -0500 Received: from mail-pj1-f71.google.com (mail-pj1-f71.google.com [209.85.216.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-495-oMrHLI3KNkulb7tDH8854w-1; Sun, 06 Feb 2022 22:26:32 -0500 Received: by mail-pj1-f71.google.com with SMTP id 62-20020a17090a09c400b001b80b0742b0so8298054pjo.8 for ; Sun, 06 Feb 2022 19:26:31 -0800 (PST) Received: from localhost.localdomain ([94.177.118.121]) by smtp.gmail.com with ESMTPSA id x33sm10720001pfh.178.2022.02.06.19.26.26 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Sun, 06 Feb 2022 19:26:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644204393; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=KnS0qOz1rV+B93dD6Ieg/TKmwRyk86bMON+oLZACjMc=; b=UqbvopGtem0B2NlIakptciHaKn5D0DtawxK5wqIokQ5mw8xejfIoqsZzI2yhfOnumSNF6u sQ11CAu83J3kQ+zBjF55gFeTrs3/5CO547fnL660scfzqdMq4+clH38W5ibYwHjWrgHSSc vJreHwwVn4UEjnDTs49Z4tXd27xWKYY= X-MC-Unique: oMrHLI3KNkulb7tDH8854w-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=KnS0qOz1rV+B93dD6Ieg/TKmwRyk86bMON+oLZACjMc=; b=H93RbJy0vABZByuPS36q7AtNFMe9X2OwDvsqDSUrUy/kGaqg5jKpP5ul9cQEDkz59G Z+7/BnGmvZT0KRaJl3i5OImLCgdU9Yo/u9wEJcYcN/bLNsdac89/vSyWkF7WydAmdku7 p1vkEmj3xOdhXL2i3Bp5X3Z5fHkmlvi56X3dQQsRFqxbmArfG/dJyBt8WskFcaFCYGz0 uyU/KYVdUyJuBz4+u4KL7X5lOmcLBcyVlGnWRBAVhXRD5Xi82Xg6swhZE54Tp1ECgUym /ienVKUnBuI4VFY0zw4yyhaefJ6fKvbbYsqxaxCmlwotT+2i6Dx9kvDUeMWW7Y6gilCa //Og== X-Gm-Message-State: AOAM532g04tP9+t4BaxOYxnrfNPwz7ChbUkxmA9SsmdNriIvfjtc+FRj nyYyWOFAIzrINjLZwb91Ptird3kklSOF8/h9uSEpY56SUkV49YsDbO2+FnAYhmcqlBNIprPQpUb YO+KjEtHZi/z+MKY8Foxb/P/DqFKGbYhjp++yDy5B/GExZvi91zoE6jZhU9kQ1Z2N X-Received: by 2002:a63:601:: with SMTP id 1mr7881799pgg.106.1644204390780; Sun, 06 Feb 2022 19:26:30 -0800 (PST) X-Google-Smtp-Source: ABdhPJwTOW0HhCRYN7hMYfdznoWiqbcx7uZlO+SlxLJUOZ5Lh+uyFaFG0xorURhhEW7xWYZOlXpWOA== X-Received: by 2002:a63:601:: with SMTP id 1mr7881776pgg.106.1644204390342; Sun, 06 Feb 2022 19:26:30 -0800 (PST) From: Peter Xu To: qemu-devel@nongnu.org Subject: [PATCH] memory: Fix qemu crash on starting dirty log twice with stopped VM Date: Mon, 7 Feb 2022 11:26:22 +0800 Message-Id: <20220207032622.19931-1-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Juan Quintela , David Hildenbrand , Hyman Huang , "Dr . David Alan Gilbert" , peterx@redhat.com, =?UTF-8?q?Philippe=20Mathieu-Daud=C3=A9?= , Leonardo Bras Soares Passos , Paolo Bonzini Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1644204572639100001 Content-Type: text/plain; charset="utf-8" QEMU can now easily crash with two continuous migration carried out: (qemu) migrate -d exec:cat>out (qemu) migrate_cancel (qemu) migrate -d exec:cat>out [crash] ../softmmu/memory.c:2782: memory_global_dirty_log_start: Assertion `!(global_dirty_tracking & flags)' failed. It's because memory API provides a way to postpone dirty log stop if the VM= is stopped, and that'll be re-done until the next VM start. It was added in 2= 017 with commit 1931076077 ("migration: optimize the downtime", 2017-08-01). However the recent work on allowing dirty tracking to be bitmask broke it, which is commit 63b41db4bc ("memory: make global_dirty_tracking a bitmask", 2021-11-01). The fix proposed in this patch contains two things: (1) Instead of passing over the flags to postpone stop dirty track, we ad= d a global variable (along with current vmstate_change variable) to record what flags to stop dirty tracking. (2) When start dirty tracking, instead if remove the vmstate hook directl= y, we also execute the postponed stop process so that we make sure all t= he starts and stops will be paired. This procedure is overlooked in the bitmask-ify work in 2021. Actually with the postponed stop flags, a smarter thing to do is when start dirty logging with specific flag, we can ignore the start if the flag is contained in the "postponed to stop" flags, however that'll slightly compli= cate the code, and it shouldn't be a major use case for QEMU. Considering that = this should also copy stable, keep it simple while fixing the crash. Cc: Hyman Huang Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=3D2044818 Fixes: 63b41db4bc ("memory: make global_dirty_tracking a bitmask") Signed-off-by: Peter Xu --- softmmu/memory.c | 53 ++++++++++++++++++++++++++++++++++-------------- 1 file changed, 38 insertions(+), 15 deletions(-) diff --git a/softmmu/memory.c b/softmmu/memory.c index 678dc62f06..cbb9b241ea 100644 --- a/softmmu/memory.c +++ b/softmmu/memory.c @@ -2790,16 +2790,25 @@ void memory_global_after_dirty_log_sync(void) MEMORY_LISTENER_CALL_GLOBAL(log_global_after_sync, Forward); } =20 +/* + * Dirty track stop flags that are postponed due to VM being stopped. Sho= uld + * only be used within vmstate_change hook. + */ +static unsigned int postponed_stop_flags; static VMChangeStateEntry *vmstate_change; +static void memory_global_dirty_log_stop_postponed_run(void); =20 void memory_global_dirty_log_start(unsigned int flags) { unsigned int old_flags =3D global_dirty_tracking; =20 - if (vmstate_change) { - qemu_del_vm_change_state_handler(vmstate_change); - vmstate_change =3D NULL; - } + /* + * If there is postponed dirty log stop flags, do it, so that start/st= op + * will always be paired. A smarter thing to do is ignore start proce= ss if + * the same flag has got postponed on stop, but it shouldn't matter a = lot + * in major user scenarios, so keep the code simple for now. + */ + memory_global_dirty_log_stop_postponed_run(); =20 assert(flags && !(flags & (~GLOBAL_DIRTY_MASK))); assert(!(global_dirty_tracking & flags)); @@ -2830,29 +2839,43 @@ static void memory_global_dirty_log_do_stop(unsigne= d int flags) } } =20 +/* + * Execute the postponed dirty log stop operations if there is, then reset + * everything (including the flags and the vmstate change hook). + */ +static void memory_global_dirty_log_stop_postponed_run(void) +{ + if (!vmstate_change) { + /* It means nothing is postponed */ + return; + } + + memory_global_dirty_log_do_stop(postponed_stop_flags); + postponed_stop_flags =3D 0; + qemu_del_vm_change_state_handler(vmstate_change); + vmstate_change =3D NULL; +} + static void memory_vm_change_state_handler(void *opaque, bool running, RunState state) { - unsigned int flags =3D (unsigned int)(uintptr_t)opaque; if (running) { - memory_global_dirty_log_do_stop(flags); - - if (vmstate_change) { - qemu_del_vm_change_state_handler(vmstate_change); - vmstate_change =3D NULL; - } + memory_global_dirty_log_stop_postponed_run(); } } =20 void memory_global_dirty_log_stop(unsigned int flags) { if (!runstate_is_running()) { + /* Postpone the dirty log stop, e.g., to when VM starts again */ if (vmstate_change) { - return; + /* Batch with previous postponed flags */ + postponed_stop_flags |=3D flags; + } else { + postponed_stop_flags =3D flags; + vmstate_change =3D qemu_add_vm_change_state_handler( + memory_vm_change_state_handler, NULL); } - vmstate_change =3D qemu_add_vm_change_state_handler( - memory_vm_change_state_handler, - (void *)(uintptr_t)flags); return; } =20 --=20 2.32.0