From nobody Tue Feb 10 05:44:40 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; dkim=fail spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1490758549414539.1161666757475; Tue, 28 Mar 2017 20:35:49 -0700 (PDT) Received: from localhost ([::1]:56370 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ct4Oi-0000OR-1i for importer@patchew.org; Tue, 28 Mar 2017 23:35:48 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:38427) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ct4NQ-000847-LB for qemu-devel@nongnu.org; Tue, 28 Mar 2017 23:34:30 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ct4NP-0005fR-I4 for qemu-devel@nongnu.org; Tue, 28 Mar 2017 23:34:28 -0400 Received: from ozlabs.org ([2401:3900:2:1::2]:52875) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ct4NP-0005dt-3X; Tue, 28 Mar 2017 23:34:27 -0400 Received: by ozlabs.org (Postfix, from userid 1007) id 3vtCzy3wPwz9vX8; Wed, 29 Mar 2017 14:34:22 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gibson.dropbear.id.au; s=201602; t=1490758462; bh=A0yvMDzNjcwrg67pu6ZVqtzJHDfTGM3JZ7wOGEQ5PNs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=asxL+wMvsPS8pXTfDKxewuqZXo/vLeacpaf8QXyrYU6EtOWSvDSiGJDmxMo1ni9a3 F9PyDrD4VqtCc8D2DReCi4TEsjqPpjRxunf+cw8W7G4WIbJKZFLj6UAkOFGz9cupg2 2HaWY20w4pXjqvtxGGd7LtytU/NBrpOdrH8EVuZE= From: David Gibson To: peter.maydell@linaro.org Date: Wed, 29 Mar 2017 14:34:15 +1100 Message-Id: <20170329033415.891-3-david@gibson.dropbear.id.au> X-Mailer: git-send-email 2.9.3 In-Reply-To: <20170329033415.891-1-david@gibson.dropbear.id.au> References: <20170329033415.891-1-david@gibson.dropbear.id.au> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2401:3900:2:1::2 Subject: [Qemu-devel] [PULL 2/2] spapr: fix memory hot-unplugging X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: lvivier@redhat.com, agraf@suse.de, thuth@redhat.com, aik@ozlabs.ru, qemu-devel@nongnu.org, mdroth@linux.vnet.ibm.com, qemu-ppc@nongnu.org, David Gibson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Laurent Vivier If, once the kernel has booted, we try to remove a memory hotplugged while the kernel was not started, QEMU crashes on an assert: qemu-system-ppc64: hw/virtio/vhost.c:651: vhost_commit: Assertion `r >=3D 0' failed. ... #4 in vhost_commit #5 in memory_region_transaction_commit #6 in pc_dimm_memory_unplug #7 in spapr_memory_unplug #8 spapr_machine_device_unplug #9 in hotplug_handler_unplug #10 in spapr_lmb_release #11 in detach #12 in set_allocation_state #13 in rtas_set_indicator ... If we take a closer look to the guest kernel log, we can see when we try to unplug the memory: pseries-hotplug-mem: Attempting to hot-add 4 LMB(s) What happens: 1- The kernel has ignored the memory hotplug event because it was not started when it was generated. 2- When we hot-unplug the memory, QEMU starts to remove the memory, generates an hot-unplug event, and signals the kernel of the incoming new event 3- as the kernel is started, on the QEMU signal, it reads the event list, decodes the hotplug event and tries to finish the hotplugging. 4- QEMU receive the the hotplug notification while it is trying to hot-unplug the memory. This moves the memory DRC to an invalid state This patch prevents this by not allowing to set the allocation state to USABLE while the DRC is awaiting release. RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=3D1432382 Signed-off-by: Laurent Vivier Signed-off-by: David Gibson --- hw/ppc/spapr_drc.c | 20 +++++++++++++++++--- include/hw/ppc/spapr_drc.h | 1 + 2 files changed, 18 insertions(+), 3 deletions(-) diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c index 150f6bf..a1cdc87 100644 --- a/hw/ppc/spapr_drc.c +++ b/hw/ppc/spapr_drc.c @@ -135,6 +135,17 @@ static uint32_t set_allocation_state(sPAPRDRConnector = *drc, if (!drc->dev) { return RTAS_OUT_NO_SUCH_INDICATOR; } + if (drc->awaiting_release && drc->awaiting_allocation) { + /* kernel is acknowledging a previous hotplug event + * while we are already removing it. + * it's safe to ignore awaiting_allocation here since we know = the + * situation is predicated on the guest either already having = done + * so (boot-time hotplug), or never being able to acquire in t= he + * first place (hotplug followed by immediate unplug). + */ + drc->awaiting_allocation_skippable =3D true; + return RTAS_OUT_NO_SUCH_INDICATOR; + } } =20 if (drc->type !=3D SPAPR_DR_CONNECTOR_TYPE_PCI) { @@ -436,9 +447,11 @@ static void detach(sPAPRDRConnector *drc, DeviceState = *d, } =20 if (drc->awaiting_allocation) { - drc->awaiting_release =3D true; - trace_spapr_drc_awaiting_allocation(get_index(drc)); - return; + if (!drc->awaiting_allocation_skippable) { + drc->awaiting_release =3D true; + trace_spapr_drc_awaiting_allocation(get_index(drc)); + return; + } } =20 drc->indicator_state =3D SPAPR_DR_INDICATOR_STATE_INACTIVE; @@ -448,6 +461,7 @@ static void detach(sPAPRDRConnector *drc, DeviceState *= d, } =20 drc->awaiting_release =3D false; + drc->awaiting_allocation_skippable =3D false; g_free(drc->fdt); drc->fdt =3D NULL; drc->fdt_start_offset =3D 0; diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h index fa531d5..5524247 100644 --- a/include/hw/ppc/spapr_drc.h +++ b/include/hw/ppc/spapr_drc.h @@ -154,6 +154,7 @@ typedef struct sPAPRDRConnector { bool awaiting_release; bool signalled; bool awaiting_allocation; + bool awaiting_allocation_skippable; =20 /* device pointer, via link property */ DeviceState *dev; --=20 2.9.3