From nobody Fri Mar 29 05:08:01 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1543986540986154.74802913217752; Tue, 4 Dec 2018 21:09:00 -0800 (PST) Received: from localhost ([::1]:60053 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gUPQY-0002bp-JU for importer@patchew.org; Wed, 05 Dec 2018 00:08:50 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:49202) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gUPOh-0008KE-TX for qemu-devel@nongnu.org; Wed, 05 Dec 2018 00:06:56 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gUPOf-0004OA-Tc for qemu-devel@nongnu.org; Wed, 05 Dec 2018 00:06:55 -0500 Received: from ozlabs.org ([2401:3900:2:1::2]:51467) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gUPOd-0004Hq-Sx; Wed, 05 Dec 2018 00:06:53 -0500 Received: by ozlabs.org (Postfix, from userid 1007) id 438mtF2vzHz9sBh; Wed, 5 Dec 2018 16:06:45 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gibson.dropbear.id.au; s=201602; t=1543986405; bh=Hmgz2C3e02dM+6nnhY1rjxhj7Ync54zO9jHua+B3uTo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=T/2ur89kLbSKihF/9YktgCDELO5QQh7YYfA9O8y0YfW8J8XiQD37oUBDoDI2DSY5I LM0NA3BG3ZWvKe6F7zQpl0ScLzj/L/hnMkt4ys6hME29rbSgGtDeeWpEW4rOw/hJ7F 63JWKAYd9JF8DAkVe9Qlc2ZvjbNUm6TGhNx9yHuc= From: David Gibson To: dhildenb@redhat.com, imammedo@redhat.com, ehabkost@redhat.com Date: Wed, 5 Dec 2018 16:06:37 +1100 Message-Id: <20181205050641.864-2-david@gibson.dropbear.id.au> X-Mailer: git-send-email 2.19.2 In-Reply-To: <20181205050641.864-1-david@gibson.dropbear.id.au> References: <20181205050641.864-1-david@gibson.dropbear.id.au> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2401:3900:2:1::2 Subject: [Qemu-devel] [RFCv2 for-4.0 1/5] virtio-balloon: Remove unnecessary MADV_WILLNEED on deflate X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: pbonzini@redhat.com, David Gibson , qemu-ppc@nongnu.org, qemu-devel@nongnu.org, mst@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Type: text/plain; charset="utf-8" When the balloon is inflated, we discard memory place in it using madvise() with MADV_DONTNEED. And when we deflate it we use MADV_WILLNEED, which sounds like it makes sense but is actually unnecessary. The misleadingly named MADV_DONTNEED just discards the memory in question, it doesn't set any persistent state on it in-kernel; all that's necessary to bring the memory back is to touch it. MADV_WILLNEED in contrast specifically says that the memory will be used soon and faults it in. This patch simplify's the balloon operation by dropping the madvise() on deflate. This might have an impact on performance - it will move a delay at deflate time until that memory is actually touched, which might be more latency sensitive. However: * Memory that's being given back to the guest by deflating the balloon *might* be used soon, but it equally could just sit around in the guest's pools until needed (or even be faulted out again if the host is under memory pressure). * Usually, the timescale over which you'll be adjusting the balloon is long enough that a few extra faults after deflation aren't going to make a difference. Signed-off-by: David Gibson Reviewed-by: David Hildenbrand Reviewed-by: Michael S. Tsirkin --- hw/virtio/virtio-balloon.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c index 1728e4f83a..6ec4bcf4e1 100644 --- a/hw/virtio/virtio-balloon.c +++ b/hw/virtio/virtio-balloon.c @@ -35,9 +35,8 @@ =20 static void balloon_page(void *addr, int deflate) { - if (!qemu_balloon_is_inhibited()) { - qemu_madvise(addr, BALLOON_PAGE_SIZE, - deflate ? QEMU_MADV_WILLNEED : QEMU_MADV_DONTNEED); + if (!qemu_balloon_is_inhibited() && !deflate) { + qemu_madvise(addr, BALLOON_PAGE_SIZE, QEMU_MADV_DONTNEED); } } =20 --=20 2.19.2 From nobody Fri Mar 29 05:08:01 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1543986537543910.579653324056; Tue, 4 Dec 2018 21:08:57 -0800 (PST) Received: from localhost ([::1]:60052 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gUPQW-0002Uo-Ck for importer@patchew.org; Wed, 05 Dec 2018 00:08:48 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:49257) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gUPOj-0008LG-5C for qemu-devel@nongnu.org; Wed, 05 Dec 2018 00:06:58 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gUPOf-0004Nj-QG for qemu-devel@nongnu.org; Wed, 05 Dec 2018 00:06:56 -0500 Received: from ozlabs.org ([203.11.71.1]:41359) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gUPOd-0004Hn-LB; Wed, 05 Dec 2018 00:06:53 -0500 Received: by ozlabs.org (Postfix, from userid 1007) id 438mtF3lR8z9sCh; Wed, 5 Dec 2018 16:06:45 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gibson.dropbear.id.au; s=201602; t=1543986405; bh=MjrkVR0uIM6WSQ6wKlw3sqVGk7q6hN6cV/L5CB5jpPs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=GDuTk5uDOFgKoKFmNgYO8fHlqI4aCDFKzLrt9I8igl0823SiKhuEoYqVciUyWY5Cx 10i6VX/XnwwAboZ6rGrU7KXVolxPj7tMILsc++NJPHSbkYlJGWTD4tOD3FVtWznqYb SNGr1xrz6/5VrQaLbmTVYdfyewUGckd8utRSc+wc= From: David Gibson To: dhildenb@redhat.com, imammedo@redhat.com, ehabkost@redhat.com Date: Wed, 5 Dec 2018 16:06:38 +1100 Message-Id: <20181205050641.864-3-david@gibson.dropbear.id.au> X-Mailer: git-send-email 2.19.2 In-Reply-To: <20181205050641.864-1-david@gibson.dropbear.id.au> References: <20181205050641.864-1-david@gibson.dropbear.id.au> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 203.11.71.1 Subject: [Qemu-devel] [RFCv2 for-4.0 2/5] virtio-balloon: Corrections to address verification X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: pbonzini@redhat.com, David Gibson , qemu-ppc@nongnu.org, qemu-devel@nongnu.org, mst@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Type: text/plain; charset="utf-8" The virtio-balloon device's verification of the address given to it by the guest has a number of faults: * The addresses here are guest physical addresses, which should be 'hwaddr' rather than 'ram_addr_t' (the distinction is admittedly pretty subtle and confusing) * We don't check for section.mr being NULL, which is the main way that memory_region_find() reports basic failures. We really need to check that before looking at any other section fields, because memory_region_find() doesn't initialize them on the failure path * We're passing a length of '1' to memory_region_find(), but really the guest is requesting that we put the entire page into the balloon, so it makes more sense to call it with BALLOON_PAGE_SIZE Signed-off-by: David Gibson Reviewed-by: David Hildenbrand Reviewed-by: Michael S. Tsirkin --- hw/virtio/virtio-balloon.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c index 6ec4bcf4e1..e8611aab0e 100644 --- a/hw/virtio/virtio-balloon.c +++ b/hw/virtio/virtio-balloon.c @@ -221,17 +221,20 @@ static void virtio_balloon_handle_output(VirtIODevice= *vdev, VirtQueue *vq) } =20 while (iov_to_buf(elem->out_sg, elem->out_num, offset, &pfn, 4) = =3D=3D 4) { - ram_addr_t pa; - ram_addr_t addr; + hwaddr pa; + hwaddr addr; int p =3D virtio_ldl_p(vdev, &pfn); =20 - pa =3D (ram_addr_t) p << VIRTIO_BALLOON_PFN_SHIFT; + pa =3D (hwaddr) p << VIRTIO_BALLOON_PFN_SHIFT; offset +=3D 4; =20 - /* FIXME: remove get_system_memory(), but how? */ - section =3D memory_region_find(get_system_memory(), pa, 1); - if (!int128_nz(section.size) || - !memory_region_is_ram(section.mr) || + section =3D memory_region_find(get_system_memory(), pa, + BALLOON_PAGE_SIZE); + if (!section.mr) { + trace_virtio_balloon_bad_addr(pa); + continue; + } + if (!memory_region_is_ram(section.mr) || memory_region_is_rom(section.mr) || memory_region_is_romd(section.mr)) { trace_virtio_balloon_bad_addr(pa); --=20 2.19.2 From nobody Fri Mar 29 05:08:01 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1543986661851676.8819606227306; Tue, 4 Dec 2018 21:11:01 -0800 (PST) Received: from localhost ([::1]:60070 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gUPSZ-0004KM-SX for importer@patchew.org; Wed, 05 Dec 2018 00:10:55 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:49256) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gUPOj-0008LF-56 for qemu-devel@nongnu.org; Wed, 05 Dec 2018 00:06:58 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gUPOi-0004RG-0u for qemu-devel@nongnu.org; Wed, 05 Dec 2018 00:06:56 -0500 Received: from ozlabs.org ([203.11.71.1]:35039) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gUPOg-0004Mt-Ue; Wed, 05 Dec 2018 00:06:55 -0500 Received: by ozlabs.org (Postfix, from userid 1007) id 438mtF4WnSz9sCr; Wed, 5 Dec 2018 16:06:45 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gibson.dropbear.id.au; s=201602; t=1543986405; bh=brOTXMxkhPk+ENVg+Ls2uCWUIUWYjxkrC7oP2Y5RVKU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=oXLt53bn6L92SKRbJkAR5ib+CK2rSiNybmCdKX3suka+mNEEzTicwbfuEumTXCN0I 4xc+ofk+ryyC0OhPvhNsSpfnnAGk8SbCK0zdMfD23r5yL5Gcs9LF/I94skahG/pRRm 5+cGWkW522W0M2zfI7M8ieoIG5LOJz7jGP8F9t/Y= From: David Gibson To: dhildenb@redhat.com, imammedo@redhat.com, ehabkost@redhat.com Date: Wed, 5 Dec 2018 16:06:39 +1100 Message-Id: <20181205050641.864-4-david@gibson.dropbear.id.au> X-Mailer: git-send-email 2.19.2 In-Reply-To: <20181205050641.864-1-david@gibson.dropbear.id.au> References: <20181205050641.864-1-david@gibson.dropbear.id.au> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 203.11.71.1 Subject: [Qemu-devel] [RFCv2 for-4.0 3/5] virtio-balloon: Rework ballon_page() interface X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: pbonzini@redhat.com, David Gibson , qemu-ppc@nongnu.org, qemu-devel@nongnu.org, mst@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Type: text/plain; charset="utf-8" This replaces the balloon_page() internal interface with ballon_inflate_page(), with a slightly different interface. The new interface will make future alterations simpler. Signed-off-by: David Gibson Reviewed-by: David Hildenbrand Reviewed-by: Michael S. Tsirkin --- hw/virtio/virtio-balloon.c | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-) diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c index e8611aab0e..c3a19aa27d 100644 --- a/hw/virtio/virtio-balloon.c +++ b/hw/virtio/virtio-balloon.c @@ -33,11 +33,12 @@ =20 #define BALLOON_PAGE_SIZE (1 << VIRTIO_BALLOON_PFN_SHIFT) =20 -static void balloon_page(void *addr, int deflate) +static void balloon_inflate_page(VirtIOBalloon *balloon, + MemoryRegion *mr, hwaddr offset) { - if (!qemu_balloon_is_inhibited() && !deflate) { - qemu_madvise(addr, BALLOON_PAGE_SIZE, QEMU_MADV_DONTNEED); - } + void *addr =3D memory_region_get_ram_ptr(mr) + offset; + + qemu_madvise(addr, BALLOON_PAGE_SIZE, QEMU_MADV_DONTNEED); } =20 static const char *balloon_stat_names[] =3D { @@ -222,7 +223,6 @@ static void virtio_balloon_handle_output(VirtIODevice *= vdev, VirtQueue *vq) =20 while (iov_to_buf(elem->out_sg, elem->out_num, offset, &pfn, 4) = =3D=3D 4) { hwaddr pa; - hwaddr addr; int p =3D virtio_ldl_p(vdev, &pfn); =20 pa =3D (hwaddr) p << VIRTIO_BALLOON_PFN_SHIFT; @@ -244,11 +244,9 @@ static void virtio_balloon_handle_output(VirtIODevice = *vdev, VirtQueue *vq) =20 trace_virtio_balloon_handle_output(memory_region_name(section.= mr), pa); - /* Using memory_region_get_ram_ptr is bending the rules a bit,= but - should be OK because we only want a single page. */ - addr =3D section.offset_within_region; - balloon_page(memory_region_get_ram_ptr(section.mr) + addr, - !!(vq =3D=3D s->dvq)); + if (!qemu_balloon_is_inhibited() && vq !=3D s->dvq) { + balloon_inflate_page(s, section.mr, section.offset_within_= region); + } memory_region_unref(section.mr); } =20 --=20 2.19.2 From nobody Fri Mar 29 05:08:01 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1543986649611374.1504510284493; Tue, 4 Dec 2018 21:10:49 -0800 (PST) Received: from localhost ([::1]:60069 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gUPSJ-00048O-38 for importer@patchew.org; Wed, 05 Dec 2018 00:10:39 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:49203) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gUPOh-0008KF-Ta for qemu-devel@nongnu.org; Wed, 05 Dec 2018 00:06:56 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gUPOf-0004NM-9a for qemu-devel@nongnu.org; Wed, 05 Dec 2018 00:06:55 -0500 Received: from ozlabs.org ([2401:3900:2:1::2]:52277) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gUPOc-0004Hp-3c; Wed, 05 Dec 2018 00:06:51 -0500 Received: by ozlabs.org (Postfix, from userid 1007) id 438mtF03fsz9s9G; Wed, 5 Dec 2018 16:06:44 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gibson.dropbear.id.au; s=201602; t=1543986405; bh=jhqbuR3JMUgaJpujuO4fNd9rvNj4X90x6Kb+G5koxnA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=DI6Mzf6BKu75O4ogWykACd1ttjSJ2Qh0Wg+Ts5MF7L83y2THQUpjVt3/VwQxq0pCy uCN9HKZp2AjA+Jal5WWJeUGq1i/gKIJMtVwmTbkRwbmSbZG9K3eqkm6KPL9z7YfXXR tZ93t69AbNfILu43/5LSmIJ7pbCL/7Qp5ZfS8PJs= From: David Gibson To: dhildenb@redhat.com, imammedo@redhat.com, ehabkost@redhat.com Date: Wed, 5 Dec 2018 16:06:40 +1100 Message-Id: <20181205050641.864-5-david@gibson.dropbear.id.au> X-Mailer: git-send-email 2.19.2 In-Reply-To: <20181205050641.864-1-david@gibson.dropbear.id.au> References: <20181205050641.864-1-david@gibson.dropbear.id.au> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2401:3900:2:1::2 Subject: [Qemu-devel] [RFCv2 for-4.0 4/5] virtio-balloon: Use ram_block_discard_range() instead of raw madvise() X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: pbonzini@redhat.com, David Gibson , qemu-ppc@nongnu.org, qemu-devel@nongnu.org, mst@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Type: text/plain; charset="utf-8" Currently, virtio-balloon uses madvise() with MADV_DONTNEED to actually discard RAM pages inserted into the balloon. This is basically a Linux only interface (MADV_DONTNEED exists on some other platforms, but doesn't always have the same semantics). It also doesn't work on hugepages and has some other limitations. It turns out that postcopy also needs to discard chunks of memory, and uses a better interface for it: ram_block_discard_range(). It doesn't cover every case, but it covers more than going direct to madvise() and this gives us a single place to update for more possibilities in future. There are some subtleties here to maintain the current balloon behaviour: * For now, we just ignore requests to balloon in a hugepage backed region. That matches current behaviour, because MADV_DONTNEED on a hugepage would simply fail, and we ignore the error. * If host page size is > BALLOON_PAGE_SIZE we can frequently call this on non-host-page-aligned addresses. These would also fail in madvise(), which we then ignored. ram_block_discard_range() error_report()s calls on unaligned addresses, so we explicitly check that case to avoid spamming the logs. * We now call ram_block_discard_range() with the *host* page size, whereas we previously called madvise() with BALLOON_PAGE_SIZE. Surprisingly, this also matches existing behaviour. Although the kernel fails madvise on unaligned addresses, it will round unaligned sizes *up* to the host page size. Yes, this means that if BALLOON_PAGE_SIZE < guest page size we can incorrectly discard more memory than the guest asked us to. I'm planning to address that soon. Errors other than the ones discussed above, will now be reported by ram_block_discard_range(), rather than silently ignored, which means we have a much better chance of seeing when something is going wrong. Signed-off-by: David Gibson Reviewed-by: Michael S. Tsirkin --- hw/virtio/virtio-balloon.c | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c index c3a19aa27d..4435905c87 100644 --- a/hw/virtio/virtio-balloon.c +++ b/hw/virtio/virtio-balloon.c @@ -37,8 +37,29 @@ static void balloon_inflate_page(VirtIOBalloon *balloon, MemoryRegion *mr, hwaddr offset) { void *addr =3D memory_region_get_ram_ptr(mr) + offset; + RAMBlock *rb; + size_t rb_page_size; + ram_addr_t ram_offset; =20 - qemu_madvise(addr, BALLOON_PAGE_SIZE, QEMU_MADV_DONTNEED); + /* XXX is there a better way to get to the RAMBlock than via a + * host address? */ + rb =3D qemu_ram_block_from_host(addr, false, &ram_offset); + rb_page_size =3D qemu_ram_pagesize(rb); + + /* Silently ignore hugepage RAM blocks */ + if (rb_page_size !=3D getpagesize()) { + return; + } + + /* Silently ignore unaligned requests */ + if (ram_offset & (rb_page_size - 1)) { + return; + } + + ram_block_discard_range(rb, ram_offset, rb_page_size); + /* We ignore errors from ram_block_discard_range(), because it has + * already reported them, and failing to discard a balloon page is + * not fatal */ } =20 static const char *balloon_stat_names[] =3D { --=20 2.19.2 From nobody Fri Mar 29 05:08:01 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1543986738492948.1666036974383; Tue, 4 Dec 2018 21:12:18 -0800 (PST) Received: from localhost ([::1]:60077 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gUPTt-0005GY-6q for importer@patchew.org; Wed, 05 Dec 2018 00:12:17 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:49200) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gUPOh-0008KD-TD for qemu-devel@nongnu.org; Wed, 05 Dec 2018 00:06:57 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gUPOf-0004NG-9R for qemu-devel@nongnu.org; Wed, 05 Dec 2018 00:06:55 -0500 Received: from ozlabs.org ([2401:3900:2:1::2]:54899) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gUPOc-0004Hr-1m; Wed, 05 Dec 2018 00:06:51 -0500 Received: by ozlabs.org (Postfix, from userid 1007) id 438mtF0xxvz9s7W; Wed, 5 Dec 2018 16:06:45 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gibson.dropbear.id.au; s=201602; t=1543986405; bh=1vb6G0SM1UKdiF/MiU7w8REDK0zjfu5llbTESniNOBg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ITW4KZUqdLAcOsvQKJ8u7gDge8j0j+ZQ4xTdWe1pM498ck4B0YACBO1HLJi+9/BQ1 dbpxJXLxSN43e36m3ZRB7NyCL17zxQC5H43hm+krOh9ocQ+QgvW0ER65Gqr5FAx+8F UxFh6Yqs8TG+GFVz6Zhi4ZszVf+lPJMQTS9wY7JQ= From: David Gibson To: dhildenb@redhat.com, imammedo@redhat.com, ehabkost@redhat.com Date: Wed, 5 Dec 2018 16:06:41 +1100 Message-Id: <20181205050641.864-6-david@gibson.dropbear.id.au> X-Mailer: git-send-email 2.19.2 In-Reply-To: <20181205050641.864-1-david@gibson.dropbear.id.au> References: <20181205050641.864-1-david@gibson.dropbear.id.au> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2401:3900:2:1::2 Subject: [Qemu-devel] [RFCv2 for-4.0 5/5] virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: pbonzini@redhat.com, David Gibson , qemu-ppc@nongnu.org, qemu-devel@nongnu.org, mst@redhat.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Type: text/plain; charset="utf-8" The virtio-balloon always works in units of 4kiB (BALLOON_PAGE_SIZE), but we can only actually discard memory in units of the host page size. Now, we handle this very badly: we silently ignore balloon requests that aren't host page aligned, and for requests that are host page aligned we discard the entire host page. The latter can corrupt guest memory if its page size is smaller than the host's. The obvious choice would be to disable the balloon if the host page size is not 4kiB. However, that would break the special case where host and guest have the same page size, but that's larger than 4kiB. That case currently works by accident[1] - and is used in practice on many production POWER systems where 64kiB has long been the Linux default page size on both host and guest. To make the balloon safe, without breaking that useful special case, we need to accumulate 4kiB balloon requests until we have a whole contiguous host page to discard. We could in principle do that across all guest memory, but it would require a large bitmap to track. This patch represents a compromise: we track ballooned subpages for a single contiguous host page at a time. This means that if the guest discards all 4kiB chunks of a host page in succession, we will discard it. This is the expected behaviour in the (host page) =3D= =3D (guest page) !=3D 4kiB case we want to support. If the guest scatters 4kiB requests across different host pages, we don't discard anything, and issue a warning. Not ideal, but at least we don't corrupt guest memory as the previous version could. Warning reporting is kind of a compromise here. Determining whether we're in a problematic state at realize() time is tricky, because we'd have to look at the host pagesizes of all memory backends, but we can't really know if some of those backends could be for special purpose memory that's not subject to ballooning. Reporting only when the guest tries to balloon a partial page also isn't great because if the guest page size happens to line up it won't indicate that we're in a non ideal situation. It could also cause alarming repeated warnings whenever a migration is attempted. So, what we do is warn the first time the guest attempts balloon a partial host page, whether or not it will end up ballooning the rest of the page immediately afterwards. [1] Because when the guest attempts to balloon a page, it will submit requests for each 4kiB subpage. Most will be ignored, but the one which happens to be host page aligned will discard the whole lot. Signed-off-by: David Gibson --- hw/virtio/virtio-balloon.c | 67 +++++++++++++++++++++++++----- include/hw/virtio/virtio-balloon.h | 3 ++ 2 files changed, 60 insertions(+), 10 deletions(-) diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c index 4435905c87..39573ef2e3 100644 --- a/hw/virtio/virtio-balloon.c +++ b/hw/virtio/virtio-balloon.c @@ -33,33 +33,80 @@ =20 #define BALLOON_PAGE_SIZE (1 << VIRTIO_BALLOON_PFN_SHIFT) =20 +typedef struct PartiallyBalloonedPage { + RAMBlock *rb; + ram_addr_t base; + unsigned long bitmap[]; +} PartiallyBalloonedPage; + static void balloon_inflate_page(VirtIOBalloon *balloon, MemoryRegion *mr, hwaddr offset) { void *addr =3D memory_region_get_ram_ptr(mr) + offset; RAMBlock *rb; size_t rb_page_size; - ram_addr_t ram_offset; + int subpages; + ram_addr_t ram_offset, host_page_base; =20 /* XXX is there a better way to get to the RAMBlock than via a * host address? */ rb =3D qemu_ram_block_from_host(addr, false, &ram_offset); rb_page_size =3D qemu_ram_pagesize(rb); + host_page_base =3D ram_offset & ~(rb_page_size - 1); + + if (rb_page_size =3D=3D BALLOON_PAGE_SIZE) { + /* Easy case */ =20 - /* Silently ignore hugepage RAM blocks */ - if (rb_page_size !=3D getpagesize()) { + ram_block_discard_range(rb, ram_offset, rb_page_size); + /* We ignore errors from ram_block_discard_range(), because it + * has already reported them, and failing to discard a balloon + * page is not fatal */ return; } =20 - /* Silently ignore unaligned requests */ - if (ram_offset & (rb_page_size - 1)) { - return; + /* Hard case + * + * We've put a piece of a larger host page into the balloon - we + * need to keep track until we have a whole host page to + * discard + */ + subpages =3D rb_page_size / BALLOON_PAGE_SIZE; + + if (balloon->pbp + && (rb !=3D balloon->pbp->rb + || host_page_base !=3D balloon->pbp->base)) { + /* We've partially ballooned part of a host page, but now + * we're trying to balloon part of a different one. Too hard, + * give up on the old partial page */ + warn_report("Unable to insert a partial page into virtio-balloon"); + free(balloon->pbp); + balloon->pbp =3D NULL; } =20 - ram_block_discard_range(rb, ram_offset, rb_page_size); - /* We ignore errors from ram_block_discard_range(), because it has - * already reported them, and failing to discard a balloon page is - * not fatal */ + if (!balloon->pbp) { + /* Starting on a new host page */ + size_t bitlen =3D BITS_TO_LONGS(subpages) * sizeof(unsigned long); + balloon->pbp =3D g_malloc0(sizeof(PartiallyBalloonedPage) + bitlen= ); + balloon->pbp->rb =3D rb; + balloon->pbp->base =3D host_page_base; + } + + bitmap_set(balloon->pbp->bitmap, + (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE, + subpages); + + if (bitmap_full(balloon->pbp->bitmap, subpages)) { + /* We've accumulated a full host page, we can actually discard + * it now */ + + ram_block_discard_range(rb, balloon->pbp->base, rb_page_size); + /* We ignore errors from ram_block_discard_range(), because it + * has already reported them, and failing to discard a balloon + * page is not fatal */ + + free(balloon->pbp); + balloon->pbp =3D NULL; + } } =20 static const char *balloon_stat_names[] =3D { diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-= balloon.h index e0df3528c8..99dcd6d105 100644 --- a/include/hw/virtio/virtio-balloon.h +++ b/include/hw/virtio/virtio-balloon.h @@ -30,6 +30,8 @@ typedef struct virtio_balloon_stat_modern { uint64_t val; } VirtIOBalloonStatModern; =20 +typedef struct PartiallyBalloonedPage PartiallyBalloonedPage; + typedef struct VirtIOBalloon { VirtIODevice parent_obj; VirtQueue *ivq, *dvq, *svq; @@ -42,6 +44,7 @@ typedef struct VirtIOBalloon { int64_t stats_last_update; int64_t stats_poll_interval; uint32_t host_features; + PartiallyBalloonedPage *pbp; } VirtIOBalloon; =20 #endif --=20 2.19.2