From nobody Sun Feb 8 06:11:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1613995184; cv=none; d=zohomail.com; s=zohoarc; b=UM/gm1y6qGNa+08nv1c5GTZqOf7f4uhvVEDhmCcx7XQXmj0yiLDa8UMz/0L/NehgMVVmT1Psw5m8B6XeSTrxT9VMqYLd+Rl+ydfqBjf9N4ETNnWi3M7ZlMZ8z6HMS0I1E8SruHMRCHcj3pfjOKhnU5gRCWBsaoZHOteRvgYS1rQ= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1613995184; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=qQJt5G6taFN0KtdMpwS0PocHMu3nuWc6SusD0ZhAqJE=; b=Yge1S4dJaZWlDVbzPR7vqLzvYj1DrJ4OmXcQ/KFWEW5s3cqO/Mn0mg5jViD9nPGtE1TCWK27woPOtUzK2Dr2BahuCUbbY18UbL7Bcpe6Bby0FW5mLGs6AbFhJpAR2ii3f3cKLOyM34n/4nLnLdUf8/SYthlolk+YRudnBsCsOhI= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1613995183793399.8247353569477; Mon, 22 Feb 2021 03:59:43 -0800 (PST) Received: from localhost ([::1]:50838 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lE9sM-0007ob-Fw for importer@patchew.org; Mon, 22 Feb 2021 06:59:42 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:41114) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lE9qb-0006M0-4P for qemu-devel@nongnu.org; Mon, 22 Feb 2021 06:57:53 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:27640) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1lE9qV-00010k-Ik for qemu-devel@nongnu.org; Mon, 22 Feb 2021 06:57:52 -0500 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-66-P-cBbdOUPeu4QwexKnRrNQ-1; Mon, 22 Feb 2021 06:57:42 -0500 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 772C980197D; Mon, 22 Feb 2021 11:57:41 +0000 (UTC) Received: from t480s.redhat.com (ovpn-115-16.ams2.redhat.com [10.36.115.16]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0D19919C45; Mon, 22 Feb 2021 11:57:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1613995066; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qQJt5G6taFN0KtdMpwS0PocHMu3nuWc6SusD0ZhAqJE=; b=FRo8WszSrZwCfZVQcgB4eTG+MnM5z86JRg/v1gWXWIe9qim8xi2jXxcrOjlJXDplTMdqkG WIRW9gX2iMzpXkYUjwxlf1lsPuDue38g4wtFIBmHSjsqY7nUKHhqv0BnAylkyDyr0DNha7 hLEe//EH3E0oar64JwNLMaYdQ4D6oZ4= X-MC-Unique: P-cBbdOUPeu4QwexKnRrNQ-1 From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v6 01/12] memory: Introduce RamDiscardMgr for RAM memory regions Date: Mon, 22 Feb 2021 12:56:57 +0100 Message-Id: <20210222115708.7623-2-david@redhat.com> In-Reply-To: <20210222115708.7623-1-david@redhat.com> References: <20210222115708.7623-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , Wei Yang , "Michael S . Tsirkin" , David Hildenbrand , Alex Williamson , Peter Xu , "Dr . David Alan Gilbert" , Auger Eric , Pankaj Gupta , teawater , Igor Mammedov , Paolo Bonzini , Marek Kedzierski Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Type: text/plain; charset="utf-8" We have some special RAM memory regions (managed by virtio-mem), whereby the guest agreed to only use selected memory ranges. "unused" parts are discarded so they won't consume memory - to logically unplug these memory ranges. Before the VM is allowed to use such logically unplugged memory again, coordination with the hypervisor is required. This results in "sparse" mmaps/RAMBlocks/memory regions, whereby only coordinated parts are valid to be used/accessed by the VM. In most cases, we don't care about that - e.g., in KVM, we simply have a single KVM memory slot. However, in case of vfio, registering the whole region with the kernel results in all pages getting pinned, and therefore an unexpected high memory consumption - discarding of RAM in that context is broken. Let's introduce a way to coordinate discarding/populating memory within a RAM memory region with such special consumers of RAM memory regions: they can register as listeners and get updates on memory getting discarded and populated. Using this machinery, vfio will be able to map only the currently populated parts, resulting in discarded parts not getting pinned and not consuming memory. A RamDiscardMgr has to be set for a memory region before it is getting mapped, and cannot change while the memory region is mapped. Note: At some point, we might want to let RAMBlock users (esp. vfio used for nvme://) consume this interface as well. We'll need RAMBlock notifier calls when a RAMBlock is getting mapped/unmapped (via the corresponding memory region), so we can properly register a listener there as well. Reviewed-by: Pankaj Gupta Acked-by: Michael S. Tsirkin Cc: Paolo Bonzini Cc: "Michael S. Tsirkin" Cc: Alex Williamson Cc: Dr. David Alan Gilbert Cc: Igor Mammedov Cc: Pankaj Gupta Cc: Peter Xu Cc: Auger Eric Cc: Wei Yang Cc: teawater Cc: Marek Kedzierski Signed-off-by: David Hildenbrand --- include/exec/memory.h | 231 ++++++++++++++++++++++++++++++++++++++++++ softmmu/memory.c | 22 ++++ 2 files changed, 253 insertions(+) diff --git a/include/exec/memory.h b/include/exec/memory.h index c6fb714e49..6132910767 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -42,6 +42,12 @@ typedef struct IOMMUMemoryRegionClass IOMMUMemoryRegionC= lass; DECLARE_OBJ_CHECKERS(IOMMUMemoryRegion, IOMMUMemoryRegionClass, IOMMU_MEMORY_REGION, TYPE_IOMMU_MEMORY_REGION) =20 +#define TYPE_RAM_DISCARD_MGR "qemu:ram-discard-mgr" +typedef struct RamDiscardMgrClass RamDiscardMgrClass; +typedef struct RamDiscardMgr RamDiscardMgr; +DECLARE_OBJ_CHECKERS(RamDiscardMgr, RamDiscardMgrClass, RAM_DISCARD_MGR, + TYPE_RAM_DISCARD_MGR); + #ifdef CONFIG_FUZZ void fuzz_dma_read_cb(size_t addr, size_t len, @@ -124,6 +130,66 @@ typedef struct IOMMUTLBEvent { IOMMUTLBEntry entry; } IOMMUTLBEvent; =20 +struct RamDiscardListener; +typedef int (*NotifyRamPopulate)(struct RamDiscardListener *rdl, + const MemoryRegion *mr, ram_addr_t offset, + ram_addr_t size); +typedef void (*NotifyRamDiscard)(struct RamDiscardListener *rdl, + const MemoryRegion *mr, ram_addr_t offset, + ram_addr_t size); +typedef void (*NotifyRamDiscardAll)(struct RamDiscardListener *rdl, + const MemoryRegion *mr); + +typedef struct RamDiscardListener { + /* + * @notify_populate: + * + * Notification that previously discarded memory is about to get popul= ated. + * Listeners are able to object. If any listener objects, already + * successfully notified listeners are notified about a discard again. + * + * @rdl: the #RamDiscardListener getting notified + * @mr: the relevant #MemoryRegion + * @offset: offset into the #MemoryRegion, aligned to minimum granular= ity of + * the #RamDiscardMgr + * @size: the size, aligned to minimum granularity of the #RamDiscardM= gr + * + * Returns 0 on success. If the notification is rejected by the listen= er, + * an error is returned. + */ + NotifyRamPopulate notify_populate; + + /* + * @notify_discard: + * + * Notification that previously populated memory was discarded success= fully + * and listeners should drop all references to such memory and prevent + * new population (e.g., unmap). + * + * @rdl: the #RamDiscardListener getting notified + * @mr: the relevant #MemoryRegion + * @offset: offset into the #MemoryRegion, aligned to minimum granular= ity of + * the #RamDiscardMgr + * @size: the size, aligned to minimum granularity of the #RamDiscardM= gr + */ + NotifyRamDiscard notify_discard; + + /* + * @notify_discard_all: + * + * Notification that all previously populated memory was discarded + * successfully. + * + * Note: this callback is optional. If not set, individual notify_popu= late() + * notifications are triggered. + * + * @rdl: the #RamDiscardListener getting notified + * @mr: the relevant #MemoryRegion + */ + NotifyRamDiscardAll notify_discard_all; + QLIST_ENTRY(RamDiscardListener) next; +} RamDiscardListener; + /* RAM is pre-allocated and passed into qemu_ram_alloc_from_ptr */ #define RAM_PREALLOC (1 << 0) =20 @@ -167,6 +233,16 @@ static inline void iommu_notifier_init(IOMMUNotifier *= n, IOMMUNotify fn, n->iommu_idx =3D iommu_idx; } =20 +static inline void ram_discard_listener_init(RamDiscardListener *rdl, + NotifyRamPopulate populate_fn, + NotifyRamDiscard discard_fn, + NotifyRamDiscardAll discard_a= ll_fn) +{ + rdl->notify_populate =3D populate_fn; + rdl->notify_discard =3D discard_fn; + rdl->notify_discard_all =3D discard_all_fn; +} + /* * Memory region callbacks */ @@ -441,6 +517,126 @@ struct IOMMUMemoryRegionClass { Error **errp); }; =20 +/* + * RamDiscardMgrClass: + * + * A #RamDiscardMgr coordinates which parts of specific RAM #MemoryRegion + * regions are currently populated to be used/accessed by the VM, notifying + * after parts were discarded (freeing up memory) and before parts will be + * populated (consuming memory), to be used/acessed by the VM. + * + * A #RamDiscardMgr can only be set for a RAM #MemoryRegion while the + * #MemoryRegion isn't mapped yet; it cannot change while the #MemoryRegio= n is + * mapped. + * + * The #RamDiscardMgr is intended to be used by technologies that are + * incompatible with discarding of RAM (e.g., VFIO, which may pin all + * memory inside a #MemoryRegion), and require proper coordination to only + * map the currently populated parts, to hinder parts that are expected to + * remain discarded from silently getting populated and consuming memory. + * Technologies that support discarding of RAM don't have to bother and can + * simply map the whole #MemoryRegion. + * + * An example #RamDiscardMgr is virtio-mem, which logically (un)plugs + * memory within an assigned RAM #MemoryRegion, coordinated with the VM. + * Logically unplugging memory consists of discarding RAM. The VM agreed t= o not + * access unplugged (discarded) memory - especially via DMA. virtio-mem wi= ll + * properly coordinate with listeners before memory is plugged (populated), + * and after memory is unplugged (discarded). + * + * Listeners are called in multiples of the minimum granularity and change= s are + * aligned to the minimum granularity within the #MemoryRegion. Listeners = have + * to prepare for memory becomming discarded in a different granularity th= an it + * was populated and the other way around. + */ +struct RamDiscardMgrClass { + /* private */ + InterfaceClass parent_class; + + /* public */ + + /** + * @get_min_granularity: + * + * Get the minimum granularity in which listeners will get notified + * about changes within the #MemoryRegion via the #RamDiscardMgr. + * + * @rdm: the #RamDiscardMgr + * @mr: the #MemoryRegion + * + * Returns the minimum granularity. + */ + uint64_t (*get_min_granularity)(const RamDiscardMgr *rdm, + const MemoryRegion *mr); + + /** + * @is_populated: + * + * Check whether the given range within the #MemoryRegion is completely + * populated (i.e., no parts are currently discarded). There are no + * alignment requirements for the range. + * + * @rdm: the #RamDiscardMgr + * @mr: the #MemoryRegion + * @offset: offset into the #MemoryRegion + * @size: size in the #MemoryRegion + * + * Returns whether the given range is completely populated. + */ + bool (*is_populated)(const RamDiscardMgr *rdm, const MemoryRegion *mr, + ram_addr_t offset, ram_addr_t size); + + /** + * @register_listener: + * + * Register a #RamDiscardListener for a #MemoryRegion via the + * #RamDiscardMgr and immediately notify the #RamDiscardListener about= all + * populated parts within the #MemoryRegion via the #RamDiscardMgr. + * + * In case any notification fails, no further notifications are trigge= red + * and an error is logged. + * + * @rdm: the #RamDiscardMgr + * @mr: the #MemoryRegion + * @rdl: the #RamDiscardListener + */ + void (*register_listener)(RamDiscardMgr *rdm, const MemoryRegion *mr, + RamDiscardListener *rdl); + + /** + * @unregister_listener: + * + * Unregister a previously registered #RamDiscardListener for a + * #MemoryRegion via the #RamDiscardMgr after notifying the + * #RamDiscardListener about all populated parts becoming unpopulated + * within the #MemoryRegion via the #RamDiscardMgr. + * + * @rdm: the #RamDiscardMgr + * @mr: the #MemoryRegion + * @rdl: the #RamDiscardListener + */ + void (*unregister_listener)(RamDiscardMgr *rdm, const MemoryRegion *mr, + RamDiscardListener *rdl); + + /** + * @replay_populated: + * + * Notify the #RamDiscardListener about all populated parts within the + * #MemoryRegion via the #RamDiscardMgr. + * + * In case any notification fails, no further notifications are trigge= red. + * The listener is not required to be registered. + * + * @rdm: the #RamDiscardMgr + * @mr: the #MemoryRegion + * @rdl: the #RamDiscardListener + * + * Returns 0 on success, or a negative error if any notification faile= d. + */ + int (*replay_populated)(const RamDiscardMgr *rdm, const MemoryRegion *= mr, + RamDiscardListener *rdl); +}; + typedef struct CoalescedMemoryRange CoalescedMemoryRange; typedef struct MemoryRegionIoeventfd MemoryRegionIoeventfd; =20 @@ -487,6 +683,7 @@ struct MemoryRegion { const char *name; unsigned ioeventfd_nb; MemoryRegionIoeventfd *ioeventfds; + RamDiscardMgr *rdm; /* Only for RAM */ }; =20 struct IOMMUMemoryRegion { @@ -1979,6 +2176,40 @@ bool memory_region_present(MemoryRegion *container, = hwaddr addr); */ bool memory_region_is_mapped(MemoryRegion *mr); =20 +/** + * memory_region_get_ram_discard_mgr: get the #RamDiscardMgr for a + * #MemoryRegion + * + * The #RamDiscardMgr cannot change while a memory region is mapped. + * + * @mr: the #MemoryRegion + */ +RamDiscardMgr *memory_region_get_ram_discard_mgr(MemoryRegion *mr); + +/** + * memory_region_has_ram_discard_mgr: check whether a #MemoryRegion has a + * #RamDiscardMgr assigned + * + * @mr: the #MemoryRegion + */ +static inline bool memory_region_has_ram_discard_mgr(MemoryRegion *mr) +{ + return !!memory_region_get_ram_discard_mgr(mr); +} + +/** + * memory_region_set_ram_discard_mgr: set the #RamDiscardMgr for a + * #MemoryRegion + * + * This function must not be called for a mapped #MemoryRegion, a #MemoryR= egion + * that does not cover RAM, or a #MemoryRegion that already has a + * #RamDiscardMgr assigned. + * + * @mr: the #MemoryRegion + * @urn: #RamDiscardMgr to set + */ +void memory_region_set_ram_discard_mgr(MemoryRegion *mr, RamDiscardMgr *rd= m); + /** * memory_region_find: translate an address/size relative to a * MemoryRegion into a #MemoryRegionSection. diff --git a/softmmu/memory.c b/softmmu/memory.c index 874a8fccde..5e7bea7661 100644 --- a/softmmu/memory.c +++ b/softmmu/memory.c @@ -2029,6 +2029,21 @@ int memory_region_iommu_num_indexes(IOMMUMemoryRegio= n *iommu_mr) return imrc->num_indexes(iommu_mr); } =20 +RamDiscardMgr *memory_region_get_ram_discard_mgr(MemoryRegion *mr) +{ + if (!memory_region_is_mapped(mr) || !memory_region_is_ram(mr)) { + return NULL; + } + return mr->rdm; +} + +void memory_region_set_ram_discard_mgr(MemoryRegion *mr, RamDiscardMgr *rd= m) +{ + g_assert(memory_region_is_ram(mr) && !memory_region_is_mapped(mr)); + g_assert(!rdm || !mr->rdm); + mr->rdm =3D rdm; +} + void memory_region_set_log(MemoryRegion *mr, bool log, unsigned client) { uint8_t mask =3D 1 << client; @@ -3309,10 +3324,17 @@ static const TypeInfo iommu_memory_region_info =3D { .abstract =3D true, }; =20 +static const TypeInfo ram_discard_mgr_info =3D { + .parent =3D TYPE_INTERFACE, + .name =3D TYPE_RAM_DISCARD_MGR, + .class_size =3D sizeof(RamDiscardMgrClass), +}; + static void memory_register_types(void) { type_register_static(&memory_region_info); type_register_static(&iommu_memory_region_info); + type_register_static(&ram_discard_mgr_info); } =20 type_init(memory_register_types) --=20 2.29.2 From nobody Sun Feb 8 06:11:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1613995186; cv=none; d=zohomail.com; s=zohoarc; b=RlNzH5ne4WN9KKc0aS7HDcphqWdGQzbJCdG/iOML/+Qj2wJc6iF74FHGWuGK+cZbXX7KNagqhjdkWE287LLQLlFhkcvLI1f6fUf+mYjG/1lMIqE/D0nbEtoexE6upctJr8yERdiO/PbFbj7EpWZPZ1CI1P1TugEqvCyFNvBg8qA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1613995186; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=mupDsZukgkVwN1gOPqQg9vtKKSfjG2244oNS9JRPx6o=; b=S5YPJd4DPiCLBnyHcn4wHaeo3koo/XVqo15D/4sCBf3UhHIobMXMj81gt04SIIK5Kbm5l+mi2OfNS7JtAErUk8I/CDlD6gsomWqjGqnfy61ezleRLjf6pziHwy3g9hEaYLV0Y5clorrajJHeyv+fPXbKVPU2TsvOoJVBquXqbt4= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1613995186632967.7255821460748; Mon, 22 Feb 2021 03:59:46 -0800 (PST) Received: from localhost ([::1]:51198 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lE9sP-0007y2-Iz for importer@patchew.org; Mon, 22 Feb 2021 06:59:45 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:41154) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lE9qh-0006Qa-Dq for qemu-devel@nongnu.org; Mon, 22 Feb 2021 06:57:59 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:57688) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1lE9qf-00013d-JQ for qemu-devel@nongnu.org; Mon, 22 Feb 2021 06:57:59 -0500 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-49-zleP-WS6OP26a58uD-ZuwQ-1; Mon, 22 Feb 2021 06:57:52 -0500 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id B51E2107ACE3; Mon, 22 Feb 2021 11:57:50 +0000 (UTC) Received: from t480s.redhat.com (ovpn-115-16.ams2.redhat.com [10.36.115.16]) by smtp.corp.redhat.com (Postfix) with ESMTP id CAC6419C45; Mon, 22 Feb 2021 11:57:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1613995075; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mupDsZukgkVwN1gOPqQg9vtKKSfjG2244oNS9JRPx6o=; b=CT4NopPAwhq+/ebLTxtRetIUH/K+NyLc8BQbRoChQfZBiJnVzkCiuz6cptoFkW1DZkDPG8 MiV/z9HSJSPX0wmiveDdfImIlIQKEJZlKvdZLH0KVxoORLnoPjLmbGbBwC0c/sVk4oDLpE hu6QN8hC0RsD9Yzk6Y4tTOBSXL6WhRw= X-MC-Unique: zleP-WS6OP26a58uD-ZuwQ-1 From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v6 02/12] virtio-mem: Factor out traversing unplugged ranges Date: Mon, 22 Feb 2021 12:56:58 +0100 Message-Id: <20210222115708.7623-3-david@redhat.com> In-Reply-To: <20210222115708.7623-1-david@redhat.com> References: <20210222115708.7623-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=216.205.24.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , Wei Yang , "Michael S . Tsirkin" , David Hildenbrand , Alex Williamson , Peter Xu , "Dr . David Alan Gilbert" , Auger Eric , Pankaj Gupta , teawater , Igor Mammedov , Paolo Bonzini , Marek Kedzierski Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Type: text/plain; charset="utf-8" Let's factor out the core logic, to be reused soon. Reviewed-by: Pankaj Gupta Acked-by: Michael S. Tsirkin Reviewed-by: Michael S. Tsirkin Cc: Paolo Bonzini Cc: "Michael S. Tsirkin" Cc: Alex Williamson Cc: Dr. David Alan Gilbert Cc: Igor Mammedov Cc: Pankaj Gupta Cc: Peter Xu Cc: Auger Eric Cc: Wei Yang Cc: teawater Cc: Marek Kedzierski Signed-off-by: David Hildenbrand --- hw/virtio/virtio-mem.c | 86 ++++++++++++++++++++++++------------------ 1 file changed, 49 insertions(+), 37 deletions(-) diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index 655824ff81..471e464171 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -145,6 +145,33 @@ static bool virtio_mem_is_busy(void) return migration_in_incoming_postcopy() || !migration_is_idle(); } =20 +typedef int (*virtio_mem_range_cb)(const VirtIOMEM *vmem, void *arg, + uint64_t offset, uint64_t size); + +static int virtio_mem_for_each_unplugged_range(const VirtIOMEM *vmem, void= *arg, + virtio_mem_range_cb cb) +{ + unsigned long first_zero_bit, last_zero_bit; + uint64_t offset, size; + int ret =3D 0; + + first_zero_bit =3D find_first_zero_bit(vmem->bitmap, vmem->bitmap_size= ); + while (first_zero_bit < vmem->bitmap_size) { + offset =3D first_zero_bit * vmem->block_size; + last_zero_bit =3D find_next_bit(vmem->bitmap, vmem->bitmap_size, + first_zero_bit + 1) - 1; + size =3D (last_zero_bit - first_zero_bit + 1) * vmem->block_size; + + ret =3D cb(vmem, arg, offset, size); + if (ret) { + break; + } + first_zero_bit =3D find_next_zero_bit(vmem->bitmap, vmem->bitmap_s= ize, + last_zero_bit + 2); + } + return ret; +} + static bool virtio_mem_test_bitmap(VirtIOMEM *vmem, uint64_t start_gpa, uint64_t size, bool plugged) { @@ -594,33 +621,27 @@ static void virtio_mem_device_unrealize(DeviceState *= dev) ram_block_discard_require(false); } =20 -static int virtio_mem_restore_unplugged(VirtIOMEM *vmem) +static int virtio_mem_discard_range_cb(const VirtIOMEM *vmem, void *arg, + uint64_t offset, uint64_t size) { RAMBlock *rb =3D vmem->memdev->mr.ram_block; - unsigned long first_zero_bit, last_zero_bit; - uint64_t offset, length; int ret; =20 - /* Find consecutive unplugged blocks and discard the consecutive range= . */ - first_zero_bit =3D find_first_zero_bit(vmem->bitmap, vmem->bitmap_size= ); - while (first_zero_bit < vmem->bitmap_size) { - offset =3D first_zero_bit * vmem->block_size; - last_zero_bit =3D find_next_bit(vmem->bitmap, vmem->bitmap_size, - first_zero_bit + 1) - 1; - length =3D (last_zero_bit - first_zero_bit + 1) * vmem->block_size; - - ret =3D ram_block_discard_range(rb, offset, length); - if (ret) { - error_report("Unexpected error discarding RAM: %s", - strerror(-ret)); - return -EINVAL; - } - first_zero_bit =3D find_next_zero_bit(vmem->bitmap, vmem->bitmap_s= ize, - last_zero_bit + 2); + ret =3D ram_block_discard_range(rb, offset, size); + if (ret) { + error_report("Unexpected error discarding RAM: %s", strerror(-ret)= ); + return -EINVAL; } return 0; } =20 +static int virtio_mem_restore_unplugged(VirtIOMEM *vmem) +{ + /* Make sure all memory is really discarded after migration. */ + return virtio_mem_for_each_unplugged_range(vmem, NULL, + virtio_mem_discard_range_cb= ); +} + static int virtio_mem_post_load(void *opaque, int version_id) { if (migration_in_incoming_postcopy()) { @@ -872,28 +893,19 @@ static void virtio_mem_set_block_size(Object *obj, Vi= sitor *v, const char *name, vmem->block_size =3D value; } =20 -static void virtio_mem_precopy_exclude_unplugged(VirtIOMEM *vmem) +static int virtio_mem_precopy_exclude_range_cb(const VirtIOMEM *vmem, void= *arg, + uint64_t offset, uint64_t s= ize) { void * const host =3D qemu_ram_get_host_addr(vmem->memdev->mr.ram_bloc= k); - unsigned long first_zero_bit, last_zero_bit; - uint64_t offset, length; =20 - /* - * Find consecutive unplugged blocks and exclude them from migration. - * - * Note: Blocks cannot get (un)plugged during precopy, no locking need= ed. - */ - first_zero_bit =3D find_first_zero_bit(vmem->bitmap, vmem->bitmap_size= ); - while (first_zero_bit < vmem->bitmap_size) { - offset =3D first_zero_bit * vmem->block_size; - last_zero_bit =3D find_next_bit(vmem->bitmap, vmem->bitmap_size, - first_zero_bit + 1) - 1; - length =3D (last_zero_bit - first_zero_bit + 1) * vmem->block_size; + qemu_guest_free_page_hint(host + offset, size); + return 0; +} =20 - qemu_guest_free_page_hint(host + offset, length); - first_zero_bit =3D find_next_zero_bit(vmem->bitmap, vmem->bitmap_s= ize, - last_zero_bit + 2); - } +static void virtio_mem_precopy_exclude_unplugged(VirtIOMEM *vmem) +{ + virtio_mem_for_each_unplugged_range(vmem, NULL, + virtio_mem_precopy_exclude_range_c= b); } =20 static int virtio_mem_precopy_notify(NotifierWithReturn *n, void *data) --=20 2.29.2 From nobody Sun Feb 8 06:11:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1613995453; cv=none; d=zohomail.com; s=zohoarc; b=UQSke/2YKwc0hmeDyuiD/sZOnKcMI5j34ZhJKNXeHRW187xgDr4Myz6ts3ZhLAV5Qjhuy/6RXf1nqSAMV65+9jewH7G/yjvKOlWvvsX8ghrsFnYfu7z7I7WV3kyHEadNFD1JfA6xKjz+s2fOIV8LH04KGNJT/CR8wiWSnp7cbiA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1613995453; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=cgf7X2V+cNDckZkSdQKWlxDk2IzJLIhZoFvrNRYHRvE=; b=NU3SNQS7GcG1Egz4A2Tfa+sesEHpvU1YLv9r/9dlOK10QJeKfUgANodJpWVoUr2VqNyK5NL/01GDgJiYSflts8h7Hoer9N96eiQ5HQ3uPFfJnw8Sf51WNcNgsh56ByKiVBBaqAohGJIAyIa0ezoa4rirAmU8HnVZSxNv2qMXMm4= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1613995453336758.6268889240774; Mon, 22 Feb 2021 04:04:13 -0800 (PST) Received: from localhost ([::1]:59246 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lE9wi-0003AI-2c for importer@patchew.org; Mon, 22 Feb 2021 07:04:12 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:41226) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lE9qs-0006kc-9h for qemu-devel@nongnu.org; Mon, 22 Feb 2021 06:58:10 -0500 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:21607) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1lE9qq-00018V-Hz for qemu-devel@nongnu.org; Mon, 22 Feb 2021 06:58:10 -0500 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-430-iJG93hPyMoGdJXY9ehzXvg-1; Mon, 22 Feb 2021 06:58:04 -0500 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id DC4A21E561; Mon, 22 Feb 2021 11:58:02 +0000 (UTC) Received: from t480s.redhat.com (ovpn-115-16.ams2.redhat.com [10.36.115.16]) by smtp.corp.redhat.com (Postfix) with ESMTP id 12C9919C45; Mon, 22 Feb 2021 11:57:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1613995087; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cgf7X2V+cNDckZkSdQKWlxDk2IzJLIhZoFvrNRYHRvE=; b=aghxKICsPzX/TFZTawD6LqPPAkpqTU9GW2GGnYJKAuj04K9YESBHtUt60ssOZ81hjjwEea OcO8fY0YbPNe+4JKb/gXv5Bdo1iwl64EiGJMaaYBJpoIjX0D8N8hE/BPsgomOtAzRUFXMW ErMEd23QLubjzX4hS1ecXkG8TS7xw0c= X-MC-Unique: iJG93hPyMoGdJXY9ehzXvg-1 From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v6 03/12] virtio-mem: Don't report errors when ram_block_discard_range() fails Date: Mon, 22 Feb 2021 12:56:59 +0100 Message-Id: <20210222115708.7623-4-david@redhat.com> In-Reply-To: <20210222115708.7623-1-david@redhat.com> References: <20210222115708.7623-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=63.128.21.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , Wei Yang , "Michael S. Tsirkin" , David Hildenbrand , "Dr . David Alan Gilbert" , Peter Xu , Auger Eric , Alex Williamson , teawater , Igor Mammedov , Paolo Bonzini , Marek Kedzierski Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Type: text/plain; charset="utf-8" Any errors are unexpected and ram_block_discard_range() already properly prints errors. Let's stop manually reporting errors. Cc: Paolo Bonzini Cc: "Michael S. Tsirkin" Cc: Alex Williamson Cc: Dr. David Alan Gilbert Cc: Igor Mammedov Cc: Pankaj Gupta Cc: Peter Xu Cc: Auger Eric Cc: Wei Yang Cc: teawater Cc: Marek Kedzierski Signed-off-by: David Hildenbrand --- hw/virtio/virtio-mem.c | 20 ++++---------------- 1 file changed, 4 insertions(+), 16 deletions(-) diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index 471e464171..bbe42ad83b 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -246,17 +246,14 @@ static int virtio_mem_set_block_state(VirtIOMEM *vmem= , uint64_t start_gpa, uint64_t size, bool plug) { const uint64_t offset =3D start_gpa - vmem->addr; - int ret; + RAMBlock *rb =3D vmem->memdev->mr.ram_block; =20 if (virtio_mem_is_busy()) { return -EBUSY; } =20 if (!plug) { - ret =3D ram_block_discard_range(vmem->memdev->mr.ram_block, offset= , size); - if (ret) { - error_report("Unexpected error discarding RAM: %s", - strerror(-ret)); + if (ram_block_discard_range(rb, offset, size)) { return -EBUSY; } } @@ -345,15 +342,12 @@ static void virtio_mem_resize_usable_region(VirtIOMEM= *vmem, static int virtio_mem_unplug_all(VirtIOMEM *vmem) { RAMBlock *rb =3D vmem->memdev->mr.ram_block; - int ret; =20 if (virtio_mem_is_busy()) { return -EBUSY; } =20 - ret =3D ram_block_discard_range(rb, 0, qemu_ram_get_used_length(rb)); - if (ret) { - error_report("Unexpected error discarding RAM: %s", strerror(-ret)= ); + if (ram_block_discard_range(rb, 0, qemu_ram_get_used_length(rb))) { return -EBUSY; } bitmap_clear(vmem->bitmap, 0, vmem->bitmap_size); @@ -625,14 +619,8 @@ static int virtio_mem_discard_range_cb(const VirtIOMEM= *vmem, void *arg, uint64_t offset, uint64_t size) { RAMBlock *rb =3D vmem->memdev->mr.ram_block; - int ret; =20 - ret =3D ram_block_discard_range(rb, offset, size); - if (ret) { - error_report("Unexpected error discarding RAM: %s", strerror(-ret)= ); - return -EINVAL; - } - return 0; + return ram_block_discard_range(rb, offset, size) ? -EINVAL : 0; } =20 static int virtio_mem_restore_unplugged(VirtIOMEM *vmem) --=20 2.29.2 From nobody Sun Feb 8 06:11:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1613995526; cv=none; d=zohomail.com; s=zohoarc; b=hN0CLBeR94iu+8lyDG/7BGKmvLj1qXCp2Bm4571ZgGbnHdl+f7OWws3vdalj1u5IEMwmXGYQ3tjFpsXgBEBidmG36iYspHXpV69lr815lfxY7uJIs6Ooipil71wK9qrjRzyfT04tYJU8k8n5Kz8Kc9Ccz5dJ1OQ8hqH73tddFBE= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1613995526; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=wi+LXngUTopxp63cYsGwlfMFWOVAqeTC4FPdWU1i4YI=; b=QF8Xf+aToYwUwKAXw0BqivMB5KZ7yRpT8WWaKtL18ynGKFXPnX+XheexZdWxxaIx1Mwb7qfeXWB5HWay+uNxtJJHr8XtoD3iMZySMu4twAAoSmR1MudJsXuO8K3BKoI9x9Q/cMIONiXQPiZ8DKMZZUGwx+uEI0QrDhy/AEwZZLk= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 16139955259341014.4246456428632; Mon, 22 Feb 2021 04:05:25 -0800 (PST) Received: from localhost ([::1]:33214 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lE9xs-0004aN-Nv for importer@patchew.org; Mon, 22 Feb 2021 07:05:24 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:41306) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lE9r6-00072I-IM for qemu-devel@nongnu.org; Mon, 22 Feb 2021 06:58:27 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:28243) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1lE9r1-0001CM-NV for qemu-devel@nongnu.org; Mon, 22 Feb 2021 06:58:22 -0500 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-145-np3pP4OaPcaVE1Bw4nAh0Q-1; Mon, 22 Feb 2021 06:58:16 -0500 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 658071934103; Mon, 22 Feb 2021 11:58:15 +0000 (UTC) Received: from t480s.redhat.com (ovpn-115-16.ams2.redhat.com [10.36.115.16]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3E13219C45; Mon, 22 Feb 2021 11:58:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1613995098; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wi+LXngUTopxp63cYsGwlfMFWOVAqeTC4FPdWU1i4YI=; b=WtJcHqHMM70gZLj7YbvnSgpDYzavKbyQQtGcY/UbZ33gKgqVhvGiZf5Z68S24uaW+cDyBT g0hjMFyeVm8OKDNfMiUSJsvSVw3ZFLX4Pb4bjgA7qhBJsNECYamTxC3Y7yRSzYvuDmwQ7e I/3pC55gv/yDOMVFbtwxcICDAzt8VkQ= X-MC-Unique: np3pP4OaPcaVE1Bw4nAh0Q-1 From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v6 04/12] virtio-mem: Implement RamDiscardMgr interface Date: Mon, 22 Feb 2021 12:57:00 +0100 Message-Id: <20210222115708.7623-5-david@redhat.com> In-Reply-To: <20210222115708.7623-1-david@redhat.com> References: <20210222115708.7623-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=216.205.24.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , Wei Yang , "Michael S . Tsirkin" , David Hildenbrand , "Dr . David Alan Gilbert" , Peter Xu , Auger Eric , Alex Williamson , teawater , Igor Mammedov , Paolo Bonzini , Marek Kedzierski Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Type: text/plain; charset="utf-8" Let's properly notify when (un)plugging blocks, after discarding memory and before allowing the guest to consume memory. Handle errors from notifiers gracefully (e.g., no remaining VFIO mappings) when plugging, rolling back the change and telling the guest that the VM is busy. One special case to take care of is replaying all notifications after restoring the vmstate. The device starts out with all memory discarded, so after loading the vmstate, we have to notify about all plugged blocks. Acked-by: Michael S. Tsirkin Reviewed-by: Michael S. Tsirkin Cc: Paolo Bonzini Cc: "Michael S. Tsirkin" Cc: Alex Williamson Cc: Dr. David Alan Gilbert Cc: Igor Mammedov Cc: Pankaj Gupta Cc: Peter Xu Cc: Auger Eric Cc: Wei Yang Cc: teawater Cc: Marek Kedzierski Signed-off-by: David Hildenbrand --- hw/virtio/virtio-mem.c | 253 ++++++++++++++++++++++++++++++++- include/hw/virtio/virtio-mem.h | 3 + 2 files changed, 253 insertions(+), 3 deletions(-) diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index bbe42ad83b..3132e4b2b1 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -172,7 +172,105 @@ static int virtio_mem_for_each_unplugged_range(const = VirtIOMEM *vmem, void *arg, return ret; } =20 -static bool virtio_mem_test_bitmap(VirtIOMEM *vmem, uint64_t start_gpa, +static int virtio_mem_for_each_plugged_range(const VirtIOMEM *vmem, void *= arg, + virtio_mem_range_cb cb) +{ + unsigned long first_bit, last_bit; + uint64_t offset, size; + int ret =3D 0; + + first_bit =3D find_first_bit(vmem->bitmap, vmem->bitmap_size); + while (first_bit < vmem->bitmap_size) { + offset =3D first_bit * vmem->block_size; + last_bit =3D find_next_zero_bit(vmem->bitmap, vmem->bitmap_size, + first_bit + 1) - 1; + size =3D (last_bit - first_bit + 1) * vmem->block_size; + + ret =3D cb(vmem, arg, offset, size); + if (ret) { + break; + } + first_bit =3D find_next_bit(vmem->bitmap, vmem->bitmap_size, + last_bit + 2); + } + return ret; +} + +static void virtio_mem_notify_unplug(VirtIOMEM *vmem, uint64_t offset, + uint64_t size) +{ + RamDiscardListener *rdl; + + QLIST_FOREACH(rdl, &vmem->rdl_list, next) { + rdl->notify_discard(rdl, &vmem->memdev->mr, offset, size); + } +} + +static int virtio_mem_notify_plug(VirtIOMEM *vmem, uint64_t offset, + uint64_t size) +{ + RamDiscardListener *rdl, *rdl2; + int ret =3D 0; + + QLIST_FOREACH(rdl, &vmem->rdl_list, next) { + ret =3D rdl->notify_populate(rdl, &vmem->memdev->mr, offset, size); + if (ret) { + break; + } + } + + if (ret) { + /* Could be a mapping attempt resulted in memory getting populated= . */ + ram_block_discard_range(vmem->memdev->mr.ram_block, offset, size); + + /* Notify all already-notified listeners. */ + QLIST_FOREACH(rdl2, &vmem->rdl_list, next) { + if (rdl2 =3D=3D rdl) { + break; + } + rdl2->notify_discard(rdl2, &vmem->memdev->mr, offset, size); + } + } + return ret; +} + +static int virtio_mem_notify_discard_range_cb(const VirtIOMEM *vmem, void = *arg, + uint64_t offset, uint64_t si= ze) +{ + RamDiscardListener *rdl; + + QLIST_FOREACH(rdl, &vmem->rdl_list, next) { + if (!rdl->notify_discard_all) { + rdl->notify_discard(rdl, &vmem->memdev->mr, offset, size); + } + } + return 0; +} + +static void virtio_mem_notify_unplug_all(VirtIOMEM *vmem) +{ + bool individual_calls =3D false; + RamDiscardListener *rdl; + + if (!vmem->size) { + return; + } + + QLIST_FOREACH(rdl, &vmem->rdl_list, next) { + if (rdl->notify_discard_all) { + rdl->notify_discard_all(rdl, &vmem->memdev->mr); + } else { + individual_calls =3D true; + } + } + + if (individual_calls) { + virtio_mem_for_each_unplugged_range(vmem, NULL, + virtio_mem_notify_discard_rang= e_cb); + } +} + +static bool virtio_mem_test_bitmap(const VirtIOMEM *vmem, uint64_t start_g= pa, uint64_t size, bool plugged) { const unsigned long first_bit =3D (start_gpa - vmem->addr) / vmem->blo= ck_size; @@ -225,7 +323,8 @@ static void virtio_mem_send_response_simple(VirtIOMEM *= vmem, virtio_mem_send_response(vmem, elem, &resp); } =20 -static bool virtio_mem_valid_range(VirtIOMEM *vmem, uint64_t gpa, uint64_t= size) +static bool virtio_mem_valid_range(const VirtIOMEM *vmem, uint64_t gpa, + uint64_t size) { if (!QEMU_IS_ALIGNED(gpa, vmem->block_size)) { return false; @@ -256,6 +355,9 @@ static int virtio_mem_set_block_state(VirtIOMEM *vmem, = uint64_t start_gpa, if (ram_block_discard_range(rb, offset, size)) { return -EBUSY; } + virtio_mem_notify_unplug(vmem, offset, size); + } else if (virtio_mem_notify_plug(vmem, offset, size)) { + return -EBUSY; } virtio_mem_set_bitmap(vmem, start_gpa, size, plug); return 0; @@ -350,6 +452,8 @@ static int virtio_mem_unplug_all(VirtIOMEM *vmem) if (ram_block_discard_range(rb, 0, qemu_ram_get_used_length(rb))) { return -EBUSY; } + virtio_mem_notify_unplug_all(vmem); + bitmap_clear(vmem->bitmap, 0, vmem->bitmap_size); if (vmem->size) { vmem->size =3D 0; @@ -598,6 +702,12 @@ static void virtio_mem_device_realize(DeviceState *dev= , Error **errp) vmstate_register_ram(&vmem->memdev->mr, DEVICE(vmem)); qemu_register_reset(virtio_mem_system_reset, vmem); precopy_add_notifier(&vmem->precopy_notifier); + + /* + * Set ourselves as RamDiscardMgr before the plug handler maps the mem= ory + * region and exposes it via an address space. + */ + memory_region_set_ram_discard_mgr(&vmem->memdev->mr, RAM_DISCARD_MGR(v= mem)); } =20 static void virtio_mem_device_unrealize(DeviceState *dev) @@ -605,6 +715,11 @@ static void virtio_mem_device_unrealize(DeviceState *d= ev) VirtIODevice *vdev =3D VIRTIO_DEVICE(dev); VirtIOMEM *vmem =3D VIRTIO_MEM(dev); =20 + /* + * The unplug handler unmapped the memory region, it cannot be + * found via an address space anymore. Unset ourselves. + */ + memory_region_set_ram_discard_mgr(&vmem->memdev->mr, NULL); precopy_remove_notifier(&vmem->precopy_notifier); qemu_unregister_reset(virtio_mem_system_reset, vmem); vmstate_unregister_ram(&vmem->memdev->mr, DEVICE(vmem)); @@ -630,13 +745,41 @@ static int virtio_mem_restore_unplugged(VirtIOMEM *vm= em) virtio_mem_discard_range_cb= ); } =20 +static int virtio_mem_post_load_replay_cb(const VirtIOMEM *vmem, void *arg, + uint64_t offset, uint64_t size) +{ + RamDiscardListener *rdl; + int ret =3D 0; + + QLIST_FOREACH(rdl, &vmem->rdl_list, next) { + ret =3D rdl->notify_populate(rdl, &vmem->memdev->mr, offset, size); + if (ret) { + break; + } + } + return ret; +} + static int virtio_mem_post_load(void *opaque, int version_id) { + VirtIOMEM *vmem =3D VIRTIO_MEM(opaque); + int ret; + + /* + * We started out with all memory discarded and our memory region is m= apped + * into an address space. Replay, now that we updated the bitmap. + */ + ret =3D virtio_mem_for_each_plugged_range(vmem, NULL, + virtio_mem_post_load_replay_cb= ); + if (ret) { + return ret; + } + if (migration_in_incoming_postcopy()) { return 0; } =20 - return virtio_mem_restore_unplugged(VIRTIO_MEM(opaque)); + return virtio_mem_restore_unplugged(vmem); } =20 typedef struct VirtIOMEMMigSanityChecks { @@ -921,6 +1064,7 @@ static void virtio_mem_instance_init(Object *obj) =20 notifier_list_init(&vmem->size_change_notifiers); vmem->precopy_notifier.notify =3D virtio_mem_precopy_notify; + QLIST_INIT(&vmem->rdl_list); =20 object_property_add(obj, VIRTIO_MEM_SIZE_PROP, "size", virtio_mem_get_= size, NULL, NULL, NULL); @@ -940,11 +1084,104 @@ static Property virtio_mem_properties[] =3D { DEFINE_PROP_END_OF_LIST(), }; =20 +static uint64_t virtio_mem_rdm_get_min_granularity(const RamDiscardMgr *rd= m, + const MemoryRegion *mr) +{ + const VirtIOMEM *vmem =3D VIRTIO_MEM(rdm); + + g_assert(mr =3D=3D &vmem->memdev->mr); + return vmem->block_size; +} + +static bool virtio_mem_rdm_is_populated(const RamDiscardMgr *rdm, + const MemoryRegion *mr, + ram_addr_t offset, ram_addr_t size) +{ + const VirtIOMEM *vmem =3D VIRTIO_MEM(rdm); + uint64_t start_gpa =3D QEMU_ALIGN_DOWN(vmem->addr + offset, vmem->bloc= k_size); + uint64_t end_gpa =3D QEMU_ALIGN_UP(vmem->addr + offset + size, + vmem->block_size); + + g_assert(mr =3D=3D &vmem->memdev->mr); + if (!virtio_mem_valid_range(vmem, start_gpa, end_gpa - start_gpa)) { + return false; + } + + return virtio_mem_test_bitmap(vmem, start_gpa, end_gpa - start_gpa, tr= ue); +} + +static int virtio_mem_notify_populate_range_single_cb(const VirtIOMEM *vme= m, + void *arg, + uint64_t offset, + uint64_t size) +{ + RamDiscardListener *rdl =3D arg; + + return rdl->notify_populate(rdl, &vmem->memdev->mr, offset, size); +} + +static int virtio_mem_notify_discard_range_single_cb(const VirtIOMEM *vmem, + void *arg, + uint64_t offset, + uint64_t size) +{ + RamDiscardListener *rdl =3D arg; + + rdl->notify_discard(rdl, &vmem->memdev->mr, offset, size); + return 0; +} + +static void virtio_mem_rdm_register_listener(RamDiscardMgr *rdm, + const MemoryRegion *mr, + RamDiscardListener *rdl) +{ + VirtIOMEM *vmem =3D VIRTIO_MEM(rdm); + int ret; + + g_assert(mr =3D=3D &vmem->memdev->mr); + QLIST_INSERT_HEAD(&vmem->rdl_list, rdl, next); + ret =3D virtio_mem_for_each_plugged_range(vmem, rdl, + virtio_mem_notify_populate_range_singl= e_cb); + if (ret) { + error_report("%s: Replaying plugged ranges failed: %s", __func__, + strerror(-ret)); + } +} + +static void virtio_mem_rdm_unregister_listener(RamDiscardMgr *rdm, + const MemoryRegion *mr, + RamDiscardListener *rdl) +{ + VirtIOMEM *vmem =3D VIRTIO_MEM(rdm); + + g_assert(mr =3D=3D &vmem->memdev->mr); + if (rdl->notify_discard_all) { + rdl->notify_discard_all(rdl, &vmem->memdev->mr); + } else { + virtio_mem_for_each_plugged_range(vmem, rdl, + virtio_mem_notify_discard_range_singl= e_cb); + + } + QLIST_REMOVE(rdl, next); +} + +static int virtio_mem_rdm_replay_populated(const RamDiscardMgr *rdm, + const MemoryRegion *mr, + RamDiscardListener *rdl) +{ + const VirtIOMEM *vmem =3D VIRTIO_MEM(rdm); + + g_assert(mr =3D=3D &vmem->memdev->mr); + return virtio_mem_for_each_plugged_range(vmem, rdl, + virtio_mem_notify_populate_range_singl= e_cb); +} + static void virtio_mem_class_init(ObjectClass *klass, void *data) { DeviceClass *dc =3D DEVICE_CLASS(klass); VirtioDeviceClass *vdc =3D VIRTIO_DEVICE_CLASS(klass); VirtIOMEMClass *vmc =3D VIRTIO_MEM_CLASS(klass); + RamDiscardMgrClass *rdmc =3D RAM_DISCARD_MGR_CLASS(klass); =20 device_class_set_props(dc, virtio_mem_properties); dc->vmsd =3D &vmstate_virtio_mem; @@ -960,6 +1197,12 @@ static void virtio_mem_class_init(ObjectClass *klass,= void *data) vmc->get_memory_region =3D virtio_mem_get_memory_region; vmc->add_size_change_notifier =3D virtio_mem_add_size_change_notifier; vmc->remove_size_change_notifier =3D virtio_mem_remove_size_change_not= ifier; + + rdmc->get_min_granularity =3D virtio_mem_rdm_get_min_granularity; + rdmc->is_populated =3D virtio_mem_rdm_is_populated; + rdmc->register_listener =3D virtio_mem_rdm_register_listener; + rdmc->unregister_listener =3D virtio_mem_rdm_unregister_listener; + rdmc->replay_populated =3D virtio_mem_rdm_replay_populated; } =20 static const TypeInfo virtio_mem_info =3D { @@ -969,6 +1212,10 @@ static const TypeInfo virtio_mem_info =3D { .instance_init =3D virtio_mem_instance_init, .class_init =3D virtio_mem_class_init, .class_size =3D sizeof(VirtIOMEMClass), + .interfaces =3D (InterfaceInfo[]) { + { TYPE_RAM_DISCARD_MGR }, + { } + }, }; =20 static void virtio_register_types(void) diff --git a/include/hw/virtio/virtio-mem.h b/include/hw/virtio/virtio-mem.h index 4eeb82d5dd..9a6e348fa2 100644 --- a/include/hw/virtio/virtio-mem.h +++ b/include/hw/virtio/virtio-mem.h @@ -67,6 +67,9 @@ struct VirtIOMEM { =20 /* don't migrate unplugged memory */ NotifierWithReturn precopy_notifier; + + /* listeners to notify on plug/unplug activity. */ + QLIST_HEAD(, RamDiscardListener) rdl_list; }; =20 struct VirtIOMEMClass { --=20 2.29.2 From nobody Sun Feb 8 06:11:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1613995474; cv=none; d=zohomail.com; s=zohoarc; b=DO50gUQUPswJ28SsUEx92mj7kviALeS5/n8FOb/ifoCb8xF/H5PrttxGspKvVzex9PoqPSzHpYhVvDPeRLpvp/3jnFQHl9+yZLj2EwFj+jSR2HUx8alA5ZkgvCFd/1Os5v3cLveuNU96oD9hHO9Buwk76/aVcGVWEuNi207/60c= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1613995474; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=D1SbYE1KmPDRoak2DJGC6Q4M4sKZFTZdGnsSzaboxZ8=; b=c1G21w7xqOG9KaX7emVhygygae9+iUMHURGmqSJ8MAZpbG8PHJHdnXmBRPJ3MRIX2NZ2t4kWiCZQFuu25ssi/Fdx9fIIBjoMNEPUKOKr2JWTuMbb8+KTaLXWzcgkFpI+MsgY6NsbUhNQv5LaGmEPyBzbJsE45vKwhZU9niQhn7Q= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 161399547377836.719532195212764; Mon, 22 Feb 2021 04:04:33 -0800 (PST) Received: from localhost ([::1]:59598 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lE9x2-0003N5-Ct for importer@patchew.org; Mon, 22 Feb 2021 07:04:32 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:41336) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lE9rM-00077H-Vs for qemu-devel@nongnu.org; Mon, 22 Feb 2021 06:58:41 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:30340) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1lE9rJ-0001J1-HA for qemu-devel@nongnu.org; Mon, 22 Feb 2021 06:58:40 -0500 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-207-2tgEC6G0OI6YWJaUlv8zog-1; Mon, 22 Feb 2021 06:58:35 -0500 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id E0D51107ACED; Mon, 22 Feb 2021 11:58:33 +0000 (UTC) Received: from t480s.redhat.com (ovpn-115-16.ams2.redhat.com [10.36.115.16]) by smtp.corp.redhat.com (Postfix) with ESMTP id B68EA19C45; Mon, 22 Feb 2021 11:58:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1613995116; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=D1SbYE1KmPDRoak2DJGC6Q4M4sKZFTZdGnsSzaboxZ8=; b=OwH3nfCyIhYDVAZXEp27CdODCLzDlh5vSwmo25SYTNmhsLXN0jngcnfvziB9y42Tot2ZpW omHpW/dBGZKLLmJ0zknr9sHqNTvFXg5fbe+919JEGGswhGd+V06xnidvuRUywa/l7ioTAu JyhQ3BLnAtvQgF73MOysTZF8SRlyP18= X-MC-Unique: 2tgEC6G0OI6YWJaUlv8zog-1 From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v6 05/12] vfio: Support for RamDiscardMgr in the !vIOMMU case Date: Mon, 22 Feb 2021 12:57:01 +0100 Message-Id: <20210222115708.7623-6-david@redhat.com> In-Reply-To: <20210222115708.7623-1-david@redhat.com> References: <20210222115708.7623-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=216.205.24.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , Wei Yang , "Michael S . Tsirkin" , David Hildenbrand , "Dr . David Alan Gilbert" , Peter Xu , Auger Eric , Alex Williamson , teawater , Igor Mammedov , Paolo Bonzini , Marek Kedzierski Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Type: text/plain; charset="utf-8" Implement support for RamDiscardMgr, to prepare for virtio-mem support. Instead of mapping the whole memory section, we only map "populated" parts and update the mapping when notified about discarding/population of memory via the RamDiscardListener. Similarly, when syncing the dirty bitmaps, sync only the actually mapped (populated) parts by replaying via the notifier. Using virtio-mem with vfio is still blocked via ram_block_discard_disable()/ram_block_discard_require() after this patch. Reviewed-by: Alex Williamson Acked-by: Alex Williamson Acked-by: Michael S. Tsirkin Cc: Paolo Bonzini Cc: "Michael S. Tsirkin" Cc: Alex Williamson Cc: Dr. David Alan Gilbert Cc: Igor Mammedov Cc: Pankaj Gupta Cc: Peter Xu Cc: Auger Eric Cc: Wei Yang Cc: teawater Cc: Marek Kedzierski Signed-off-by: David Hildenbrand --- hw/vfio/common.c | 203 ++++++++++++++++++++++++++++++++++ include/hw/vfio/vfio-common.h | 12 ++ 2 files changed, 215 insertions(+) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 6ff1daa763..f68370de6c 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -654,6 +654,139 @@ out: rcu_read_unlock(); } =20 +static void vfio_ram_discard_notify_discard(RamDiscardListener *rdl, + const MemoryRegion *mr, + ram_addr_t offset, ram_addr_t = size) +{ + VFIORamDiscardListener *vrdl =3D container_of(rdl, VFIORamDiscardListe= ner, + listener); + const hwaddr mr_start =3D MAX(offset, vrdl->offset_within_region); + const hwaddr mr_end =3D MIN(offset + size, + vrdl->offset_within_region + vrdl->size); + const hwaddr iova =3D mr_start - vrdl->offset_within_region + + vrdl->offset_within_address_space; + int ret; + + if (mr_start >=3D mr_end) { + return; + } + + /* Unmap with a single call. */ + ret =3D vfio_dma_unmap(vrdl->container, iova, mr_end - mr_start, NULL); + if (ret) { + error_report("%s: vfio_dma_unmap() failed: %s", __func__, + strerror(-ret)); + } +} + +static int vfio_ram_discard_notify_populate(RamDiscardListener *rdl, + const MemoryRegion *mr, + ram_addr_t offset, ram_addr_t = size) +{ + VFIORamDiscardListener *vrdl =3D container_of(rdl, VFIORamDiscardListe= ner, + listener); + const hwaddr mr_end =3D MIN(offset + size, + vrdl->offset_within_region + vrdl->size); + hwaddr mr_start =3D MAX(offset, vrdl->offset_within_region); + hwaddr mr_next, iova; + void *vaddr; + int ret; + + /* + * Map in (aligned within memory region) minimum granularity, so we can + * unmap in minimum granularity later. + */ + for (; mr_start < mr_end; mr_start =3D mr_next) { + mr_next =3D ROUND_UP(mr_start + 1, vrdl->granularity); + mr_next =3D MIN(mr_next, mr_end); + + iova =3D mr_start - vrdl->offset_within_region + + vrdl->offset_within_address_space; + vaddr =3D memory_region_get_ram_ptr(vrdl->mr) + mr_start; + + ret =3D vfio_dma_map(vrdl->container, iova, mr_next - mr_start, + vaddr, mr->readonly); + if (ret) { + /* Rollback */ + vfio_ram_discard_notify_discard(rdl, mr, offset, size); + return ret; + } + } + return 0; +} + +static void vfio_ram_discard_notify_discard_all(RamDiscardListener *rdl, + const MemoryRegion *mr) +{ + VFIORamDiscardListener *vrdl =3D container_of(rdl, VFIORamDiscardListe= ner, + listener); + int ret; + + /* Unmap with a single call. */ + ret =3D vfio_dma_unmap(vrdl->container, vrdl->offset_within_address_sp= ace, + vrdl->size, NULL); + if (ret) { + error_report("%s: vfio_dma_unmap() failed: %s", __func__, + strerror(-ret)); + } +} + +static void vfio_register_ram_discard_notifier(VFIOContainer *container, + MemoryRegionSection *sectio= n) +{ + RamDiscardMgr *rdm =3D memory_region_get_ram_discard_mgr(section->mr); + RamDiscardMgrClass *rdmc =3D RAM_DISCARD_MGR_GET_CLASS(rdm); + VFIORamDiscardListener *vrdl; + + vrdl =3D g_new0(VFIORamDiscardListener, 1); + vrdl->container =3D container; + vrdl->mr =3D section->mr; + vrdl->offset_within_region =3D section->offset_within_region; + vrdl->offset_within_address_space =3D section->offset_within_address_s= pace; + vrdl->size =3D int128_get64(section->size); + vrdl->granularity =3D rdmc->get_min_granularity(rdm, section->mr); + + g_assert(vrdl->granularity && is_power_of_2(vrdl->granularity)); + g_assert(vrdl->granularity >=3D 1 << ctz64(container->pgsizes)); + + /* Ignore some corner cases not relevant in practice. */ + g_assert(QEMU_IS_ALIGNED(vrdl->offset_within_region, TARGET_PAGE_SIZE)= ); + g_assert(QEMU_IS_ALIGNED(vrdl->offset_within_address_space, + TARGET_PAGE_SIZE)); + g_assert(QEMU_IS_ALIGNED(vrdl->size, TARGET_PAGE_SIZE)); + + ram_discard_listener_init(&vrdl->listener, + vfio_ram_discard_notify_populate, + vfio_ram_discard_notify_discard, + vfio_ram_discard_notify_discard_all); + rdmc->register_listener(rdm, section->mr, &vrdl->listener); + QLIST_INSERT_HEAD(&container->vrdl_list, vrdl, next); +} + +static void vfio_unregister_ram_discard_listener(VFIOContainer *container, + MemoryRegionSection *sect= ion) +{ + RamDiscardMgr *rdm =3D memory_region_get_ram_discard_mgr(section->mr); + RamDiscardMgrClass *rdmc =3D RAM_DISCARD_MGR_GET_CLASS(rdm); + VFIORamDiscardListener *vrdl =3D NULL; + + QLIST_FOREACH(vrdl, &container->vrdl_list, next) { + if (vrdl->mr =3D=3D section->mr && + vrdl->offset_within_region =3D=3D section->offset_within_regio= n) { + break; + } + } + + if (!vrdl) { + hw_error("vfio: Trying to unregister missing RAM discard listener"= ); + } + + rdmc->unregister_listener(rdm, section->mr, &vrdl->listener); + QLIST_REMOVE(vrdl, next); + + g_free(vrdl); +} + static void vfio_listener_region_add(MemoryListener *listener, MemoryRegionSection *section) { @@ -814,6 +947,16 @@ static void vfio_listener_region_add(MemoryListener *l= istener, =20 /* Here we assume that memory_region_is_ram(section->mr)=3D=3Dtrue */ =20 + /* + * For RAM memory regions with a RamDiscardMgr, we only want to map the + * actually populated parts - and update the mapping whenever we're no= tified + * about changes. + */ + if (memory_region_has_ram_discard_mgr(section->mr)) { + vfio_register_ram_discard_notifier(container, section); + return; + } + vaddr =3D memory_region_get_ram_ptr(section->mr) + section->offset_within_region + (iova - section->offset_within_address_space); @@ -950,6 +1093,10 @@ static void vfio_listener_region_del(MemoryListener *= listener, =20 pgmask =3D (1ULL << ctz64(hostwin->iova_pgsizes)) - 1; try_unmap =3D !((iova & pgmask) || (int128_get64(llsize) & pgmask)= ); + } else if (memory_region_has_ram_discard_mgr(section->mr)) { + vfio_unregister_ram_discard_listener(container, section); + /* Unregistering will trigger an unmap. */ + try_unmap =3D false; } =20 if (try_unmap) { @@ -1077,6 +1224,59 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifie= r *n, IOMMUTLBEntry *iotlb) rcu_read_unlock(); } =20 +static int vfio_ram_discard_notify_dirty_bitmap(RamDiscardListener *rdl, + const MemoryRegion *mr, + ram_addr_t offset, + ram_addr_t size) +{ + VFIORamDiscardListener *vrdl =3D container_of(rdl, VFIORamDiscardListe= ner, + listener); + const hwaddr mr_start =3D MAX(offset, vrdl->offset_within_region); + const hwaddr mr_end =3D MIN(offset + size, + vrdl->offset_within_region + vrdl->size); + const hwaddr iova =3D mr_start - vrdl->offset_within_region + + vrdl->offset_within_address_space; + ram_addr_t ram_addr; + int ret; + + if (mr_start >=3D mr_end) { + return 0; + } + + /* + * Sync the whole mapped region (spanning multiple individual mappings) + * in one go. + */ + ram_addr =3D memory_region_get_ram_addr(vrdl->mr) + mr_start; + ret =3D vfio_get_dirty_bitmap(vrdl->container, iova, mr_end - mr_start, + ram_addr); + return ret; +} + +static int vfio_sync_ram_discard_listener_dirty_bitmap(VFIOContainer *cont= ainer, + MemoryRegionSection *se= ction) +{ + RamDiscardMgr *rdm =3D memory_region_get_ram_discard_mgr(section->mr); + RamDiscardMgrClass *rdmc =3D RAM_DISCARD_MGR_GET_CLASS(rdm); + VFIORamDiscardListener tmp_vrdl, *vrdl =3D NULL; + + QLIST_FOREACH(vrdl, &container->vrdl_list, next) { + if (vrdl->mr =3D=3D section->mr && + vrdl->offset_within_region =3D=3D section->offset_within_regio= n) { + break; + } + } + + if (!vrdl) { + hw_error("vfio: Trying to sync missing RAM discard listener"); + } + + tmp_vrdl =3D *vrdl; + ram_discard_listener_init(&tmp_vrdl.listener, + vfio_ram_discard_notify_dirty_bitmap, NULL, = NULL); + return rdmc->replay_populated(rdm, section->mr, &tmp_vrdl.listener); +} + static int vfio_sync_dirty_bitmap(VFIOContainer *container, MemoryRegionSection *section) { @@ -1108,6 +1308,8 @@ static int vfio_sync_dirty_bitmap(VFIOContainer *cont= ainer, } } return 0; + } else if (memory_region_has_ram_discard_mgr(section->mr)) { + return vfio_sync_ram_discard_listener_dirty_bitmap(container, sect= ion); } =20 ram_addr =3D memory_region_get_ram_addr(section->mr) + @@ -1737,6 +1939,7 @@ static int vfio_connect_container(VFIOGroup *group, A= ddressSpace *as, container->dirty_pages_supported =3D false; QLIST_INIT(&container->giommu_list); QLIST_INIT(&container->hostwin_list); + QLIST_INIT(&container->vrdl_list); =20 ret =3D vfio_init_container(container, group->fd, errp); if (ret) { diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 6141162d7a..af6f8d1b22 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -91,6 +91,7 @@ typedef struct VFIOContainer { QLIST_HEAD(, VFIOGuestIOMMU) giommu_list; QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list; QLIST_HEAD(, VFIOGroup) group_list; + QLIST_HEAD(, VFIORamDiscardListener) vrdl_list; QLIST_ENTRY(VFIOContainer) next; } VFIOContainer; =20 @@ -102,6 +103,17 @@ typedef struct VFIOGuestIOMMU { QLIST_ENTRY(VFIOGuestIOMMU) giommu_next; } VFIOGuestIOMMU; =20 +typedef struct VFIORamDiscardListener { + VFIOContainer *container; + MemoryRegion *mr; + hwaddr offset_within_region; + hwaddr offset_within_address_space; + hwaddr size; + uint64_t granularity; + RamDiscardListener listener; + QLIST_ENTRY(VFIORamDiscardListener) next; +} VFIORamDiscardListener; + typedef struct VFIOHostDMAWindow { hwaddr min_iova; hwaddr max_iova; --=20 2.29.2 From nobody Sun Feb 8 06:11:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1613995447; cv=none; d=zohomail.com; s=zohoarc; b=fFnppB8D+cis8j7ldtikK08wQx5L7Q70C8FboU5AX6U8gYb8dDrNu9NjhkmbPsSf/LgD3zhDyJ83zDuNbp0DgngXEdRO1l7ZC0HoKw40Wemib4RBPJoeTdEpofjlVMzzDGr82hPNdfVfs3acMmtTJNCf+OD1B7dNWNTf9ps4vEE= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1613995447; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=Ou8EzMkOmKrMd7qEWYbVwBXo6aC7ifVcqrSpJZ8+v9o=; b=iMZGz2TdEMO9pwQYLlgkEZsKJyGdmmSu0sI168GmDvBHhjIeGD8xJi8Q+hhhy5+Hsa4fAcq0sOe7tn25vA0LFLJDJTVYuD9T+Fpr1VXTQ4sy8j278HB2soyhyOwvEZvP8lbvJpE+q1+34WSTO5zyVn9onD6AcaFqzjuU35dqlyc= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1613995447321567.2486749209447; Mon, 22 Feb 2021 04:04:07 -0800 (PST) Received: from localhost ([::1]:59224 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lE9wb-00039P-VQ for importer@patchew.org; Mon, 22 Feb 2021 07:04:05 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:41358) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lE9rO-00078a-Rp for qemu-devel@nongnu.org; Mon, 22 Feb 2021 06:58:43 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:49341) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1lE9rM-0001Jj-8J for qemu-devel@nongnu.org; Mon, 22 Feb 2021 06:58:42 -0500 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-237-ewqRvC46PBKgdzLHP6HBbw-1; Mon, 22 Feb 2021 06:58:38 -0500 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D63231E564; Mon, 22 Feb 2021 11:58:36 +0000 (UTC) Received: from t480s.redhat.com (ovpn-115-16.ams2.redhat.com [10.36.115.16]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3AC2019C79; Mon, 22 Feb 2021 11:58:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1613995119; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ou8EzMkOmKrMd7qEWYbVwBXo6aC7ifVcqrSpJZ8+v9o=; b=I1wNld5z1y1C0BCfi0PYSIjjY8+38Wv3ph5DhqdE6I+lx3AVqyuu3YxctmKsXwof/LW+XC npIHIfGjKdC/vNpP67UoFxGnj+svRWQbA+4yxm8lkO7NplPOqU7IuEciQspljuecrmbRPg P/Oly2IMjSLy8wfjqpdb24INgko6KiA= X-MC-Unique: ewqRvC46PBKgdzLHP6HBbw-1 From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v6 06/12] vfio: Query and store the maximum number of possible DMA mappings Date: Mon, 22 Feb 2021 12:57:02 +0100 Message-Id: <20210222115708.7623-7-david@redhat.com> In-Reply-To: <20210222115708.7623-1-david@redhat.com> References: <20210222115708.7623-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=216.205.24.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , Wei Yang , "Michael S . Tsirkin" , David Hildenbrand , "Dr . David Alan Gilbert" , Peter Xu , Auger Eric , Alex Williamson , teawater , Igor Mammedov , Paolo Bonzini , Marek Kedzierski Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Type: text/plain; charset="utf-8" Let's query the maximum number of possible DMA mappings by querying the available mappings when creating the container (before any mappings are created). We'll use this informaton soon to perform some sanity checks and warn the user. Reviewed-by: Alex Williamson Acked-by: Alex Williamson Acked-by: Michael S. Tsirkin Cc: Paolo Bonzini Cc: "Michael S. Tsirkin" Cc: Alex Williamson Cc: Dr. David Alan Gilbert Cc: Igor Mammedov Cc: Pankaj Gupta Cc: Peter Xu Cc: Auger Eric Cc: Wei Yang Cc: teawater Cc: Marek Kedzierski Signed-off-by: David Hildenbrand --- hw/vfio/common.c | 4 ++++ include/hw/vfio/vfio-common.h | 1 + 2 files changed, 5 insertions(+) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index f68370de6c..78be813a53 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -1937,6 +1937,7 @@ static int vfio_connect_container(VFIOGroup *group, A= ddressSpace *as, container->fd =3D fd; container->error =3D NULL; container->dirty_pages_supported =3D false; + container->dma_max_mappings =3D 0; QLIST_INIT(&container->giommu_list); QLIST_INIT(&container->hostwin_list); QLIST_INIT(&container->vrdl_list); @@ -1968,7 +1969,10 @@ static int vfio_connect_container(VFIOGroup *group, = AddressSpace *as, vfio_host_win_add(container, 0, (hwaddr)-1, info->iova_pgsizes); container->pgsizes =3D info->iova_pgsizes; =20 + /* The default in the kernel ("dma_entry_limit") is 65535. */ + container->dma_max_mappings =3D 65535; if (!ret) { + vfio_get_info_dma_avail(info, &container->dma_max_mappings); vfio_get_iommu_info_migration(container, info); } g_free(info); diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index af6f8d1b22..4b28c6e8ac 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -88,6 +88,7 @@ typedef struct VFIOContainer { uint64_t dirty_pgsizes; uint64_t max_dirty_bitmap_size; unsigned long pgsizes; + unsigned int dma_max_mappings; QLIST_HEAD(, VFIOGuestIOMMU) giommu_list; QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list; QLIST_HEAD(, VFIOGroup) group_list; --=20 2.29.2 From nobody Sun Feb 8 06:11:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1613995681; cv=none; d=zohomail.com; s=zohoarc; b=EMrBAReHhOvNcXuXv3+2x66KST8o9lNCe4aZ9s9PUEQ5T857igMQpwWld9Gzz9ZlOtFDGG/UFPLyGd21NVRH/2kuxPHschHUpuFjNNRGaeZL/LW5W4x6B9OQZPEPHHMiVnVZpkU7dVZqwDjdKbYnfrKUCD6oINF1wFMeUGny6qQ= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1613995681; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=Z9orpjGL3CmvbVJSfSdYiLqe3CtjBJly/X2bIQHk4ks=; b=CwGt6R5QMk+roChgyHlEaxcVb2rgRoOZgtdKA5N1VyL0xCYhiLRm1GMCcfcyJrgf4Nh3+THu7RkBopmpMEUIEwPP7Jc+mibF6J//UGfAQi8hMWLIDnU3DTqONkwGTzbaFAKuaoeCNKmTyu0HOjni2mVduJGZ2F/oBc2r4L1UePk= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1613995680935459.8732403273443; Mon, 22 Feb 2021 04:08:00 -0800 (PST) Received: from localhost ([::1]:39094 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lEA0N-0007D4-T1 for importer@patchew.org; Mon, 22 Feb 2021 07:07:59 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:41434) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lE9rj-0007oa-Hp for qemu-devel@nongnu.org; Mon, 22 Feb 2021 06:59:03 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:53134) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1lE9rh-0001SW-MZ for qemu-devel@nongnu.org; Mon, 22 Feb 2021 06:59:03 -0500 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-22-f1vivacfMkGDpB_WVS4pPw-1; Mon, 22 Feb 2021 06:58:59 -0500 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id E545786A061; Mon, 22 Feb 2021 11:58:57 +0000 (UTC) Received: from t480s.redhat.com (ovpn-115-16.ams2.redhat.com [10.36.115.16]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3364219C79; Mon, 22 Feb 2021 11:58:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1613995140; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Z9orpjGL3CmvbVJSfSdYiLqe3CtjBJly/X2bIQHk4ks=; b=LMa2caowMYY+EosS5Ip+c1H2W97z04q0gDkcf4vyKIFxi21rl4ecQ0K78YYBWJesUXDg2I b9NKtwJTGJsH1kMX3dCZAC/4ctZn0XFFtE2LCRgvxsd9PNv+qxtKX9wambV995wV8voiTV izu+SaZPPnzruS2uCX7opG/qlPiHVzs= X-MC-Unique: f1vivacfMkGDpB_WVS4pPw-1 From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v6 07/12] vfio: Sanity check maximum number of DMA mappings with RamDiscardMgr Date: Mon, 22 Feb 2021 12:57:03 +0100 Message-Id: <20210222115708.7623-8-david@redhat.com> In-Reply-To: <20210222115708.7623-1-david@redhat.com> References: <20210222115708.7623-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , Wei Yang , "Michael S . Tsirkin" , David Hildenbrand , "Dr . David Alan Gilbert" , Peter Xu , Auger Eric , Alex Williamson , teawater , Igor Mammedov , Paolo Bonzini , Marek Kedzierski Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Type: text/plain; charset="utf-8" Although RamDiscardMgr can handle running into the maximum number of DMA mappings by propagating errors when creating a DMA mapping, we want to sanity check and warn the user early that there is a theoretical setup issue and that virtio-mem might not be able to provide as much memory towards a VM as desired. As suggested by Alex, let's use the number of KVM memory slots to guess how many other mappings we might see over time. Acked-by: Alex Williamson Reviewed-by: Alex Williamson Acked-by: Michael S. Tsirkin Cc: Paolo Bonzini Cc: "Michael S. Tsirkin" Cc: Alex Williamson Cc: Dr. David Alan Gilbert Cc: Igor Mammedov Cc: Pankaj Gupta Cc: Peter Xu Cc: Auger Eric Cc: Wei Yang Cc: teawater Cc: Marek Kedzierski Signed-off-by: David Hildenbrand --- hw/vfio/common.c | 43 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 78be813a53..166ec6ec62 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -761,6 +761,49 @@ static void vfio_register_ram_discard_notifier(VFIOCon= tainer *container, vfio_ram_discard_notify_discard_all); rdmc->register_listener(rdm, section->mr, &vrdl->listener); QLIST_INSERT_HEAD(&container->vrdl_list, vrdl, next); + + /* + * Sanity-check if we have a theoretically problematic setup where we = could + * exceed the maximum number of possible DMA mappings over time. We as= sume + * that each mapped section in the same address space as a RamDiscardM= gr + * section consumes exactly one DMA mapping, with the exception of + * RamDiscardMgr sections; i.e., we don't expect to have gIOMMU sectio= ns in + * the same address space as RamDiscardMgr sections. + * + * We assume that each section in the address space consumes one memsl= ot. + * We take the number of KVM memory slots as a best guess for the maxi= mum + * number of sections in the address space we could have over time, + * also consuming DMA mappings. + */ + if (container->dma_max_mappings) { + unsigned int vrdl_count =3D 0, vrdl_mappings =3D 0, max_memslots = =3D 512; + +#ifdef CONFIG_KVM + if (kvm_enabled()) { + max_memslots =3D kvm_get_max_memslots(); + } +#endif + + QLIST_FOREACH(vrdl, &container->vrdl_list, next) { + hwaddr start, end; + + start =3D QEMU_ALIGN_DOWN(vrdl->offset_within_address_space, + vrdl->granularity); + end =3D ROUND_UP(vrdl->offset_within_address_space + vrdl->siz= e, + vrdl->granularity); + vrdl_mappings +=3D (end - start) / vrdl->granularity; + vrdl_count++; + } + + if (vrdl_mappings + max_memslots - vrdl_count > + container->dma_max_mappings) { + warn_report("%s: possibly running out of DMA mappings. E.g., t= ry" + " increasing the 'block-size' of virtio-mem devies= ." + " Maximum possible DMA mappings: %d, Maximum possi= ble" + " memslots: %d", __func__, container->dma_max_mapp= ings, + max_memslots); + } + } } =20 static void vfio_unregister_ram_discard_listener(VFIOContainer *container, --=20 2.29.2 From nobody Sun Feb 8 06:11:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1613995702; cv=none; d=zohomail.com; s=zohoarc; b=oCOLgwVQD/m1kwjQp1g05zXye7oA0svwt09QW+q+QCQoUe9hyiQrwVxE33YX1O2mRkLIXUY7cD5b5dXMfJRu8v4rpIMg5g5C2RDb9QG5bA6FRjy6wZHtKpnenbvXw1TYi7uXOCw2z4t9Qk9u+UxuZoN2rx/IxzxV0rFEFNA6yvU= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1613995702; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=/dPgg0Kyv23qQF1EHHOV47SNa3sxL4Q1rt18HZe6BYI=; b=gjSZN0rl3JnFUGeumuez6670Lv+YBKuev6T9b6GWv7aot1DYUaAr2ehhW0RvBltc6lcIwhOkXFBv5Zkk+YUakC+paDIIpuXj3fQwNIYREEfKI74DtG3HV9z26BzZYAaIVwk/ZdwgfFYujUZKVQFVGesGsYXdN/7rjkt+lnsIaZs= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1613995702131842.5478018913176; Mon, 22 Feb 2021 04:08:22 -0800 (PST) Received: from localhost ([::1]:40754 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lEA0j-0007t1-2O for importer@patchew.org; Mon, 22 Feb 2021 07:08:21 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:41522) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lE9rx-0008Lk-OX for qemu-devel@nongnu.org; Mon, 22 Feb 2021 06:59:19 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:27027) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1lE9rt-0001ag-LZ for qemu-devel@nongnu.org; Mon, 22 Feb 2021 06:59:17 -0500 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-557-TL3PbZm1Oae5JzZXvVlKog-1; Mon, 22 Feb 2021 06:59:11 -0500 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 531A786A061; Mon, 22 Feb 2021 11:59:10 +0000 (UTC) Received: from t480s.redhat.com (ovpn-115-16.ams2.redhat.com [10.36.115.16]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3F41719C45; Mon, 22 Feb 2021 11:58:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1613995152; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/dPgg0Kyv23qQF1EHHOV47SNa3sxL4Q1rt18HZe6BYI=; b=hdYWszmRBTO4gdpaoPokpKxQ0pkXKM0sZgKUCefCSEfhpdBbRbnoK/W2dPhRcpfcDP6wNx 9GVyCegUB8YRtKA8GGOrtfjyAWJg//93unJ6AgeUOBhRLqwdUCGQni1jb7cawwe0o6mkmo 41wOSh9pwABdgBekquT72KaIau2KdWw= X-MC-Unique: TL3PbZm1Oae5JzZXvVlKog-1 From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v6 08/12] vfio: Support for RamDiscardMgr in the vIOMMU case Date: Mon, 22 Feb 2021 12:57:04 +0100 Message-Id: <20210222115708.7623-9-david@redhat.com> In-Reply-To: <20210222115708.7623-1-david@redhat.com> References: <20210222115708.7623-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=216.205.24.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , Wei Yang , "Michael S . Tsirkin" , David Hildenbrand , "Dr . David Alan Gilbert" , Peter Xu , Auger Eric , Alex Williamson , teawater , Igor Mammedov , Paolo Bonzini , Marek Kedzierski Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Type: text/plain; charset="utf-8" vIOMMU support works already with RamDiscardMgr as long as guests only map populated memory. Both, populated and discarded memory is mapped into &address_space_memory, where vfio_get_xlat_addr() will find that memory, to create the vfio mapping. Sane guests will never map discarded memory (e.g., unplugged memory blocks in virtio-mem) into an IOMMU - or keep it mapped into an IOMMU while memory is getting discarded. However, there are two cases where a malicious guests could trigger pinning of more memory than intended. One case is easy to handle: the guest trying to map discarded memory into an IOMMU. The other case is harder to handle: the guest keeping memory mapped in the IOMMU while it is getting discarded. We would have to walk over all mappings when discarding memory and identify if any mapping would be a violation. Let's keep it simple for now and print a warning, indicating that setting RLIMIT_MEMLOCK can mitigate such attacks. We have to take care of incoming migration: at the point the IOMMUs get restored and start creating mappings in vfio, RamDiscardMgr implementations might not be back up and running yet: let's add runstate priorities to enforce the order when restoring. Acked-by: Alex Williamson Reviewed-by: Alex Williamson Acked-by: Michael S. Tsirkin Cc: Paolo Bonzini Cc: "Michael S. Tsirkin" Cc: Alex Williamson Cc: Dr. David Alan Gilbert Cc: Igor Mammedov Cc: Pankaj Gupta Cc: Peter Xu Cc: Auger Eric Cc: Wei Yang Cc: teawater Cc: Marek Kedzierski Signed-off-by: David Hildenbrand --- hw/vfio/common.c | 35 +++++++++++++++++++++++++++++++++++ hw/virtio/virtio-mem.c | 1 + include/migration/vmstate.h | 1 + 3 files changed, 37 insertions(+) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 166ec6ec62..15ecd05a4b 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -36,6 +36,7 @@ #include "qemu/range.h" #include "sysemu/kvm.h" #include "sysemu/reset.h" +#include "sysemu/runstate.h" #include "trace.h" #include "qapi/error.h" #include "migration/migration.h" @@ -574,6 +575,40 @@ static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, v= oid **vaddr, error_report("iommu map to non memory area %"HWADDR_PRIx"", xlat); return false; + } else if (memory_region_has_ram_discard_mgr(mr)) { + RamDiscardMgr *rdm =3D memory_region_get_ram_discard_mgr(mr); + RamDiscardMgrClass *rdmc =3D RAM_DISCARD_MGR_GET_CLASS(rdm); + + /* + * Malicious VMs can map memory into the IOMMU, which is expected + * to remain discarded. vfio will pin all pages, populating memory. + * Disallow that. vmstate priorities make sure any RamDiscardMgr w= ere + * already restored before IOMMUs are restored. + */ + if (!rdmc->is_populated(rdm, mr, xlat, len)) { + error_report("iommu map to discarded memory (e.g., unplugged v= ia" + " virtio-mem): %"HWADDR_PRIx"", + iotlb->translated_addr); + return false; + } + + /* + * Malicious VMs might trigger discarding of IOMMU-mapped memory. = The + * pages will remain pinned inside vfio until unmapped, resulting = in a + * higher memory consumption than expected. If memory would get + * populated again later, there would be an inconsistency between = pages + * pinned by vfio and pages seen by QEMU. This is the case until + * unmapped from the IOMMU (e.g., during device reset). + * + * With malicious guests, we really only care about pinning more m= emory + * than expected. RLIMIT_MEMLOCK set for the user/process can neve= r be + * exceeded and can be used to mitigate this problem. + */ + warn_report_once("Using vfio with vIOMMUs and coordinated discardi= ng of" + " RAM (e.g., virtio-mem) works, however, maliciou= s" + " guests can trigger pinning of more memory than" + " intended via an IOMMU. It's possible to mitigat= e " + " by setting/adjusting RLIMIT_MEMLOCK."); } =20 /* diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index 3132e4b2b1..194fb56a9a 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -854,6 +854,7 @@ static const VMStateDescription vmstate_virtio_mem_devi= ce =3D { .name =3D "virtio-mem-device", .minimum_version_id =3D 1, .version_id =3D 1, + .priority =3D MIG_PRI_VIRTIO_MEM, .post_load =3D virtio_mem_post_load, .fields =3D (VMStateField[]) { VMSTATE_WITH_TMP(VirtIOMEM, VirtIOMEMMigSanityChecks, diff --git a/include/migration/vmstate.h b/include/migration/vmstate.h index 075ee80096..3bf58ff043 100644 --- a/include/migration/vmstate.h +++ b/include/migration/vmstate.h @@ -153,6 +153,7 @@ typedef enum { MIG_PRI_DEFAULT =3D 0, MIG_PRI_IOMMU, /* Must happen before PCI devices */ MIG_PRI_PCI_BUS, /* Must happen before IOMMU */ + MIG_PRI_VIRTIO_MEM, /* Must happen before IOMMU */ MIG_PRI_GICV3_ITS, /* Must happen before PCI devices */ MIG_PRI_GICV3, /* Must happen before the ITS */ MIG_PRI_MAX, --=20 2.29.2 From nobody Sun Feb 8 06:11:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1613995810; cv=none; d=zohomail.com; s=zohoarc; b=CF/RMqHU/x8OydXdNvv2c8z8kHczx3RX+W2j9uuhdPlaP3AOaQXElUYozQu5LEZoLsdAJa90yZ6yCJ1GhfBLfNzsr+/Wy+VtLIM1gCRIIdnfgFwE+OhRqf/vYP4gN5gT4GZjKxYGSlXosGFr/GVpQA5G5PcIHlVroBc4CkT6ibs= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1613995810; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=8R8BXWST+W76xUQCJvojTJ7ghMMk+o/jHUg65yRlj4k=; b=ikeorWBOK2l5zI/f7RcrbmP9tuXJ00fCfL5oCc2vFq00iUc3zm7CKVXk5pGVQza8GYk8ICqA32onGFZ5SeD/9xRRY3KunWuZHb2/i3HTpvYiAIVChAji2wU897GSNPiOHrAn0WLZuu+wl/7iG4WP7CxD1qyxR7jUqCM2w1C3a+M= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1613995809928981.6539418067782; Mon, 22 Feb 2021 04:10:09 -0800 (PST) Received: from localhost ([::1]:45256 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lEA2S-0001Sl-Ec for importer@patchew.org; Mon, 22 Feb 2021 07:10:08 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:41602) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lE9sF-0008Tz-7I for qemu-devel@nongnu.org; Mon, 22 Feb 2021 06:59:39 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:33821) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1lE9s4-0001eZ-Mx for qemu-devel@nongnu.org; Mon, 22 Feb 2021 06:59:34 -0500 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-32-OENgNyxJO_q_VC_9T0-Y1A-1; Mon, 22 Feb 2021 06:59:20 -0500 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 8786C107ACC7; Mon, 22 Feb 2021 11:59:19 +0000 (UTC) Received: from t480s.redhat.com (ovpn-115-16.ams2.redhat.com [10.36.115.16]) by smtp.corp.redhat.com (Postfix) with ESMTP id A324819C45; Mon, 22 Feb 2021 11:59:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1613995162; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8R8BXWST+W76xUQCJvojTJ7ghMMk+o/jHUg65yRlj4k=; b=Lj6zZqNizOsG1Pt6AseJRwyMdvytZxeWtznZZ54fufoQYDVk3bhRC5Wl1HYCa+gFEv/2GJ qQto10Cx9Y8wG5h3YoAuhhGqOzYIJjjQNzIxeqp+1wNhbMvHu6gv0v9CArS2+qtWWt6iMf Tquijr5EK9Roum0q3X0eTKmeCEdQLNQ= X-MC-Unique: OENgNyxJO_q_VC_9T0-Y1A-1 From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v6 09/12] softmmu/physmem: Don't use atomic operations in ram_block_discard_(disable|require) Date: Mon, 22 Feb 2021 12:57:05 +0100 Message-Id: <20210222115708.7623-10-david@redhat.com> In-Reply-To: <20210222115708.7623-1-david@redhat.com> References: <20210222115708.7623-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , Wei Yang , "Michael S . Tsirkin" , David Hildenbrand , Alex Williamson , Peter Xu , "Dr . David Alan Gilbert" , Auger Eric , Pankaj Gupta , teawater , Igor Mammedov , Paolo Bonzini , Marek Kedzierski Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Type: text/plain; charset="utf-8" We have users in migration context that don't hold the BQL (when finishing migration). To prepare for further changes, use a dedicated mutex instead of atomic operations. Keep using qatomic_read ("READ_ONCE") for the functions that only extract the current state (e.g., used by virtio-balloon), locking isn't necessary. While at it, split up the counter into two variables to make it easier to understand. Suggested-by: Peter Xu Reviewed-by: Peter Xu Reviewed-by: Pankaj Gupta Acked-by: Michael S. Tsirkin Cc: Paolo Bonzini Cc: "Michael S. Tsirkin" Cc: Alex Williamson Cc: Dr. David Alan Gilbert Cc: Igor Mammedov Cc: Pankaj Gupta Cc: Peter Xu Cc: Auger Eric Cc: Wei Yang Cc: teawater Cc: Marek Kedzierski Signed-off-by: David Hildenbrand --- softmmu/physmem.c | 70 ++++++++++++++++++++++++++--------------------- 1 file changed, 39 insertions(+), 31 deletions(-) diff --git a/softmmu/physmem.c b/softmmu/physmem.c index 19e0aa9836..6550217c26 100644 --- a/softmmu/physmem.c +++ b/softmmu/physmem.c @@ -3677,56 +3677,64 @@ void mtree_print_dispatch(AddressSpaceDispatch *d, = MemoryRegion *root) } } =20 -/* - * If positive, discarding RAM is disabled. If negative, discarding RAM is - * required to work and cannot be disabled. - */ -static int ram_block_discard_disabled; +static unsigned int ram_block_discard_requirers; +static unsigned int ram_block_discard_disablers; +static QemuMutex ram_block_discard_disable_mutex; + +static void ram_block_discard_disable_mutex_lock(void) +{ + static gsize initialized; + + if (g_once_init_enter(&initialized)) { + qemu_mutex_init(&ram_block_discard_disable_mutex); + g_once_init_leave(&initialized, 1); + } + qemu_mutex_lock(&ram_block_discard_disable_mutex); +} + +static void ram_block_discard_disable_mutex_unlock(void) +{ + qemu_mutex_unlock(&ram_block_discard_disable_mutex); +} =20 int ram_block_discard_disable(bool state) { - int old; + int ret =3D 0; =20 + ram_block_discard_disable_mutex_lock(); if (!state) { - qatomic_dec(&ram_block_discard_disabled); - return 0; + ram_block_discard_disablers--; + } else if (!ram_block_discard_requirers) { + ram_block_discard_disablers++; + } else { + ret =3D -EBUSY; } - - do { - old =3D qatomic_read(&ram_block_discard_disabled); - if (old < 0) { - return -EBUSY; - } - } while (qatomic_cmpxchg(&ram_block_discard_disabled, - old, old + 1) !=3D old); - return 0; + ram_block_discard_disable_mutex_unlock(); + return ret; } =20 int ram_block_discard_require(bool state) { - int old; + int ret =3D 0; =20 + ram_block_discard_disable_mutex_lock(); if (!state) { - qatomic_inc(&ram_block_discard_disabled); - return 0; + ram_block_discard_requirers--; + } else if (!ram_block_discard_disablers) { + ram_block_discard_requirers++; + } else { + ret =3D -EBUSY; } - - do { - old =3D qatomic_read(&ram_block_discard_disabled); - if (old > 0) { - return -EBUSY; - } - } while (qatomic_cmpxchg(&ram_block_discard_disabled, - old, old - 1) !=3D old); - return 0; + ram_block_discard_disable_mutex_unlock(); + return ret; } =20 bool ram_block_discard_is_disabled(void) { - return qatomic_read(&ram_block_discard_disabled) > 0; + return qatomic_read(&ram_block_discard_disablers); } =20 bool ram_block_discard_is_required(void) { - return qatomic_read(&ram_block_discard_disabled) < 0; + return qatomic_read(&ram_block_discard_requirers); } --=20 2.29.2 From nobody Sun Feb 8 06:11:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1613995853; cv=none; d=zohomail.com; s=zohoarc; b=aE3yGv9pjnqaArBOAMc/VCgMYiYBUepoqQwspa0/eDz5AQTspXKoJSaEAs4mo0MTgaqlmpleEXQLwL3wXwd40H8qC63AkiHcntQJlUlXMIAi9AwSAN+aITqBBh/fl0C6TYPJM6LlPoJVz/DbwP9KSXQCCLUG9l7cpIkvS3LUwm8= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1613995853; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=LwHAeY0imacXyN9k+gcTEwRbetXYrarv6niZbi0TXmg=; b=Dw5u1hZbWyKY49T4S+LQOOsjh8CV8mmAFPuxSTqIRxCuAw31gJDrUXdvGr/EvKtc6OaKHrJsS7XbXEu/S0ubTWf+P8DAAY6tTX7WtWJI+SRpuyIANY2wZaWgvRmVS/gq0siWLzQzPCxlp+8BHk1zVy+sLi1k0h86PZF3JNX34TE= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1613995853290452.6499382061063; Mon, 22 Feb 2021 04:10:53 -0800 (PST) Received: from localhost ([::1]:46936 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lEA39-0002cP-KD for importer@patchew.org; Mon, 22 Feb 2021 07:10:51 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:41604) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lE9sJ-0008UT-IT for qemu-devel@nongnu.org; Mon, 22 Feb 2021 06:59:39 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:20719) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1lE9sD-0001kj-5p for qemu-devel@nongnu.org; Mon, 22 Feb 2021 06:59:39 -0500 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-23-dqk9LODwMiSxaTtf8cqurQ-1; Mon, 22 Feb 2021 06:59:30 -0500 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D854F801965; Mon, 22 Feb 2021 11:59:28 +0000 (UTC) Received: from t480s.redhat.com (ovpn-115-16.ams2.redhat.com [10.36.115.16]) by smtp.corp.redhat.com (Postfix) with ESMTP id D759919C45; Mon, 22 Feb 2021 11:59:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1613995172; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LwHAeY0imacXyN9k+gcTEwRbetXYrarv6niZbi0TXmg=; b=h5bw1H8QBA+631YtIshF1tTlrVIgFKqunfnqOxkE2R9vutDFsaGdYjOFmT89MSMkZE8ajg fMQPY5JeCC9GvYSsd/Bw5/fijybuP3ksVS+K4TQukP5mNUXHPMyhhXjAEHlRi2uTJS7GBd swWsWFlOlhwBFPBtbZyd6xrqeHB/4AY= X-MC-Unique: dqk9LODwMiSxaTtf8cqurQ-1 From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v6 10/12] softmmu/physmem: Extend ram_block_discard_(require|disable) by two discard types Date: Mon, 22 Feb 2021 12:57:06 +0100 Message-Id: <20210222115708.7623-11-david@redhat.com> In-Reply-To: <20210222115708.7623-1-david@redhat.com> References: <20210222115708.7623-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=216.205.24.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , Wei Yang , "Michael S . Tsirkin" , David Hildenbrand , Alex Williamson , Peter Xu , "Dr . David Alan Gilbert" , Auger Eric , Pankaj Gupta , teawater , Igor Mammedov , Paolo Bonzini , Marek Kedzierski Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Type: text/plain; charset="utf-8" We want to separate the two cases whereby we discard ram - uncoordinated: e.g., virito-balloon - coordinated: e.g., virtio-mem coordinated via the RamDiscardMgr Reviewed-by: Pankaj Gupta Acked-by: Michael S. Tsirkin Cc: Paolo Bonzini Cc: "Michael S. Tsirkin" Cc: Alex Williamson Cc: Dr. David Alan Gilbert Cc: Igor Mammedov Cc: Pankaj Gupta Cc: Peter Xu Cc: Auger Eric Cc: Wei Yang Cc: teawater Cc: Marek Kedzierski Signed-off-by: David Hildenbrand --- include/exec/memory.h | 18 +++++++++++++-- softmmu/physmem.c | 54 ++++++++++++++++++++++++++++++++++++++----- 2 files changed, 64 insertions(+), 8 deletions(-) diff --git a/include/exec/memory.h b/include/exec/memory.h index 6132910767..fa41c1aee8 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -2818,6 +2818,12 @@ static inline MemOp devend_memop(enum device_endian = end) */ int ram_block_discard_disable(bool state); =20 +/* + * See ram_block_discard_disable(): only disable uncoordinated discards, + * keeping coordinated discards (via the RamDiscardMgr) enabled. + */ +int ram_block_uncoordinated_discard_disable(bool state); + /* * Inhibit technologies that disable discarding of pages in RAM blocks. * @@ -2827,12 +2833,20 @@ int ram_block_discard_disable(bool state); int ram_block_discard_require(bool state); =20 /* - * Test if discarding of memory in ram blocks is disabled. + * See ram_block_discard_require(): only inhibit technologies that disable + * uncoordinated discarding of pages in RAM blocks, allowing co-existance = with + * technologies that only inhibit uncoordinated discards (via the + * RamDiscardMgr). + */ +int ram_block_coordinated_discard_require(bool state); + +/* + * Test if any discarding of memory in ram blocks is disabled. */ bool ram_block_discard_is_disabled(void); =20 /* - * Test if discarding of memory in ram blocks is required to work reliably. + * Test if any discarding of memory in ram blocks is required to work reli= ably. */ bool ram_block_discard_is_required(void); =20 diff --git a/softmmu/physmem.c b/softmmu/physmem.c index 6550217c26..19f19ad3a8 100644 --- a/softmmu/physmem.c +++ b/softmmu/physmem.c @@ -3677,8 +3677,14 @@ void mtree_print_dispatch(AddressSpaceDispatch *d, M= emoryRegion *root) } } =20 +/* Require any discards to work. */ static unsigned int ram_block_discard_requirers; +/* Require only coordinated discards to work. */ +static unsigned int ram_block_coordinated_discard_requirers; +/* Disable any discards. */ static unsigned int ram_block_discard_disablers; +/* Disable only uncoordinated discards. */ +static unsigned int ram_block_uncoordinated_discard_disablers; static QemuMutex ram_block_discard_disable_mutex; =20 static void ram_block_discard_disable_mutex_lock(void) @@ -3704,10 +3710,27 @@ int ram_block_discard_disable(bool state) ram_block_discard_disable_mutex_lock(); if (!state) { ram_block_discard_disablers--; - } else if (!ram_block_discard_requirers) { - ram_block_discard_disablers++; + } else if (ram_block_discard_requirers || + ram_block_coordinated_discard_requirers) { + ret =3D -EBUSY; } else { + ram_block_discard_disablers++; + } + ram_block_discard_disable_mutex_unlock(); + return ret; +} + +int ram_block_uncoordinated_discard_disable(bool state) +{ + int ret =3D 0; + + ram_block_discard_disable_mutex_lock(); + if (!state) { + ram_block_uncoordinated_discard_disablers--; + } else if (ram_block_discard_requirers) { ret =3D -EBUSY; + } else { + ram_block_uncoordinated_discard_disablers++; } ram_block_discard_disable_mutex_unlock(); return ret; @@ -3720,10 +3743,27 @@ int ram_block_discard_require(bool state) ram_block_discard_disable_mutex_lock(); if (!state) { ram_block_discard_requirers--; - } else if (!ram_block_discard_disablers) { - ram_block_discard_requirers++; + } else if (ram_block_discard_disablers || + ram_block_uncoordinated_discard_disablers) { + ret =3D -EBUSY; } else { + ram_block_discard_requirers++; + } + ram_block_discard_disable_mutex_unlock(); + return ret; +} + +int ram_block_coordinated_discard_require(bool state) +{ + int ret =3D 0; + + ram_block_discard_disable_mutex_lock(); + if (!state) { + ram_block_coordinated_discard_requirers--; + } else if (ram_block_discard_disablers) { ret =3D -EBUSY; + } else { + ram_block_coordinated_discard_requirers++; } ram_block_discard_disable_mutex_unlock(); return ret; @@ -3731,10 +3771,12 @@ int ram_block_discard_require(bool state) =20 bool ram_block_discard_is_disabled(void) { - return qatomic_read(&ram_block_discard_disablers); + return qatomic_read(&ram_block_discard_disablers) || + qatomic_read(&ram_block_uncoordinated_discard_disablers); } =20 bool ram_block_discard_is_required(void) { - return qatomic_read(&ram_block_discard_requirers); + return qatomic_read(&ram_block_discard_requirers) || + qatomic_read(&ram_block_coordinated_discard_requirers); } --=20 2.29.2 From nobody Sun Feb 8 06:11:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1613996144; cv=none; d=zohomail.com; s=zohoarc; b=VDbfn3R2wFCQPJJgfgqpQ7EIfcQU6vxODeaFbRpQrZqTuRwSRwzA+Kez+4tdY098ng27Kpb64MQrIjayWayY0xGxSZpuXcLAmYCDraJav1ePl7TBSkTIb47X2CH6lkHYO0VQyn1q0EkySub8mnRTKkdO62kmHfUhyghQQ91YwxI= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1613996144; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=Lx/0pGHnw6tViSiAtQUtP/qp5LIjgk3FDjIRhud2lss=; b=jnHcR9l937s0HrwietpJvuRS9J0AJazGllzTwIlT5HNZp62vT++jwKIZ1pOHFi8zj7OmcuEUdrt1xuNThKF7g1HWQrVxe8fibAS7uM7IDEF0K4IPiVYehKZ/4Fle2bbqItWc3n7Rn5aAESeeCy8fprReQ19P8lncl7tEDvXu8Io= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1613996143884297.5109343189748; Mon, 22 Feb 2021 04:15:43 -0800 (PST) Received: from localhost ([::1]:55816 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lEA7q-00071h-Q9 for importer@patchew.org; Mon, 22 Feb 2021 07:15:42 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:41642) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lE9sR-0000MR-35 for qemu-devel@nongnu.org; Mon, 22 Feb 2021 06:59:47 -0500 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:25731) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1lE9sP-0001oC-6j for qemu-devel@nongnu.org; Mon, 22 Feb 2021 06:59:46 -0500 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-164-njkXwNuJOTatQm_yhbO7iw-1; Mon, 22 Feb 2021 06:59:42 -0500 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 3581680196C; Mon, 22 Feb 2021 11:59:41 +0000 (UTC) Received: from t480s.redhat.com (ovpn-115-16.ams2.redhat.com [10.36.115.16]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3123119C45; Mon, 22 Feb 2021 11:59:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1613995184; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Lx/0pGHnw6tViSiAtQUtP/qp5LIjgk3FDjIRhud2lss=; b=HTUPzMjxfiwAWuWJN8p0hwDBO19nYwIuddhETY/Q2o+lApEnLjxK/gFE9uogY1cHj/TrK1 IsJ1SxGzRgtLaoLov85ZnnIoBFFHIWEQdZLQIsbGStLZTqJ9ljWeB65uhuMNyuwDFBeBFp h0ffhvKADxgfrhstgnqDa++Tj21w0YA= X-MC-Unique: njkXwNuJOTatQm_yhbO7iw-1 From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v6 11/12] virtio-mem: Require only coordinated discards Date: Mon, 22 Feb 2021 12:57:07 +0100 Message-Id: <20210222115708.7623-12-david@redhat.com> In-Reply-To: <20210222115708.7623-1-david@redhat.com> References: <20210222115708.7623-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=63.128.21.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , Wei Yang , "Michael S . Tsirkin" , David Hildenbrand , Alex Williamson , Peter Xu , "Dr . David Alan Gilbert" , Auger Eric , Pankaj Gupta , teawater , Igor Mammedov , Paolo Bonzini , Marek Kedzierski Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Type: text/plain; charset="utf-8" We implement the RamDiscardMgr interface and only require coordinated discarding of RAM to work. Reviewed-by: Dr. David Alan Gilbert Reviewed-by: Pankaj Gupta Acked-by: Michael S. Tsirkin Reviewed-by: Michael S. Tsirkin Cc: Paolo Bonzini Cc: "Michael S. Tsirkin" Cc: Alex Williamson Cc: Dr. David Alan Gilbert Cc: Igor Mammedov Cc: Pankaj Gupta Cc: Peter Xu Cc: Auger Eric Cc: Wei Yang Cc: teawater Cc: Marek Kedzierski Signed-off-by: David Hildenbrand --- hw/virtio/virtio-mem.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index 194fb56a9a..9c36e7f96d 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -676,7 +676,7 @@ static void virtio_mem_device_realize(DeviceState *dev,= Error **errp) return; } =20 - if (ram_block_discard_require(true)) { + if (ram_block_coordinated_discard_require(true)) { error_setg(errp, "Discarding RAM is disabled"); return; } @@ -684,7 +684,7 @@ static void virtio_mem_device_realize(DeviceState *dev,= Error **errp) ret =3D ram_block_discard_range(rb, 0, qemu_ram_get_used_length(rb)); if (ret) { error_setg_errno(errp, -ret, "Unexpected error discarding RAM"); - ram_block_discard_require(false); + ram_block_coordinated_discard_require(false); return; } =20 @@ -727,7 +727,7 @@ static void virtio_mem_device_unrealize(DeviceState *de= v) virtio_del_queue(vdev, 0); virtio_cleanup(vdev); g_free(vmem->bitmap); - ram_block_discard_require(false); + ram_block_coordinated_discard_require(false); } =20 static int virtio_mem_discard_range_cb(const VirtIOMEM *vmem, void *arg, --=20 2.29.2 From nobody Sun Feb 8 06:11:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1613995683; cv=none; d=zohomail.com; s=zohoarc; b=kIwZrmPzMOAXtyEc9RCrDQPHVvVpsnXHLavwJx3kxEniticmIRrIYO2wBKKPqroFgq4tN/Dtad+Hv35wtXzWNiwkTc+xGn4k8R6xTrZhYOAbevL6nG5kD1SF3l3KRJi7hf15Bfz0j8u0cBy8KgFYPED2W3Dn4THwV09d7NCS7Zg= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1613995683; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=gFA42tresfvNuT4zr2Dtt7dL+z+012IfjADyVZJTHOI=; b=FlhvAkO26XA/fc376tZE7WBdqg1I5yrrfzBmkfoKZmmjxOEIQwCFDJutE74vRPE4k9cogbpPrIHAf0MykMv6E0qjeF/Lm2DhlmbPa9WD+KaWEO+UmKRduGYI6vRYkVPwT2q5ByZCa9RAscQKd7k44Bd5U8k9SiCNGRHW1LQwt0U= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 161399568355468.35299472017164; Mon, 22 Feb 2021 04:08:03 -0800 (PST) Received: from localhost ([::1]:39276 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lEA0Q-0007He-CV for importer@patchew.org; Mon, 22 Feb 2021 07:08:02 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:41674) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lE9sg-0000xA-2A for qemu-devel@nongnu.org; Mon, 22 Feb 2021 07:00:02 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:34123) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1lE9sd-0001uP-SQ for qemu-devel@nongnu.org; Mon, 22 Feb 2021 07:00:01 -0500 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-523-IT-_QZWDO0G_qmVsXODE2g-1; Mon, 22 Feb 2021 06:59:57 -0500 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 7CD791008303; Mon, 22 Feb 2021 11:59:56 +0000 (UTC) Received: from t480s.redhat.com (ovpn-115-16.ams2.redhat.com [10.36.115.16]) by smtp.corp.redhat.com (Postfix) with ESMTP id 82C2119C45; Mon, 22 Feb 2021 11:59:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1613995199; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gFA42tresfvNuT4zr2Dtt7dL+z+012IfjADyVZJTHOI=; b=FiPI2D+RxHmNXlBnCPFANC8/X5GigQ3a7OLWGOFbvClXcp0Yg9+/+dt/ju3RXVBKBOITih ohB8OaA4BbQLHkd7wZNO3kO7xAsAGDzRzQH7uqpjzlXdHJr4RLkokNEt9t1ghPIJfl+cDB 4Uf3dFH2BQXtUO1kcGf6OdCuoJiJkkY= X-MC-Unique: IT-_QZWDO0G_qmVsXODE2g-1 From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v6 12/12] vfio: Disable only uncoordinated discards for VFIO_TYPE1 iommus Date: Mon, 22 Feb 2021 12:57:08 +0100 Message-Id: <20210222115708.7623-13-david@redhat.com> In-Reply-To: <20210222115708.7623-1-david@redhat.com> References: <20210222115708.7623-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=216.205.24.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , Wei Yang , "Michael S . Tsirkin" , David Hildenbrand , "Dr . David Alan Gilbert" , Peter Xu , Auger Eric , Alex Williamson , teawater , Igor Mammedov , Paolo Bonzini , Marek Kedzierski Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Type: text/plain; charset="utf-8" We support coordinated discarding of RAM using the RamDiscardMgr for the VFIO_TYPE1 iommus. Let's unlock support for coordinated discards, keeping uncoordinated discards (e.g., via virtio-balloon) disabled if possible. This unlocks virtio-mem + vfio on x86-64. Note that vfio used via "nvme://" by the block layer has to be implemented/unlocked separately. For now, virtio-mem only supports x86-64; we don't restrict RamDiscardMgr to x86-64, though: arm64 and s390x are supposed to work as well, and we'll test once unlocking virtio-mem support. The spapr IOMMUs will need special care, to be tackled later, e.g.., once supporting virtio-mem. Note: The block size of a virtio-mem device has to be set to sane sizes, depending on the maximum hotplug size - to not run out of vfio mappings. The default virtio-mem block size is usually in the range of a couple of MBs. The maximum number of mapping is 64k, shared with other users. Assume you want to hotplug 256GB using virtio-mem - the block size would have to be set to at least 8 MiB (resulting in 32768 separate mappings). Acked-by: Alex Williamson Reviewed-by: Alex Williamson Acked-by: Michael S. Tsirkin Cc: Paolo Bonzini Cc: "Michael S. Tsirkin" Cc: Alex Williamson Cc: Dr. David Alan Gilbert Cc: Igor Mammedov Cc: Pankaj Gupta Cc: Peter Xu Cc: Auger Eric Cc: Wei Yang Cc: teawater Cc: Marek Kedzierski Signed-off-by: David Hildenbrand --- hw/vfio/common.c | 63 +++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 51 insertions(+), 12 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 15ecd05a4b..d879b8ab92 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -135,6 +135,27 @@ static const char *index_to_str(VFIODevice *vbasedev, = int index) } } =20 +static int vfio_ram_block_discard_disable(VFIOContainer *container, bool s= tate) +{ + switch (container->iommu_type) { + case VFIO_TYPE1v2_IOMMU: + case VFIO_TYPE1_IOMMU: + /* We support coordinated discarding of RAM via the RamDiscardMgr.= */ + return ram_block_uncoordinated_discard_disable(state); + default: + /* + * VFIO_SPAPR_TCE_IOMMU most probably works just fine with + * RamDiscardMgr, however, it is completely untested. + * + * VFIO_SPAPR_TCE_v2_IOMMU with "DMA memory preregistering" does + * completely the opposite of managing mapping/pinning dynamically= as + * required by RamDiscardMgr. We would have to special-case sectio= ns + * with a RamDiscardMgr. + */ + return ram_block_discard_disable(state); + } +} + int vfio_set_irq_signaling(VFIODevice *vbasedev, int index, int subindex, int action, int fd, Error **errp) { @@ -1979,15 +2000,25 @@ static int vfio_connect_container(VFIOGroup *group,= AddressSpace *as, * new memory, it will not yet set ram_block_discard_set_required() and * therefore, neither stops us here or deals with the sudden memory * consumption of inflated memory. + * + * We do support discarding of memory coordinated via the RamDiscardMgr + * with some IOMMU types. vfio_ram_block_discard_disable() handles the + * details once we know which type of IOMMU we are using. */ - ret =3D ram_block_discard_disable(true); - if (ret) { - error_setg_errno(errp, -ret, "Cannot set discarding of RAM broken"= ); - return ret; - } =20 QLIST_FOREACH(container, &space->containers, next) { if (!ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &container->fd)) { + ret =3D vfio_ram_block_discard_disable(container, true); + if (ret) { + error_setg_errno(errp, -ret, + "Cannot set discarding of RAM broken"); + if (ioctl(group->fd, VFIO_GROUP_UNSET_CONTAINER, + &container->fd)) { + error_report("vfio: error disconnecting group %d from" + " container", group->groupid); + } + return ret; + } group->container =3D container; QLIST_INSERT_HEAD(&container->group_list, group, container_nex= t); vfio_kvm_device_add_group(group); @@ -2025,6 +2056,12 @@ static int vfio_connect_container(VFIOGroup *group, = AddressSpace *as, goto free_container_exit; } =20 + ret =3D vfio_ram_block_discard_disable(container, true); + if (ret) { + error_setg_errno(errp, -ret, "Cannot set discarding of RAM broken"= ); + goto free_container_exit; + } + switch (container->iommu_type) { case VFIO_TYPE1v2_IOMMU: case VFIO_TYPE1_IOMMU: @@ -2072,7 +2109,7 @@ static int vfio_connect_container(VFIOGroup *group, A= ddressSpace *as, if (ret) { error_setg_errno(errp, errno, "failed to enable container"= ); ret =3D -errno; - goto free_container_exit; + goto enable_discards_exit; } } else { container->prereg_listener =3D vfio_prereg_listener; @@ -2084,7 +2121,7 @@ static int vfio_connect_container(VFIOGroup *group, A= ddressSpace *as, ret =3D -1; error_propagate_prepend(errp, container->error, "RAM memory listener initialization failed: "); - goto free_container_exit; + goto enable_discards_exit; } } =20 @@ -2097,7 +2134,7 @@ static int vfio_connect_container(VFIOGroup *group, A= ddressSpace *as, if (v2) { memory_listener_unregister(&container->prereg_listener); } - goto free_container_exit; + goto enable_discards_exit; } =20 if (v2) { @@ -2112,7 +2149,7 @@ static int vfio_connect_container(VFIOGroup *group, A= ddressSpace *as, if (ret) { error_setg_errno(errp, -ret, "failed to remove existing window"); - goto free_container_exit; + goto enable_discards_exit; } } else { /* The default table uses 4K pages */ @@ -2153,6 +2190,9 @@ listener_release_exit: vfio_kvm_device_del_group(group); vfio_listener_release(container); =20 +enable_discards_exit: + vfio_ram_block_discard_disable(container, false); + free_container_exit: g_free(container); =20 @@ -2160,7 +2200,6 @@ close_fd_exit: close(fd); =20 put_space_exit: - ram_block_discard_disable(false); vfio_put_address_space(space); =20 return ret; @@ -2282,7 +2321,7 @@ void vfio_put_group(VFIOGroup *group) } =20 if (!group->ram_block_discard_allowed) { - ram_block_discard_disable(false); + vfio_ram_block_discard_disable(group->container, false); } vfio_kvm_device_del_group(group); vfio_disconnect_container(group); @@ -2336,7 +2375,7 @@ int vfio_get_device(VFIOGroup *group, const char *nam= e, =20 if (!group->ram_block_discard_allowed) { group->ram_block_discard_allowed =3D true; - ram_block_discard_disable(false); + vfio_ram_block_discard_disable(group->container, false); } } =20 --=20 2.29.2