From nobody Sat Nov 15 00:06:53 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1759996575; cv=none; d=zohomail.com; s=zohoarc; b=i+RsaXdAY+coA1Q2MWtwtIH19cNiPK4NNgT/xB3Etjv9puujbMSUYR6JNPIcnXajOLcQK8S2HYdm00QcXQWoEwdCqG82upN/evjv++DixDU6KFWs3bGs71B8HwkkilLAdcvN8TpxHVJQaWcOiNYYdrzyDlOillm291SFCIyvHR8= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1759996575; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=u30x/55IvFNCQpDjqbfcuUoNhz5W2rD1TNjSa+BpMjw=; b=WJJTHvBlBDROSY6CQpZLA4sehl8f3UOWR4MaHth05cm3/+lx50g3V4JWyqTyKhcV34C2fEJTxtokbzfketB/cub+qKw6DCFZgcy9ZZ/sYDjhSuMQjbU3V5tdM2sEUXMp5HKdrnjKhdcJ1rQXRWHkOWX/k3JYoyu64y3Ddz3/hPc= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1759996575972942.9177859281228; Thu, 9 Oct 2025 00:56:15 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1v6lRE-0000BM-1e; Thu, 09 Oct 2025 03:51:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1v6lRB-00009s-Uy for qemu-devel@nongnu.org; Thu, 09 Oct 2025 03:51:45 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1v6lR6-0004Tm-KG for qemu-devel@nongnu.org; Thu, 09 Oct 2025 03:51:45 -0400 Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-612-YSMro4xWMr-DyUo1fJvShg-1; Thu, 09 Oct 2025 03:51:35 -0400 Received: by mail-wr1-f70.google.com with SMTP id ffacd0b85a97d-42421b15185so548626f8f.1 for ; Thu, 09 Oct 2025 00:51:35 -0700 (PDT) Received: from [192.168.10.48] ([151.95.36.217]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-4255d8abf38sm32988750f8f.20.2025.10.09.00.51.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 09 Oct 2025 00:51:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1759996298; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=u30x/55IvFNCQpDjqbfcuUoNhz5W2rD1TNjSa+BpMjw=; b=bJ7EYfoZBf5ImaSHqN72/BjIETa2C+ts6LmAo3gA7xAbAIkHZU6UoblJV4W/oRFl+fKp05 MiTmbbNr3s5SuOxKcVIW/QaeCaFEJlabGStivTmYJqCbyUSuLwwycOyF9sZ2n0/fbvjdU4 Z6wOHkq5q+x2q97DJN2ZQWdJXiSvrJM= X-MC-Unique: YSMro4xWMr-DyUo1fJvShg-1 X-Mimecast-MFC-AGG-ID: YSMro4xWMr-DyUo1fJvShg_1759996294 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759996293; x=1760601093; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=u30x/55IvFNCQpDjqbfcuUoNhz5W2rD1TNjSa+BpMjw=; b=AhWVSBzfXp+6bv1EnUOrZyslxmiXQD7CgzvPfv+qku/RBpbnyVO3xq39AUgSE42Lvi 09JkHbO4MWhitEy+GqFxCAgThdsm2mNudXsJwtKCwlb11fTA70PLoH8zMgAYl1I0zWBD CYTWayGM87PZQ7A7cna0RQAdeQkr9UGI+ij0v6EthR/zuWIYwGxGKGjOivsLny0ECkHl VEW7cJGmeZrCzHwYn5+qVcU7vthpNlP4C8jYRP8FUJfKPWc//nH4MUVbn/pFKnD1jefn NsSmpAWv4siOXYipE3EFpGMnjBQUs3Hp9AAM/8gBvBn26obljr8Fn2KKA9lYzV+IBFoy YG8g== X-Gm-Message-State: AOJu0YxN2V1j92yvV4ZtbjH9IUF/qVk2ykhlKfp9nvX2mn4hiccjiS+W y+LDRdA8TqA3ZWYKOIILVRfQo2FGNJZJBusjZGnG5hRFz8rnnDmfjoGn1GUQbF06UPVqZsWOeMY HhZWXih+4AnVjV1YAMPn9ffzdPnB1J/BOcrZg+1hk7FL0EPM4PsKtqqR9JWEJS+yB2mjhPYS3UY 34YbktAqBHTWnXRn2VfSh0H4gQSQxBytndDamtRuoj2FI= X-Gm-Gg: ASbGnctvzkRG8cBUjGrdOqrLtPSolCFENe2JQmj30JrLEP8rwsg+XUecYvmkxK1Pht3 EqOTANCDpIB3pklPIPTthu7UGk6QhgbNTqPUEMy+SeUOB+JzfpMXalm/T7vpEjLWKTsSMbEMVlY QqWCoznxhqFOnllDYujupj55/O/tksdaXT03V7Bj6oDpN+YwwE9kh3CAVE6bEnJTEqT6SF2FCla Ia7jPtsAEcOKFuQ2D+Fprj5Oce7vwtSikEOBvOnXYt6f/a6VYJKtGT2VePpQa2J0nGWDiQGo8C+ qoQ6mnZpfXeGGqKTnY+ozl/M6/Xn/kk5fPdy5SZxZUwtoyDi1dFzeRymDHaNIY23NvnMhGan9Zn 6H/gziYj6p7Tnp6LG7Myji81/aNqr+fljpByfiSvJo1LyQg/+ X-Received: by 2002:a05:6000:2504:b0:3ec:a019:3936 with SMTP id ffacd0b85a97d-4266e8e4a51mr4231141f8f.55.1759996292902; Thu, 09 Oct 2025 00:51:32 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFuQUODVdEuh/vC59SS8ky8by2+ngSnq9gyTmMWpaS3ZR6cm0/PASxwKYPsJAj0RG9Akr6pBg== X-Received: by 2002:a05:6000:2504:b0:3ec:a019:3936 with SMTP id ffacd0b85a97d-4266e8e4a51mr4231111f8f.55.1759996292133; Thu, 09 Oct 2025 00:51:32 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: Magnus Kulke Subject: [PULL 29/35] accel/mshv: Handle overlapping mem mappings Date: Thu, 9 Oct 2025 09:50:19 +0200 Message-ID: <20251009075026.505715-30-pbonzini@redhat.com> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20251009075026.505715-1-pbonzini@redhat.com> References: <20251009075026.505715-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -24 X-Spam_score: -2.5 X-Spam_bar: -- X-Spam_report: (-2.5 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.442, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1759996584611116600 Content-Type: text/plain; charset="utf-8" From: Magnus Kulke QEMU maps certain regions into the guest multiple times, as seen in the trace below. Currently the MSHV kernel driver will reject those mappings. To workaround this, a record is kept (a static global list of "slots", inspired by what the HVF accelerator has implemented). An overlapping region is not registered at the hypervisor, and marked as mapped=3Dfalse. If there is an UNMAPPED_GPA exit, we can look for a slot that is unmapped and would cover the GPA. In this case we map out the conflicting slot and map in the requested region. mshv_set_phys_mem add=3D1 name=3Dpc.bios mshv_map_memory =3D> u_a=3D7ffff4e00000 gpa=3D00fffc0000 size=3D000400= 00 mshv_set_phys_mem add=3D1 name=3Dioapic mshv_set_phys_mem add=3D1 name=3Dhpet mshv_set_phys_mem add=3D0 name=3Dpc.ram mshv_unmap_memory u_a=3D7fff67e00000 gpa=3D0000000000 size=3D80000000 mshv_set_phys_mem add=3D1 name=3Dpc.ram mshv_map_memory u_a=3D7fff67e00000 gpa=3D0000000000 size=3D000c0000 mshv_set_phys_mem add=3D1 name=3Dpc.rom mshv_map_memory u_a=3D7ffff4c00000 gpa=3D00000c0000 size=3D00020000 mshv_set_phys_mem add=3D1 name=3Dpc.bios mshv_remap_attempt =3D> u_a=3D7ffff4e20000 gpa=3D00000e0000 size=3D000200= 00 The mapping table is guarded by a mutex for concurrent modification and RCU mechanisms for concurrent reads. Writes occur rarely, but we'll have to verify whether an unmapped region exist for each UNMAPPED_GPA exit, which happens frequently. Signed-off-by: Magnus Kulke Link: https://lore.kernel.org/r/20250916164847.77883-24-magnuskulke@linux.m= icrosoft.com [Fix format strings for trace-events; mshv.h/mshv_int.h split. - Paolo] Signed-off-by: Paolo Bonzini --- include/system/mshv_int.h | 22 +- accel/mshv/mem.c | 406 +++++++++++++++++++++++++++++++++--- accel/mshv/mshv-all.c | 2 + target/i386/mshv/mshv-cpu.c | 43 ++++ accel/mshv/trace-events | 5 + 5 files changed, 448 insertions(+), 30 deletions(-) diff --git a/include/system/mshv_int.h b/include/system/mshv_int.h index b29d39911db..6350c69e9d3 100644 --- a/include/system/mshv_int.h +++ b/include/system/mshv_int.h @@ -16,6 +16,8 @@ =20 #define MSHV_MSR_ENTRIES_COUNT 64 =20 +#define MSHV_MAX_MEM_SLOTS 32 + typedef struct hyperv_message hv_message; =20 struct AccelCPUState { @@ -33,6 +35,12 @@ typedef struct MshvAddressSpace { AddressSpace *as; } MshvAddressSpace; =20 +typedef struct MshvMemorySlotManager { + size_t n_slots; + GList *slots; + QemuMutex mutex; +} MshvMemorySlotManager; + struct MshvState { AccelState parent_obj; int vm; @@ -41,6 +49,7 @@ struct MshvState { int nr_as; MshvAddressSpace *as; int fd; + MshvMemorySlotManager msm; }; =20 typedef struct MshvMsiControl { @@ -71,6 +80,12 @@ typedef enum MshvVmExit { MshvVmExitSpecial =3D 2, } MshvVmExit; =20 +typedef enum MshvRemapResult { + MshvRemapOk =3D 0, + MshvRemapNoMapping =3D 1, + MshvRemapNoOverlap =3D 2, +} MshvRemapResult; + void mshv_init_mmio_emu(void); int mshv_create_vcpu(int vm_fd, uint8_t vp_index, int *cpu_fd); void mshv_remove_vcpu(int vm_fd, int cpu_fd); @@ -94,19 +109,22 @@ int mshv_hvcall(int fd, const struct mshv_root_hvcall = *args); #endif =20 /* memory */ -typedef struct MshvMemoryRegion { +typedef struct MshvMemorySlot { uint64_t guest_phys_addr; uint64_t memory_size; uint64_t userspace_addr; bool readonly; -} MshvMemoryRegion; + bool mapped; +} MshvMemorySlot; =20 +MshvRemapResult mshv_remap_overlap_region(int vm_fd, uint64_t gpa); int mshv_guest_mem_read(uint64_t gpa, uint8_t *data, uintptr_t size, bool is_secure_mode, bool instruction_fetch); int mshv_guest_mem_write(uint64_t gpa, const uint8_t *data, uintptr_t size, bool is_secure_mode); void mshv_set_phys_mem(MshvMemoryListener *mml, MemoryRegionSection *secti= on, bool add); +void mshv_init_memory_slot_manager(MshvState *mshv_state); =20 /* msr */ typedef struct MshvMsrEntry { diff --git a/accel/mshv/mem.c b/accel/mshv/mem.c index e55c38d4db9..0e2164af3ee 100644 --- a/accel/mshv/mem.c +++ b/accel/mshv/mem.c @@ -11,7 +11,9 @@ */ =20 #include "qemu/osdep.h" +#include "qemu/lockable.h" #include "qemu/error-report.h" +#include "qemu/rcu.h" #include "linux/mshv.h" #include "system/address-spaces.h" #include "system/mshv.h" @@ -20,6 +22,137 @@ #include #include "trace.h" =20 +typedef struct SlotsRCUReclaim { + struct rcu_head rcu; + GList *old_head; + MshvMemorySlot *removed_slot; +} SlotsRCUReclaim; + +static void rcu_reclaim_slotlist(struct rcu_head *rcu) +{ + SlotsRCUReclaim *r =3D container_of(rcu, SlotsRCUReclaim, rcu); + g_list_free(r->old_head); + g_free(r->removed_slot); + g_free(r); +} + +static void publish_slots(GList *new_head, GList *old_head, + MshvMemorySlot *removed_slot) +{ + MshvMemorySlotManager *manager =3D &mshv_state->msm; + + assert(manager); + qatomic_store_release(&manager->slots, new_head); + + SlotsRCUReclaim *r =3D g_new(SlotsRCUReclaim, 1); + r->old_head =3D old_head; + r->removed_slot =3D removed_slot; + + call_rcu1(&r->rcu, rcu_reclaim_slotlist); +} + +/* Needs to be called with mshv_state->msm.mutex held */ +static int remove_slot(MshvMemorySlot *slot) +{ + GList *old_head, *new_head; + MshvMemorySlotManager *manager =3D &mshv_state->msm; + + assert(manager); + old_head =3D qatomic_load_acquire(&manager->slots); + + if (!g_list_find(old_head, slot)) { + error_report("slot requested for removal not found"); + return -1; + } + + new_head =3D g_list_copy(old_head); + new_head =3D g_list_remove(new_head, slot); + manager->n_slots--; + + publish_slots(new_head, old_head, slot); + + return 0; +} + +/* Needs to be called with mshv_state->msm.mutex held */ +static MshvMemorySlot *append_slot(uint64_t gpa, uint64_t userspace_addr, + uint64_t size, bool readonly) +{ + GList *old_head, *new_head; + MshvMemorySlot *slot; + MshvMemorySlotManager *manager =3D &mshv_state->msm; + + assert(manager); + + old_head =3D qatomic_load_acquire(&manager->slots); + + if (manager->n_slots >=3D MSHV_MAX_MEM_SLOTS) { + error_report("no free memory slots available"); + return NULL; + } + + slot =3D g_new0(MshvMemorySlot, 1); + slot->guest_phys_addr =3D gpa; + slot->userspace_addr =3D userspace_addr; + slot->memory_size =3D size; + slot->readonly =3D readonly; + + new_head =3D g_list_copy(old_head); + new_head =3D g_list_append(new_head, slot); + manager->n_slots++; + + publish_slots(new_head, old_head, NULL); + + return slot; +} + +static int slot_overlaps(const MshvMemorySlot *slot1, + const MshvMemorySlot *slot2) +{ + uint64_t start_1 =3D slot1->userspace_addr, + start_2 =3D slot2->userspace_addr; + size_t len_1 =3D slot1->memory_size, + len_2 =3D slot2->memory_size; + + if (slot1 =3D=3D slot2) { + return -1; + } + + return ranges_overlap(start_1, len_1, start_2, len_2) ? 0 : -1; +} + +static bool is_mapped(MshvMemorySlot *slot) +{ + /* Subsequent reads of mapped field see a fully-initialized slot */ + return qatomic_load_acquire(&slot->mapped); +} + +/* + * Find slot that is: + * - overlapping in userspace + * - currently mapped in the guest + * + * Needs to be called with mshv_state->msm.mutex or RCU read lock held. + */ +static MshvMemorySlot *find_overlap_mem_slot(GList *head, MshvMemorySlot *= slot) +{ + GList *found; + MshvMemorySlot *overlap_slot; + + found =3D g_list_find_custom(head, slot, (GCompareFunc) slot_overlaps); + + if (!found) { + return NULL; + } + + overlap_slot =3D found->data; + if (!overlap_slot || !is_mapped(overlap_slot)) { + return NULL; + } + + return overlap_slot; +} + static int set_guest_memory(int vm_fd, const struct mshv_user_mem_region *region) { @@ -27,38 +160,169 @@ static int set_guest_memory(int vm_fd, =20 ret =3D ioctl(vm_fd, MSHV_SET_GUEST_MEMORY, region); if (ret < 0) { - error_report("failed to set guest memory"); - return -errno; + error_report("failed to set guest memory: %s", strerror(errno)); + return -1; } =20 return 0; } =20 -static int map_or_unmap(int vm_fd, const MshvMemoryRegion *mr, bool map) +static int map_or_unmap(int vm_fd, const MshvMemorySlot *slot, bool map) { struct mshv_user_mem_region region =3D {0}; =20 - region.guest_pfn =3D mr->guest_phys_addr >> MSHV_PAGE_SHIFT; - region.size =3D mr->memory_size; - region.userspace_addr =3D mr->userspace_addr; + region.guest_pfn =3D slot->guest_phys_addr >> MSHV_PAGE_SHIFT; + region.size =3D slot->memory_size; + region.userspace_addr =3D slot->userspace_addr; =20 if (!map) { region.flags |=3D (1 << MSHV_SET_MEM_BIT_UNMAP); - trace_mshv_unmap_memory(mr->userspace_addr, mr->guest_phys_addr, - mr->memory_size); + trace_mshv_unmap_memory(slot->userspace_addr, slot->guest_phys_add= r, + slot->memory_size); return set_guest_memory(vm_fd, ®ion); } =20 region.flags =3D BIT(MSHV_SET_MEM_BIT_EXECUTABLE); - if (!mr->readonly) { + if (!slot->readonly) { region.flags |=3D BIT(MSHV_SET_MEM_BIT_WRITABLE); } =20 - trace_mshv_map_memory(mr->userspace_addr, mr->guest_phys_addr, - mr->memory_size); + trace_mshv_map_memory(slot->userspace_addr, slot->guest_phys_addr, + slot->memory_size); return set_guest_memory(vm_fd, ®ion); } =20 +static int slot_matches_region(const MshvMemorySlot *slot1, + const MshvMemorySlot *slot2) +{ + return (slot1->guest_phys_addr =3D=3D slot2->guest_phys_addr && + slot1->userspace_addr =3D=3D slot2->userspace_addr && + slot1->memory_size =3D=3D slot2->memory_size) ? 0 : -1; +} + +/* Needs to be called with mshv_state->msm.mutex held */ +static MshvMemorySlot *find_mem_slot_by_region(uint64_t gpa, uint64_t size, + uint64_t userspace_addr) +{ + MshvMemorySlot ref_slot =3D { + .guest_phys_addr =3D gpa, + .userspace_addr =3D userspace_addr, + .memory_size =3D size, + }; + GList *found; + MshvMemorySlotManager *manager =3D &mshv_state->msm; + + assert(manager); + found =3D g_list_find_custom(manager->slots, &ref_slot, + (GCompareFunc) slot_matches_region); + + return found ? found->data : NULL; +} + +static int slot_covers_gpa(const MshvMemorySlot *slot, uint64_t *gpa_p) +{ + uint64_t gpa_offset, gpa =3D *gpa_p; + + gpa_offset =3D gpa - slot->guest_phys_addr; + return (slot->guest_phys_addr <=3D gpa && gpa_offset < slot->memory_si= ze) + ? 0 : -1; +} + +/* Needs to be called with mshv_state->msm.mutex or RCU read lock held */ +static MshvMemorySlot *find_mem_slot_by_gpa(GList *head, uint64_t gpa) +{ + GList *found; + MshvMemorySlot *slot; + + trace_mshv_find_slot_by_gpa(gpa); + + found =3D g_list_find_custom(head, &gpa, (GCompareFunc) slot_covers_gp= a); + if (found) { + slot =3D found->data; + trace_mshv_found_slot(slot->userspace_addr, slot->guest_phys_addr, + slot->memory_size); + return slot; + } + + return NULL; +} + +/* Needs to be called with mshv_state->msm.mutex held */ +static void set_mapped(MshvMemorySlot *slot, bool mapped) +{ + /* prior writes to mapped field becomes visible before readers see slo= t */ + qatomic_store_release(&slot->mapped, mapped); +} + +MshvRemapResult mshv_remap_overlap_region(int vm_fd, uint64_t gpa) +{ + MshvMemorySlot *gpa_slot, *overlap_slot; + GList *head; + int ret; + MshvMemorySlotManager *manager =3D &mshv_state->msm; + + /* fast path, called often by unmapped_gpa vm exit */ + WITH_RCU_READ_LOCK_GUARD() { + assert(manager); + head =3D qatomic_load_acquire(&manager->slots); + /* return early if no slot is found */ + gpa_slot =3D find_mem_slot_by_gpa(head, gpa); + if (gpa_slot =3D=3D NULL) { + return MshvRemapNoMapping; + } + + /* return early if no overlapping slot is found */ + overlap_slot =3D find_overlap_mem_slot(head, gpa_slot); + if (overlap_slot =3D=3D NULL) { + return MshvRemapNoOverlap; + } + } + + /* + * We'll modify the mapping list, so we need to upgrade to mutex and + * recheck. + */ + assert(manager); + QEMU_LOCK_GUARD(&manager->mutex); + + /* return early if no slot is found */ + gpa_slot =3D find_mem_slot_by_gpa(manager->slots, gpa); + if (gpa_slot =3D=3D NULL) { + return MshvRemapNoMapping; + } + + /* return early if no overlapping slot is found */ + overlap_slot =3D find_overlap_mem_slot(manager->slots, gpa_slot); + if (overlap_slot =3D=3D NULL) { + return MshvRemapNoOverlap; + } + + /* unmap overlapping slot */ + ret =3D map_or_unmap(vm_fd, overlap_slot, false); + if (ret < 0) { + error_report("failed to unmap overlap region"); + abort(); + } + set_mapped(overlap_slot, false); + warn_report("mapped out userspace_addr=3D0x%016lx gpa=3D0x%010lx size= =3D0x%lx", + overlap_slot->userspace_addr, + overlap_slot->guest_phys_addr, + overlap_slot->memory_size); + + /* map region for gpa */ + ret =3D map_or_unmap(vm_fd, gpa_slot, true); + if (ret < 0) { + error_report("failed to map new region"); + abort(); + } + set_mapped(gpa_slot, true); + warn_report("mapped in userspace_addr=3D0x%016lx gpa=3D0x%010lx size= =3D0x%lx", + gpa_slot->userspace_addr, gpa_slot->guest_phys_addr, + gpa_slot->memory_size); + + return MshvRemapOk; +} + static int handle_unmapped_mmio_region_read(uint64_t gpa, uint64_t size, uint8_t *data) { @@ -124,20 +388,97 @@ int mshv_guest_mem_write(uint64_t gpa, const uint8_t = *data, uintptr_t size, return -1; } =20 -static int set_memory(const MshvMemoryRegion *mshv_mr, bool add) +static int tracked_unmap(int vm_fd, uint64_t gpa, uint64_t size, + uint64_t userspace_addr) { - int ret =3D 0; + int ret; + MshvMemorySlot *slot; + MshvMemorySlotManager *manager =3D &mshv_state->msm; =20 - if (!mshv_mr) { - error_report("Invalid mshv_mr"); + assert(manager); + + QEMU_LOCK_GUARD(&manager->mutex); + + slot =3D find_mem_slot_by_region(gpa, size, userspace_addr); + if (!slot) { + trace_mshv_skip_unset_mem(userspace_addr, gpa, size); + /* no work to do */ + return 0; + } + + if (!is_mapped(slot)) { + /* remove slot, no need to unmap */ + return remove_slot(slot); + } + + ret =3D map_or_unmap(vm_fd, slot, false); + if (ret < 0) { + error_report("failed to unmap memory region"); + return ret; + } + return remove_slot(slot); +} + +static int tracked_map(int vm_fd, uint64_t gpa, uint64_t size, bool readon= ly, + uint64_t userspace_addr) +{ + MshvMemorySlot *slot, *overlap_slot; + int ret; + MshvMemorySlotManager *manager =3D &mshv_state->msm; + + assert(manager); + + QEMU_LOCK_GUARD(&manager->mutex); + + slot =3D find_mem_slot_by_region(gpa, size, userspace_addr); + if (slot) { + error_report("memory region already mapped at gpa=3D0x%lx, " + "userspace_addr=3D0x%lx, size=3D0x%lx", + slot->guest_phys_addr, slot->userspace_addr, + slot->memory_size); return -1; } =20 - trace_mshv_set_memory(add, mshv_mr->guest_phys_addr, - mshv_mr->memory_size, - mshv_mr->userspace_addr, mshv_mr->readonly, - ret); - return map_or_unmap(mshv_state->vm, mshv_mr, add); + slot =3D append_slot(gpa, userspace_addr, size, readonly); + + overlap_slot =3D find_overlap_mem_slot(manager->slots, slot); + if (overlap_slot) { + trace_mshv_remap_attempt(slot->userspace_addr, + slot->guest_phys_addr, + slot->memory_size); + warn_report("attempt to map region [0x%lx-0x%lx], while " + "[0x%lx-0x%lx] is already mapped in the guest", + userspace_addr, userspace_addr + size - 1, + overlap_slot->userspace_addr, + overlap_slot->userspace_addr + + overlap_slot->memory_size - 1); + + /* do not register mem slot in hv, but record for later swap-in */ + set_mapped(slot, false); + + return 0; + } + + ret =3D map_or_unmap(vm_fd, slot, true); + if (ret < 0) { + error_report("failed to map memory region"); + return -1; + } + set_mapped(slot, true); + + return 0; +} + +static int set_memory(uint64_t gpa, uint64_t size, bool readonly, + uint64_t userspace_addr, bool add) +{ + int vm_fd =3D mshv_state->vm; + + if (add) { + return tracked_map(vm_fd, gpa, size, readonly, userspace_addr); + } + + return tracked_unmap(vm_fd, gpa, size, userspace_addr); } =20 /* @@ -173,7 +514,9 @@ void mshv_set_phys_mem(MshvMemoryListener *mml, MemoryR= egionSection *section, bool writable =3D !area->readonly && !area->rom_device; hwaddr start_addr, mr_offset, size; void *ram; - MshvMemoryRegion mshv_mr =3D {0}; + + size =3D align_section(section, &start_addr); + trace_mshv_set_phys_mem(add, section->mr->name, start_addr); =20 size =3D align_section(section, &start_addr); trace_mshv_set_phys_mem(add, section->mr->name, start_addr); @@ -200,14 +543,21 @@ void mshv_set_phys_mem(MshvMemoryListener *mml, Memor= yRegionSection *section, =20 ram =3D memory_region_get_ram_ptr(area) + mr_offset; =20 - mshv_mr.guest_phys_addr =3D start_addr; - mshv_mr.memory_size =3D size; - mshv_mr.readonly =3D !writable; - mshv_mr.userspace_addr =3D (uint64_t)ram; - - ret =3D set_memory(&mshv_mr, add); + ret =3D set_memory(start_addr, size, !writable, (uint64_t)ram, add); if (ret < 0) { - error_report("Failed to set memory region"); + error_report("failed to set memory region"); abort(); } } + +void mshv_init_memory_slot_manager(MshvState *mshv_state) +{ + MshvMemorySlotManager *manager; + + assert(mshv_state); + manager =3D &mshv_state->msm; + + manager->n_slots =3D 0; + manager->slots =3D NULL; + qemu_mutex_init(&manager->mutex); +} diff --git a/accel/mshv/mshv-all.c b/accel/mshv/mshv-all.c index fa1f8f35bdb..5edfcbad9d2 100644 --- a/accel/mshv/mshv-all.c +++ b/accel/mshv/mshv-all.c @@ -437,6 +437,8 @@ static int mshv_init(AccelState *as, MachineState *ms) =20 mshv_init_msicontrol(); =20 + mshv_init_memory_slot_manager(s); + ret =3D create_vm(mshv_fd, &vm_fd); if (ret < 0) { close(mshv_fd); diff --git a/target/i386/mshv/mshv-cpu.c b/target/i386/mshv/mshv-cpu.c index 7edc032cea3..de87142bffc 100644 --- a/target/i386/mshv/mshv-cpu.c +++ b/target/i386/mshv/mshv-cpu.c @@ -1170,6 +1170,43 @@ static int handle_mmio(CPUState *cpu, const struct h= yperv_message *msg, return 0; } =20 +static int handle_unmapped_mem(int vm_fd, CPUState *cpu, + const struct hyperv_message *msg, + MshvVmExit *exit_reason) +{ + struct hv_x64_memory_intercept_message info =3D { 0 }; + uint64_t gpa; + int ret; + enum MshvRemapResult remap_result; + + ret =3D set_memory_info(msg, &info); + if (ret < 0) { + error_report("failed to convert message to memory info"); + return -1; + } + + gpa =3D info.guest_physical_address; + + /* attempt to remap the region, in case of overlapping userspace mappi= ngs */ + remap_result =3D mshv_remap_overlap_region(vm_fd, gpa); + *exit_reason =3D MshvVmExitIgnore; + + switch (remap_result) { + case MshvRemapNoMapping: + /* if we didn't find a mapping, it is probably mmio */ + return handle_mmio(cpu, msg, exit_reason); + case MshvRemapOk: + break; + case MshvRemapNoOverlap: + /* This should not happen, but we are forgiving it */ + warn_report("found no overlap for unmapped region"); + *exit_reason =3D MshvVmExitSpecial; + break; + } + + return 0; +} + static int set_ioport_info(const struct hyperv_message *msg, hv_x64_io_port_intercept_message *info) { @@ -1507,6 +1544,12 @@ int mshv_run_vcpu(int vm_fd, CPUState *cpu, hv_messa= ge *msg, MshvVmExit *exit) case HVMSG_UNRECOVERABLE_EXCEPTION: return MshvVmExitShutdown; case HVMSG_UNMAPPED_GPA: + ret =3D handle_unmapped_mem(vm_fd, cpu, msg, &exit_reason); + if (ret < 0) { + error_report("failed to handle unmapped memory"); + return -1; + } + return exit_reason; case HVMSG_GPA_INTERCEPT: ret =3D handle_mmio(cpu, msg, &exit_reason); if (ret < 0) { diff --git a/accel/mshv/trace-events b/accel/mshv/trace-events index a4dffeb24a3..36f0d59b385 100644 --- a/accel/mshv/trace-events +++ b/accel/mshv/trace-events @@ -26,3 +26,8 @@ mshv_map_memory(uint64_t userspace_addr, uint64_t gpa, ui= nt64_t size) "\tu_a=3D0x% mshv_unmap_memory(uint64_t userspace_addr, uint64_t gpa, uint64_t size) "\= tu_a=3D0x%" PRIx64 " gpa=3D0x%010" PRIx64 " size=3D0x%08" PRIx64 mshv_set_phys_mem(bool add, const char *name, uint64_t gpa) "\tadd=3D%d na= me=3D%s gpa=3D0x%010" PRIx64 mshv_handle_mmio(uint64_t gva, uint64_t gpa, uint64_t size, uint8_t access= _type) "\tgva=3D0x%" PRIx64 " gpa=3D0x%010" PRIx64 " size=3D0x%" PRIx64 " a= ccess_type=3D%d" + +mshv_found_slot(uint64_t userspace_addr, uint64_t gpa, uint64_t size) "\tu= _a=3D0x%" PRIx64 " gpa=3D0x%010" PRIx64 " size=3D0x%08" PRIx64 +mshv_skip_unset_mem(uint64_t userspace_addr, uint64_t gpa, uint64_t size) = "\tu_a=3D0x%" PRIx64 " gpa=3D0x%010" PRIx64 " size=3D0x%08" PRIx64 +mshv_remap_attempt(uint64_t userspace_addr, uint64_t gpa, uint64_t size) "= \tu_a=3D0x%" PRIx64 " gpa=3D0x%010" PRIx64 " size=3D0x%08" PRIx64 +mshv_find_slot_by_gpa(uint64_t gpa) "\tgpa=3D0x%010" PRIx64 --=20 2.51.0