From nobody Fri Apr 4 21:31:07 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=intel.com ARC-Seal: i=1; a=rsa-sha256; t=1739953694; cv=none; d=zohomail.com; s=zohoarc; b=NeaNLizkCg7p8qn707Um07ktRaoq9efPsoutiJCtnJBnehdkADgyyVR96LeJveScmuEBArZwqBxwZgetzdpLzZEeHr5TCw8vvVlLXrN+5+Cfr9k0GWWJ2OB3ZS1AIH//mRzeWDbRqzt1XT4d2v6mBNxx4unF3sM5d+zUNSFYUr0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1739953694; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=pTkSgvMGAOgTO9KZRr9dYBP8/x3p3/SwQWsIP7mmIMU=; b=LbE8osqxgafAr8p7AzL1eHwvpGdWLMVm+DdpDpVXdKIQD+5G4Lq4bzp82FLbp58IMGKg1aUbDh8bdlsOj0C9XoM1nkFytQRnuR760/7gGiCOzp4/J98+pmuhkgGKH7ioIGP4xO4QqIIXFRDQ5kn5zSTsGp9LCl/dlWA7oG45+X0= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1739953694872324.0397827846243; Wed, 19 Feb 2025 00:28:14 -0800 (PST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tkfQn-00069w-D1; Wed, 19 Feb 2025 03:27:45 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tkfQL-0005rV-DJ for qemu-devel@nongnu.org; Wed, 19 Feb 2025 03:27:18 -0500 Received: from mgamail.intel.com ([192.198.163.17]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tkfQH-0004eO-Sn for qemu-devel@nongnu.org; Wed, 19 Feb 2025 03:27:17 -0500 Received: from orviesa005.jf.intel.com ([10.64.159.145]) by fmvoesa111.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Feb 2025 00:27:09 -0800 Received: from spr-s2600bt.bj.intel.com ([10.240.192.127]) by orviesa005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Feb 2025 00:27:02 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1739953634; x=1771489634; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=2sVWgDs4RI7dmwZT9u6bFAhLp8LgsfK8oLChJpWHE7M=; b=bU1Q/InRK6Bo7NDYkB8vY/QaKhfs2BVCvm2DN61DrrUlJ119K7kB4p2I UM6rcy3xZ0kyu3ylYsKtmZytE3cde1X5E+G1MZetJZIiOwK14NV7o4Bnb xyOqUCgBQSo5rrayFFo2z0Gd1673DXadjdvS8crzYQfmZFU6vM0/0NqWl QuO2mLupHYS4w6FBqcpRSDGcG0fP8lLdEGyNfmHJ9XpNAVitcq+bdVWxR 1lJXCDOBwPbX08iPlVgC+u45aYk7QBxQnaOSlv3sqo6e6jYLEDGKX5mh6 X57HtGO2ZMywtlpgfaeFSyMjJxF+lKyIPDnTGWvvTHvWkqB0F3vRmwli+ Q==; X-CSE-ConnectionGUID: w5vPf67HQ+eSy7D8BF9GJQ== X-CSE-MsgGUID: ueEag07OQ+iJaISGGQHfKw== X-IronPort-AV: E=McAfee;i="6700,10204,11348"; a="40544239" X-IronPort-AV: E=Sophos;i="6.13,298,1732608000"; d="scan'208";a="40544239" X-CSE-ConnectionGUID: ObXa9MF/T/2YyiY9TIqGKQ== X-CSE-MsgGUID: kpdNl1mFSP2VAD0WPT3TwQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,224,1728975600"; d="scan'208";a="119851265" From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, eric.auger@redhat.com, mst@redhat.com, jasowang@redhat.com, peterx@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, shameerali.kolothum.thodi@huawei.com, joao.m.martins@oracle.com, clement.mathieu--drif@eviden.com, kevin.tian@intel.com, yi.l.liu@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Yi Sun , Paolo Bonzini , Richard Henderson , Eduardo Habkost , Marcel Apfelbaum Subject: [PATCH rfcv2 13/20] intel_iommu: Add PASID cache management infrastructure Date: Wed, 19 Feb 2025 16:22:21 +0800 Message-Id: <20250219082228.3303163-14-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250219082228.3303163-1-zhenzhong.duan@intel.com> References: <20250219082228.3303163-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=192.198.163.17; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -47 X-Spam_score: -4.8 X-Spam_bar: ---- X-Spam_report: (-4.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.423, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @intel.com) X-ZM-MESSAGEID: 1739953695546019000 Content-Type: text/plain; charset="utf-8" This adds an new entry VTDPASIDCacheEntry in VTDAddressSpace to cache the pasid entry and track PASID usage and future PASID tagged DMA address translation support in vIOMMU. VTDAddressSpace of PCI_NO_PASID is allocated when device is plugged and never freed. For other pasid, VTDAddressSpace instance is created/destroyed per the guest pasid entry set up/destroy for passthrough devices. While for emulated devices, VTDAddressSpace instance is created in the PASID tagged D= MA translation and be destroyed per guest PASID cache invalidation. This focus= es on the PASID cache management for passthrough devices as there is no PASID capable emulated devices yet. When guest modifies a PASID entry, QEMU will capture the guest pasid select= ive pasid cache invalidation, allocate or remove a VTDAddressSpace instance per= the invalidation reasons: *) a present pasid entry moved to non-present *) a present pasid entry to be a present entry *) a non-present pasid entry moved to present vIOMMU emulator could figure out the reason by fetching latest guest pasid = entry and compare it with the PASID cache. Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan --- hw/i386/intel_iommu_internal.h | 29 ++ include/hw/i386/intel_iommu.h | 6 + hw/i386/intel_iommu.c | 484 ++++++++++++++++++++++++++++++++- hw/i386/trace-events | 4 + 4 files changed, 513 insertions(+), 10 deletions(-) diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h index 18bc22fc72..632fda2853 100644 --- a/hw/i386/intel_iommu_internal.h +++ b/hw/i386/intel_iommu_internal.h @@ -315,6 +315,7 @@ typedef enum VTDFaultReason { * request while disabled */ VTD_FR_IR_SID_ERR =3D 0x26, /* Invalid Source-ID */ =20 + VTD_FR_RTADDR_INV_TTM =3D 0x31, /* Invalid TTM in RTADDR */ /* PASID directory entry access failure */ VTD_FR_PASID_DIR_ACCESS_ERR =3D 0x50, /* The Present(P) field of pasid directory entry is 0 */ @@ -492,6 +493,15 @@ typedef union VTDInvDesc VTDInvDesc; #define VTD_INV_DESC_PIOTLB_RSVD_VAL0 0xfff000000000f1c0ULL #define VTD_INV_DESC_PIOTLB_RSVD_VAL1 0xf80ULL =20 +#define VTD_INV_DESC_PASIDC_G (3ULL << 4) +#define VTD_INV_DESC_PASIDC_PASID(val) (((val) >> 32) & 0xfffffULL) +#define VTD_INV_DESC_PASIDC_DID(val) (((val) >> 16) & VTD_DOMAIN_ID_MASK) +#define VTD_INV_DESC_PASIDC_RSVD_VAL0 0xfff000000000f1c0ULL + +#define VTD_INV_DESC_PASIDC_DSI (0ULL << 4) +#define VTD_INV_DESC_PASIDC_PASID_SI (1ULL << 4) +#define VTD_INV_DESC_PASIDC_GLOBAL (3ULL << 4) + /* Information about page-selective IOTLB invalidate */ struct VTDIOTLBPageInvInfo { uint16_t domain_id; @@ -548,10 +558,28 @@ typedef struct VTDRootEntry VTDRootEntry; #define VTD_CTX_ENTRY_LEGACY_SIZE 16 #define VTD_CTX_ENTRY_SCALABLE_SIZE 32 =20 +#define VTD_SM_CONTEXT_ENTRY_PDTS(val) (((val) >> 9) & 0x7) #define VTD_SM_CONTEXT_ENTRY_RID2PASID_MASK 0xfffff #define VTD_SM_CONTEXT_ENTRY_RSVD_VAL0(aw) (0x1e0ULL | ~VTD_HAW_MASK(aw)) #define VTD_SM_CONTEXT_ENTRY_RSVD_VAL1 0xffffffffffe00000ULL =20 +typedef enum VTDPCInvType { + /* force reset all */ + VTD_PASID_CACHE_FORCE_RESET =3D 0, + /* pasid cache invalidation rely on guest PASID entry */ + VTD_PASID_CACHE_GLOBAL_INV, + VTD_PASID_CACHE_DOMSI, + VTD_PASID_CACHE_PASIDSI, +} VTDPCInvType; + +typedef struct VTDPASIDCacheInfo { + VTDPCInvType type; + uint16_t domain_id; + uint32_t pasid; + PCIBus *bus; + uint16_t devfn; +} VTDPASIDCacheInfo; + /* PASID Table Related Definitions */ #define VTD_PASID_DIR_BASE_ADDR_MASK (~0xfffULL) #define VTD_PASID_TABLE_BASE_ADDR_MASK (~0xfffULL) @@ -563,6 +591,7 @@ typedef struct VTDRootEntry VTDRootEntry; #define VTD_PASID_TABLE_BITS_MASK (0x3fULL) #define VTD_PASID_TABLE_INDEX(pasid) ((pasid) & VTD_PASID_TABLE_BITS_MASK) #define VTD_PASID_ENTRY_FPD (1ULL << 1) /* Fault Processing Disa= ble */ +#define VTD_PASID_TBL_ENTRY_NUM (1ULL << 6) =20 /* PASID Granular Translation Type Mask */ #define VTD_PASID_ENTRY_P 1ULL diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h index 50f9b27a45..fbc9da903a 100644 --- a/include/hw/i386/intel_iommu.h +++ b/include/hw/i386/intel_iommu.h @@ -95,6 +95,11 @@ struct VTDPASIDEntry { uint64_t val[8]; }; =20 +typedef struct VTDPASIDCacheEntry { + struct VTDPASIDEntry pasid_entry; + bool cache_filled; +} VTDPASIDCacheEntry; + struct VTDAddressSpace { PCIBus *bus; uint8_t devfn; @@ -107,6 +112,7 @@ struct VTDAddressSpace { MemoryRegion iommu_ir_fault; /* Interrupt region for catching fault */ IntelIOMMUState *iommu_state; VTDContextCacheEntry context_cache_entry; + VTDPASIDCacheEntry pasid_cache_entry; QLIST_ENTRY(VTDAddressSpace) next; /* Superset of notifier flags that this address space has */ IOMMUNotifierFlag notifier_flags; diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index fafa199f52..b8f3b85803 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -86,6 +86,8 @@ struct vtd_iotlb_key { static void vtd_address_space_refresh_all(IntelIOMMUState *s); static void vtd_address_space_unmap(VTDAddressSpace *as, IOMMUNotifier *n); =20 +static void vtd_pasid_cache_reset(IntelIOMMUState *s); + static void vtd_panic_require_caching_mode(void) { error_report("We need to set caching-mode=3Don for intel-iommu to enab= le " @@ -390,6 +392,7 @@ static void vtd_reset_caches(IntelIOMMUState *s) vtd_iommu_lock(s); vtd_reset_iotlb_locked(s); vtd_reset_context_cache_locked(s); + vtd_pasid_cache_reset(s); vtd_iommu_unlock(s); } =20 @@ -825,6 +828,16 @@ static inline bool vtd_pe_type_check(IntelIOMMUState *= s, VTDPASIDEntry *pe) } } =20 +static inline uint32_t vtd_sm_ce_get_pdt_entry_num(VTDContextEntry *ce) +{ + return 1U << (VTD_SM_CONTEXT_ENTRY_PDTS(ce->val[0]) + 7); +} + +static inline uint16_t vtd_pe_get_domain_id(VTDPASIDEntry *pe) +{ + return VTD_SM_PASID_ENTRY_DID((pe)->val[1]); +} + static inline bool vtd_pdire_present(VTDPASIDDirEntry *pdire) { return pdire->val & 1; @@ -1617,6 +1630,54 @@ static int vtd_as_to_context_entry(VTDAddressSpace *= vtd_as, VTDContextEntry *ce) } } =20 +/* Translate to iommu pasid if PCI_NO_PASID */ +static int vtd_as_to_iommu_pasid(VTDAddressSpace *vtd_as, uint32_t *pasid) +{ + VTDContextEntry ce; + int ret; + + ret =3D vtd_as_to_context_entry(vtd_as, &ce); + if (ret) { + return ret; + } + + if (vtd_as->pasid =3D=3D PCI_NO_PASID) { + *pasid =3D VTD_CE_GET_RID2PASID(&ce); + } else { + *pasid =3D vtd_as->pasid; + } + + return 0; +} + +static gboolean vtd_find_as_by_sid_and_iommu_pasid(gpointer key, gpointer = value, + gpointer user_data) +{ + VTDAddressSpace *vtd_as =3D (VTDAddressSpace *)value; + struct vtd_as_raw_key *target =3D (struct vtd_as_raw_key *)user_data; + uint16_t sid =3D PCI_BUILD_BDF(pci_bus_num(vtd_as->bus), vtd_as->devfn= ); + uint32_t pasid; + + if (vtd_as_to_iommu_pasid(vtd_as, &pasid)) { + return false; + } + + return (pasid =3D=3D target->pasid) && (sid =3D=3D target->sid); +} + +/* Translate iommu pasid to vtd_as */ +static VTDAddressSpace *vtd_as_from_iommu_pasid(IntelIOMMUState *s, + uint16_t sid, uint32_t pas= id) +{ + struct vtd_as_raw_key key =3D { + .sid =3D sid, + .pasid =3D pasid + }; + + return g_hash_table_find(s->vtd_address_spaces, + vtd_find_as_by_sid_and_iommu_pasid, &key); +} + static int vtd_sync_shadow_page_hook(const IOMMUTLBEvent *event, void *private) { @@ -3062,6 +3123,412 @@ static bool vtd_process_piotlb_desc(IntelIOMMUState= *s, return true; } =20 +static inline int vtd_dev_get_pe_from_pasid(VTDAddressSpace *vtd_as, + uint32_t pasid, VTDPASIDEntry = *pe) +{ + IntelIOMMUState *s =3D vtd_as->iommu_state; + VTDContextEntry ce; + int ret; + + if (!s->root_scalable) { + return -VTD_FR_RTADDR_INV_TTM; + } + + ret =3D vtd_as_to_context_entry(vtd_as, &ce); + if (ret) { + return ret; + } + + return vtd_ce_get_pasid_entry(s, &ce, pe, pasid); +} + +static bool vtd_pasid_entry_compare(VTDPASIDEntry *p1, VTDPASIDEntry *p2) +{ + return !memcmp(p1, p2, sizeof(*p1)); +} + +/* + * This function fills in the pasid entry in &vtd_as. Caller + * of this function should hold iommu_lock. + */ +static void vtd_fill_pe_in_cache(IntelIOMMUState *s, VTDAddressSpace *vtd_= as, + VTDPASIDEntry *pe) +{ + VTDPASIDCacheEntry *pc_entry =3D &vtd_as->pasid_cache_entry; + + if (vtd_pasid_entry_compare(pe, &pc_entry->pasid_entry)) { + /* No need to go further as cached pasid entry is latest */ + return; + } + + pc_entry->pasid_entry =3D *pe; + pc_entry->cache_filled =3D true; + /* + * TODO: send pasid bind to host for passthru devices + */ +} + +/* + * This function is used to clear cached pasid entry in vtd_as + * instances. Caller of this function should hold iommu_lock. + */ +static gboolean vtd_flush_pasid(gpointer key, gpointer value, + gpointer user_data) +{ + VTDPASIDCacheInfo *pc_info =3D user_data; + VTDAddressSpace *vtd_as =3D value; + IntelIOMMUState *s =3D vtd_as->iommu_state; + VTDPASIDCacheEntry *pc_entry =3D &vtd_as->pasid_cache_entry; + VTDPASIDEntry pe; + uint16_t did; + uint32_t pasid; + int ret; + + /* Replay only fill pasid entry cache for passthrough device */ + if (!pc_entry->cache_filled) { + return false; + } + did =3D vtd_pe_get_domain_id(&pc_entry->pasid_entry); + + if (vtd_as_to_iommu_pasid(vtd_as, &pasid)) { + goto remove; + } + + switch (pc_info->type) { + case VTD_PASID_CACHE_FORCE_RESET: + goto remove; + case VTD_PASID_CACHE_PASIDSI: + if (pc_info->pasid !=3D pasid) { + return false; + } + /* Fall through */ + case VTD_PASID_CACHE_DOMSI: + if (pc_info->domain_id !=3D did) { + return false; + } + /* Fall through */ + case VTD_PASID_CACHE_GLOBAL_INV: + break; + default: + error_report("invalid pc_info->type"); + abort(); + } + + /* + * pasid cache invalidation may indicate a present pasid + * entry to present pasid entry modification. To cover such + * case, vIOMMU emulator needs to fetch latest guest pasid + * entry and check cached pasid entry, then update pasid + * cache and send pasid bind/unbind to host properly. + */ + ret =3D vtd_dev_get_pe_from_pasid(vtd_as, pasid, &pe); + if (ret) { + /* + * No valid pasid entry in guest memory. e.g. pasid entry + * was modified to be either all-zero or non-present. Either + * case means existing pasid cache should be removed. + */ + goto remove; + } + + vtd_fill_pe_in_cache(s, vtd_as, &pe); + return false; + +remove: + /* + * TODO: send pasid unbind to host for passthru devices + */ + pc_entry->cache_filled =3D false; + + /* + * Don't remove address space of PCI_NO_PASID which is created by PCI + * sub-system. + */ + if (vtd_as->pasid =3D=3D PCI_NO_PASID) { + return false; + } + return true; +} + +/* Caller of this function should hold iommu_lock */ +static void vtd_pasid_cache_reset(IntelIOMMUState *s) +{ + VTDPASIDCacheInfo pc_info; + + trace_vtd_pasid_cache_reset(); + + pc_info.type =3D VTD_PASID_CACHE_FORCE_RESET; + + /* + * Reset pasid cache is a big hammer, so use + * g_hash_table_foreach_remove which will free + * the vtd_as instances. Also, as a big + * hammer, use VTD_PASID_CACHE_FORCE_RESET to + * ensure all the vtd_as instances are + * dropped, meanwhile the change will be passed + * to host if HostIOMMUDeviceIOMMUFD is available. + */ + g_hash_table_foreach_remove(s->vtd_address_spaces, + vtd_flush_pasid, &pc_info); +} + +/* Caller of this function should hold iommu_lock. */ +static void vtd_sm_pasid_table_walk_one(IntelIOMMUState *s, + dma_addr_t pt_base, + int start, + int end, + VTDPASIDCacheInfo *info) +{ + VTDPASIDEntry pe; + int pasid =3D start; + int pasid_next; + + while (pasid < end) { + pasid_next =3D pasid + 1; + + if (!vtd_get_pe_in_pasid_leaf_table(s, pasid, pt_base, &pe) + && vtd_pe_present(&pe)) { + int bus_n =3D pci_bus_num(info->bus), devfn =3D info->devfn; + uint16_t sid =3D PCI_BUILD_BDF(bus_n, devfn); + VTDAddressSpace *vtd_as; + + vtd_as =3D vtd_as_from_iommu_pasid(s, sid, pasid); + if (!vtd_as) { + vtd_as =3D vtd_find_add_as(s, info->bus, devfn, pasid); + } + + if ((info->type =3D=3D VTD_PASID_CACHE_DOMSI || + info->type =3D=3D VTD_PASID_CACHE_PASIDSI) && + !(info->domain_id =3D=3D vtd_pe_get_domain_id(&pe))) { + /* + * VTD_PASID_CACHE_DOMSI and VTD_PASID_CACHE_PASIDSI + * requires domain ID check. If domain Id check fail, + * go to next pasid. + */ + pasid =3D pasid_next; + continue; + } + vtd_fill_pe_in_cache(s, vtd_as, &pe); + } + pasid =3D pasid_next; + } +} + +/* + * Currently, VT-d scalable mode pasid table is a two level table, + * this function aims to loop a range of PASIDs in a given pasid + * table to identify the pasid config in guest. + * Caller of this function should hold iommu_lock. + */ +static void vtd_sm_pasid_table_walk(IntelIOMMUState *s, + dma_addr_t pdt_base, + int start, + int end, + VTDPASIDCacheInfo *info) +{ + VTDPASIDDirEntry pdire; + int pasid =3D start; + int pasid_next; + dma_addr_t pt_base; + + while (pasid < end) { + pasid_next =3D ((end - pasid) > VTD_PASID_TBL_ENTRY_NUM) ? + (pasid + VTD_PASID_TBL_ENTRY_NUM) : end; + if (!vtd_get_pdire_from_pdir_table(pdt_base, pasid, &pdire) + && vtd_pdire_present(&pdire)) { + pt_base =3D pdire.val & VTD_PASID_TABLE_BASE_ADDR_MASK; + vtd_sm_pasid_table_walk_one(s, pt_base, pasid, pasid_next, inf= o); + } + pasid =3D pasid_next; + } +} + +static void vtd_replay_pasid_bind_for_dev(IntelIOMMUState *s, + int start, int end, + VTDPASIDCacheInfo *info) +{ + VTDContextEntry ce; + VTDAddressSpace *vtd_as; + + vtd_as =3D vtd_find_add_as(s, info->bus, info->devfn, PCI_NO_PASID); + + if (!vtd_as_to_context_entry(vtd_as, &ce)) { + uint32_t max_pasid; + + max_pasid =3D vtd_sm_ce_get_pdt_entry_num(&ce) * VTD_PASID_TBL_ENT= RY_NUM; + if (end > max_pasid) { + end =3D max_pasid; + } + vtd_sm_pasid_table_walk(s, + VTD_CE_GET_PASID_DIR_TABLE(&ce), + start, + end, + info); + } +} + +/* + * This function replay the guest pasid bindings to hosts by + * walking the guest PASID table. This ensures host will have + * latest guest pasid bindings. Caller should hold iommu_lock. + */ +static void vtd_replay_guest_pasid_bindings(IntelIOMMUState *s, + VTDPASIDCacheInfo *pc_info) +{ + VTDHostIOMMUDevice *vtd_hiod; + int start =3D 0, end =3D 1; /* only rid2pasid is supported */ + VTDPASIDCacheInfo walk_info; + GHashTableIter as_it; + + switch (pc_info->type) { + case VTD_PASID_CACHE_PASIDSI: + start =3D pc_info->pasid; + end =3D pc_info->pasid + 1; + /* + * PASID selective invalidation is within domain, + * thus fall through. + */ + case VTD_PASID_CACHE_DOMSI: + case VTD_PASID_CACHE_GLOBAL_INV: + /* loop all assigned devices */ + break; + case VTD_PASID_CACHE_FORCE_RESET: + /* For force reset, no need to go further replay */ + return; + default: + error_report("invalid pc_info->type for replay"); + abort(); + } + + /* + * In this replay, only needs to care about the devices which + * are backed by host IOMMU. For such devices, their vtd_hiod + * instances are in the s->vtd_host_iommu_dev. For devices which + * are not backed by host IOMMU, it is not necessary to replay + * the bindings since their cache could be re-created in the future + * DMA address translation. + */ + walk_info =3D *pc_info; + g_hash_table_iter_init(&as_it, s->vtd_host_iommu_dev); + while (g_hash_table_iter_next(&as_it, NULL, (void **)&vtd_hiod)) { + /* bus|devfn fields are not identical with pc_info */ + walk_info.bus =3D vtd_hiod->bus; + walk_info.devfn =3D vtd_hiod->devfn; + vtd_replay_pasid_bind_for_dev(s, start, end, &walk_info); + } +} + +/* + * This function syncs the pasid bindings between guest and host. + * It includes updating the pasid cache in vIOMMU and updating the + * pasid bindings per guest's latest pasid entry presence. + */ +static void vtd_pasid_cache_sync(IntelIOMMUState *s, + VTDPASIDCacheInfo *pc_info) +{ + if (!s->flts || !s->root_scalable || !s->dmar_enabled) { + return; + } + + /* + * Regards to a pasid cache invalidation, e.g. a PSI. + * it could be either cases of below: + * a) a present pasid entry moved to non-present + * b) a present pasid entry to be a present entry + * c) a non-present pasid entry moved to present + * + * Different invalidation granularity may affect different device + * scope and pasid scope. But for each invalidation granularity, + * it needs to do two steps to sync host and guest pasid binding. + * + * Here is the handling of a PSI: + * 1) loop all the existing vtd_as instances to update them + * according to the latest guest pasid entry in pasid table. + * this will make sure affected existing vtd_as instances + * cached the latest pasid entries. Also, during the loop, the + * host should be notified if needed. e.g. pasid unbind or pasid + * update. Should be able to cover case a) and case b). + * + * 2) loop all devices to cover case c) + * - For devices which are backed by HostIOMMUDeviceIOMMUFD instanc= es, + * we loop them and check if guest pasid entry exists. If yes, + * it is case c), we update the pasid cache and also notify + * host. + * - For devices which are not backed by HostIOMMUDeviceIOMMUFD, + * it is not necessary to create pasid cache at this phase since + * it could be created when vIOMMU does DMA address translation. + * This is not yet implemented since there is no emulated + * pasid-capable devices today. If we have such devices in + * future, the pasid cache shall be created there. + * Other granularity follow the same steps, just with different scope + * + */ + + vtd_iommu_lock(s); + /* Step 1: loop all the existing vtd_as instances */ + g_hash_table_foreach_remove(s->vtd_address_spaces, + vtd_flush_pasid, pc_info); + + /* + * Step 2: loop all the existing vtd_hiod instances. + * Ideally, needs to loop all devices to find if there is any new + * PASID binding regards to the PASID cache invalidation request. + * But it is enough to loop the devices which are backed by host + * IOMMU. For devices backed by vIOMMU (a.k.a emulated devices), + * if new PASID happened on them, their vtd_as instance could + * be created during future vIOMMU DMA translation. + */ + vtd_replay_guest_pasid_bindings(s, pc_info); + vtd_iommu_unlock(s); +} + +static bool vtd_process_pasid_desc(IntelIOMMUState *s, + VTDInvDesc *inv_desc) +{ + uint16_t domain_id; + uint32_t pasid; + VTDPASIDCacheInfo pc_info; + uint64_t mask[4] =3D {VTD_INV_DESC_PASIDC_RSVD_VAL0, VTD_INV_DESC_ALL_= ONE, + VTD_INV_DESC_ALL_ONE, VTD_INV_DESC_ALL_ONE}; + + if (!vtd_inv_desc_reserved_check(s, inv_desc, mask, true, + __func__, "pasid cache inv")) { + return false; + } + + domain_id =3D VTD_INV_DESC_PASIDC_DID(inv_desc->val[0]); + pasid =3D VTD_INV_DESC_PASIDC_PASID(inv_desc->val[0]); + + switch (inv_desc->val[0] & VTD_INV_DESC_PASIDC_G) { + case VTD_INV_DESC_PASIDC_DSI: + trace_vtd_pasid_cache_dsi(domain_id); + pc_info.type =3D VTD_PASID_CACHE_DOMSI; + pc_info.domain_id =3D domain_id; + break; + + case VTD_INV_DESC_PASIDC_PASID_SI: + /* PASID selective implies a DID selective */ + trace_vtd_pasid_cache_psi(domain_id, pasid); + pc_info.type =3D VTD_PASID_CACHE_PASIDSI; + pc_info.domain_id =3D domain_id; + pc_info.pasid =3D pasid; + break; + + case VTD_INV_DESC_PASIDC_GLOBAL: + trace_vtd_pasid_cache_gsi(); + pc_info.type =3D VTD_PASID_CACHE_GLOBAL_INV; + break; + + default: + error_report_once("invalid-inv-granu-in-pc_inv_desc hi: 0x%" PRIx64 + " lo: 0x%" PRIx64, inv_desc->val[1], inv_desc->val[0]); + return false; + } + + vtd_pasid_cache_sync(s, &pc_info); + return true; +} + static bool vtd_process_inv_iec_desc(IntelIOMMUState *s, VTDInvDesc *inv_desc) { @@ -3223,6 +3690,13 @@ static bool vtd_process_inv_desc(IntelIOMMUState *s) } break; =20 + case VTD_INV_DESC_PC: + trace_vtd_inv_desc("pasid-cache", inv_desc.val[1], inv_desc.val[0]= ); + if (!vtd_process_pasid_desc(s, &inv_desc)) { + return false; + } + break; + case VTD_INV_DESC_PIOTLB: trace_vtd_inv_desc("p-iotlb", inv_desc.val[1], inv_desc.val[0]); if (!vtd_process_piotlb_desc(s, &inv_desc)) { @@ -3258,16 +3732,6 @@ static bool vtd_process_inv_desc(IntelIOMMUState *s) } break; =20 - /* - * TODO: the entity of below two cases will be implemented in future s= eries. - * To make guest (which integrates scalable mode support patch set in - * iommu driver) work, just return true is enough so far. - */ - case VTD_INV_DESC_PC: - if (s->scalable_mode) { - break; - } - /* fallthrough */ default: error_report_once("%s: invalid inv desc: hi=3D%"PRIx64", lo=3D%"PR= Ix64 " (unknown type)", __func__, inv_desc.hi, diff --git a/hw/i386/trace-events b/hw/i386/trace-events index 53c02d7ac8..a26b38b52c 100644 --- a/hw/i386/trace-events +++ b/hw/i386/trace-events @@ -24,6 +24,10 @@ vtd_inv_qi_head(uint16_t head) "read head %d" vtd_inv_qi_tail(uint16_t head) "write tail %d" vtd_inv_qi_fetch(void) "" vtd_context_cache_reset(void) "" +vtd_pasid_cache_gsi(void) "" +vtd_pasid_cache_reset(void) "" +vtd_pasid_cache_dsi(uint16_t domain) "Domain slective PC invalidation doma= in 0x%"PRIx16 +vtd_pasid_cache_psi(uint16_t domain, uint32_t pasid) "PASID slective PC in= validation domain 0x%"PRIx16" pasid 0x%"PRIx32 vtd_re_not_present(uint8_t bus) "Root entry bus %"PRIu8" not present" vtd_ce_not_present(uint8_t bus, uint8_t devfn) "Context entry bus %"PRIu8"= devfn %"PRIu8" not present" vtd_iotlb_page_hit(uint16_t sid, uint64_t addr, uint64_t slpte, uint16_t d= omain) "IOTLB page hit sid 0x%"PRIx16" iova 0x%"PRIx64" slpte 0x%"PRIx64" d= omain 0x%"PRIx16 --=20 2.34.1