From nobody Mon Nov 17 12:06:32 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1603743310; cv=none; d=zohomail.com; s=zohoarc; b=WfWZlHbPKmoabh+5f5gJ/hENjRExsrMuVjSuIuR+RVYO3URLBZJx/rpmunYZBBLwWxIXtqeCuPD5sPjVfyrifIduBLetfsO366SWrYSKW+U8JzbiKMOgVNNW01chLu5N+aVH3Z6i6pk0Qjrv2ZI1JoeiK6Tl8KtxUIGuPnPBUUk= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1603743310; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=ofC3bQP8TSsH7olvpjsfJNuQG5QcIXGEl+IgjMFmQdo=; b=EsLIjPFEeUMYJRCrbGdTXkN2GBHkXH2bVqm0pNSia4OICLlrfkXdQ+DHqMge6ImIbQGQmvi4U8pEXcjcQi7z9NjCfxAHyKu3eyhYhDVP1aMpJWqN2ZLQUqsV+yKzQ6CXHo78hNKB9Vc6Af2oiHZzpryyZsjI5c5hoMaznA+US4g= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 160374331015296.98977646855496; Mon, 26 Oct 2020 13:15:10 -0700 (PDT) Received: from localhost ([::1]:54132 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kX8tY-0000v7-NX for importer@patchew.org; Mon, 26 Oct 2020 16:15:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:60456) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kX8Hi-00063s-9X for qemu-devel@nongnu.org; Mon, 26 Oct 2020 15:36:02 -0400 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:51040) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1kX8He-0002OW-Oh for qemu-devel@nongnu.org; Mon, 26 Oct 2020 15:36:01 -0400 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-106-SGc1MIrgPLivgV6WKCeOwg-1; Mon, 26 Oct 2020 15:35:55 -0400 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 19A1E80363C; Mon, 26 Oct 2020 19:35:54 +0000 (UTC) Received: from gimli.home (ovpn-112-213.phx2.redhat.com [10.3.112.213]) by smtp.corp.redhat.com (Postfix) with ESMTP id B729E10013C0; Mon, 26 Oct 2020 19:35:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1603740957; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ofC3bQP8TSsH7olvpjsfJNuQG5QcIXGEl+IgjMFmQdo=; b=SeziHX5hHhNXV03kIeGsdRnUyZu+NGeAteYW/JIKIAOql68gzT6caNlIKHinEvoQjfaesp eM5FREdFDE2D0uydn87XpzVScGc6Nd6CJKMl8ZkS8Yj8v6fEP5VWRXiqx5NEFbnwUMIDAS Yefdq/dxM5xQy4N0kYZ7KTLnPuGuqXs= X-MC-Unique: SGc1MIrgPLivgV6WKCeOwg-1 Subject: [PULL 24/32] s390x/pci: Honor DMA limits set by vfio From: Alex Williamson To: qemu-devel@nongnu.org Date: Mon, 26 Oct 2020 13:35:53 -0600 Message-ID: <160374095335.22414.6845048205460410915.stgit@gimli.home> In-Reply-To: <160374054442.22414.10832953989449611268.stgit@gimli.home> References: <160374054442.22414.10832953989449611268.stgit@gimli.home> User-Agent: StGit/0.21-dirty MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=alex.williamson@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=63.128.21.124; envelope-from=alex.williamson@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/10/25 21:03:19 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Cornelia Huck , Matthew Rosato Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) From: Matthew Rosato When an s390 guest is using lazy unmapping, it can result in a very large number of oustanding DMA requests, far beyond the default limit configured for vfio. Let's track DMA usage similar to vfio in the host, and trigger the guest to flush their DMA mappings before vfio runs out. Signed-off-by: Matthew Rosato Reviewed-by: Cornelia Huck Signed-off-by: Alex Williamson --- hw/s390x/s390-pci-bus.c | 16 +++++++++----- hw/s390x/s390-pci-inst.c | 45 +++++++++++++++++++++++++++++++++-= ---- hw/s390x/s390-pci-vfio.c | 42 +++++++++++++++++++++++++++++++++++ include/hw/s390x/s390-pci-bus.h | 9 ++++++++ include/hw/s390x/s390-pci-inst.h | 3 +++ include/hw/s390x/s390-pci-vfio.h | 5 ++++ 6 files changed, 109 insertions(+), 11 deletions(-) diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c index a929340688cc..218717397ae1 100644 --- a/hw/s390x/s390-pci-bus.c +++ b/hw/s390x/s390-pci-bus.c @@ -17,6 +17,7 @@ #include "cpu.h" #include "hw/s390x/s390-pci-bus.h" #include "hw/s390x/s390-pci-inst.h" +#include "hw/s390x/s390-pci-vfio.h" #include "hw/pci/pci_bus.h" #include "hw/qdev-properties.h" #include "hw/pci/pci_bridge.h" @@ -764,6 +765,7 @@ static void s390_pcihost_realize(DeviceState *dev, Erro= r **errp) s->bus_no =3D 0; QTAILQ_INIT(&s->pending_sei); QTAILQ_INIT(&s->zpci_devs); + QTAILQ_INIT(&s->zpci_dma_limit); =20 css_register_io_adapters(CSS_IO_ADAPTER_PCI, true, false, S390_ADAPTER_SUPPRESSIBLE, errp); @@ -941,17 +943,18 @@ static void s390_pcihost_plug(HotplugHandler *hotplug= _dev, DeviceState *dev, } } =20 + pbdev->pdev =3D pdev; + pbdev->iommu =3D s390_pci_get_iommu(s, pci_get_bus(pdev), pdev->de= vfn); + pbdev->iommu->pbdev =3D pbdev; + pbdev->state =3D ZPCI_FS_DISABLED; + if (object_dynamic_cast(OBJECT(dev), "vfio-pci")) { pbdev->fh |=3D FH_SHM_VFIO; + pbdev->iommu->dma_limit =3D s390_pci_start_dma_count(s, pbdev); } else { pbdev->fh |=3D FH_SHM_EMUL; } =20 - pbdev->pdev =3D pdev; - pbdev->iommu =3D s390_pci_get_iommu(s, pci_get_bus(pdev), pdev->de= vfn); - pbdev->iommu->pbdev =3D pbdev; - pbdev->state =3D ZPCI_FS_DISABLED; - if (s390_pci_msix_init(pbdev)) { error_setg(errp, "MSI-X support is mandatory " "in the S390 architecture"); @@ -1004,6 +1007,9 @@ static void s390_pcihost_unplug(HotplugHandler *hotpl= ug_dev, DeviceState *dev, pbdev->fid =3D 0; QTAILQ_REMOVE(&s->zpci_devs, pbdev, link); g_hash_table_remove(s->zpci_table, &pbdev->idx); + if (pbdev->iommu->dma_limit) { + s390_pci_end_dma_count(s, pbdev->iommu->dma_limit); + } qdev_unrealize(dev); } } diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c index 639b13c8d626..4eadd9e79416 100644 --- a/hw/s390x/s390-pci-inst.c +++ b/hw/s390x/s390-pci-inst.c @@ -32,6 +32,20 @@ } \ } while (0) =20 +static inline void inc_dma_avail(S390PCIIOMMU *iommu) +{ + if (iommu->dma_limit) { + iommu->dma_limit->avail++; + } +} + +static inline void dec_dma_avail(S390PCIIOMMU *iommu) +{ + if (iommu->dma_limit) { + iommu->dma_limit->avail--; + } +} + static void s390_set_status_code(CPUS390XState *env, uint8_t r, uint64_t status_code) { @@ -572,7 +586,8 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8= _t r2, uintptr_t ra) return 0; } =20 -static void s390_pci_update_iotlb(S390PCIIOMMU *iommu, S390IOTLBEntry *ent= ry) +static uint32_t s390_pci_update_iotlb(S390PCIIOMMU *iommu, + S390IOTLBEntry *entry) { S390IOTLBEntry *cache =3D g_hash_table_lookup(iommu->iotlb, &entry->io= va); IOMMUTLBEntry notify =3D { @@ -585,14 +600,15 @@ static void s390_pci_update_iotlb(S390PCIIOMMU *iommu= , S390IOTLBEntry *entry) =20 if (entry->perm =3D=3D IOMMU_NONE) { if (!cache) { - return; + goto out; } g_hash_table_remove(iommu->iotlb, &entry->iova); + inc_dma_avail(iommu); } else { if (cache) { if (cache->perm =3D=3D entry->perm && cache->translated_addr =3D=3D entry->translated_addr) { - return; + goto out; } =20 notify.perm =3D IOMMU_NONE; @@ -606,9 +622,13 @@ static void s390_pci_update_iotlb(S390PCIIOMMU *iommu,= S390IOTLBEntry *entry) cache->len =3D PAGE_SIZE; cache->perm =3D entry->perm; g_hash_table_replace(iommu->iotlb, &cache->iova, cache); + dec_dma_avail(iommu); } =20 memory_region_notify_iommu(&iommu->iommu_mr, 0, notify); + +out: + return iommu->dma_limit ? iommu->dma_limit->avail : 1; } =20 int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra) @@ -620,6 +640,7 @@ int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_= t r2, uintptr_t ra) S390PCIIOMMU *iommu; S390IOTLBEntry entry; hwaddr start, end; + uint32_t dma_avail; =20 if (env->psw.mask & PSW_MASK_PSTATE) { s390_program_interrupt(env, PGM_PRIVILEGED, ra); @@ -658,6 +679,11 @@ int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8= _t r2, uintptr_t ra) } =20 iommu =3D pbdev->iommu; + if (iommu->dma_limit) { + dma_avail =3D iommu->dma_limit->avail; + } else { + dma_avail =3D 1; + } if (!iommu->g_iota) { error =3D ERR_EVENT_INVALAS; goto err; @@ -675,8 +701,9 @@ int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_= t r2, uintptr_t ra) } =20 start +=3D entry.len; - while (entry.iova < start && entry.iova < end) { - s390_pci_update_iotlb(iommu, &entry); + while (entry.iova < start && entry.iova < end && + (dma_avail > 0 || entry.perm =3D=3D IOMMU_NONE)) { + dma_avail =3D s390_pci_update_iotlb(iommu, &entry); entry.iova +=3D PAGE_SIZE; entry.translated_addr +=3D PAGE_SIZE; } @@ -689,7 +716,13 @@ err: s390_pci_generate_error_event(error, pbdev->fh, pbdev->fid, start,= 0); } else { pbdev->fmb.counter[ZPCI_FMB_CNT_RPCIT]++; - setcc(cpu, ZPCI_PCI_LS_OK); + if (dma_avail > 0) { + setcc(cpu, ZPCI_PCI_LS_OK); + } else { + /* vfio DMA mappings are exhausted, trigger a RPCIT */ + setcc(cpu, ZPCI_PCI_LS_ERR); + s390_set_status_code(env, r1, ZPCI_RPCIT_ST_INSUFF_RES); + } } return 0; } diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c index cb3f4d98adf8..0621fa386ced 100644 --- a/hw/s390x/s390-pci-vfio.c +++ b/hw/s390x/s390-pci-vfio.c @@ -12,7 +12,9 @@ #include =20 #include "qemu/osdep.h" +#include "hw/s390x/s390-pci-bus.h" #include "hw/s390x/s390-pci-vfio.h" +#include "hw/vfio/pci.h" #include "hw/vfio/vfio-common.h" =20 /* @@ -52,3 +54,43 @@ retry: return vfio_get_info_dma_avail(info, avail); } =20 +S390PCIDMACount *s390_pci_start_dma_count(S390pciState *s, + S390PCIBusDevice *pbdev) +{ + S390PCIDMACount *cnt; + uint32_t avail; + VFIOPCIDevice *vpdev =3D container_of(pbdev->pdev, VFIOPCIDevice, pdev= ); + int id; + + assert(vpdev); + + id =3D vpdev->vbasedev.group->container->fd; + + if (!s390_pci_update_dma_avail(id, &avail)) { + return NULL; + } + + QTAILQ_FOREACH(cnt, &s->zpci_dma_limit, link) { + if (cnt->id =3D=3D id) { + cnt->users++; + return cnt; + } + } + + cnt =3D g_new0(S390PCIDMACount, 1); + cnt->id =3D id; + cnt->users =3D 1; + cnt->avail =3D avail; + QTAILQ_INSERT_TAIL(&s->zpci_dma_limit, cnt, link); + return cnt; +} + +void s390_pci_end_dma_count(S390pciState *s, S390PCIDMACount *cnt) +{ + assert(cnt); + + cnt->users--; + if (cnt->users =3D=3D 0) { + QTAILQ_REMOVE(&s->zpci_dma_limit, cnt, link); + } +} diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bu= s.h index 97464d0ad33e..6a35f1365bec 100644 --- a/include/hw/s390x/s390-pci-bus.h +++ b/include/hw/s390x/s390-pci-bus.h @@ -262,6 +262,13 @@ typedef struct S390IOTLBEntry { uint64_t perm; } S390IOTLBEntry; =20 +typedef struct S390PCIDMACount { + int id; + int users; + uint32_t avail; + QTAILQ_ENTRY(S390PCIDMACount) link; +} S390PCIDMACount; + struct S390PCIIOMMU { Object parent_obj; S390PCIBusDevice *pbdev; @@ -273,6 +280,7 @@ struct S390PCIIOMMU { uint64_t pba; uint64_t pal; GHashTable *iotlb; + S390PCIDMACount *dma_limit; }; =20 typedef struct S390PCIIOMMUTable { @@ -348,6 +356,7 @@ struct S390pciState { GHashTable *zpci_table; QTAILQ_HEAD(, SeiContainer) pending_sei; QTAILQ_HEAD(, S390PCIBusDevice) zpci_devs; + QTAILQ_HEAD(, S390PCIDMACount) zpci_dma_limit; }; =20 S390pciState *s390_get_phb(void); diff --git a/include/hw/s390x/s390-pci-inst.h b/include/hw/s390x/s390-pci-i= nst.h index fa3bf8b5aad1..8ee3a3c23757 100644 --- a/include/hw/s390x/s390-pci-inst.h +++ b/include/hw/s390x/s390-pci-inst.h @@ -254,6 +254,9 @@ typedef struct ClpReqRspQueryPciGrp { #define ZPCI_STPCIFC_ST_INVAL_DMAAS 28 #define ZPCI_STPCIFC_ST_ERROR_RECOVER 40 =20 +/* Refresh PCI Translations status codes */ +#define ZPCI_RPCIT_ST_INSUFF_RES 16 + /* FIB function controls */ #define ZPCI_FIB_FC_ENABLED 0x80 #define ZPCI_FIB_FC_ERROR 0x40 diff --git a/include/hw/s390x/s390-pci-vfio.h b/include/hw/s390x/s390-pci-v= fio.h index 2a5a261cc9b1..96137839a0ef 100644 --- a/include/hw/s390x/s390-pci-vfio.h +++ b/include/hw/s390x/s390-pci-vfio.h @@ -12,6 +12,11 @@ #ifndef HW_S390_PCI_VFIO_H #define HW_S390_PCI_VFIO_H =20 +#include "hw/s390x/s390-pci-bus.h" + bool s390_pci_update_dma_avail(int fd, unsigned int *avail); +S390PCIDMACount *s390_pci_start_dma_count(S390pciState *s, + S390PCIBusDevice *pbdev); +void s390_pci_end_dma_count(S390pciState *s, S390PCIDMACount *cnt); =20 #endif