From nobody Tue Apr 7 17:34:02 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A319C43217 for ; Wed, 9 Nov 2022 14:29:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231470AbiKIO3a (ORCPT ); Wed, 9 Nov 2022 09:29:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42608 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231204AbiKIO3Y (ORCPT ); Wed, 9 Nov 2022 09:29:24 -0500 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9E4B110D2; Wed, 9 Nov 2022 06:29:23 -0800 (PST) Received: from pps.filterd (m0187473.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2A9ELUJo032113; Wed, 9 Nov 2022 14:29:11 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=HK5YMddnDvqd/bZDE7MU98+xJJ5Z4s1xsKhTNKbl0qQ=; b=JMFJLXehiQOpgZWnnKNpZIRGhVwUN8evLh2vbRFlE0SN1PqfUwIxAabPpSmelSLEmVX1 UzOafTwVUUEKGAzemNSoAdEZa/ENOcDScECTNf9qZlkiNYG6ZsPkwOkNyt2wVq1MmFJ2 whHg5oBCzHUPS1WlrzXOEgjBE0O2D3QtZ8fe6bVrqLGF0Qq2dVU33c0XbaBnXO4U0w90 xxS6za/w3seXvuh7YCJtwel2V82R+cCC5IimSvxodQKEqvxirZPbHj5Yh0MHhMjUeJLj LsSG+V0VVaiqMkvEyp2714knBLGSW7/Ba55FaeF93QwNdFOFlMJR012/aU/gpnY5bP17 TA== Received: from ppma04ams.nl.ibm.com (63.31.33a9.ip4.static.sl-reverse.com [169.51.49.99]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3krdw0g5yp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 09 Nov 2022 14:29:10 +0000 Received: from pps.filterd (ppma04ams.nl.ibm.com [127.0.0.1]) by ppma04ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 2A9ELLck013876; Wed, 9 Nov 2022 14:29:08 GMT Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by ppma04ams.nl.ibm.com with ESMTP id 3kngqddsb3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 09 Nov 2022 14:29:08 +0000 Received: from d06av23.portsmouth.uk.ibm.com (d06av23.portsmouth.uk.ibm.com [9.149.105.59]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2A9ET5DW2949794 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 9 Nov 2022 14:29:05 GMT Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1DB11A4040; Wed, 9 Nov 2022 14:29:05 +0000 (GMT) Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A24DAA4053; Wed, 9 Nov 2022 14:29:04 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by d06av23.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 9 Nov 2022 14:29:04 +0000 (GMT) From: Niklas Schnelle To: Matthew Rosato , iommu@lists.linux.dev, Joerg Roedel , Will Deacon , Robin Murphy , Jason Gunthorpe Cc: Gerd Bayer , Pierre Morel , linux-s390@vger.kernel.org, borntraeger@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, gerald.schaefer@linux.ibm.com, agordeev@linux.ibm.com, svens@linux.ibm.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 1/5] iommu/s390: Make attach succeed even if the device is in error state Date: Wed, 9 Nov 2022 15:28:59 +0100 Message-Id: <20221109142903.4080275-2-schnelle@linux.ibm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221109142903.4080275-1-schnelle@linux.ibm.com> References: <20221109142903.4080275-1-schnelle@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: Aekzp2NvoY9E3rUQitHJqoNwjME_XNoR X-Proofpoint-GUID: Aekzp2NvoY9E3rUQitHJqoNwjME_XNoR X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-09_06,2022-11-09_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 malwarescore=0 spamscore=0 phishscore=0 clxscore=1015 mlxlogscore=999 suspectscore=0 priorityscore=1501 impostorscore=0 mlxscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211090107 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" If a zPCI device is in the error state while switching IOMMU domains zpci_register_ioat() will fail and we would end up with the device not attached to any domain. In this state since zdev->dma_table =3D=3D NULL a reset via zpci_hot_reset_device() would wrongfully re-initialize the device for DMA API usage using zpci_dma_init_device(). As automatic recovery is currently disabled while attached to an IOMMU domain this only affects slot resets triggered through other means but will affect automatic recovery once we switch to using dma-iommu. Additionally with that switch common code expects attaching to the default domain to always work so zpci_register_ioat() should only fail if there is no chance to recover anyway, e.g. if the device has been unplugged. Improve the robustness of attach by specifically looking at the status returned by zpci_mod_fc() to determine if the device is unavailable and in this case simply ignore the error. Once the device is reset zpci_hot_reset_device() will then correctly set the domain's DMA translation tables. Signed-off-by: Niklas Schnelle Reviewed-by: Matthew Rosato --- arch/s390/include/asm/pci.h | 2 +- arch/s390/kvm/pci.c | 6 ++++-- arch/s390/pci/pci.c | 11 ++++++----- arch/s390/pci/pci_dma.c | 3 ++- drivers/iommu/s390-iommu.c | 9 +++++++-- 5 files changed, 20 insertions(+), 11 deletions(-) diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h index 15f8714ca9b7..07361e2fd8c5 100644 --- a/arch/s390/include/asm/pci.h +++ b/arch/s390/include/asm/pci.h @@ -221,7 +221,7 @@ void zpci_device_reserved(struct zpci_dev *zdev); bool zpci_is_device_configured(struct zpci_dev *zdev); =20 int zpci_hot_reset_device(struct zpci_dev *zdev); -int zpci_register_ioat(struct zpci_dev *, u8, u64, u64, u64); +int zpci_register_ioat(struct zpci_dev *, u8, u64, u64, u64, u8 *); int zpci_unregister_ioat(struct zpci_dev *, u8); void zpci_remove_reserved_devices(void); void zpci_update_fh(struct zpci_dev *zdev, u32 fh); diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c index c50c1645c0ae..03964c0e1fdf 100644 --- a/arch/s390/kvm/pci.c +++ b/arch/s390/kvm/pci.c @@ -434,6 +434,7 @@ static void kvm_s390_pci_dev_release(struct zpci_dev *z= dev) static int kvm_s390_pci_register_kvm(void *opaque, struct kvm *kvm) { struct zpci_dev *zdev =3D opaque; + u8 status; int rc; =20 if (!zdev) @@ -486,7 +487,7 @@ static int kvm_s390_pci_register_kvm(void *opaque, stru= ct kvm *kvm) =20 /* Re-register the IOMMU that was already created */ rc =3D zpci_register_ioat(zdev, 0, zdev->start_dma, zdev->end_dma, - virt_to_phys(zdev->dma_table)); + virt_to_phys(zdev->dma_table), &status); if (rc) goto clear_gisa; =20 @@ -516,6 +517,7 @@ static void kvm_s390_pci_unregister_kvm(void *opaque) { struct zpci_dev *zdev =3D opaque; struct kvm *kvm; + u8 status; =20 if (!zdev) return; @@ -554,7 +556,7 @@ static void kvm_s390_pci_unregister_kvm(void *opaque) =20 /* Re-register the IOMMU that was already created */ zpci_register_ioat(zdev, 0, zdev->start_dma, zdev->end_dma, - virt_to_phys(zdev->dma_table)); + virt_to_phys(zdev->dma_table), &status); =20 out: spin_lock(&kvm->arch.kzdev_list_lock); diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c index 73cdc5539384..a703dcd94a68 100644 --- a/arch/s390/pci/pci.c +++ b/arch/s390/pci/pci.c @@ -116,20 +116,20 @@ EXPORT_SYMBOL_GPL(pci_proc_domain); =20 /* Modify PCI: Register I/O address translation parameters */ int zpci_register_ioat(struct zpci_dev *zdev, u8 dmaas, - u64 base, u64 limit, u64 iota) + u64 base, u64 limit, u64 iota, u8 *status) { u64 req =3D ZPCI_CREATE_REQ(zdev->fh, dmaas, ZPCI_MOD_FC_REG_IOAT); struct zpci_fib fib =3D {0}; - u8 cc, status; + u8 cc; =20 WARN_ON_ONCE(iota & 0x3fff); fib.pba =3D base; fib.pal =3D limit; fib.iota =3D iota | ZPCI_IOTA_RTTO_FLAG; fib.gd =3D zdev->gisa; - cc =3D zpci_mod_fc(req, &fib, &status); + cc =3D zpci_mod_fc(req, &fib, status); if (cc) - zpci_dbg(3, "reg ioat fid:%x, cc:%d, status:%d\n", zdev->fid, cc, status= ); + zpci_dbg(3, "reg ioat fid:%x, cc:%d, status:%d\n", zdev->fid, cc, *statu= s); return cc; } EXPORT_SYMBOL_GPL(zpci_register_ioat); @@ -764,6 +764,7 @@ EXPORT_SYMBOL_GPL(zpci_disable_device); */ int zpci_hot_reset_device(struct zpci_dev *zdev) { + u8 status; int rc; =20 zpci_dbg(3, "rst fid:%x, fh:%x\n", zdev->fid, zdev->fh); @@ -787,7 +788,7 @@ int zpci_hot_reset_device(struct zpci_dev *zdev) =20 if (zdev->dma_table) rc =3D zpci_register_ioat(zdev, 0, zdev->start_dma, zdev->end_dma, - virt_to_phys(zdev->dma_table)); + virt_to_phys(zdev->dma_table), &status); else rc =3D zpci_dma_init_device(zdev); if (rc) { diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c index 227cf0a62800..dee825ee7305 100644 --- a/arch/s390/pci/pci_dma.c +++ b/arch/s390/pci/pci_dma.c @@ -547,6 +547,7 @@ static void s390_dma_unmap_sg(struct device *dev, struc= t scatterlist *sg, =09 int zpci_dma_init_device(struct zpci_dev *zdev) { + u8 status; int rc; =20 /* @@ -598,7 +599,7 @@ int zpci_dma_init_device(struct zpci_dev *zdev) =20 } if (zpci_register_ioat(zdev, 0, zdev->start_dma, zdev->end_dma, - virt_to_phys(zdev->dma_table))) { + virt_to_phys(zdev->dma_table), &status)) { rc =3D -EIO; goto free_bitmap; } diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c index 7fb512bece9a..e2c886bc4376 100644 --- a/drivers/iommu/s390-iommu.c +++ b/drivers/iommu/s390-iommu.c @@ -98,6 +98,7 @@ static int s390_iommu_attach_device(struct iommu_domain *= domain, struct s390_domain *s390_domain =3D to_s390_domain(domain); struct zpci_dev *zdev =3D to_zpci_dev(dev); unsigned long flags; + u8 status; int cc; =20 if (!zdev) @@ -113,8 +114,12 @@ static int s390_iommu_attach_device(struct iommu_domai= n *domain, zpci_dma_exit_device(zdev); =20 cc =3D zpci_register_ioat(zdev, 0, zdev->start_dma, zdev->end_dma, - virt_to_phys(s390_domain->dma_table)); - if (cc) + virt_to_phys(s390_domain->dma_table), &status); + /* + * If the device is undergoing error recovery the reset code + * will re-establish the new domain. + */ + if (cc && status !=3D ZPCI_PCI_ST_FUNC_NOT_AVAIL) return -EIO; zdev->dma_table =3D s390_domain->dma_table; =20 --=20 2.34.1 From nobody Tue Apr 7 17:34:02 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CACB5C4332F for ; Wed, 9 Nov 2022 14:29:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231315AbiKIO3h (ORCPT ); Wed, 9 Nov 2022 09:29:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42684 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231396AbiKIO32 (ORCPT ); Wed, 9 Nov 2022 09:29:28 -0500 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 82D7B140C9; Wed, 9 Nov 2022 06:29:27 -0800 (PST) Received: from pps.filterd (m0127361.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2A9Ds3hl026270; Wed, 9 Nov 2022 14:29:11 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=4kTuTBw92c/DytV/q7UcPXb4WIEHFMdb7d5jYFs1+yo=; b=NDB4UIwFj0VrrS3jsjV9Wq5FweZYFRlvToeLdcalhV5W3kY512T+3wUXahEWOFKrkYwD QXDe+I0PW2+AUsvRXkZvCw23hhT1TpnxEGwY8RPcrWofreeZwmrErPgYBHzITKHYLkdH u6G2nwrjRVMRtjvbPWM7so4/F2c4qlNWcE1kOkadgiqTTSyZYw+FLGIXrUd34xnReavN OxZY/oTrx8vw/iPzDbHlwDjFLyqoR+DIUyuxqUCA4tC7X0uzDgiJY+QoNv4AuTm9WJly t38wLpXVBj7x2DYP6kdbHrTWh9XM3WhRrswIjlWIok4UMF1KMJH7bABBqknUPvZncAOs wg== Received: from ppma01fra.de.ibm.com (46.49.7a9f.ip4.static.sl-reverse.com [159.122.73.70]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3krdg7h4sx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 09 Nov 2022 14:29:10 +0000 Received: from pps.filterd (ppma01fra.de.ibm.com [127.0.0.1]) by ppma01fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 2A9EK5Ql000631; Wed, 9 Nov 2022 14:29:09 GMT Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by ppma01fra.de.ibm.com with ESMTP id 3kngs4m7qp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 09 Nov 2022 14:29:08 +0000 Received: from d06av23.portsmouth.uk.ibm.com (d06av23.portsmouth.uk.ibm.com [9.149.105.59]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2A9ET5da54985202 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 9 Nov 2022 14:29:05 GMT Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 978C0A4051; Wed, 9 Nov 2022 14:29:05 +0000 (GMT) Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2933EA4057; Wed, 9 Nov 2022 14:29:05 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by d06av23.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 9 Nov 2022 14:29:05 +0000 (GMT) From: Niklas Schnelle To: Matthew Rosato , iommu@lists.linux.dev, Joerg Roedel , Will Deacon , Robin Murphy , Jason Gunthorpe Cc: Gerd Bayer , Pierre Morel , linux-s390@vger.kernel.org, borntraeger@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, gerald.schaefer@linux.ibm.com, agordeev@linux.ibm.com, svens@linux.ibm.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 2/5] iommu/s390: Add I/O TLB ops Date: Wed, 9 Nov 2022 15:29:00 +0100 Message-Id: <20221109142903.4080275-3-schnelle@linux.ibm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221109142903.4080275-1-schnelle@linux.ibm.com> References: <20221109142903.4080275-1-schnelle@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: t-GWkEAdqw2b0yZIyQq-C-prvo5ZiU7U X-Proofpoint-GUID: t-GWkEAdqw2b0yZIyQq-C-prvo5ZiU7U X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-09_06,2022-11-09_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 priorityscore=1501 mlxscore=0 impostorscore=0 suspectscore=0 spamscore=0 mlxlogscore=999 adultscore=0 phishscore=0 malwarescore=0 clxscore=1015 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211090107 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Currently s390-iommu does an I/O TLB flush (RPCIT) for every update of the I/O translation table explicitly. For one this is wasteful since RPCIT can be skipped after a mapping operation if zdev->tlb_refresh is unset. Moreover we can do a single RPCIT for a range of pages including whne doing lazy unmapping. Thankfully both of these optimizations can be achieved by implementing the IOMMU operations common code provides for the different types of I/O tlb flushes: * flush_iotlb_all: Flushes the I/O TLB for the entire IOVA space * iotlb_sync: Flushes the I/O TLB for a range of pages that can be gathered up, for example to implement lazy unmapping. * iotlb_sync_map: Flushes the I/O TLB after a mapping operation Signed-off-by: Niklas Schnelle --- v1->v2: - Don't skip IOTLB flushes for other devices on IOTLB flush failure (Jason) drivers/iommu/s390-iommu.c | 67 +++++++++++++++++++++++++++++++------- 1 file changed, 56 insertions(+), 11 deletions(-) diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c index e2c886bc4376..9771bce86e94 100644 --- a/drivers/iommu/s390-iommu.c +++ b/drivers/iommu/s390-iommu.c @@ -199,14 +199,63 @@ static void s390_iommu_release_device(struct device *= dev) __s390_iommu_detach_device(zdev); } =20 +static void s390_iommu_flush_iotlb_all(struct iommu_domain *domain) +{ + struct s390_domain *s390_domain =3D to_s390_domain(domain); + struct zpci_dev *zdev; + unsigned long flags; + + spin_lock_irqsave(&s390_domain->list_lock, flags); + list_for_each_entry(zdev, &s390_domain->devices, iommu_list) { + zpci_refresh_trans((u64)zdev->fh << 32, zdev->start_dma, + zdev->end_dma - zdev->start_dma + 1); + } + spin_unlock_irqrestore(&s390_domain->list_lock, flags); +} + +static void s390_iommu_iotlb_sync(struct iommu_domain *domain, + struct iommu_iotlb_gather *gather) +{ + struct s390_domain *s390_domain =3D to_s390_domain(domain); + size_t size =3D gather->end - gather->start + 1; + struct zpci_dev *zdev; + unsigned long flags; + + /* If gather was never added to there is nothing to flush */ + if (!gather->end) + return; + + spin_lock_irqsave(&s390_domain->list_lock, flags); + list_for_each_entry(zdev, &s390_domain->devices, iommu_list) { + zpci_refresh_trans((u64)zdev->fh << 32, gather->start, + size); + } + spin_unlock_irqrestore(&s390_domain->list_lock, flags); +} + +static void s390_iommu_iotlb_sync_map(struct iommu_domain *domain, + unsigned long iova, size_t size) +{ + struct s390_domain *s390_domain =3D to_s390_domain(domain); + struct zpci_dev *zdev; + unsigned long flags; + + spin_lock_irqsave(&s390_domain->list_lock, flags); + list_for_each_entry(zdev, &s390_domain->devices, iommu_list) { + if (!zdev->tlb_refresh) + continue; + zpci_refresh_trans((u64)zdev->fh << 32, + iova, size); + } + spin_unlock_irqrestore(&s390_domain->list_lock, flags); +} + static int s390_iommu_update_trans(struct s390_domain *s390_domain, phys_addr_t pa, dma_addr_t dma_addr, unsigned long nr_pages, int flags) { phys_addr_t page_addr =3D pa & PAGE_MASK; - dma_addr_t start_dma_addr =3D dma_addr; unsigned long irq_flags, i; - struct zpci_dev *zdev; unsigned long *entry; int rc =3D 0; =20 @@ -225,15 +274,6 @@ static int s390_iommu_update_trans(struct s390_domain = *s390_domain, dma_addr +=3D PAGE_SIZE; } =20 - spin_lock(&s390_domain->list_lock); - list_for_each_entry(zdev, &s390_domain->devices, iommu_list) { - rc =3D zpci_refresh_trans((u64)zdev->fh << 32, - start_dma_addr, nr_pages * PAGE_SIZE); - if (rc) - break; - } - spin_unlock(&s390_domain->list_lock); - undo_cpu_trans: if (rc && ((flags & ZPCI_PTE_VALID_MASK) =3D=3D ZPCI_PTE_VALID)) { flags =3D ZPCI_PTE_INVALID; @@ -340,6 +380,8 @@ static size_t s390_iommu_unmap_pages(struct iommu_domai= n *domain, if (rc) return 0; =20 + iommu_iotlb_gather_add_range(gather, iova, size); + return size; } =20 @@ -384,6 +426,9 @@ static const struct iommu_ops s390_iommu_ops =3D { .detach_dev =3D s390_iommu_detach_device, .map_pages =3D s390_iommu_map_pages, .unmap_pages =3D s390_iommu_unmap_pages, + .flush_iotlb_all =3D s390_iommu_flush_iotlb_all, + .iotlb_sync =3D s390_iommu_iotlb_sync, + .iotlb_sync_map =3D s390_iommu_iotlb_sync_map, .iova_to_phys =3D s390_iommu_iova_to_phys, .free =3D s390_domain_free, } --=20 2.34.1 From nobody Tue Apr 7 17:34:02 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA272C433FE for ; Wed, 9 Nov 2022 14:29:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231504AbiKIO3d (ORCPT ); Wed, 9 Nov 2022 09:29:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42648 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231343AbiKIO30 (ORCPT ); Wed, 9 Nov 2022 09:29:26 -0500 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8793418384; Wed, 9 Nov 2022 06:29:25 -0800 (PST) Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2A9DxYLZ006660; Wed, 9 Nov 2022 14:29:12 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=izBFDgbVKHNBefeE4boPFsbeLvNsC62xJH5EVEQLiiA=; b=QhMXisSfxp57kiOxT9hu41X9u4QUzFLS8mc9nNPI1kXijWBwfeOvTjhTDQtt/zls5Z4E TJV6nRrDTEWrquu4YYYwf88g86aM2I8r+iJmyI9I5SySf9HliOwJ1J9N481U4/bPorv/ LFqhDsSvWAscGebIbP12Aac8sZSxSxmu+rCw+wnsCKjPNfsa+vqzoimCUphj4ILsIAni q/98kV98s0/C/EVEOKEpcAxK6n8lp7fNIRlylVpmEtmz6f7zMPBt9OeK68r8zeyd1ubQ 0J7KrLGholQJnM6O0W06Y6nJmiyVkSBNPmI77tuIvDHlnYo2Eh+W5yWGokBSD9d3FLdY 5Q== Received: from ppma05fra.de.ibm.com (6c.4a.5195.ip4.static.sl-reverse.com [149.81.74.108]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3krdjprye2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 09 Nov 2022 14:29:12 +0000 Received: from pps.filterd (ppma05fra.de.ibm.com [127.0.0.1]) by ppma05fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 2A9EJtsV021709; Wed, 9 Nov 2022 14:29:09 GMT Received: from b06cxnps3074.portsmouth.uk.ibm.com (d06relay09.portsmouth.uk.ibm.com [9.149.109.194]) by ppma05fra.de.ibm.com with ESMTP id 3krcbr03jv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 09 Nov 2022 14:29:09 +0000 Received: from d06av23.portsmouth.uk.ibm.com (d06av23.portsmouth.uk.ibm.com [9.149.105.59]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2A9ET6jO38470044 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 9 Nov 2022 14:29:06 GMT Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 19E1EA404D; Wed, 9 Nov 2022 14:29:06 +0000 (GMT) Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A2CD4A4040; Wed, 9 Nov 2022 14:29:05 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by d06av23.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 9 Nov 2022 14:29:05 +0000 (GMT) From: Niklas Schnelle To: Matthew Rosato , iommu@lists.linux.dev, Joerg Roedel , Will Deacon , Robin Murphy , Jason Gunthorpe Cc: Gerd Bayer , Pierre Morel , linux-s390@vger.kernel.org, borntraeger@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, gerald.schaefer@linux.ibm.com, agordeev@linux.ibm.com, svens@linux.ibm.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 3/5] iommu/s390: Use RCU to allow concurrent domain_list iteration Date: Wed, 9 Nov 2022 15:29:01 +0100 Message-Id: <20221109142903.4080275-4-schnelle@linux.ibm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221109142903.4080275-1-schnelle@linux.ibm.com> References: <20221109142903.4080275-1-schnelle@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: q_F5OMFiaz7D4I5FizMLr-E7pL5dVz2x X-Proofpoint-GUID: q_F5OMFiaz7D4I5FizMLr-E7pL5dVz2x X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-09_06,2022-11-09_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 clxscore=1015 malwarescore=0 adultscore=0 spamscore=0 mlxlogscore=999 suspectscore=0 priorityscore=1501 mlxscore=0 bulkscore=0 phishscore=0 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211090107 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The s390_domain->devices list is only added to when new devices are attached but is iterated through in read-only fashion for every mapping operation as well as for I/O TLB flushes and thus in performance critical code causing contention on the s390_domain->list_lock. Fortunately such a read-mostly linked list is a standard use case for RCU. This change closely follows the example fpr RCU protected list given in Documentation/RCU/listRCU.rst. Signed-off-by: Niklas Schnelle --- v1->v2: - Free domain tables via call_rcu() (Jason) arch/s390/include/asm/pci.h | 1 + arch/s390/pci/pci.c | 2 +- drivers/iommu/s390-iommu.c | 44 +++++++++++++++++++++++-------------- 3 files changed, 29 insertions(+), 18 deletions(-) diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h index 07361e2fd8c5..e4c3e4e04d30 100644 --- a/arch/s390/include/asm/pci.h +++ b/arch/s390/include/asm/pci.h @@ -119,6 +119,7 @@ struct zpci_dev { struct list_head entry; /* list of all zpci_devices, needed for hotplug,= etc. */ struct list_head iommu_list; struct kref kref; + struct rcu_head rcu; struct hotplug_slot hotplug_slot; =20 enum zpci_state state; diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c index a703dcd94a68..ef38b1514c77 100644 --- a/arch/s390/pci/pci.c +++ b/arch/s390/pci/pci.c @@ -996,7 +996,7 @@ void zpci_release_device(struct kref *kref) break; } zpci_dbg(3, "rem fid:%x\n", zdev->fid); - kfree(zdev); + kfree_rcu(zdev, rcu); } =20 int zpci_report_error(struct pci_dev *pdev, diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c index 9771bce86e94..cf5dcbcea4e0 100644 --- a/drivers/iommu/s390-iommu.c +++ b/drivers/iommu/s390-iommu.c @@ -10,6 +10,8 @@ #include #include #include +#include +#include #include =20 static const struct iommu_ops s390_iommu_ops; @@ -20,6 +22,7 @@ struct s390_domain { unsigned long *dma_table; spinlock_t dma_table_lock; spinlock_t list_lock; + struct rcu_head rcu; }; =20 static struct s390_domain *to_s390_domain(struct iommu_domain *dom) @@ -61,18 +64,28 @@ static struct iommu_domain *s390_domain_alloc(unsigned = domain_type) =20 spin_lock_init(&s390_domain->dma_table_lock); spin_lock_init(&s390_domain->list_lock); - INIT_LIST_HEAD(&s390_domain->devices); + INIT_LIST_HEAD_RCU(&s390_domain->devices); =20 return &s390_domain->domain; } =20 +static void s390_iommu_rcu_free_domain(struct rcu_head *head) +{ + struct s390_domain *s390_domain =3D container_of(head, struct s390_domain= , rcu); + + dma_cleanup_tables(s390_domain->dma_table); + kfree(s390_domain); +} + static void s390_domain_free(struct iommu_domain *domain) { struct s390_domain *s390_domain =3D to_s390_domain(domain); =20 + rcu_read_lock(); WARN_ON(!list_empty(&s390_domain->devices)); - dma_cleanup_tables(s390_domain->dma_table); - kfree(s390_domain); + rcu_read_unlock(); + + call_rcu(&s390_domain->rcu, s390_iommu_rcu_free_domain); } =20 static void __s390_iommu_detach_device(struct zpci_dev *zdev) @@ -84,7 +97,7 @@ static void __s390_iommu_detach_device(struct zpci_dev *z= dev) return; =20 spin_lock_irqsave(&s390_domain->list_lock, flags); - list_del_init(&zdev->iommu_list); + list_del_rcu(&zdev->iommu_list); spin_unlock_irqrestore(&s390_domain->list_lock, flags); =20 zpci_unregister_ioat(zdev, 0); @@ -127,7 +140,7 @@ static int s390_iommu_attach_device(struct iommu_domain= *domain, zdev->s390_domain =3D s390_domain; =20 spin_lock_irqsave(&s390_domain->list_lock, flags); - list_add(&zdev->iommu_list, &s390_domain->devices); + list_add_rcu(&zdev->iommu_list, &s390_domain->devices); spin_unlock_irqrestore(&s390_domain->list_lock, flags); =20 return 0; @@ -203,14 +216,13 @@ static void s390_iommu_flush_iotlb_all(struct iommu_d= omain *domain) { struct s390_domain *s390_domain =3D to_s390_domain(domain); struct zpci_dev *zdev; - unsigned long flags; =20 - spin_lock_irqsave(&s390_domain->list_lock, flags); - list_for_each_entry(zdev, &s390_domain->devices, iommu_list) { + rcu_read_lock(); + list_for_each_entry_rcu(zdev, &s390_domain->devices, iommu_list) { zpci_refresh_trans((u64)zdev->fh << 32, zdev->start_dma, zdev->end_dma - zdev->start_dma + 1); } - spin_unlock_irqrestore(&s390_domain->list_lock, flags); + rcu_read_unlock(); } =20 static void s390_iommu_iotlb_sync(struct iommu_domain *domain, @@ -219,18 +231,17 @@ static void s390_iommu_iotlb_sync(struct iommu_domain= *domain, struct s390_domain *s390_domain =3D to_s390_domain(domain); size_t size =3D gather->end - gather->start + 1; struct zpci_dev *zdev; - unsigned long flags; =20 /* If gather was never added to there is nothing to flush */ if (!gather->end) return; =20 - spin_lock_irqsave(&s390_domain->list_lock, flags); - list_for_each_entry(zdev, &s390_domain->devices, iommu_list) { + rcu_read_lock(); + list_for_each_entry_rcu(zdev, &s390_domain->devices, iommu_list) { zpci_refresh_trans((u64)zdev->fh << 32, gather->start, size); } - spin_unlock_irqrestore(&s390_domain->list_lock, flags); + rcu_read_unlock(); } =20 static void s390_iommu_iotlb_sync_map(struct iommu_domain *domain, @@ -238,16 +249,15 @@ static void s390_iommu_iotlb_sync_map(struct iommu_do= main *domain, { struct s390_domain *s390_domain =3D to_s390_domain(domain); struct zpci_dev *zdev; - unsigned long flags; =20 - spin_lock_irqsave(&s390_domain->list_lock, flags); - list_for_each_entry(zdev, &s390_domain->devices, iommu_list) { + rcu_read_lock(); + list_for_each_entry_rcu(zdev, &s390_domain->devices, iommu_list) { if (!zdev->tlb_refresh) continue; zpci_refresh_trans((u64)zdev->fh << 32, iova, size); } - spin_unlock_irqrestore(&s390_domain->list_lock, flags); + rcu_read_unlock(); } =20 static int s390_iommu_update_trans(struct s390_domain *s390_domain, --=20 2.34.1 From nobody Tue Apr 7 17:34:02 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E69FC4332F for ; Wed, 9 Nov 2022 14:29:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231522AbiKIO3j (ORCPT ); Wed, 9 Nov 2022 09:29:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42720 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231442AbiKIO32 (ORCPT ); Wed, 9 Nov 2022 09:29:28 -0500 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0601915723; Wed, 9 Nov 2022 06:29:27 -0800 (PST) Received: from pps.filterd (m0098421.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2A9EMLwH016148; Wed, 9 Nov 2022 14:29:11 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=v+m/PVmU2IZKQ9M47SlpOzQCxGsERfoxOf9GirUx1Pg=; b=VdpFaC9/fjW0iyKGUiYXQBV9JIT9DYVUFwXhCKN7B3u9Ef2C/dbGMQwF7xQqn/LqHPQR WUE9z9Zy8L7IBa1EWYIjOoxewX1luTpwWmy085NV3BqiPMn/M1LjTYseTRRCISO86sLY WBhvCXZrKcukS1I87p5huTOrUSuudn65PFTWFz6hNS45oan+wobjpWA6PxQOeAhqGlbz UiO9zyytm0zWKdYqNttpQCS2iuJXTPer0OIXErS5xtChq4/IvvJPfeKyNmiKiumVJg3K 7NmgIAfMPEbFbpLxzX74RlKrcsStXv5ezxgre2iNnRi0lkuwUTCfoOpJXcLS90S6kz9r 9Q== Received: from ppma03ams.nl.ibm.com (62.31.33a9.ip4.static.sl-reverse.com [169.51.49.98]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3krdwpg4qn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 09 Nov 2022 14:29:11 +0000 Received: from pps.filterd (ppma03ams.nl.ibm.com [127.0.0.1]) by ppma03ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 2A9EK54F026865; Wed, 9 Nov 2022 14:29:09 GMT Received: from b06cxnps3074.portsmouth.uk.ibm.com (d06relay09.portsmouth.uk.ibm.com [9.149.109.194]) by ppma03ams.nl.ibm.com with ESMTP id 3kngqgdtqp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 09 Nov 2022 14:29:09 +0000 Received: from d06av23.portsmouth.uk.ibm.com (d06av23.portsmouth.uk.ibm.com [9.149.105.59]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2A9ET6FZ38470046 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 9 Nov 2022 14:29:06 GMT Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 93541A4040; Wed, 9 Nov 2022 14:29:06 +0000 (GMT) Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 25FF6A4051; Wed, 9 Nov 2022 14:29:06 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by d06av23.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 9 Nov 2022 14:29:06 +0000 (GMT) From: Niklas Schnelle To: Matthew Rosato , iommu@lists.linux.dev, Joerg Roedel , Will Deacon , Robin Murphy , Jason Gunthorpe Cc: Gerd Bayer , Pierre Morel , linux-s390@vger.kernel.org, borntraeger@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, gerald.schaefer@linux.ibm.com, agordeev@linux.ibm.com, svens@linux.ibm.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 4/5] iommu/s390: Optimize IOMMU table walking Date: Wed, 9 Nov 2022 15:29:02 +0100 Message-Id: <20221109142903.4080275-5-schnelle@linux.ibm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221109142903.4080275-1-schnelle@linux.ibm.com> References: <20221109142903.4080275-1-schnelle@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-GUID: 3nqj8d6u60titnld1rHHy-oHqu1-xk-4 X-Proofpoint-ORIG-GUID: 3nqj8d6u60titnld1rHHy-oHqu1-xk-4 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-09_06,2022-11-09_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 adultscore=0 mlxscore=0 mlxlogscore=893 priorityscore=1501 clxscore=1015 lowpriorityscore=0 spamscore=0 phishscore=0 suspectscore=0 malwarescore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211090107 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" When invalidating existing table entries for unmap there is no need to know the physical address beforehand so don't do an extra walk of the IOMMU table to get it. Also when invalidating entries not finding an entry indicates an invalid unmap and not a lack of memory we also don't need to undo updates in this case. Implement this by splitting s390_iommu_update_trans() in a variant for validating and one for invalidating translations. Signed-off-by: Niklas Schnelle --- drivers/iommu/s390-iommu.c | 69 ++++++++++++++++++++++++-------------- 1 file changed, 43 insertions(+), 26 deletions(-) diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c index cf5dcbcea4e0..2b9a3e3bc606 100644 --- a/drivers/iommu/s390-iommu.c +++ b/drivers/iommu/s390-iommu.c @@ -260,14 +260,14 @@ static void s390_iommu_iotlb_sync_map(struct iommu_do= main *domain, rcu_read_unlock(); } =20 -static int s390_iommu_update_trans(struct s390_domain *s390_domain, - phys_addr_t pa, dma_addr_t dma_addr, - unsigned long nr_pages, int flags) +static int s390_iommu_validate_trans(struct s390_domain *s390_domain, + phys_addr_t pa, dma_addr_t dma_addr, + unsigned long nr_pages, int flags) { phys_addr_t page_addr =3D pa & PAGE_MASK; unsigned long irq_flags, i; unsigned long *entry; - int rc =3D 0; + int rc; =20 if (!nr_pages) return 0; @@ -275,7 +275,7 @@ static int s390_iommu_update_trans(struct s390_domain *= s390_domain, spin_lock_irqsave(&s390_domain->dma_table_lock, irq_flags); for (i =3D 0; i < nr_pages; i++) { entry =3D dma_walk_cpu_trans(s390_domain->dma_table, dma_addr); - if (!entry) { + if (unlikely(!entry)) { rc =3D -ENOMEM; goto undo_cpu_trans; } @@ -283,19 +283,43 @@ static int s390_iommu_update_trans(struct s390_domain= *s390_domain, page_addr +=3D PAGE_SIZE; dma_addr +=3D PAGE_SIZE; } + spin_unlock_irqrestore(&s390_domain->dma_table_lock, irq_flags); + + return 0; =20 undo_cpu_trans: - if (rc && ((flags & ZPCI_PTE_VALID_MASK) =3D=3D ZPCI_PTE_VALID)) { - flags =3D ZPCI_PTE_INVALID; - while (i-- > 0) { - page_addr -=3D PAGE_SIZE; - dma_addr -=3D PAGE_SIZE; - entry =3D dma_walk_cpu_trans(s390_domain->dma_table, - dma_addr); - if (!entry) - break; - dma_update_cpu_trans(entry, page_addr, flags); + while (i-- > 0) { + dma_addr -=3D PAGE_SIZE; + entry =3D dma_walk_cpu_trans(s390_domain->dma_table, + dma_addr); + if (!entry) + break; + dma_update_cpu_trans(entry, 0, ZPCI_PTE_INVALID); + } + spin_unlock_irqrestore(&s390_domain->dma_table_lock, irq_flags); + + return rc; +} + +static int s390_iommu_invalidate_trans(struct s390_domain *s390_domain, + dma_addr_t dma_addr, unsigned long nr_pages) +{ + unsigned long irq_flags, i; + unsigned long *entry; + int rc =3D 0; + + if (!nr_pages) + return 0; + + spin_lock_irqsave(&s390_domain->dma_table_lock, irq_flags); + for (i =3D 0; i < nr_pages; i++) { + entry =3D dma_walk_cpu_trans(s390_domain->dma_table, dma_addr); + if (unlikely(!entry)) { + rc =3D -EINVAL; + break; } + dma_update_cpu_trans(entry, 0, ZPCI_PTE_INVALID); + dma_addr +=3D PAGE_SIZE; } spin_unlock_irqrestore(&s390_domain->dma_table_lock, irq_flags); =20 @@ -308,8 +332,8 @@ static int s390_iommu_map_pages(struct iommu_domain *do= main, int prot, gfp_t gfp, size_t *mapped) { struct s390_domain *s390_domain =3D to_s390_domain(domain); - int flags =3D ZPCI_PTE_VALID, rc =3D 0; size_t size =3D pgcount << __ffs(pgsize); + int flags =3D ZPCI_PTE_VALID, rc =3D 0; =20 if (pgsize !=3D SZ_4K) return -EINVAL; @@ -327,8 +351,8 @@ static int s390_iommu_map_pages(struct iommu_domain *do= main, if (!(prot & IOMMU_WRITE)) flags |=3D ZPCI_TABLE_PROTECTED; =20 - rc =3D s390_iommu_update_trans(s390_domain, paddr, iova, - pgcount, flags); + rc =3D s390_iommu_validate_trans(s390_domain, paddr, iova, + pgcount, flags); if (!rc) *mapped =3D size; =20 @@ -373,20 +397,13 @@ static size_t s390_iommu_unmap_pages(struct iommu_dom= ain *domain, { struct s390_domain *s390_domain =3D to_s390_domain(domain); size_t size =3D pgcount << __ffs(pgsize); - int flags =3D ZPCI_PTE_INVALID; - phys_addr_t paddr; int rc; =20 if (WARN_ON(iova < s390_domain->domain.geometry.aperture_start || (iova + size - 1) > s390_domain->domain.geometry.aperture_end)) return 0; =20 - paddr =3D s390_iommu_iova_to_phys(domain, iova); - if (!paddr) - return 0; - - rc =3D s390_iommu_update_trans(s390_domain, paddr, iova, - pgcount, flags); + rc =3D s390_iommu_invalidate_trans(s390_domain, iova, pgcount); if (rc) return 0; =20 --=20 2.34.1 From nobody Tue Apr 7 17:34:02 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 755D6C4332F for ; Wed, 9 Nov 2022 14:29:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231551AbiKIO3r (ORCPT ); Wed, 9 Nov 2022 09:29:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42808 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231486AbiKIO3c (ORCPT ); Wed, 9 Nov 2022 09:29:32 -0500 Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 27DA91D660; Wed, 9 Nov 2022 06:29:30 -0800 (PST) Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2A9ES939006282; Wed, 9 Nov 2022 14:29:12 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=pp1; bh=+BistWnd4z+7A7K5tuip7U85T3+3GQn4XSKxpGUsq+Y=; b=GgxzX9u3+/yTAhAytmryq7bBadgpmDsleLlsa3zdJJS6/y8PgshW24B4ynlvBWjF7IaO FovsyRMBbtLYxASrLSitY3/VWAclQRbeGxMGTraa1RdctrpIxGhZ2dTh7ga6DRV5ZTKb iRWeJuJ0gXF/wj3WNI+xiA35QLRSk4OLTG795zH+XcD3XrvmsP+bncnIkLB9HXhEdOLC pcIuFrD6u3P1MTHntQnO3i4G7GcnedfNUeemJqupO1VbpBwlDHqnNZgZcUqSqaxHaZAV MzV+cper85wIFRjIw+DU4s/Hby8h/vcHIkkANK7bpxus4a60DdfijP8awdInN0q5kkFG vw== Received: from ppma02fra.de.ibm.com (47.49.7a9f.ip4.static.sl-reverse.com [159.122.73.71]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3kre03g0j5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 09 Nov 2022 14:29:12 +0000 Received: from pps.filterd (ppma02fra.de.ibm.com [127.0.0.1]) by ppma02fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 2A9ELak8032332; Wed, 9 Nov 2022 14:29:10 GMT Received: from b06cxnps4075.portsmouth.uk.ibm.com (d06relay12.portsmouth.uk.ibm.com [9.149.109.197]) by ppma02fra.de.ibm.com with ESMTP id 3kngpgm7y6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 09 Nov 2022 14:29:10 +0000 Received: from d06av23.portsmouth.uk.ibm.com (d06av23.portsmouth.uk.ibm.com [9.149.105.59]) by b06cxnps4075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2A9ET7Ok39125570 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 9 Nov 2022 14:29:07 GMT Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1E243A4040; Wed, 9 Nov 2022 14:29:07 +0000 (GMT) Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9EA5BA4053; Wed, 9 Nov 2022 14:29:06 +0000 (GMT) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by d06av23.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 9 Nov 2022 14:29:06 +0000 (GMT) From: Niklas Schnelle To: Matthew Rosato , iommu@lists.linux.dev, Joerg Roedel , Will Deacon , Robin Murphy , Jason Gunthorpe Cc: Gerd Bayer , Pierre Morel , linux-s390@vger.kernel.org, borntraeger@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, gerald.schaefer@linux.ibm.com, agordeev@linux.ibm.com, svens@linux.ibm.com, linux-kernel@vger.kernel.org Subject: [PATCH v2 5/5] s390/pci: use lock-free I/O translation updates Date: Wed, 9 Nov 2022 15:29:03 +0100 Message-Id: <20221109142903.4080275-6-schnelle@linux.ibm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221109142903.4080275-1-schnelle@linux.ibm.com> References: <20221109142903.4080275-1-schnelle@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-GUID: -NJ252kKyB_6gvm23IRyInYhFsCRGi_E X-Proofpoint-ORIG-GUID: -NJ252kKyB_6gvm23IRyInYhFsCRGi_E X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-09_06,2022-11-09_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 adultscore=0 bulkscore=0 priorityscore=1501 spamscore=0 lowpriorityscore=0 phishscore=0 mlxscore=0 suspectscore=0 malwarescore=0 clxscore=1015 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211090107 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" I/O translation tables on s390 use 8 byte page table entries and tables which are allocated lazily but only freed when the entire I/O translation table is torn down. Also each IOVA can at any time only translate to one physical address Furthermore I/O table accesses by the IOMMU hardware are cache coherent. With a bit of care we can thus use atomic updates to manipulate the translation table without having to use a global lock at all. This is done analogous to the existing I/O translation table handling code used on Intel and AMD x86 systems. Signed-off-by: Niklas Schnelle --- arch/s390/include/asm/pci.h | 1 - arch/s390/pci/pci_dma.c | 74 ++++++++++++++++++++++--------------- drivers/iommu/s390-iommu.c | 37 +++++++------------ 3 files changed, 58 insertions(+), 54 deletions(-) diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h index e4c3e4e04d30..b248694e0024 100644 --- a/arch/s390/include/asm/pci.h +++ b/arch/s390/include/asm/pci.h @@ -157,7 +157,6 @@ struct zpci_dev { =20 /* DMA stuff */ unsigned long *dma_table; - spinlock_t dma_table_lock; int tlb_refresh; =20 spinlock_t iommu_bitmap_lock; diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c index dee825ee7305..ea478d11fbd1 100644 --- a/arch/s390/pci/pci_dma.c +++ b/arch/s390/pci/pci_dma.c @@ -63,37 +63,55 @@ static void dma_free_page_table(void *table) kmem_cache_free(dma_page_table_cache, table); } =20 -static unsigned long *dma_get_seg_table_origin(unsigned long *entry) +static unsigned long *dma_get_seg_table_origin(unsigned long *rtep) { + unsigned long old_rte, rte; unsigned long *sto; =20 - if (reg_entry_isvalid(*entry)) - sto =3D get_rt_sto(*entry); - else { + rte =3D READ_ONCE(*rtep); + if (reg_entry_isvalid(rte)) { + sto =3D get_rt_sto(rte); + } else { sto =3D dma_alloc_cpu_table(); if (!sto) return NULL; =20 - set_rt_sto(entry, virt_to_phys(sto)); - validate_rt_entry(entry); - entry_clr_protected(entry); + set_rt_sto(&rte, virt_to_phys(sto)); + validate_rt_entry(&rte); + entry_clr_protected(&rte); + + old_rte =3D cmpxchg(rtep, ZPCI_TABLE_INVALID, rte); + if (old_rte !=3D ZPCI_TABLE_INVALID) { + /* Somone else was faster, use theirs */ + dma_free_cpu_table(sto); + sto =3D get_rt_sto(old_rte); + } } return sto; } =20 -static unsigned long *dma_get_page_table_origin(unsigned long *entry) +static unsigned long *dma_get_page_table_origin(unsigned long *step) { + unsigned long old_ste, ste; unsigned long *pto; =20 - if (reg_entry_isvalid(*entry)) - pto =3D get_st_pto(*entry); - else { + ste =3D READ_ONCE(*step); + if (reg_entry_isvalid(ste)) { + pto =3D get_st_pto(ste); + } else { pto =3D dma_alloc_page_table(); if (!pto) return NULL; - set_st_pto(entry, virt_to_phys(pto)); - validate_st_entry(entry); - entry_clr_protected(entry); + set_st_pto(&ste, virt_to_phys(pto)); + validate_st_entry(&ste); + entry_clr_protected(&ste); + + old_ste =3D cmpxchg(step, ZPCI_TABLE_INVALID, ste); + if (old_ste !=3D ZPCI_TABLE_INVALID) { + /* Somone else was faster, use theirs */ + dma_free_page_table(pto); + pto =3D get_st_pto(old_ste); + } } return pto; } @@ -117,19 +135,24 @@ unsigned long *dma_walk_cpu_trans(unsigned long *rto,= dma_addr_t dma_addr) return &pto[px]; } =20 -void dma_update_cpu_trans(unsigned long *entry, phys_addr_t page_addr, int= flags) +void dma_update_cpu_trans(unsigned long *ptep, phys_addr_t page_addr, int = flags) { + unsigned long pte; + + pte =3D READ_ONCE(*ptep); if (flags & ZPCI_PTE_INVALID) { - invalidate_pt_entry(entry); + invalidate_pt_entry(&pte); } else { - set_pt_pfaa(entry, page_addr); - validate_pt_entry(entry); + set_pt_pfaa(&pte, page_addr); + validate_pt_entry(&pte); } =20 if (flags & ZPCI_TABLE_PROTECTED) - entry_set_protected(entry); + entry_set_protected(&pte); else - entry_clr_protected(entry); + entry_clr_protected(&pte); + + xchg(ptep, pte); } =20 static int __dma_update_trans(struct zpci_dev *zdev, phys_addr_t pa, @@ -137,18 +160,14 @@ static int __dma_update_trans(struct zpci_dev *zdev, = phys_addr_t pa, { unsigned int nr_pages =3D PAGE_ALIGN(size) >> PAGE_SHIFT; phys_addr_t page_addr =3D (pa & PAGE_MASK); - unsigned long irq_flags; unsigned long *entry; int i, rc =3D 0; =20 if (!nr_pages) return -EINVAL; =20 - spin_lock_irqsave(&zdev->dma_table_lock, irq_flags); - if (!zdev->dma_table) { - rc =3D -EINVAL; - goto out_unlock; - } + if (!zdev->dma_table) + return -EINVAL; =20 for (i =3D 0; i < nr_pages; i++) { entry =3D dma_walk_cpu_trans(zdev->dma_table, dma_addr); @@ -173,8 +192,6 @@ static int __dma_update_trans(struct zpci_dev *zdev, ph= ys_addr_t pa, dma_update_cpu_trans(entry, page_addr, flags); } } -out_unlock: - spin_unlock_irqrestore(&zdev->dma_table_lock, irq_flags); return rc; } =20 @@ -558,7 +575,6 @@ int zpci_dma_init_device(struct zpci_dev *zdev) WARN_ON(zdev->s390_domain); =20 spin_lock_init(&zdev->iommu_bitmap_lock); - spin_lock_init(&zdev->dma_table_lock); =20 zdev->dma_table =3D dma_alloc_cpu_table(); if (!zdev->dma_table) { diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c index 2b9a3e3bc606..ed33c6cce083 100644 --- a/drivers/iommu/s390-iommu.c +++ b/drivers/iommu/s390-iommu.c @@ -20,7 +20,6 @@ struct s390_domain { struct iommu_domain domain; struct list_head devices; unsigned long *dma_table; - spinlock_t dma_table_lock; spinlock_t list_lock; struct rcu_head rcu; }; @@ -62,7 +61,6 @@ static struct iommu_domain *s390_domain_alloc(unsigned do= main_type) s390_domain->domain.geometry.aperture_start =3D 0; s390_domain->domain.geometry.aperture_end =3D ZPCI_TABLE_SIZE_RT - 1; =20 - spin_lock_init(&s390_domain->dma_table_lock); spin_lock_init(&s390_domain->list_lock); INIT_LIST_HEAD_RCU(&s390_domain->devices); =20 @@ -265,14 +263,10 @@ static int s390_iommu_validate_trans(struct s390_doma= in *s390_domain, unsigned long nr_pages, int flags) { phys_addr_t page_addr =3D pa & PAGE_MASK; - unsigned long irq_flags, i; unsigned long *entry; + unsigned long i; int rc; =20 - if (!nr_pages) - return 0; - - spin_lock_irqsave(&s390_domain->dma_table_lock, irq_flags); for (i =3D 0; i < nr_pages; i++) { entry =3D dma_walk_cpu_trans(s390_domain->dma_table, dma_addr); if (unlikely(!entry)) { @@ -283,7 +277,6 @@ static int s390_iommu_validate_trans(struct s390_domain= *s390_domain, page_addr +=3D PAGE_SIZE; dma_addr +=3D PAGE_SIZE; } - spin_unlock_irqrestore(&s390_domain->dma_table_lock, irq_flags); =20 return 0; =20 @@ -296,7 +289,6 @@ static int s390_iommu_validate_trans(struct s390_domain= *s390_domain, break; dma_update_cpu_trans(entry, 0, ZPCI_PTE_INVALID); } - spin_unlock_irqrestore(&s390_domain->dma_table_lock, irq_flags); =20 return rc; } @@ -304,14 +296,10 @@ static int s390_iommu_validate_trans(struct s390_doma= in *s390_domain, static int s390_iommu_invalidate_trans(struct s390_domain *s390_domain, dma_addr_t dma_addr, unsigned long nr_pages) { - unsigned long irq_flags, i; unsigned long *entry; + unsigned long i; int rc =3D 0; =20 - if (!nr_pages) - return 0; - - spin_lock_irqsave(&s390_domain->dma_table_lock, irq_flags); for (i =3D 0; i < nr_pages; i++) { entry =3D dma_walk_cpu_trans(s390_domain->dma_table, dma_addr); if (unlikely(!entry)) { @@ -321,7 +309,6 @@ static int s390_iommu_invalidate_trans(struct s390_doma= in *s390_domain, dma_update_cpu_trans(entry, 0, ZPCI_PTE_INVALID); dma_addr +=3D PAGE_SIZE; } - spin_unlock_irqrestore(&s390_domain->dma_table_lock, irq_flags); =20 return rc; } @@ -363,7 +350,8 @@ static phys_addr_t s390_iommu_iova_to_phys(struct iommu= _domain *domain, dma_addr_t iova) { struct s390_domain *s390_domain =3D to_s390_domain(domain); - unsigned long *sto, *pto, *rto, flags; + unsigned long *rto, *sto, *pto; + unsigned long ste, pte, rte; unsigned int rtx, sx, px; phys_addr_t phys =3D 0; =20 @@ -376,16 +364,17 @@ static phys_addr_t s390_iommu_iova_to_phys(struct iom= mu_domain *domain, px =3D calc_px(iova); rto =3D s390_domain->dma_table; =20 - spin_lock_irqsave(&s390_domain->dma_table_lock, flags); - if (rto && reg_entry_isvalid(rto[rtx])) { - sto =3D get_rt_sto(rto[rtx]); - if (sto && reg_entry_isvalid(sto[sx])) { - pto =3D get_st_pto(sto[sx]); - if (pto && pt_entry_isvalid(pto[px])) - phys =3D pto[px] & ZPCI_PTE_ADDR_MASK; + rte =3D READ_ONCE(rto[rtx]); + if (reg_entry_isvalid(rte)) { + sto =3D get_rt_sto(rte); + ste =3D READ_ONCE(sto[sx]); + if (reg_entry_isvalid(ste)) { + pto =3D get_st_pto(ste); + pte =3D READ_ONCE(pto[px]); + if (pt_entry_isvalid(pte)) + phys =3D pte & ZPCI_PTE_ADDR_MASK; } } - spin_unlock_irqrestore(&s390_domain->dma_table_lock, flags); =20 return phys; } --=20 2.34.1