From nobody Thu Oct 2 20:44:09 2025 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7D4E5369974; Thu, 11 Sep 2025 18:33:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.158.5 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757615602; cv=none; b=AbvQ5SASIwnh/vqsBoZGDUcIXAINrc29EvqWuTYhGtytChn61Fs/C3f8CzP3XegAJS+3Ss4QibSO/HDMjgK6Agh2E019f/tAnvEIFWKiMuxhwWKX+kLTEOYFfW8qemyEHAYkSsx4DYRzipr1QN/TRwQOBVY5FSvUDxcPRWNDh94= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757615602; c=relaxed/simple; bh=lISvvsVj3gZoCyb2hnU+9i3dJa2QHlhRVtFM4LwMPPU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=lRPz6rIbZk2RACSY+gLa3XdUuanRbolMHFNsiTeDsjyR06IGdDPp6MnHOhDUa1qC8NA0CGmIkHPbiUj4tOxg5NcrvxWlrcUT60He8wruYAn85Dr2YWpnhSSufPDYrV/bltfX2dl6054ncFYYnWHiSBhuUJL0EnI6oghHYGzOwwY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com; spf=pass smtp.mailfrom=linux.ibm.com; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b=JsoPKZav; arc=none smtp.client-ip=148.163.158.5 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="JsoPKZav" Received: from pps.filterd (m0353725.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 58BClwkN026821; Thu, 11 Sep 2025 18:33:16 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=pp1; bh=iQGNNINXa0BmjUmxW dUGnyQbGrZ+66cpiZoBLKw60i8=; b=JsoPKZavFjTPWF7I/MPrrIHSo1IJiz840 diYgU2Anwstrwq4zDlDEOzrxibRVB26/7lghUbCeNwVXPygjSJ6Fk6BKY6yh7TMa Ri0/PtnWHrh2tdJqbnKlxi980nArf5BSyfswBOXLnaTlQ1jwxcumvTZK8NMAf2LY UFfna9RwFik6GUj1PX9Lxaj++xvGCoKXxY3VUIpC+5YGmkDun51Fs33dP+bWUKlq Qm0Eo+O1n1uevbmG8qCKFulhXUzKn+pQDYVJYXufgBlFtVVnsNJ9ZuT6vKyHuJAe 7vVKfp5jAb9top6nKUO14KmfKSb8l4Oj6tjTV96jCvg/rmwZaRWbA== Received: from ppma21.wdc07v.mail.ibm.com (5b.69.3da9.ip4.static.sl-reverse.com [169.61.105.91]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 490bct5v8e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 11 Sep 2025 18:33:16 +0000 (GMT) Received: from pps.filterd (ppma21.wdc07v.mail.ibm.com [127.0.0.1]) by ppma21.wdc07v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id 58BHlBZG008428; Thu, 11 Sep 2025 18:33:15 GMT Received: from smtprelay03.dal12v.mail.ibm.com ([172.16.1.5]) by ppma21.wdc07v.mail.ibm.com (PPS) with ESMTPS id 49109py8mf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 11 Sep 2025 18:33:15 +0000 Received: from smtpav01.dal12v.mail.ibm.com (smtpav01.dal12v.mail.ibm.com [10.241.53.100]) by smtprelay03.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 58BIXEPn28771036 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 11 Sep 2025 18:33:14 GMT Received: from smtpav01.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1427F58057; Thu, 11 Sep 2025 18:33:14 +0000 (GMT) Received: from smtpav01.dal12v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5FABC58058; Thu, 11 Sep 2025 18:33:13 +0000 (GMT) Received: from IBM-D32RQW3.ibm.com (unknown [9.61.249.32]) by smtpav01.dal12v.mail.ibm.com (Postfix) with ESMTP; Thu, 11 Sep 2025 18:33:13 +0000 (GMT) From: Farhan Ali To: linux-s390@vger.kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org Cc: alex.williamson@redhat.com, helgaas@kernel.org, alifm@linux.ibm.com, schnelle@linux.ibm.com, mjrosato@linux.ibm.com Subject: [PATCH v3 07/10] s390/pci: Store PCI error information for passthrough devices Date: Thu, 11 Sep 2025 11:33:04 -0700 Message-ID: <20250911183307.1910-8-alifm@linux.ibm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250911183307.1910-1-alifm@linux.ibm.com> References: <20250911183307.1910-1-alifm@linux.ibm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwOTA2MDAxMCBTYWx0ZWRfX9Rx08+wgZfpQ L90VIx6Rl4mJeTa5BDK6oZgj98p0py8Z6o+jpe9dW0HrTOd04iEPa5W+4MswzXvnuNSD531fTUT R7kMX6gutI6tP7jREPRhPSSMBxYiWwx0BtrXQ3nrGRTiRavqnYOpgZSePODHnhMSK1FdOCUntvS /p42jza90Nk/s6OMzBeHyyHxeBOES4c6vh9y5pZkEGYrPEA7xhuw8/aAuWjxq+rU74kFN5GwZEC YTVtRRC2+pZ8XMUBy7ouTHTuglSm8Z1OXSI9uLT1rz7AiE2S6x4JZpOgX6YzWYyzQYs7Xk1REFS V+ZjC6c35ZgZL/jy7HnRHTSmyxjs9N7kkdcM+Mcmyw7N3qPWJVoq31hU2idHeUP2S16s6LaMfqF rLf4Dqct X-Authority-Analysis: v=2.4 cv=SKNCVPvH c=1 sm=1 tr=0 ts=68c315ec cx=c_pps a=GFwsV6G8L6GxiO2Y/PsHdQ==:117 a=GFwsV6G8L6GxiO2Y/PsHdQ==:17 a=yJojWOMRYYMA:10 a=VnNF1IyMAAAA:8 a=AhHhU-3714CzMtceQIIA:9 X-Proofpoint-GUID: 4QA-rC5EdO55zvAuNHiYqFNCIup10lXJ X-Proofpoint-ORIG-GUID: 4QA-rC5EdO55zvAuNHiYqFNCIup10lXJ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1117,Hydra:6.1.9,FMLib:17.12.80.40 definitions=2025-09-11_03,2025-09-11_02,2025-03-28_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 spamscore=0 priorityscore=1501 bulkscore=0 malwarescore=0 adultscore=0 suspectscore=0 impostorscore=0 phishscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.19.0-2507300000 definitions=main-2509060010 Content-Type: text/plain; charset="utf-8" For a passthrough device we need co-operation from user space to recover the device. This would require to bubble up any error information to user space. Let's store this error information for passthrough devices, so it can be retrieved later. Signed-off-by: Farhan Ali --- arch/s390/include/asm/pci.h | 28 ++++++++++ arch/s390/pci/pci.c | 1 + arch/s390/pci/pci_event.c | 95 +++++++++++++++++++------------- drivers/vfio/pci/vfio_pci_zdev.c | 2 + 4 files changed, 88 insertions(+), 38 deletions(-) diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h index f47f62fc3bfd..72e05af90e08 100644 --- a/arch/s390/include/asm/pci.h +++ b/arch/s390/include/asm/pci.h @@ -116,6 +116,31 @@ struct zpci_bus { enum pci_bus_speed max_bus_speed; }; =20 +/* Content Code Description for PCI Function Error */ +struct zpci_ccdf_err { + u32 reserved1; + u32 fh; /* function handle */ + u32 fid; /* function id */ + u32 ett : 4; /* expected table type */ + u32 mvn : 12; /* MSI vector number */ + u32 dmaas : 8; /* DMA address space */ + u32 reserved2 : 6; + u32 q : 1; /* event qualifier */ + u32 rw : 1; /* read/write */ + u64 faddr; /* failing address */ + u32 reserved3; + u16 reserved4; + u16 pec; /* PCI event code */ +} __packed; + +#define ZPCI_ERR_PENDING_MAX 16 +struct zpci_ccdf_pending { + size_t count; + int head; + int tail; + struct zpci_ccdf_err err[ZPCI_ERR_PENDING_MAX]; +}; + /* Private data per function */ struct zpci_dev { struct zpci_bus *zbus; @@ -191,6 +216,8 @@ struct zpci_dev { struct iommu_domain *s390_domain; /* attached IOMMU domain */ struct kvm_zdev *kzdev; struct mutex kzdev_lock; + struct zpci_ccdf_pending pending_errs; + struct mutex pending_errs_lock; spinlock_t dom_lock; /* protect s390_domain change */ }; =20 @@ -316,6 +343,7 @@ void zpci_debug_exit_device(struct zpci_dev *); int zpci_report_error(struct pci_dev *, struct zpci_report_error_header *); int zpci_clear_error_state(struct zpci_dev *zdev); int zpci_reset_load_store_blocked(struct zpci_dev *zdev); +void zpci_cleanup_pending_errors(struct zpci_dev *zdev); =20 #ifdef CONFIG_NUMA =20 diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c index 5baeb5f6f674..02c7b06cfd0e 100644 --- a/arch/s390/pci/pci.c +++ b/arch/s390/pci/pci.c @@ -896,6 +896,7 @@ struct zpci_dev *zpci_create_device(u32 fid, u32 fh, en= um zpci_state state) mutex_init(&zdev->state_lock); mutex_init(&zdev->fmb_lock); mutex_init(&zdev->kzdev_lock); + mutex_init(&zdev->pending_errs_lock); =20 return zdev; =20 diff --git a/arch/s390/pci/pci_event.c b/arch/s390/pci/pci_event.c index 541d536be052..ac527410812b 100644 --- a/arch/s390/pci/pci_event.c +++ b/arch/s390/pci/pci_event.c @@ -18,23 +18,6 @@ #include "pci_bus.h" #include "pci_report.h" =20 -/* Content Code Description for PCI Function Error */ -struct zpci_ccdf_err { - u32 reserved1; - u32 fh; /* function handle */ - u32 fid; /* function id */ - u32 ett : 4; /* expected table type */ - u32 mvn : 12; /* MSI vector number */ - u32 dmaas : 8; /* DMA address space */ - u32 : 6; - u32 q : 1; /* event qualifier */ - u32 rw : 1; /* read/write */ - u64 faddr; /* failing address */ - u32 reserved3; - u16 reserved4; - u16 pec; /* PCI event code */ -} __packed; - /* Content Code Description for PCI Function Availability */ struct zpci_ccdf_avail { u32 reserved1; @@ -76,6 +59,41 @@ static bool is_driver_supported(struct pci_driver *drive= r) return true; } =20 +static void zpci_store_pci_error(struct pci_dev *pdev, + struct zpci_ccdf_err *ccdf) +{ + struct zpci_dev *zdev =3D to_zpci(pdev); + int i; + + mutex_lock(&zdev->pending_errs_lock); + if (zdev->pending_errs.count >=3D ZPCI_ERR_PENDING_MAX) { + pr_err("%s: Cannot store PCI error info for device", + pci_name(pdev)); + mutex_unlock(&zdev->pending_errs_lock); + return; + } + + i =3D zdev->pending_errs.tail % ZPCI_ERR_PENDING_MAX; + memcpy(&zdev->pending_errs.err[i], ccdf, sizeof(struct zpci_ccdf_err)); + zdev->pending_errs.tail++; + zdev->pending_errs.count++; + mutex_unlock(&zdev->pending_errs_lock); +} + +void zpci_cleanup_pending_errors(struct zpci_dev *zdev) +{ + struct pci_dev *pdev =3D NULL; + + mutex_lock(&zdev->pending_errs_lock); + pdev =3D pci_get_slot(zdev->zbus->bus, zdev->devfn); + if (zdev->pending_errs.count) + pr_err("%s: Unhandled PCI error events count=3D%zu", + pci_name(pdev), zdev->pending_errs.count); + memset(&zdev->pending_errs, 0, sizeof(struct zpci_ccdf_pending)); + mutex_unlock(&zdev->pending_errs_lock); +} +EXPORT_SYMBOL_GPL(zpci_cleanup_pending_errors); + static pci_ers_result_t zpci_event_notify_error_detected(struct pci_dev *p= dev, struct pci_driver *driver) { @@ -169,7 +187,8 @@ static pci_ers_result_t zpci_event_do_reset(struct pci_= dev *pdev, * and the platform determines which functions are affected for * multi-function devices. */ -static pci_ers_result_t zpci_event_attempt_error_recovery(struct pci_dev *= pdev) +static pci_ers_result_t zpci_event_attempt_error_recovery(struct pci_dev *= pdev, + struct zpci_ccdf_err *ccdf) { pci_ers_result_t ers_res =3D PCI_ERS_RESULT_DISCONNECT; struct zpci_dev *zdev =3D to_zpci(pdev); @@ -188,13 +207,6 @@ static pci_ers_result_t zpci_event_attempt_error_recov= ery(struct pci_dev *pdev) } pdev->error_state =3D pci_channel_io_frozen; =20 - if (needs_mediated_recovery(pdev)) { - pr_info("%s: Cannot be recovered in the host because it is a pass-throug= h device\n", - pci_name(pdev)); - status_str =3D "failed (pass-through)"; - goto out_unlock; - } - driver =3D to_pci_driver(pdev->dev.driver); if (!is_driver_supported(driver)) { if (!driver) { @@ -210,12 +222,22 @@ static pci_ers_result_t zpci_event_attempt_error_reco= very(struct pci_dev *pdev) goto out_unlock; } =20 + if (needs_mediated_recovery(pdev)) + zpci_store_pci_error(pdev, ccdf); + ers_res =3D zpci_event_notify_error_detected(pdev, driver); if (ers_result_indicates_abort(ers_res)) { status_str =3D "failed (abort on detection)"; goto out_unlock; } =20 + if (needs_mediated_recovery(pdev)) { + pr_info("%s: Recovering passthrough device\n", pci_name(pdev)); + ers_res =3D PCI_ERS_RESULT_RECOVERED; + status_str =3D "in progress"; + goto out_unlock; + } + if (ers_res !=3D PCI_ERS_RESULT_NEED_RESET) { ers_res =3D zpci_event_do_error_state_clear(pdev, driver); if (ers_result_indicates_abort(ers_res)) { @@ -258,25 +280,20 @@ static pci_ers_result_t zpci_event_attempt_error_reco= very(struct pci_dev *pdev) * @pdev: PCI function for which to report * @es: PCI channel failure state to report */ -static void zpci_event_io_failure(struct pci_dev *pdev, pci_channel_state_= t es) +static void zpci_event_io_failure(struct pci_dev *pdev, pci_channel_state_= t es, + struct zpci_ccdf_err *ccdf) { struct pci_driver *driver; =20 pci_dev_lock(pdev); pdev->error_state =3D es; - /** - * While vfio-pci's error_detected callback notifies user-space QEMU - * reacts to this by freezing the guest. In an s390 environment PCI - * errors are rarely fatal so this is overkill. Instead in the future - * we will inject the error event and let the guest recover the device - * itself. - */ + if (needs_mediated_recovery(pdev)) - goto out; + zpci_store_pci_error(pdev, ccdf); driver =3D to_pci_driver(pdev->dev.driver); if (driver && driver->err_handler && driver->err_handler->error_detected) driver->err_handler->error_detected(pdev, pdev->error_state); -out: + pci_dev_unlock(pdev); } =20 @@ -312,6 +329,7 @@ static void __zpci_event_error(struct zpci_ccdf_err *cc= df) pr_err("%s: Event 0x%x reports an error for PCI function 0x%x\n", pdev ? pci_name(pdev) : "n/a", ccdf->pec, ccdf->fid); =20 + if (!pdev) goto no_pdev; =20 @@ -322,12 +340,13 @@ static void __zpci_event_error(struct zpci_ccdf_err *= ccdf) break; case 0x0040: /* Service Action or Error Recovery Failed */ case 0x003b: - zpci_event_io_failure(pdev, pci_channel_io_perm_failure); + zpci_event_io_failure(pdev, pci_channel_io_perm_failure, ccdf); break; default: /* PCI function left in the error state attempt to recover */ - ers_res =3D zpci_event_attempt_error_recovery(pdev); + ers_res =3D zpci_event_attempt_error_recovery(pdev, ccdf); if (ers_res !=3D PCI_ERS_RESULT_RECOVERED) - zpci_event_io_failure(pdev, pci_channel_io_perm_failure); + zpci_event_io_failure(pdev, pci_channel_io_perm_failure, + ccdf); break; } pci_dev_put(pdev); diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_z= dev.c index a7bc23ce8483..2be37eab9279 100644 --- a/drivers/vfio/pci/vfio_pci_zdev.c +++ b/drivers/vfio/pci/vfio_pci_zdev.c @@ -168,6 +168,8 @@ void vfio_pci_zdev_close_device(struct vfio_pci_core_de= vice *vdev) =20 zdev->mediated_recovery =3D false; =20 + zpci_cleanup_pending_errors(zdev); + if (!vdev->vdev.kvm) return; =20 --=20 2.43.0