From nobody Mon Nov 25 00:47:21 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail header.i=@fujitsu.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1718729649; cv=none; d=zohomail.com; s=zohoarc; b=g6a9D/kT5PBXxfxNLVCAng+hFIrtzsPXNNr1Ps40l4OP51Iq+PU6HUz5jHsj+KklZ0WmL7I9bRkX7/EgEt04R6Lg4QUaVMQsVrfEqW1s5QaPTYU3nAVIG3GWJ8UR0WqBx6cJQuRGCuM1ZVnZlNbct/cZOFvizWqMdQprFB55C2g= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1718729649; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:Reply-To:Reply-To:Sender:Subject:Subject:To:To:Message-Id; bh=BvTOv307RfAs3ae1SvAXjqQWZ2mb37ZuhBJeaxUSeCE=; b=dOPxuitwP6a5Dk0iL8IIObzbr4UYlHTyxAdzhJeKAy02uiwVwVYCrgNYfZQttug1naADaQ//BI4n1NC0Syqrf5fKIlwV+Fk0HrOEyCBRbqZLEdwdryWVlym7/7S1g8lZBZOEmWVxiWvy1hkz2QsGQMSV6dbzFeSrP/zCO5dzjr4= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail header.i=@fujitsu.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1718729649302772.5864427755465; Tue, 18 Jun 2024 09:54:09 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1sJc5H-0004IJ-QX; Tue, 18 Jun 2024 12:53:27 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sJc5F-0004H8-QN for qemu-devel@nongnu.org; Tue, 18 Jun 2024 12:53:25 -0400 Received: from esa9.hc1455-7.c3s2.iphmx.com ([139.138.36.223]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1sJc5C-0007IW-L6 for qemu-devel@nongnu.org; Tue, 18 Jun 2024 12:53:25 -0400 Received: from unknown (HELO yto-r1.gw.nic.fujitsu.com) ([218.44.52.217]) by esa9.hc1455-7.c3s2.iphmx.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Jun 2024 01:53:15 +0900 Received: from yto-m2.gw.nic.fujitsu.com (yto-nat-yto-m2.gw.nic.fujitsu.com [192.168.83.65]) by yto-r1.gw.nic.fujitsu.com (Postfix) with ESMTP id 0466CD6EEE for ; Wed, 19 Jun 2024 01:53:13 +0900 (JST) Received: from kws-ab4.gw.nic.fujitsu.com (kws-ab4.gw.nic.fujitsu.com [192.51.206.22]) by yto-m2.gw.nic.fujitsu.com (Postfix) with ESMTP id 34D93D50B5 for ; Wed, 19 Jun 2024 01:53:12 +0900 (JST) Received: from edo.cn.fujitsu.com (edo.cn.fujitsu.com [10.167.33.5]) by kws-ab4.gw.nic.fujitsu.com (Postfix) with ESMTP id AD00EE20E5 for ; Wed, 19 Jun 2024 01:53:11 +0900 (JST) Received: from irides.g08.fujitsu.local (unknown [10.167.226.114]) by edo.cn.fujitsu.com (Postfix) with ESMTP id BFD501A000A; Wed, 19 Jun 2024 00:53:10 +0800 (CST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=fujitsu.com; i=@fujitsu.com; q=dns/txt; s=fj2; t=1718729602; x=1750265602; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=ag5fOqmarH0K6nwAnznY+zumpXxV86AOoh/to2a+V3g=; b=sgg2HEJvTqlnxVmexHTjSSaT4Dv+Q+xwfJdELt1v0n2gNR7ltcdLDBQS xsi9ZaycyxT5B36MvYtluooPfG0ENwq7EueJxRJxqX2YvzCjRBiNKaFZH QFqeLNJs2mdfCpnNOL0tpuD+TntBDucwofqny+fK/BwH+X4Cx0WVUveut oxIbAXzuAh5OBhkzBcaISdqbQV2Am7WlCFz8VnmmBbW1QJDVdXIk3uCJO 3ns3fIi+L//VZJ3674icrFNPC3W0m7b1OHngQQ/P8XtDhmCGVfWXWZSVp n1P6uLsp/+vidciitJgbZN3V+tbouiHCxSlW5S4YcaWwNiPTR8wfvW96U g==; X-IronPort-AV: E=McAfee;i="6700,10204,11107"; a="152244467" X-IronPort-AV: E=Sophos;i="6.08,247,1712588400"; d="scan'208";a="152244467" To: qemu-devel@nongnu.org, linux-cxl@vger.kernel.org Cc: jonathan.cameron@huawei.com, dan.j.williams@intel.com, dave@stgolabs.net, ira.weiny@intel.com, alison.schofield@intel.com, dave.jiang@intel.com, vishal.l.verma@intel.com Subject: [RFC PATCH] cxl: avoid duplicating report from MCE & device Date: Wed, 19 Jun 2024 00:53:10 +0800 Message-Id: <20240618165310.877974-1-ruansy.fnst@fujitsu.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-TM-AS-Product-Ver: IMSS-9.1.0.1417-9.0.0.1002-28462.000 X-TM-AS-User-Approved-Sender: Yes X-TMASE-Version: IMSS-9.1.0.1417-9.0.1002-28462.000 X-TMASE-Result: 10--15.424000-10.000000 X-TMASE-MatchedRID: tyQFqoka1ycZHQl0dvECsdMJkd+MUUHPSdIdCi8Ba4C3Vbk+OCY1YaxI jSYfsSaZmC+wH+KoDcqFDet6pUjZELWmEBjwFcVelTsGW3DmpUul9VzHf0qr7txq159FIVQpY3G dDr1JjFoMI2dvwKu82gsCbakizIq3jzShsBdWTBdZRmuDptXfZ5zZ8Jvlwu9xUStSH0U1R4fZ6K rkcB2NweIiMBSBul+8ApIqu01NKaojsTVCecYbuMQ4mpKyfkqZMzbF1gbxlQZXy2/vm0e6zmNok eyvFnLMF40HISqSK6q12HagvbwDji/7QU2czuUNA9lly13c/gH4uJ1REX4MHb5TVqwOzxj8vf9f i5QQ71M4p1iHssV7ht6ME/2Pq5BIhcAT8+NWWH/HyCtnYFmFhoIw3bnTjwR6icvz9DxarMGk86u MB98iNvBn25uFP/vzYcMGA+vBhsriaPUT+aN6kp4CIKY/Hg3AGdQnQSTrKGPEQdG7H66TyH4gKq 42LRYkLkqlPbQWFY4NzD7jlMx6pfoCEAOPEbM1RDpcC2OvH3d+3BndfXUhXQ== X-TMASE-SNAP-Result: 1.821001.0001-0-1-22:0,33:0,34:0-0 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=139.138.36.223; envelope-from=ruansy.fnst@fujitsu.com; helo=esa9.hc1455-7.c3s2.iphmx.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-to: Shiyang Ruan From: Shiyang Ruan via Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1718729652629100003 Content-Type: text/plain; charset="utf-8" Background: Since CXL device is a memory device, while CPU consumes a poison page of=20 CXL device, it always triggers a MCE by interrupt (INT18), no matter=20 which-First path is configured. This is the first report. Then=20 currently, in FW-First path, the poison event is transferred according=20 to the following process: CXL device -> firmware -> OS:ACPI->APEI->GHES=20 -> CPER -> trace report. This is the second one. These two reports are indicating the same poisoning page, which is the so-called "duplicate report"[1]. And the memory_failure() handling I'm trying to add in OS-First path could also be another duplicate report. Hope the flow below could make it easier to understand: CPU accesses bad memory on CXL device, then -> MCE (INT18), *always* report (1) -> * FW-First (implemented now) -> CXL device -> FW -> OS:ACPI->APEI->GHES->CPER -> trace report (2.a) * OS-First (not implemented yet, I'm working on it) -> CXL device -> MSI -> OS:CXL driver -> memory_failure() (2.b) so, the (1) and (2.a/b) are duplicated. (I didn't get response in my reply for [1] while I have to make patch to solve this problem, so please correct me if my understanding is wrong.) This patch adds a new notifier_block and MCE_PRIO_CXL, for CXL memdev to check whether the current poison page has been reported (if yes, stop the notifier chain, won't call the following memory_failure() to report), into `x86_mce_decoder_chain`. In this way, if the poison page already handled(recorded and reported) in (1) or (2), the other one won't duplicate the report. The record could be clear when cxl_clear_poison() is called. [1] https://lore.kernel.org/linux-cxl/664d948fb86f0_e8be294f8@dwillia2-mobl= 3.amr.corp.intel.com.notmuch/ Signed-off-by: Shiyang Ruan --- arch/x86/include/asm/mce.h | 1 + drivers/cxl/core/mbox.c | 130 +++++++++++++++++++++++++++++++++++++ drivers/cxl/core/memdev.c | 6 +- drivers/cxl/cxlmem.h | 3 + 4 files changed, 139 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h index dfd2e9699bd7..d8109c48e7d9 100644 --- a/arch/x86/include/asm/mce.h +++ b/arch/x86/include/asm/mce.h @@ -182,6 +182,7 @@ enum mce_notifier_prios { MCE_PRIO_NFIT, MCE_PRIO_EXTLOG, MCE_PRIO_UC, + MCE_PRIO_CXL, MCE_PRIO_EARLY, MCE_PRIO_CEC, MCE_PRIO_HIGHEST =3D MCE_PRIO_CEC diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c index 2626f3fff201..0eb3c5401e81 100644 --- a/drivers/cxl/core/mbox.c +++ b/drivers/cxl/core/mbox.c @@ -4,6 +4,8 @@ #include #include #include +#include +#include #include #include #include @@ -880,6 +882,9 @@ void cxl_event_trace_record(const struct cxl_memdev *cx= lmd, if (cxlr) hpa =3D cxl_trace_hpa(cxlr, cxlmd, dpa); =20 + if (hpa !=3D ULLONG_MAX && cxl_mce_recorded(hpa)) + return; + if (event_type =3D=3D CXL_CPER_EVENT_GEN_MEDIA) trace_cxl_general_media(cxlmd, type, cxlr, hpa, &evt->gen_media); @@ -1408,6 +1413,127 @@ int cxl_poison_state_init(struct cxl_memdev_state *= mds) } EXPORT_SYMBOL_NS_GPL(cxl_poison_state_init, CXL); =20 +struct cxl_mce_record { + struct list_head node; + u64 hpa; +}; +LIST_HEAD(cxl_mce_records); +DEFINE_MUTEX(cxl_mce_mutex); + +bool cxl_mce_recorded(u64 hpa) +{ + struct cxl_mce_record *cur, *next, *rec; + int rc; + + rc =3D mutex_lock_interruptible(&cxl_mce_mutex); + if (rc) + return false; + + list_for_each_entry_safe(cur, next, &cxl_mce_records, node) { + if (cur->hpa =3D=3D hpa) { + mutex_unlock(&cxl_mce_mutex); + return true; + } + } + + rec =3D kmalloc(sizeof(struct cxl_mce_record), GFP_KERNEL); + rec->hpa =3D hpa; + list_add(&cxl_mce_records, &rec->node); + + mutex_unlock(&cxl_mce_mutex); + + return false; +} + +void cxl_mce_clear(u64 hpa) +{ + struct cxl_mce_record *cur, *next; + int rc; + + rc =3D mutex_lock_interruptible(&cxl_mce_mutex); + if (rc) + return; + + list_for_each_entry_safe(cur, next, &cxl_mce_records, node) { + if (cur->hpa =3D=3D hpa) { + list_del(&cur->node); + break; + } + } + + mutex_unlock(&cxl_mce_mutex); +} + +struct cxl_contains_hpa_context { + bool contains; + u64 hpa; +}; + +static int __cxl_contains_hpa(struct device *dev, void *arg) +{ + struct cxl_contains_hpa_context *ctx =3D arg; + struct cxl_endpoint_decoder *cxled; + struct range *range; + u64 hpa =3D ctx->hpa; + + if (!is_endpoint_decoder(dev)) + return 0; + + cxled =3D to_cxl_endpoint_decoder(dev); + range =3D &cxled->cxld.hpa_range; + + if (range->start <=3D hpa && hpa <=3D range->end) { + ctx->contains =3D true; + return 1; + } + + return 0; +} + +static bool cxl_contains_hpa(const struct cxl_memdev *cxlmd, u64 hpa) +{ + struct cxl_contains_hpa_context ctx =3D { + .contains =3D false, + .hpa =3D hpa, + }; + struct cxl_port *port; + + port =3D cxlmd->endpoint; + if (port && is_cxl_endpoint(port) && cxl_num_decoders_committed(port)) + device_for_each_child(&port->dev, &ctx, __cxl_contains_hpa); + + return ctx.contains; +} + +static int cxl_handle_mce(struct notifier_block *nb, unsigned long val, + void *data) +{ + struct mce *mce =3D (struct mce *)data; + struct cxl_memdev_state *mds =3D container_of(nb, struct cxl_memdev_state, + mce_notifier); + u64 hpa; + + if (!mce || !mce_usable_address(mce)) + return NOTIFY_DONE; + + hpa =3D mce->addr & MCI_ADDR_PHYSADDR; + + /* Check if the PFN is located on this CXL device */ + if (!pfn_valid(hpa >> PAGE_SHIFT) && + !cxl_contains_hpa(mds->cxlds.cxlmd, hpa)) + return NOTIFY_DONE; + + /* + * Search PFN in the cxl_mce_records, if already exists, don't continue + * to do memory_failure() to avoid a poison address being reported + * more than once. + */ + if (cxl_mce_recorded(hpa)) + return NOTIFY_STOP; + else + return NOTIFY_OK; +} + struct cxl_memdev_state *cxl_memdev_state_create(struct device *dev) { struct cxl_memdev_state *mds; @@ -1427,6 +1553,10 @@ struct cxl_memdev_state *cxl_memdev_state_create(str= uct device *dev) mds->ram_perf.qos_class =3D CXL_QOS_CLASS_INVALID; mds->pmem_perf.qos_class =3D CXL_QOS_CLASS_INVALID; =20 + mds->mce_notifier.notifier_call =3D cxl_handle_mce; + mds->mce_notifier.priority =3D MCE_PRIO_CXL; + mce_register_decode_chain(&mds->mce_notifier); + return mds; } EXPORT_SYMBOL_NS_GPL(cxl_memdev_state_create, CXL); diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c index 0277726afd04..aa3ac89d17be 100644 --- a/drivers/cxl/core/memdev.c +++ b/drivers/cxl/core/memdev.c @@ -376,10 +376,14 @@ int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dp= a) goto out; =20 cxlr =3D cxl_dpa_to_region(cxlmd, dpa); - if (cxlr) + if (cxlr) { + u64 hpa =3D cxl_trace_hpa(cxlr, cxlmd, dpa); + + cxl_mce_clear(hpa); dev_warn_once(mds->cxlds.dev, "poison clear dpa:%#llx region: %s\n", dpa, dev_name(&cxlr->dev)); + } =20 record =3D (struct cxl_poison_record) { .address =3D cpu_to_le64(dpa), diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index 19aba81cdf13..fbf8d9f46984 100644 --- a/drivers/cxl/cxlmem.h +++ b/drivers/cxl/cxlmem.h @@ -501,6 +501,7 @@ struct cxl_memdev_state { struct cxl_fw_state fw; =20 struct rcuwait mbox_wait; + struct notifier_block mce_notifier; int (*mbox_send)(struct cxl_memdev_state *mds, struct cxl_mbox_cmd *cmd); }; @@ -836,6 +837,8 @@ int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 of= fset, u64 len, int cxl_trigger_poison_list(struct cxl_memdev *cxlmd); int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa); int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa); +bool cxl_mce_recorded(u64 pfn); +void cxl_mce_clear(u64 pfn); =20 #ifdef CONFIG_CXL_SUSPEND void cxl_mem_active_inc(void); --=20 2.34.1