From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4541FC43217 for ; Thu, 5 May 2022 23:57:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387055AbiEFABa (ORCPT ); Thu, 5 May 2022 20:01:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34994 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387038AbiEFAB1 (ORCPT ); Thu, 5 May 2022 20:01:27 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C179347045 for ; Thu, 5 May 2022 16:57:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795065; x=1683331065; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=gFAa9l6D17AE+FwqBH16mNyyvqzE44nr84f76LNdCPk=; b=Klc7g6ZRUz9pfifhY4ADEr1KVFzmxuB1bLbuE+GVpCHib1bSR2ZOXkik aKG9gI5qaPXwCA1oyhfEtfTUnflD50MBwMFU9k2FRJHQf/V2Qjo7Q2xey juA0Lenk9YkpnmPU/+Y2J9n8vM4DwPXyy4S+dKHi42fGnpFTsMSYJ3d5Q rBkZuSkG6lrmb0pWElZ+mmQTRiCO6DP9YtYr/Eznti7Zu1Y6XBmBECJms M4I344Kp+CcF6aKSqS9czTZs9KfPJgYuE4CHeO4sJo174VoJZO1+iLxHS HvqQgDa9t3GuQyt0Qv522j62iK2XlLyA4qPXjFMM4XEiL9zG6WXmABJnA Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283608" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283608" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914319" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:44 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri Subject: [PATCH v6 01/29] irq/matrix: Expose functions to allocate the best CPU for new vectors Date: Thu, 5 May 2022 16:59:40 -0700 Message-Id: <20220506000008.30892-2-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Certain types of interrupts, such as NMI, do not have an associated vector. They, however, target specific CPUs. Thus, when assigning the destination CPU, it is beneficial to select the one with the lowest number of vectors. Prepend the functions matrix_find_best_cpu_managed() and matrix_find_best_cpu_managed() with the irq_ prefix and expose them for IRQ controllers to use when allocating and activating vector-less IRQs. Cc: Andi Kleen Cc: "Ravi V. Shankar" Cc: Stephane Eranian Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Reviewed-by: Tony Luck Signed-off-by: Ricardo Neri --- Changes since v5: * Introduced this patch. Changes since v4: * N/A Changes since v3: * N/A Changes since v2: * N/A Changes since v1: * N/A --- include/linux/irq.h | 4 ++++ kernel/irq/matrix.c | 32 +++++++++++++++++++++++--------- 2 files changed, 27 insertions(+), 9 deletions(-) diff --git a/include/linux/irq.h b/include/linux/irq.h index f92788ccdba2..9e674e73d295 100644 --- a/include/linux/irq.h +++ b/include/linux/irq.h @@ -1223,6 +1223,10 @@ struct irq_matrix *irq_alloc_matrix(unsigned int mat= rix_bits, void irq_matrix_online(struct irq_matrix *m); void irq_matrix_offline(struct irq_matrix *m); void irq_matrix_assign_system(struct irq_matrix *m, unsigned int bit, bool= replace); +unsigned int irq_matrix_find_best_cpu(struct irq_matrix *m, + const struct cpumask *msk); +unsigned int irq_matrix_find_best_cpu_managed(struct irq_matrix *m, + const struct cpumask *msk); int irq_matrix_reserve_managed(struct irq_matrix *m, const struct cpumask = *msk); void irq_matrix_remove_managed(struct irq_matrix *m, const struct cpumask = *msk); int irq_matrix_alloc_managed(struct irq_matrix *m, const struct cpumask *m= sk, diff --git a/kernel/irq/matrix.c b/kernel/irq/matrix.c index 1698e77645ac..810479f608f4 100644 --- a/kernel/irq/matrix.c +++ b/kernel/irq/matrix.c @@ -125,9 +125,16 @@ static unsigned int matrix_alloc_area(struct irq_matri= x *m, struct cpumap *cm, return area; } =20 -/* Find the best CPU which has the lowest vector allocation count */ -static unsigned int matrix_find_best_cpu(struct irq_matrix *m, - const struct cpumask *msk) +/** + * irq_matrix_find_best_cpu() - Find the best CPU for an IRQ + * @m: Matrix pointer + * @msk: On which CPUs the search will be performed + * + * Find the best CPU which has the lowest vector allocation count + * Returns: The best CPU to use + */ +unsigned int irq_matrix_find_best_cpu(struct irq_matrix *m, + const struct cpumask *msk) { unsigned int cpu, best_cpu, maxavl =3D 0; struct cpumap *cm; @@ -146,9 +153,16 @@ static unsigned int matrix_find_best_cpu(struct irq_ma= trix *m, return best_cpu; } =20 -/* Find the best CPU which has the lowest number of managed IRQs allocated= */ -static unsigned int matrix_find_best_cpu_managed(struct irq_matrix *m, - const struct cpumask *msk) +/** + * irq_matrix_find_best_cpu_managed() - Find the best CPU for a managed IRQ + * @m: Matrix pointer + * @msk: On which CPUs the search will be performed + * + * Find the best CPU which has the lowest number of managed IRQs allocated + * Returns: The best CPU to use + */ +unsigned int irq_matrix_find_best_cpu_managed(struct irq_matrix *m, + const struct cpumask *msk) { unsigned int cpu, best_cpu, allocated =3D UINT_MAX; struct cpumap *cm; @@ -292,7 +306,7 @@ int irq_matrix_alloc_managed(struct irq_matrix *m, cons= t struct cpumask *msk, if (cpumask_empty(msk)) return -EINVAL; =20 - cpu =3D matrix_find_best_cpu_managed(m, msk); + cpu =3D irq_matrix_find_best_cpu_managed(m, msk); if (cpu =3D=3D UINT_MAX) return -ENOSPC; =20 @@ -381,13 +395,13 @@ int irq_matrix_alloc(struct irq_matrix *m, const stru= ct cpumask *msk, struct cpumap *cm; =20 /* - * Not required in theory, but matrix_find_best_cpu() uses + * Not required in theory, but irq_matrix_find_best_cpu() uses * for_each_cpu() which ignores the cpumask on UP . */ if (cpumask_empty(msk)) return -EINVAL; =20 - cpu =3D matrix_find_best_cpu(m, msk); + cpu =3D irq_matrix_find_best_cpu(m, msk); if (cpu =3D=3D UINT_MAX) return -ENOSPC; =20 --=20 2.17.1 From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A094C433FE for ; Thu, 5 May 2022 23:58:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387083AbiEFABj (ORCPT ); Thu, 5 May 2022 20:01:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35012 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387042AbiEFAB1 (ORCPT ); Thu, 5 May 2022 20:01:27 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7AD3C60DAA for ; Thu, 5 May 2022 16:57:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795066; x=1683331066; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=zlyJMMAdck3m7Lj6IAc06Z99mVZqGJ1kJoeEJPS6nAw=; b=EMQWTlsJIFEbbzkVX/qDkiVTjIupuSyobroCQk2Chn46yZXbG1L5B5ih J5o6Hn8/dKD/2lTg0UQ6YR+UNYgBmY04x/sVc7ovwDbazPi9tOTrpn/vW b5zF4NjgxR/4U/2Ax3qtqyJq3ENzFdETdbw7u7AvT+MSCfJOK5ic60X/r awXYLbbzdO8vJSXPzOIsC5YkRblhBrXMj+4q7zdppescKjvTe4STHI92j OjtR/3ECqpNHRSHw6Q6upqU4/E6QYzgLCRhQdwRW3J3vs3xCFC+91sTdT 9xytodTz9CUu6CfgNJylVmGAJTOlOqI8hpkAp27eMuyFUQhywS8e+8vlZ A==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283612" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283612" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:45 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914325" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:44 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri Subject: [PATCH v6 02/29] x86/apic: Add irq_cfg::delivery_mode Date: Thu, 5 May 2022 16:59:41 -0700 Message-Id: <20220506000008.30892-3-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Currently, the delivery mode of all interrupts is set to the mode of the APIC driver in use. There are no restrictions in hardware to configure the delivery mode of each interrupt individually. Also, certain IRQs need to be configured with a specific delivery mode (e.g., NMI). Add a new member, delivery_mode, to struct irq_cfg. Subsequent changesets will update every irq_domain to set the delivery mode of each IRQ to that specified in its irq_cfg data. To keep the current behavior, when allocating an IRQ in the root domain (i.e., the x86_vector_domain), set the delivery mode of the IRQ as that of the APIC driver. Cc: Andi Kleen Cc: "Ravi V. Shankar" Cc: Stephane Eranian Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Reviewed-by: Ashok Raj Reviewed-by: Tony Luck Signed-off-by: Ricardo Neri --- Changes since v5: * Updated indentation of the existing members of struct irq_cfg. * Reworded the commit message. Changes since v4: * Rebased to use new enumeration apic_delivery_modes. Changes since v3: * None Changes since v2: * Reduced scope to only add the interrupt delivery mode in struct irq_alloc_info. Changes since v1: * Introduced this patch. --- arch/x86/include/asm/hw_irq.h | 5 +++-- arch/x86/kernel/apic/vector.c | 9 +++++++++ 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h index d465ece58151..5ac5e6c603ee 100644 --- a/arch/x86/include/asm/hw_irq.h +++ b/arch/x86/include/asm/hw_irq.h @@ -88,8 +88,9 @@ struct irq_alloc_info { }; =20 struct irq_cfg { - unsigned int dest_apicid; - unsigned int vector; + unsigned int dest_apicid; + unsigned int vector; + enum apic_delivery_modes delivery_mode; }; =20 extern struct irq_cfg *irq_cfg(unsigned int irq); diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c index 3e6f6b448f6a..838e220e8860 100644 --- a/arch/x86/kernel/apic/vector.c +++ b/arch/x86/kernel/apic/vector.c @@ -567,6 +567,7 @@ static int x86_vector_alloc_irqs(struct irq_domain *dom= ain, unsigned int virq, irqd->chip_data =3D apicd; irqd->hwirq =3D virq + i; irqd_set_single_target(irqd); + /* * Prevent that any of these interrupts is invoked in * non interrupt context via e.g. generic_handle_irq() @@ -577,6 +578,14 @@ static int x86_vector_alloc_irqs(struct irq_domain *do= main, unsigned int virq, /* Don't invoke affinity setter on deactivated interrupts */ irqd_set_affinity_on_activate(irqd); =20 + /* + * Initialize the delivery mode of this irq to match the + * default delivery mode of the APIC. Children irq domains + * may take the delivery mode from the individual irq + * configuration rather than from the APIC driver. + */ + apicd->hw_irq_cfg.delivery_mode =3D apic->delivery_mode; + /* * Legacy vectors are already assigned when the IOAPIC * takes them over. They stay on the same vector. This is --=20 2.17.1 From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A443CC433EF for ; Thu, 5 May 2022 23:58:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387095AbiEFABm (ORCPT ); Thu, 5 May 2022 20:01:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35030 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387035AbiEFAB1 (ORCPT ); Thu, 5 May 2022 20:01:27 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 01EA360DB2 for ; Thu, 5 May 2022 16:57:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795067; x=1683331067; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=H881t49gt9ruU/EtOx7+sxhRpxmGtS427Z13jRF3PD8=; b=U3k/HwjtI1RpT+f9U5hA5Pnky5VzdJD0D5edFJxnMChqEq6c25uJ1EL8 cO1frK+72Ykotb0WtQvdJJTeDKt7MNlvbLjEQ3Bgs0kM/6byKkc7LbnB4 /62XrVS3cNIiaONBPcDXvivn0i92rFqUJz6/wZnCaZegGJ7aPy7Grz91W eJ0bU+xNHWAGoEExxIShlAFl5yxH13nMM3GtV/g4tcZjGQx2a5POQmn/C HhX0QDelzeNIW7dI0DIooLZtPkV8GIiepcutN+j9aCjpRJqqIx+P8XYop K1TQ8kVOxurZATT7dbNZoSY3czZ1whHrgwxxUoVAEosc09nf8De5tftqV g==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283613" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283613" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:45 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914332" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:45 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri Subject: [PATCH v6 03/29] x86/apic/msi: Set the delivery mode individually for each IRQ Date: Thu, 5 May 2022 16:59:42 -0700 Message-Id: <20220506000008.30892-4-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" There are no restrictions in hardware to set MSI messages with its own delivery mode. Use the mode specified in the provided IRQ hardware configuration data. Since most of the IRQs are configured to use the delivery mode of the APIC driver in use (set in all of them to APIC_DELIVERY_MODE_FIXED), the only functional changes are where IRQs are configured to use a specific delivery mode. Changing the utility function __irq_msi_compose_msg() takes care of implementing the change in the in the local APIC, PCI-MSI, and DMAR-MSI irq_chips. The IO-APIC irq_chip configures the entries in the interrupt redirection table using the delivery mode specified in the corresponding MSI message. Since the MSI message is composed by a higher irq_chip in the hierarchy, it does not need to be updated. Cc: Andi Kleen Cc: "Ravi V. Shankar" Cc: Stephane Eranian Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Reviewed-by: Tony Luck Signed-off-by: Ricardo Neri --- Changes since v5: * Introduced this patch Changes since v4: * N/A Changes since v3: * N/A Changes since v2: * N/A Changes since v1: * N/A --- arch/x86/kernel/apic/apic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 189d3a5e471a..d1e12da1e9af 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -2528,7 +2528,7 @@ void __irq_msi_compose_msg(struct irq_cfg *cfg, struc= t msi_msg *msg, msg->arch_addr_lo.dest_mode_logical =3D apic->dest_mode_logical; msg->arch_addr_lo.destid_0_7 =3D cfg->dest_apicid & 0xFF; =20 - msg->arch_data.delivery_mode =3D APIC_DELIVERY_MODE_FIXED; + msg->arch_data.delivery_mode =3D cfg->delivery_mode; msg->arch_data.vector =3D cfg->vector; =20 msg->address_hi =3D X86_MSI_BASE_ADDRESS_HIGH; --=20 2.17.1 From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DFE21C433EF for ; Thu, 5 May 2022 23:58:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387069AbiEFABr (ORCPT ); Thu, 5 May 2022 20:01:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35042 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387048AbiEFAB2 (ORCPT ); Thu, 5 May 2022 20:01:28 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 307B360DB7 for ; Thu, 5 May 2022 16:57:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795067; x=1683331067; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=Jz5/GbztoMpO5f33w54Dy1VgumudmsMpWQzH5+E5s4I=; b=XlZwEYI6/M/Oti/oKGwvzBgRy98haQGOy0pNl1ocMfxAxNWuh77VXoX8 mfn6j2B80IABNcm2f1O78T7HrFLtCiJjKHo2hMNbw33/X8Dk8y0kdMWqC fGxZUMJgnuGpV9lnIBA2YtHsP6CCcbNoT5mwYUJ6VnlikYVSe3tqS9whK m9aetlD8Dq2cnW1PFOzKRx4vf+D5jdptOJLtEXxoVv9mMX80y2xpfGVkZ KP7dGaAydb0qbZhBi8RzIxrREpIw/4dKxnXCCs4oaDFPX5YI5CArhld5c xKH4yHo0pWLiBnxXl+l//kY9iRCajNzs1b+Cc/OKRDspSQxpa1V1+6FF8 g==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283617" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283617" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:45 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914335" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:45 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri Subject: [PATCH v6 04/29] x86/apic: Add the X86_IRQ_ALLOC_AS_NMI irq allocation flag Date: Thu, 5 May 2022 16:59:43 -0700 Message-Id: <20220506000008.30892-5-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" There are cases in which it is necessary to set the delivery mode of an interrupt as NMI. Add a new flag that callers can specify when allocating an IRQ. Cc: Andi Kleen Cc: "Ravi V. Shankar" Cc: Stephane Eranian Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Suggested-by: Thomas Gleixner Reviewed-by: Tony Luck Signed-off-by: Ricardo Neri --- Changes since v5: * Introduced this patch. Changes since v4: * N/A Changes since v3: * N/A Changes since v2: * N/A Changes since v1: * N/A --- arch/x86/include/asm/irqdomain.h | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/include/asm/irqdomain.h b/arch/x86/include/asm/irqdom= ain.h index 125c23b7bad3..de1cf2e80443 100644 --- a/arch/x86/include/asm/irqdomain.h +++ b/arch/x86/include/asm/irqdomain.h @@ -10,6 +10,7 @@ enum { /* Allocate contiguous CPU vectors */ X86_IRQ_ALLOC_CONTIGUOUS_VECTORS =3D 0x1, X86_IRQ_ALLOC_LEGACY =3D 0x2, + X86_IRQ_ALLOC_AS_NMI =3D 0x4, }; =20 extern int x86_fwspec_is_ioapic(struct irq_fwspec *fwspec); --=20 2.17.1 From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8199CC433EF for ; Thu, 5 May 2022 23:58:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387147AbiEFABz (ORCPT ); Thu, 5 May 2022 20:01:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35104 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387045AbiEFAB3 (ORCPT ); Thu, 5 May 2022 20:01:29 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A271160DA8 for ; Thu, 5 May 2022 16:57:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795067; x=1683331067; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=D+sZP1VTaO2AD2PX2fNF30e9SIwhyViy7LM9jGqhP5M=; b=bsFDga76t8yiZChb3uUw+latfT3p6I20MC2+fZHfDlCOAUBGkh1sfEnK kpA+4PHEQeJp2QKk2dmULVMptvy+0tugwMIlITxS1dXtNcHIr8TTLsb+R FBXUEP23YSB3msNH5hKY9OuEXCMmUFN26M95esj7vjiaHGZUNTdilajVW aOSwA/bIspzJKJ3jESRzQ4fy6uNOS+26V3UT8MoEd1NtqX7Dc3c5P6Abd +KzCkmzSOf+LnaKnCNi/zsWpQAxr0fqvN48Iy2djIyPkI4O3u9MiEkbCO iD93KjAxh94L0XD/d7VA5jJ1GHbUcusVnvr1Cxs0yOCXclfmlAgeNSQBY g==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283619" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283619" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:46 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914338" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:45 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri Subject: [PATCH v6 05/29] x86/apic/vector: Do not allocate vectors for NMIs Date: Thu, 5 May 2022 16:59:44 -0700 Message-Id: <20220506000008.30892-6-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Vectors are meaningless when allocating IRQs with NMI as the delivery mode. In such case, skip the reservation of IRQ vectors. Do it in the lowest- level functions where the actual IRQ reservation takes place. Since NMIs target specific CPUs, keep the functionality to find the best CPU. Cc: Andi Kleen Cc: "Ravi V. Shankar" Cc: Stephane Eranian Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Reviewed-by: Tony Luck Signed-off-by: Ricardo Neri Suggested-by is good enough. --- Changes since v5: * Introduced this patch. Changes since v4: * N/A Changes since v3: * N/A Changes since v2: * N/A Changes since v1: * N/A --- arch/x86/kernel/apic/vector.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c index 838e220e8860..11f881f45cec 100644 --- a/arch/x86/kernel/apic/vector.c +++ b/arch/x86/kernel/apic/vector.c @@ -245,11 +245,20 @@ assign_vector_locked(struct irq_data *irqd, const str= uct cpumask *dest) if (apicd->move_in_progress || !hlist_unhashed(&apicd->clist)) return -EBUSY; =20 + if (apicd->hw_irq_cfg.delivery_mode =3D=3D APIC_DELIVERY_MODE_NMI) { + cpu =3D irq_matrix_find_best_cpu(vector_matrix, dest); + apicd->cpu =3D cpu; + vector =3D 0; + goto no_vector; + } + vector =3D irq_matrix_alloc(vector_matrix, dest, resvd, &cpu); trace_vector_alloc(irqd->irq, vector, resvd, vector); if (vector < 0) return vector; apic_update_vector(irqd, vector, cpu); + +no_vector: apic_update_irq_cfg(irqd, vector, cpu); =20 return 0; @@ -321,12 +330,22 @@ assign_managed_vector(struct irq_data *irqd, const st= ruct cpumask *dest) /* set_affinity might call here for nothing */ if (apicd->vector && cpumask_test_cpu(apicd->cpu, vector_searchmask)) return 0; + + if (apicd->hw_irq_cfg.delivery_mode =3D=3D APIC_DELIVERY_MODE_NMI) { + cpu =3D irq_matrix_find_best_cpu_managed(vector_matrix, dest); + apicd->cpu =3D cpu; + vector =3D 0; + goto no_vector; + } + vector =3D irq_matrix_alloc_managed(vector_matrix, vector_searchmask, &cpu); trace_vector_alloc_managed(irqd->irq, vector, vector); if (vector < 0) return vector; apic_update_vector(irqd, vector, cpu); + +no_vector: apic_update_irq_cfg(irqd, vector, cpu); return 0; } @@ -376,6 +395,10 @@ static void x86_vector_deactivate(struct irq_domain *d= om, struct irq_data *irqd) if (apicd->has_reserved) return; =20 + /* NMI IRQs do not have associated vectors; nothing to do. */ + if (apicd->hw_irq_cfg.delivery_mode =3D=3D APIC_DELIVERY_MODE_NMI) + return; + raw_spin_lock_irqsave(&vector_lock, flags); clear_irq_vector(irqd); if (apicd->can_reserve) @@ -472,6 +495,10 @@ static void vector_free_reserved_and_managed(struct ir= q_data *irqd) trace_vector_teardown(irqd->irq, apicd->is_managed, apicd->has_reserved); =20 + /* NMI IRQs do not have associated vectors; nothing to do. */ + if (apicd->hw_irq_cfg.delivery_mode =3D=3D APIC_DELIVERY_MODE_NMI) + return; + if (apicd->has_reserved) irq_matrix_remove_reserved(vector_matrix); if (apicd->is_managed) --=20 2.17.1 From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 11653C433FE for ; Thu, 5 May 2022 23:58:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387096AbiEFABv (ORCPT ); Thu, 5 May 2022 20:01:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35106 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387046AbiEFAB3 (ORCPT ); Thu, 5 May 2022 20:01:29 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AC32860DBA for ; Thu, 5 May 2022 16:57:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795067; x=1683331067; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=eY8ASsd7/aj+9aYNbmEq9GanUaeYVEui2oMFa5HCVds=; b=YMXxjxb13scJhai/c9kcuT3e1y9qaoy8f8nlPRARxffbHqPCprnB4xfO hXwBos4I70Mr0AyRixDeh0aN79ZtIDaIG1Dz8PuAeI+PvSgy8G8bpns7S cxeaF1gkEOeTa+Fx4vLn+VMg4qS7lHeWQd0g987MT5o2HktG/NKS+w+Sm bhqDCKkDSeA+aROvxoKiHyhVomN3jeiXj8uFyD4cz01pGe7b6on/BNeAu R7fjAvc1VUbT9L0kftyrHRrs9H2YY2tZabw9xYbnOtugI/6zp3kkuUdOd ZdEun7Ym253F4N6UesRlY0S74D2prZjSJBNM4rOBQ4824LeMQ6MBAr7Yt Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283621" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283621" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:46 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914342" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:46 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri Subject: [PATCH v6 06/29] x86/apic/vector: Implement support for NMI delivery mode Date: Thu, 5 May 2022 16:59:45 -0700 Message-Id: <20220506000008.30892-7-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" The flag X86_IRQ_ALLOC_AS_NMI indicates to the interrupt controller that it should configure the delivery mode of an IRQ as NMI. Implement such request. This causes irq_domain children in the hierarchy to configure their irq_chips accordingly. When no specific delivery mode is requested, continue using the delivery mode of the APIC driver in use. Cc: Andi Kleen Cc: "Ravi V. Shankar" Cc: Stephane Eranian Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Suggested-by: Thomas Gleixner Reviewed-by: Tony Luck Signed-off-by: Ricardo Neri --- Changes since v5: * Introduced this patch. Changes since v4: * N/A Changes since v3: * N/A Changes since v2: * N/A Changes since v1: * N/A --- arch/x86/kernel/apic/vector.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c index 11f881f45cec..df4d7b9f6e27 100644 --- a/arch/x86/kernel/apic/vector.c +++ b/arch/x86/kernel/apic/vector.c @@ -570,6 +570,10 @@ static int x86_vector_alloc_irqs(struct irq_domain *do= main, unsigned int virq, if ((info->flags & X86_IRQ_ALLOC_CONTIGUOUS_VECTORS) && nr_irqs > 1) return -ENOSYS; =20 + /* Only one IRQ per NMI */ + if ((info->flags & X86_IRQ_ALLOC_AS_NMI) && nr_irqs !=3D 1) + return -EINVAL; + /* * Catch any attempt to touch the cascade interrupt on a PIC * equipped system. @@ -610,7 +614,15 @@ static int x86_vector_alloc_irqs(struct irq_domain *do= main, unsigned int virq, * default delivery mode of the APIC. Children irq domains * may take the delivery mode from the individual irq * configuration rather than from the APIC driver. + * + * Vectors are meaningless if the delivery mode is NMI. Since + * nr_irqs is 1, we can return. */ + if (info->flags & X86_IRQ_ALLOC_AS_NMI) { + apicd->hw_irq_cfg.delivery_mode =3D APIC_DELIVERY_MODE_NMI; + return 0; + } + apicd->hw_irq_cfg.delivery_mode =3D apic->delivery_mode; =20 /* --=20 2.17.1 From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 167C0C433F5 for ; Thu, 5 May 2022 23:58:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387155AbiEFAB7 (ORCPT ); Thu, 5 May 2022 20:01:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35030 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387056AbiEFAB3 (ORCPT ); Thu, 5 May 2022 20:01:29 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 32C7E60DBD for ; Thu, 5 May 2022 16:57:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795068; x=1683331068; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=s6kx2WD+U+gm4lQdJ9AXtLnfiPC2EdKdrJZetgovSOw=; b=LAQSZFRt0VfLpAxWTIJ93sXM7+vniu54efLdIofxKBwFoy9j7jF3poZ5 YtcyHPYKJORJ67r9K34BiOafHzmMr7xizSFXrQIahVDwbd382b2M4r+2J z7gCod/C8XrRSfOc4a0D4yuhlhXRUIKdp5pZkZ3jJk+PxhZZXYMyrnl1D hE6bs1vl5j7gXoamhTPtbXo0Jq3bPAr2Ko5mSaZWHh4GoH0sFHCeBtiy7 VlIRq063rDeI1TUYi4Vd21NNwkMz/Eqi+Pr24XryCbfr0G2z/YRHn59em 3jT8t3/W+CgWaBuiSLhbCg8loti/KQDXj8LmcsLKgDvvSFuSiDVmEOwvB A==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283623" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283623" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:47 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914346" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:46 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri Subject: [PATCH v6 07/29] iommu/vt-d: Clear the redirection hint when the destination mode is physical Date: Thu, 5 May 2022 16:59:46 -0700 Message-Id: <20220506000008.30892-8-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" When the destination mode of an interrupt is physical APICID, the interrupt is delivered only to the single CPU of which the physical APICID is specified in the destination ID field. Therefore, the redirection hint is meaningless. Furthermore, on certain processors, the IOMMU does not deliver the interrupt when the delivery mode is NMI, the redirection hint is set, and the destination mode is physical. Clearing the redirection hint ensures that the NMI is delivered. Cc: Andi Kleen Cc: David Woodhouse Cc: "Ravi V. Shankar" Cc: Lu Baolu Cc: Stephane Eranian Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Suggested-by: Ashok Raj Reviewed-by: Lu Baolu Signed-off-by: Ricardo Neri --- Changes since v5: * Introduced this patch. Changes since v4: * N/A Changes since v3: * N/A Changes since v2: * N/A Changes since v1: * N/A --- drivers/iommu/intel/irq_remapping.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/intel/irq_remapping.c b/drivers/iommu/intel/irq_= remapping.c index a67319597884..d2764a71f91a 100644 --- a/drivers/iommu/intel/irq_remapping.c +++ b/drivers/iommu/intel/irq_remapping.c @@ -1128,7 +1128,17 @@ static void prepare_irte(struct irte *irte, int vect= or, unsigned int dest) irte->dlvry_mode =3D apic->delivery_mode; irte->vector =3D vector; irte->dest_id =3D IRTE_DEST(dest); - irte->redir_hint =3D 1; + + /* + * When using the destination mode of physical APICID, only the + * processor specified in @dest receives the interrupt. Thus, the + * redirection hint is meaningless. + * + * Furthermore, on some processors, NMIs with physical delivery mode + * and the redirection hint set are delivered as regular interrupts + * or not delivered at all. + */ + irte->redir_hint =3D apic->dest_mode_logical; } =20 struct irq_remap_ops intel_irq_remap_ops =3D { --=20 2.17.1 From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 580D5C433EF for ; Thu, 5 May 2022 23:58:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387163AbiEFACE (ORCPT ); Thu, 5 May 2022 20:02:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35188 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387070AbiEFABb (ORCPT ); Thu, 5 May 2022 20:01:31 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3309E60DB7 for ; Thu, 5 May 2022 16:57:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795069; x=1683331069; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=UTXQyhtYGmsXw7bhN7zaVr8NO2RbQS3Z18S2QU+fmiU=; b=bAi8KwMmiiRz11vHIlPSx5xBISbI+rV/zdITXHwh/BVlhd6RqYfbHxUA zmGvQzC7N27/aUzi//Bd0cloTmrRnKqwXfauyBZAPIQ/f9lQ1UYVBzBs3 05dvNEkP/UIl0xzgNe1adz4rD+BNXiLR1o5kQzFcXj1lVnv7EbtYbw9Wb NnfF5S553k2QT7bt+TVtNRYvo2Zt9hSWJORaOujxImmxx5+/XWSqmLttM 9rU9IZwK739fnLf256bidM41NAW8MZr8q6GNgVNQHyAaIPxWxMQ7FWVLO CSfqVz1cz/jrM31K2bjNamXnALYtAAwjCxOiolobPJCEFt4H6psI8ZqjJ w==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283625" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283625" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:47 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914350" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:47 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri Subject: [PATCH v6 08/29] iommu/vt-d: Rework prepare_irte() to support per-IRQ delivery mode Date: Thu, 5 May 2022 16:59:47 -0700 Message-Id: <20220506000008.30892-9-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" struct irq_cfg::delivery_mode specifies the delivery mode of each IRQ separately. Configuring the delivery mode of an IRTE would require adding a third argument to prepare_irte(). Instead, simply take a pointer to the irq_cfg for which an IRTE is being configured. This change does not cause functional changes. Cc: Andi Kleen Cc: David Woodhouse Cc: "Ravi V. Shankar" Cc: Lu Baolu Cc: Stephane Eranian Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Reviewed-by: Ashok Raj Reviewed-by: Tony Luck Reviewed-by: Lu Baolu Signed-off-by: Ricardo Neri --- Changes since v5: * Only change the signature of prepare_irte(). A separate patch changes the setting of the delivery_mode. Changes since v4: * None Changes since v3: * None Changes since v2: * None Changes since v1: * Introduced this patch. --- drivers/iommu/intel/irq_remapping.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/intel/irq_remapping.c b/drivers/iommu/intel/irq_= remapping.c index d2764a71f91a..66d37186ec28 100644 --- a/drivers/iommu/intel/irq_remapping.c +++ b/drivers/iommu/intel/irq_remapping.c @@ -1111,7 +1111,7 @@ void intel_irq_remap_add_device(struct dmar_pci_notif= y_info *info) dev_set_msi_domain(&info->dev->dev, map_dev_to_ir(info->dev)); } =20 -static void prepare_irte(struct irte *irte, int vector, unsigned int dest) +static void prepare_irte(struct irte *irte, struct irq_cfg *irq_cfg) { memset(irte, 0, sizeof(*irte)); =20 @@ -1126,8 +1126,8 @@ static void prepare_irte(struct irte *irte, int vecto= r, unsigned int dest) */ irte->trigger_mode =3D 0; irte->dlvry_mode =3D apic->delivery_mode; - irte->vector =3D vector; - irte->dest_id =3D IRTE_DEST(dest); + irte->vector =3D irq_cfg->vector; + irte->dest_id =3D IRTE_DEST(irq_cfg->dest_apicid); =20 /* * When using the destination mode of physical APICID, only the @@ -1278,8 +1278,7 @@ static void intel_irq_remapping_prepare_irte(struct i= ntel_ir_data *data, { struct irte *irte =3D &data->irte_entry; =20 - prepare_irte(irte, irq_cfg->vector, irq_cfg->dest_apicid); - + prepare_irte(irte, irq_cfg); switch (info->type) { case X86_IRQ_ALLOC_TYPE_IOAPIC: /* Set source-id of interrupt request */ --=20 2.17.1 From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ECAF1C433EF for ; Thu, 5 May 2022 23:58:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1381614AbiEFACH (ORCPT ); Thu, 5 May 2022 20:02:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35192 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387071AbiEFABb (ORCPT ); Thu, 5 May 2022 20:01:31 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3E81460DB9 for ; Thu, 5 May 2022 16:57:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795069; x=1683331069; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=5aqgNQHykrRDChcwPWM1DjTNVBAFYGhJ4hN9bwVMei0=; b=RA2xMnQYXJ6kezS5sIydBflhJ2grYm53g7pjI8DBRErQ4+8L885GiJeC lboDZ5/DScCzQn42sV0W8SX/BALOltkD9HSds4hm6hN3OAe1qqHuuUmwR +AdnyZC6YL8uJto7xsuMUKbkBBbN2VzZonE9+sFTGS5eRZJquxoXlTKmc XvbEUTVQl9ZJYaLAYsx7IRb6AERn/87zGACkBnxKdwu6hUQL19WdbJfhL DqHLYvIbECthUsy8CGJzQjnMWQSbBBul68Dos68QXtcYNUWmRgz4TJJaN w7UrfeemYvhWRv+2QuzSLsmfW8DLb676Y3tGuVwmH1w0AdzfgvBKQUe4y g==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283628" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283628" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:48 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914353" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:47 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri Subject: [PATCH v6 09/29] iommu/vt-d: Set the IRTE delivery mode individually for each IRQ Date: Thu, 5 May 2022 16:59:48 -0700 Message-Id: <20220506000008.30892-10-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" There are no hardware requirements to use the same delivery mode for all interrupts. Use the mode specified in the provided IRQ hardware configuration data. Since all IRQs are configured to use the delivery mode of the APIC drive, the only functional changes are where IRQs are configured to use a specific delivery mode. Cc: Andi Kleen Cc: David Woodhouse Cc: "Ravi V. Shankar" Cc: Lu Baolu Cc: Stephane Eranian Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Reviewed-by: Tony Luck Reviewed-by: Lu Baolu Signed-off-by: Ricardo Neri --- Changes since v5: * Introduced this patch. Changes since v4: * N/A Changes since v3: * N/A Changes since v2: * N/A Changes since v1: * N/A --- drivers/iommu/intel/irq_remapping.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/intel/irq_remapping.c b/drivers/iommu/intel/irq_= remapping.c index 66d37186ec28..fb2d71bea98d 100644 --- a/drivers/iommu/intel/irq_remapping.c +++ b/drivers/iommu/intel/irq_remapping.c @@ -1125,7 +1125,7 @@ static void prepare_irte(struct irte *irte, struct ir= q_cfg *irq_cfg) * irq migration in the presence of interrupt-remapping. */ irte->trigger_mode =3D 0; - irte->dlvry_mode =3D apic->delivery_mode; + irte->dlvry_mode =3D irq_cfg->delivery_mode; irte->vector =3D irq_cfg->vector; irte->dest_id =3D IRTE_DEST(irq_cfg->dest_apicid); =20 --=20 2.17.1 From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 081B3C433EF for ; Thu, 5 May 2022 23:58:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387175AbiEFACJ (ORCPT ); Thu, 5 May 2022 20:02:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35548 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387094AbiEFABm (ORCPT ); Thu, 5 May 2022 20:01:42 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B030E60DA8 for ; Thu, 5 May 2022 16:57:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795069; x=1683331069; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=5M26CnkIjGrveCxZY+LFQe6aJhW+irvkad9bEfjWjOs=; b=n3jyQnvYgV2Ow2qEPLCBV//jLqjdjsAS0ozlvOZFgTnHu7xkh9jTFrwQ Em+Ml4GGTfwbio7do6pExzZjVJRQJFsPz50Cz6CrOY2fe3+J/hGd/0NnT vWmcVN4GQ/WucrAfUSt7F2y6ghZVU0xUEq7Wzam+K128DPanK3NUK0KyK YojAusZ3rROkD8qd82SZISlm6tzpDEsYoZe6Jvd6FCKi7wKM43rzAxx4v mOAmcfHnYEgHtrI89Qwyav8xZWzUOyDy/UV4z8l6AJfaxPLMZvTG2/Ywe 7D/CG+x+WF37X1jknk4tAYOaTLES45ZbvkvVUY5DnwqeA0Km+nfKHqa/9 g==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283629" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283629" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:48 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914357" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:48 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri Subject: [PATCH v6 10/29] iommu/vt-d: Implement minor tweaks for NMI irqs Date: Thu, 5 May 2022 16:59:49 -0700 Message-Id: <20220506000008.30892-11-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" The Intel IOMMU interrupt remapping driver already programs correctly the delivery mode of individual irqs as per their irq_data. Improve handling of NMIs. Allow only one irq per NMI. Also, it is not necessary to cleanup irq vectors after updating affinity. NMIs do not have associated vectors. Cc: Andi Kleen Cc: David Woodhouse Cc: "Ravi V. Shankar" Cc: Lu Baolu Cc: Stephane Eranian Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Reviewed-by: Lu Baolu Signed-off-by: Ricardo Neri --- Changes since v5: * Introduced this patch. Changes since v4: * N/A Changes since v3: * N/A Changes since v2: * N/A Changes since v1: * N/A --- drivers/iommu/intel/irq_remapping.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/intel/irq_remapping.c b/drivers/iommu/intel/irq_= remapping.c index fb2d71bea98d..791a9331e257 100644 --- a/drivers/iommu/intel/irq_remapping.c +++ b/drivers/iommu/intel/irq_remapping.c @@ -1198,8 +1198,12 @@ intel_ir_set_affinity(struct irq_data *data, const s= truct cpumask *mask, * After this point, all the interrupts will start arriving * at the new destination. So, time to cleanup the previous * vector allocation. + * + * Do it only for non-NMI irqs. NMIs don't have associated + * vectors. */ - send_cleanup_vector(cfg); + if (cfg->delivery_mode !=3D APIC_DELIVERY_MODE_NMI) + send_cleanup_vector(cfg); =20 return IRQ_SET_MASK_OK_DONE; } @@ -1352,6 +1356,9 @@ static int intel_irq_remapping_alloc(struct irq_domai= n *domain, if (info->type =3D=3D X86_IRQ_ALLOC_TYPE_PCI_MSI) info->flags &=3D ~X86_IRQ_ALLOC_CONTIGUOUS_VECTORS; =20 + if ((info->flags & X86_IRQ_ALLOC_AS_NMI) && nr_irqs !=3D 1) + return -EINVAL; + ret =3D irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, arg); if (ret < 0) return ret; --=20 2.17.1 From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ADF93C433F5 for ; Thu, 5 May 2022 23:58:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387202AbiEFACZ (ORCPT ); Thu, 5 May 2022 20:02:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35012 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387105AbiEFABm (ORCPT ); Thu, 5 May 2022 20:01:42 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 486E56128F for ; Thu, 5 May 2022 16:57:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795071; x=1683331071; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=fO8Z/cunSnljRh67cIhZwnOKie+7UxkAbp0EHXuT2pk=; b=R/BZoREPEagQvkn9cTpfdazTWZh+mJ1svF8QIhLuUSTtcClwe8DCJdHy HROzjNKx+tVbDE+f2A9b88/0GlmxtnRfAqp8uCe0X1kfUq8KnKzoIgLzI TNplY29Fd8ew9sD41vB7w5X09sf8qnlTAOAo3+ctvYvueI7txZLsXAie0 ULfsgyecekv0bdBE/zJLr9oOmeLn4EQpaORaIvAmCW1uR3d/yG7t18mH7 v6dug/3Ok02NtAU7Dl5r6jtJhi1nyobTLq9pxKEt5YvDUqs5/GXpZXbFN MWo0Mu0j4xw+d93mqyR55+Co1GiHwuMzRJC0qtJ1xQRhInBFB+xtAYqWN g==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283630" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283630" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:49 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914361" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:48 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri , Suravee Suthikulpanit Subject: [PATCH v6 11/29] iommu/amd: Expose [set|get]_dev_entry_bit() Date: Thu, 5 May 2022 16:59:50 -0700 Message-Id: <20220506000008.30892-12-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" These functions are used to check and set specific bits in a Device Table Entry. For instance, they can be used to modify the setting of the NMIPass field. Currently, these functions are used only for ACPI-specified devices. However, an interrupt is to be allocated with NMI as delivery mode, the Device Table Entry needs modified accordingly in irq_remapping_alloc(). As a first step expose these two functions. No functional changes. Cc: Andi Kleen Cc: "Ravi V. Shankar" Cc: Joerg Roedel Cc: Suravee Suthikulpanit Cc: Stephane Eranian Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Signed-off-by: Ricardo Neri --- Changes since v5: * Introduced this patch Changes since v4: * N/A Changes since v3: * N/A Changes since v2: * N/A Changes since v1: * N/A --- drivers/iommu/amd/amd_iommu.h | 3 +++ drivers/iommu/amd/init.c | 4 ++-- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/amd/amd_iommu.h b/drivers/iommu/amd/amd_iommu.h index 1ab31074f5b3..9f3d1564c84e 100644 --- a/drivers/iommu/amd/amd_iommu.h +++ b/drivers/iommu/amd/amd_iommu.h @@ -128,4 +128,7 @@ static inline void amd_iommu_apply_ivrs_quirks(void) { } =20 extern void amd_iommu_domain_set_pgtable(struct protection_domain *domain, u64 *root, int mode); + +extern void set_dev_entry_bit(u16 devid, u8 bit); +extern int get_dev_entry_bit(u16 devid, u8 bit); #endif diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index b4a798c7b347..823e76b284f1 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -914,7 +914,7 @@ static void iommu_enable_gt(struct amd_iommu *iommu) } =20 /* sets a specific bit in the device table entry. */ -static void set_dev_entry_bit(u16 devid, u8 bit) +void set_dev_entry_bit(u16 devid, u8 bit) { int i =3D (bit >> 6) & 0x03; int _bit =3D bit & 0x3f; @@ -922,7 +922,7 @@ static void set_dev_entry_bit(u16 devid, u8 bit) amd_iommu_dev_table[devid].data[i] |=3D (1UL << _bit); } =20 -static int get_dev_entry_bit(u16 devid, u8 bit) +int get_dev_entry_bit(u16 devid, u8 bit) { int i =3D (bit >> 6) & 0x03; int _bit =3D bit & 0x3f; --=20 2.17.1 From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0989C433EF for ; Thu, 5 May 2022 23:58:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387186AbiEFACT (ORCPT ); Thu, 5 May 2022 20:02:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35142 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387104AbiEFABm (ORCPT ); Thu, 5 May 2022 20:01:42 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 832D460DA1 for ; Thu, 5 May 2022 16:57:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795071; x=1683331071; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=3xUpwlx5TiZjojK61Skg6W12+8Hy6giKnPG/YThYLqA=; b=S8xjDMgbdvpuazQnaNj4yQ5YOOE5QOvopc+AOo7QodhSxRnnZbkh5buN ZszO4vTCACiqsBONHGlB0DMVQ9rjOsUGofKtdtkDzqDf8Kch9sL/p9zQ4 GN+wJCdpOAUU+qbYksv+d0YnGUqodmJkYz3yGMAOOlJ1vKfG8MNRAucCp 8Yz6ombY2VtAZsvOLIJyxOH08lhvcly/Snfv2w7nJS7ohfMYpqu4fJn6w GTk32aksBHBjgq1wY3to1ZpzXJdTfpW5PYdgoGDY0rF5TTL1kJcNHSUmi MLV7e++dHeia7l3uTs32feZmKKMvd8ApbU744v2m4cNFefTtBAUlU87+P w==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283633" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283633" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:49 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914367" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:49 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri , Suravee Suthikulpanit Subject: [PATCH v6 12/29] iommu/amd: Enable NMIPass when allocating an NMI irq Date: Thu, 5 May 2022 16:59:51 -0700 Message-Id: <20220506000008.30892-13-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" As per the AMD I/O Virtualization Technology (IOMMU) Specification, the AMD IOMMU only remaps fixed and arbitrated MSIs. NMIs are controlled by the NMIPass bit of a Device Table Entry. When set, the IOMMU passes through NMI interrupt messages unmapped. Otherwise, they are aborted. Furthermore, Section 2.2.5 Table 19 states that the IOMMU will also abort NMIs when the destination mode is logical. Update the NMIPass setting of a device's DTE when an NMI irq is being allocated. Only do so when the destination mode of the APIC is not logical. Cc: Andi Kleen Cc: "Ravi V. Shankar" Cc: Joerg Roedel Cc: Suravee Suthikulpanit Cc: Stephane Eranian Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Signed-off-by: Ricardo Neri --- Changes since v5: * Introduced this patch Changes since v4: * N/A Changes since v3: * N/A Changes since v2: * N/A Changes since v1: * N/A --- drivers/iommu/amd/iommu.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index a1ada7bff44e..4d7421b6858d 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -3156,6 +3156,15 @@ static int irq_remapping_alloc(struct irq_domain *do= main, unsigned int virq, info->type !=3D X86_IRQ_ALLOC_TYPE_PCI_MSIX) return -EINVAL; =20 + if (info->flags & X86_IRQ_ALLOC_AS_NMI) { + /* Only one IRQ per NMI */ + if (nr_irqs !=3D 1) + return -EINVAL; + + /* NMIs are aborted when the destination mode is logical. */ + if (apic->dest_mode_logical) + return -EPERM; + } /* * With IRQ remapping enabled, don't need contiguous CPU vectors * to support multiple MSI interrupts. @@ -3208,6 +3217,15 @@ static int irq_remapping_alloc(struct irq_domain *do= main, unsigned int virq, goto out_free_parent; } =20 + if (info->flags & X86_IRQ_ALLOC_AS_NMI) { + struct amd_iommu *iommu =3D amd_iommu_rlookup_table[devid]; + + if (!get_dev_entry_bit(devid, DEV_ENTRY_NMI_PASS)) { + set_dev_entry_bit(devid, DEV_ENTRY_NMI_PASS); + iommu_flush_dte(iommu, devid); + } + } + for (i =3D 0; i < nr_irqs; i++) { irq_data =3D irq_domain_get_irq_data(domain, virq + i); cfg =3D irq_data ? irqd_cfg(irq_data) : NULL; --=20 2.17.1 From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 30180C433EF for ; Thu, 5 May 2022 23:59:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387250AbiEFACj (ORCPT ); Thu, 5 May 2022 20:02:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35558 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387108AbiEFABm (ORCPT ); Thu, 5 May 2022 20:01:42 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BFD0D61295 for ; Thu, 5 May 2022 16:57:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795071; x=1683331071; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=60YeowQrYtVQTC/Y9/6aG1nPAySkU1vh5niJqWVSb28=; b=WYRBDqbmyC1mf2GOXxhmq7iTYY4kZm8YZd6zHR5HrpZdJG0d5mn5A3Uh mWuTcc9mFgLkgsFDQQMPs1B3YHjFmH+I5yjpBiOSPLKK8lY2n+uAsxXfi HZ4OxCQBdvlJGevFPGWn6QBJwNrpFzrBFj44UVKT/5KrOJ6uyOSLsgMuW oJ1T+0uPxI1hKNLiwPxcBpBGSO0ep8OlzZOrlxNtJLgMGWBxfF4ZjPZIt iwA1Ff1o/UnehMLB/Ojzpc1faC3A03+kgaPK4aiHwv815hdA9I+cYLqAU wW19A6XL5bYAeScnXC35V3y4mxWspD3M8nZb92Ad6nWOFHQtUhGv8PrxY A==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283636" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283636" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914372" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:49 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri , Suravee Suthikulpanit Subject: [PATCH v6 13/29] iommu/amd: Compose MSI messages for NMI irqs in non-IR format Date: Thu, 5 May 2022 16:59:52 -0700 Message-Id: <20220506000008.30892-14-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" If NMIPass is enabled in a device's DTE, the IOMMU lets NMI interrupt messages pass through unmapped. Therefore, the contents of the MSI message, not an IRTE, determine how and where the NMI is delivered. Since the IOMMU driver owns the MSI message of the NMI irq, compose it using the non-interrupt-remapping format. Also, let descendant irqchips write the MSI as appropriate for the device. Cc: Andi Kleen Cc: "Ravi V. Shankar" Cc: Joerg Roedel Cc: Suravee Suthikulpanit Cc: Stephane Eranian Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Signed-off-by: Ricardo Neri --- Changes since v5: * Introduced this patch Changes since v4: * N/A Changes since v3: * N/A Changes since v2: * N/A Changes since v1: * N/A --- drivers/iommu/amd/iommu.c | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index 4d7421b6858d..6e07949b3e2a 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -3111,7 +3111,16 @@ static void irq_remapping_prepare_irte(struct amd_ir= _data *data, case X86_IRQ_ALLOC_TYPE_HPET: case X86_IRQ_ALLOC_TYPE_PCI_MSI: case X86_IRQ_ALLOC_TYPE_PCI_MSIX: - fill_msi_msg(&data->msi_entry, irte_info->index); + if (irq_cfg->delivery_mode =3D=3D APIC_DELIVERY_MODE_NMI) + /* + * The IOMMU lets NMIs pass through unmapped. Thus, the + * MSI message, not the IRTE, determines the irq + * configuration. Since we own the MSI message, + * compose it. Descendant irqchips will write it. + */ + __irq_msi_compose_msg(irq_cfg, &data->msi_entry, true); + else + fill_msi_msg(&data->msi_entry, irte_info->index); break; =20 default: @@ -3509,6 +3518,18 @@ static int amd_ir_set_affinity(struct irq_data *data, */ send_cleanup_vector(cfg); =20 + /* + * When the delivery mode of an irq is NMI, the IOMMU lets the NMI + * interrupt messages pass through unmapped. Hence, changes in the + * destination are to be reflected in the NMI message itself, not the + * IRTE. Thus, descendant irqchips must set the affinity and compose + * write the MSI message. + * + * Also, NMIs do not have an associated vector. No need for cleanup. + */ + if (cfg->delivery_mode =3D=3D APIC_DELIVERY_MODE_NMI) + return IRQ_SET_MASK_OK; + return IRQ_SET_MASK_OK_DONE; } =20 --=20 2.17.1 From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E2442C433EF for ; Thu, 5 May 2022 23:58:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387126AbiEFAC3 (ORCPT ); Thu, 5 May 2022 20:02:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35560 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387107AbiEFABm (ORCPT ); Thu, 5 May 2022 20:01:42 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 78FA7612A1 for ; Thu, 5 May 2022 16:57:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795073; x=1683331073; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=xjphrR/fdJYxZybC1OeOpVaKdDVRMW9p0gc6LYNcZJo=; b=URLFvnE4ct6dF+OEekMBR4GhFlwSu2/Hxk9ClCVf2+A3qv+DEQguIvDe J5598G22iJk9/WHZKyl9NjXOkB0ua4Szci85NHZbFVPwy1yDuke6ZgsGH B3mA3aeGm/w6QdlreD5CLycPFS+9aoHczb5Cbc3l1CRflqolmpmHS6pju w4bVaJrP7jgncroa6doHL4ideimHNPu3zg45HdFoZR0JBbs73lf0qa4GB FPV2r20Elqx+xrxGVvVIoJdGp6TmZBI9TZprQOAfHCAJ5zQckHnjSkz1m Ya5TH0g+bR7E4dTHh4gnX8y6c1OC0zCgHZly3HMYy27oJjS5mrGDqicE4 Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283638" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283638" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914376" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:50 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri Subject: [PATCH v6 14/29] x86/hpet: Expose hpet_writel() in header Date: Thu, 5 May 2022 16:59:53 -0700 Message-Id: <20220506000008.30892-15-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" In order to allow hpet_writel() to be used by other components (e.g., the HPET-based hardlockup detector), expose it in the HPET header file. Cc: Andi Kleen Cc: Stephane Eranian Cc: "Ravi V. Shankar" Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Reviewed-by: Tony Luck Signed-off-by: Ricardo Neri --- Changes since v5: * None Changes since v4: * Dropped exposing hpet_readq() as it is not needed. Changes since v3: * None Changes since v2: * None Changes since v1: * None --- arch/x86/include/asm/hpet.h | 1 + arch/x86/kernel/hpet.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/hpet.h b/arch/x86/include/asm/hpet.h index ab9f3dd87c80..be9848f0883f 100644 --- a/arch/x86/include/asm/hpet.h +++ b/arch/x86/include/asm/hpet.h @@ -72,6 +72,7 @@ extern int is_hpet_enabled(void); extern int hpet_enable(void); extern void hpet_disable(void); extern unsigned int hpet_readl(unsigned int a); +extern void hpet_writel(unsigned int d, unsigned int a); extern void force_hpet_resume(void); =20 #ifdef CONFIG_HPET_EMULATE_RTC diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c index 71f336425e58..47678e7927ff 100644 --- a/arch/x86/kernel/hpet.c +++ b/arch/x86/kernel/hpet.c @@ -79,7 +79,7 @@ inline unsigned int hpet_readl(unsigned int a) return readl(hpet_virt_address + a); } =20 -static inline void hpet_writel(unsigned int d, unsigned int a) +inline void hpet_writel(unsigned int d, unsigned int a) { writel(d, hpet_virt_address + a); } --=20 2.17.1 From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0ABDEC433EF for ; Thu, 5 May 2022 23:59:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387265AbiEFACp (ORCPT ); Thu, 5 May 2022 20:02:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35218 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387116AbiEFABr (ORCPT ); Thu, 5 May 2022 20:01:47 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 851BD612A4 for ; Thu, 5 May 2022 16:57:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795073; x=1683331073; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=TyPidIU9s9JbwJNwvFvgLrTu4E+Y5y97BFEdQg2cUk8=; b=me41SlqkT0DwRfIzvo/ciQAoCLD1qsyy+SULhWMBbDYPBkr6IZBCGILz uIBBkZP4/VYX7Ffh8bgJA8nLbrhIbPTBA2xAc+WtH8pbKY1H7yx4tz8uq ck7ONI9SE6ZKZ7wXCYnavlaJ3ogLfekAiQBLloYruWD/sz6cu3NzM2ReF bwGOb053f/Rbuc44c5xkws0eciOSfLdaYQhqIyc8VViJ9s+gw663F4dPV PlJ1Kd4Jb3u9vsIFkfKHMmRreHCTv3+2LxNqywYCsWgJJHEiT3/oNxi5i jY5rcNx6HdPYnoYWaPxmbP9kVjV2p7ka0JfPewu+Nu7FXmgFQlHdpjPFo w==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283639" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283639" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914380" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:50 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri Subject: [PATCH v6 15/29] x86/hpet: Add helper function hpet_set_comparator_periodic() Date: Thu, 5 May 2022 16:59:54 -0700 Message-Id: <20220506000008.30892-16-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Programming an HPET channel as periodic requires setting the HPET_TN_SETVAL bit in the channel configuration. Plus, the comparator register must be written twice (once for the comparator value and once for the periodic value). Since this programming might be needed in several places (e.g., the HPET clocksource and the HPET-based hardlockup detector), add a helper function for this purpose. A helper function hpet_set_comparator_oneshot() could also be implemented. However, such function would only program the comparator register and the function would be quite small. Hence, it is better to not bloat the code with such an obvious function. Cc: Andi Kleen Cc: Tony Luck Cc: Stephane Eranian Cc: "Ravi V. Shankar" Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Originally-by: Suravee Suthikulpanit Reviewed-by: Tony Luck Signed-off-by: Ricardo Neri --- When programming the HPET channel in periodic mode, a udelay(1) between the two successive writes to HPET_Tn_CMP was introduced in commit e9e2cdb41241 ("[PATCH] clockevents: i386 drivers"). The commit message does not give any reason for such delay. The hardware specification does not seem to require it. The refactoring in this patch simply carries such delay. --- Changes since v5: * None Changes since v4: * Implement function only for periodic mode. This removed extra logic to to use a non-zero period value as a proxy for periodic mode programming. (Thomas) * Added a comment on the history of the udelay() when programming the channel in periodic mode. (Ashok) Changes since v3: * Added back a missing hpet_writel() for time configuration. Changes since v2: * Introduced this patch. Changes since v1: * N/A --- arch/x86/include/asm/hpet.h | 2 ++ arch/x86/kernel/hpet.c | 49 ++++++++++++++++++++++++++++--------- 2 files changed, 39 insertions(+), 12 deletions(-) diff --git a/arch/x86/include/asm/hpet.h b/arch/x86/include/asm/hpet.h index be9848f0883f..486e001413c7 100644 --- a/arch/x86/include/asm/hpet.h +++ b/arch/x86/include/asm/hpet.h @@ -74,6 +74,8 @@ extern void hpet_disable(void); extern unsigned int hpet_readl(unsigned int a); extern void hpet_writel(unsigned int d, unsigned int a); extern void force_hpet_resume(void); +extern void hpet_set_comparator_periodic(int channel, unsigned int cmp, + unsigned int period); =20 #ifdef CONFIG_HPET_EMULATE_RTC =20 diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c index 47678e7927ff..2c6713b40921 100644 --- a/arch/x86/kernel/hpet.c +++ b/arch/x86/kernel/hpet.c @@ -294,6 +294,39 @@ static void hpet_enable_legacy_int(void) hpet_legacy_int_enabled =3D true; } =20 +/** + * hpet_set_comparator_periodic() - Helper function to set periodic channel + * @channel: The HPET channel + * @cmp: The value to be written to the comparator/accumulator + * @period: Number of ticks per period + * + * Helper function for updating comparator, accumulator and period values. + * + * In periodic mode, HPET needs HPET_TN_SETVAL to be set before writing + * to the Tn_CMP to update the accumulator. Then, HPET needs a second + * write (with HPET_TN_SETVAL cleared) to Tn_CMP to set the period. + * The HPET_TN_SETVAL bit is automatically cleared after the first write. + * + * This function takes a 1 microsecond delay. However, this function is su= pposed + * to be called only once (or when reprogramming the timer) as it deals wi= th a + * periodic timer channel. + * + * See the following documents: + * - Intel IA-PC HPET (High Precision Event Timers) Specification + * - AMD-8111 HyperTransport I/O Hub Data Sheet, Publication # 24674 + */ +void hpet_set_comparator_periodic(int channel, unsigned int cmp, unsigned = int period) +{ + unsigned int v =3D hpet_readl(HPET_Tn_CFG(channel)); + + hpet_writel(v | HPET_TN_SETVAL, HPET_Tn_CFG(channel)); + + hpet_writel(cmp, HPET_Tn_CMP(channel)); + + udelay(1); + hpet_writel(period, HPET_Tn_CMP(channel)); +} + static int hpet_clkevt_set_state_periodic(struct clock_event_device *evt) { unsigned int channel =3D clockevent_to_channel(evt)->num; @@ -306,19 +339,11 @@ static int hpet_clkevt_set_state_periodic(struct cloc= k_event_device *evt) now =3D hpet_readl(HPET_COUNTER); cmp =3D now + (unsigned int)delta; cfg =3D hpet_readl(HPET_Tn_CFG(channel)); - cfg |=3D HPET_TN_ENABLE | HPET_TN_PERIODIC | HPET_TN_SETVAL | - HPET_TN_32BIT; + cfg |=3D HPET_TN_ENABLE | HPET_TN_PERIODIC | HPET_TN_32BIT; hpet_writel(cfg, HPET_Tn_CFG(channel)); - hpet_writel(cmp, HPET_Tn_CMP(channel)); - udelay(1); - /* - * HPET on AMD 81xx needs a second write (with HPET_TN_SETVAL - * cleared) to T0_CMP to set the period. The HPET_TN_SETVAL - * bit is automatically cleared after the first write. - * (See AMD-8111 HyperTransport I/O Hub Data Sheet, - * Publication # 24674) - */ - hpet_writel((unsigned int)delta, HPET_Tn_CMP(channel)); + + hpet_set_comparator_periodic(channel, cmp, (unsigned int)delta); + hpet_start_counter(); hpet_print_config(); =20 --=20 2.17.1 From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 30672C433F5 for ; Thu, 5 May 2022 23:58:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387241AbiEFACh (ORCPT ); Thu, 5 May 2022 20:02:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35562 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387106AbiEFABm (ORCPT ); Thu, 5 May 2022 20:01:42 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B7B3D60DAA for ; Thu, 5 May 2022 16:57:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795073; x=1683331073; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=yu43tYTm3rL820NWGTlU5e3BSTdrJeh1cVn1o9Y1twQ=; b=XZulpZrOMMMKJxovH7naHeH3WcgDbT7/GBMVfNzwn4gIMqU3IpJ2Xfo9 LNTvFHm0tHHKR0dr/brMB7FC1utQB27pmtGUa/OeBBB1aqR761CvgCbuz QZTOETcWtvH31IqGcJvw1/t7EPq4G+57GqqTy6TwAQ6tiQdz79YxxkPg1 N97MX4odfxHaibHWYwbNU9wojq/S65Yy0h/7XnFHXJH+nV5qq7XTxW5SX 8hRinGyNVFw4FLetuyE96LxV4LppYtlKAZbPI+VcXpv4TPgSiM9RG1th5 RZz0N1SdLsNECXGUmigyNreqvPycb0HBsZMcgC9sao9LkU7gxmmKwe93H A==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283641" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283641" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:51 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914383" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:50 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri Subject: [PATCH v6 16/29] x86/hpet: Prepare IRQ assignments to use the X86_ALLOC_AS_NMI flag Date: Thu, 5 May 2022 16:59:55 -0700 Message-Id: <20220506000008.30892-17-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" The flag X86_ALLOC_AS_NMI indicates that the IRQs to be allocated in an IRQ domain need to be configured as NMIs. Add an as_nmi argument to hpet_assign_irq(). Even though the HPET clock events do not need NMI IRQs, the HPET hardlockup detector does. A subsequent changeset will implement the reservation of a channel for it. Cc: Andi Kleen Cc: "Ravi V. Shankar" Cc: Stephane Eranian Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Suggested-by: Thomas Gleixner Reviewed-by: Tony Luck Signed-off-by: Ricardo Neri --- Changes since v5: * Introduced this patch. Changes since v4: * N/A Changes since v3: * N/A Changes since v2: * N/A Changes since v1: * N/A --- arch/x86/kernel/hpet.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c index 2c6713b40921..02d25e00e93f 100644 --- a/arch/x86/kernel/hpet.c +++ b/arch/x86/kernel/hpet.c @@ -618,7 +618,7 @@ static inline int hpet_dev_id(struct irq_domain *domain) } =20 static int hpet_assign_irq(struct irq_domain *domain, struct hpet_channel = *hc, - int dev_num) + int dev_num, bool as_nmi) { struct irq_alloc_info info; =20 @@ -627,6 +627,8 @@ static int hpet_assign_irq(struct irq_domain *domain, s= truct hpet_channel *hc, info.data =3D hc; info.devid =3D hpet_dev_id(domain); info.hwirq =3D dev_num; + if (as_nmi) + info.flags |=3D X86_IRQ_ALLOC_AS_NMI; =20 return irq_domain_alloc_irqs(domain, 1, NUMA_NO_NODE, &info); } @@ -755,7 +757,7 @@ static void __init hpet_select_clockevents(void) =20 sprintf(hc->name, "hpet%d", i); =20 - irq =3D hpet_assign_irq(hpet_domain, hc, hc->num); + irq =3D hpet_assign_irq(hpet_domain, hc, hc->num, false); if (irq <=3D 0) continue; =20 --=20 2.17.1 From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 793CDC433F5 for ; Thu, 5 May 2022 23:59:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233742AbiEFAC6 (ORCPT ); Thu, 5 May 2022 20:02:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35548 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387122AbiEFABs (ORCPT ); Thu, 5 May 2022 20:01:48 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 81B48612AC for ; Thu, 5 May 2022 16:57:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795074; x=1683331074; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=eMvcJDLLgv1piJPvn9rpsQ5W0zxPoYZ7HqfVKzdrDNs=; b=g/tj6SB7eJFUNZLsZ3LhFWCGdd10Xz9MKBaIfRpP9IW2YfmJefGgwa2+ 4ddlXKmn3mQPSUYG1JmM0O4sfDR5UdFrtfxQtyDR+QxfgJXzaCMTVf8zC SpKKIMXEoOW7H/UlEcXdgk5Owqm5V/ik/wdjM+A2xGfCse3hEzbEFI9XK 4PQ835JONUcHQE3zV3SzfFTJHm9JvQjbHTGlq3rC4Yu/Ug2P6k1dqDnI/ Pki4RLkjLElxqU+YzXQX0jfU3DZjZcYyaiMj5AqZ/rvNeTCT5wyBhC/fm +43fkWETQwU/mBPAiyFbYUjWVvvWbPSW/ZsPJNtKp4v3AgxeYKaqnxzv9 w==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283643" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283643" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:51 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914386" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:51 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri Subject: [PATCH v6 17/29] x86/hpet: Reserve an HPET channel for the hardlockup detector Date: Thu, 5 May 2022 16:59:56 -0700 Message-Id: <20220506000008.30892-18-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" The HPET hardlockup detector needs a dedicated HPET channel. Hence, create a new HPET_MODE_NMI_WATCHDOG mode category to indicate that it cannot be used for other purposes. Using MSI interrupts greatly simplifies the implementation of the detector. Specifically, it helps to avoid the complexities of routing the interrupt via the IO-APIC (e.g., potential race conditions that arise from re-programming the IO-APIC while also servicing an NMI). Therefore, only reserve the timer if it supports Front Side Bus interrupt delivery. HPET channels are reserved at various stages. First, from x86_late_time_init(), hpet_time_init() checks if the HPET timer supports Legacy Replacement Routing. If this is the case, channels 0 and 1 are reserved as HPET_MODE_LEGACY. At a later stage, from lockup_detector_init(), reserve the HPET channel for the hardlockup detector. Then, the HPET clocksource reserves the channels it needs and then the remaining channels are given to the HPET char driver via hpet_alloc(). Hence, the channel assigned to the HPET hardlockup detector depends on whether the first two channels are reserved for legacy mode. Lastly, only reserve the channel for the hardlockup detector if enabled in the kernel command line. Cc: Andi Kleen Cc: Stephane Eranian Cc: "Ravi V. Shankar" Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Reviewed-by: Tony Luck Signed-off-by: Ricardo Neri --- Changes since v5: * Added a check for the allowed maximum frequency of the HPET. * Added hpet_hld_free_timer() to properly free the reserved HPET channel if the initialization is not completed. * Call hpet_assign_irq() with as_nmi =3D true. * Relocated declarations of functions and data structures of the detector to not depend on CONFIG_HPET_TIMER. Changes since v4: * Reworked timer reservation to use Thomas' rework on HPET channel management. * Removed hard-coded channel number for the hardlockup detector. * Provided more details on the sequence of HPET channel reservations. (Thomas Gleixner) * Only reserve a channel for the hardlockup detector if enabled via kernel command line. The function reserving the channel is called from hardlockup detector. (Thomas Gleixner) * Shorten the name of hpet_hardlockup_detector_get_timer() to hpet_hld_get_timer(). (Andi) * Simplify error handling when a channel is not found. (Tony) Changes since v3: * None Changes since v2: * None Changes since v1: * None --- arch/x86/include/asm/hpet.h | 22 ++++++++ arch/x86/kernel/hpet.c | 105 ++++++++++++++++++++++++++++++++++++ 2 files changed, 127 insertions(+) diff --git a/arch/x86/include/asm/hpet.h b/arch/x86/include/asm/hpet.h index 486e001413c7..5762bd0169a1 100644 --- a/arch/x86/include/asm/hpet.h +++ b/arch/x86/include/asm/hpet.h @@ -103,4 +103,26 @@ static inline int is_hpet_enabled(void) { return 0; } #define default_setup_hpet_msi NULL =20 #endif + +#ifdef CONFIG_X86_HARDLOCKUP_DETECTOR_HPET +/** + * struct hpet_hld_data - Data needed to operate the detector + * @has_periodic: The HPET channel supports periodic mode + * @channel: HPET channel assigned to the detector + * @channe_priv: Private data of the assigned channel + * @ticks_per_second: Frequency of the HPET timer + * @irq: IRQ number assigned to the HPET channel + */ +struct hpet_hld_data { + bool has_periodic; + u32 channel; + struct hpet_channel *channel_priv; + u64 ticks_per_second; + int irq; +}; + +extern struct hpet_hld_data *hpet_hld_get_timer(void); +extern void hpet_hld_free_timer(struct hpet_hld_data *hdata); +#endif /* CONFIG_X86_HARDLOCKUP_DETECTOR_HPET */ + #endif /* _ASM_X86_HPET_H */ diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c index 02d25e00e93f..ee9275c013f5 100644 --- a/arch/x86/kernel/hpet.c +++ b/arch/x86/kernel/hpet.c @@ -20,6 +20,7 @@ enum hpet_mode { HPET_MODE_LEGACY, HPET_MODE_CLOCKEVT, HPET_MODE_DEVICE, + HPET_MODE_NMI_WATCHDOG, }; =20 struct hpet_channel { @@ -216,6 +217,7 @@ static void __init hpet_reserve_platform_timers(void) break; case HPET_MODE_CLOCKEVT: case HPET_MODE_LEGACY: + case HPET_MODE_NMI_WATCHDOG: hpet_reserve_timer(&hd, hc->num); break; } @@ -1496,3 +1498,106 @@ irqreturn_t hpet_rtc_interrupt(int irq, void *dev_i= d) } EXPORT_SYMBOL_GPL(hpet_rtc_interrupt); #endif + +#ifdef CONFIG_X86_HARDLOCKUP_DETECTOR_HPET + +/* + * We program the timer in 32-bit mode to reduce the number of register + * accesses. The maximum value of watch_thresh is 60 seconds. The HPET cou= nter + * should not wrap around more frequently than that. Thus, the frequency o= f the + * HPET timer must be less than 71.582788 MHz. For safety, limit the frequ= ency + * to 85% the maximum frequency. + * + * The frequency of the HPET on systems in the field is usually less than = 24MHz. + */ +#define HPET_HLD_MAX_FREQ 60845000ULL + +static struct hpet_hld_data *hld_data; + +/** + * hpet_hld_free_timer - Free the reserved timer for the hardlockup detect= or + * + * Free the resources held by the HPET channel reserved for the hard lockup + * detector and make it available for other uses. + * + * Returns: none + */ +void hpet_hld_free_timer(struct hpet_hld_data *hdata) +{ + hdata->channel_priv->mode =3D HPET_MODE_UNUSED; + hdata->channel_priv->in_use =3D 0; + kfree(hld_data); +} + +/** + * hpet_hld_get_timer - Get an HPET channel for the hardlockup detector + * + * Reseve an HPET channel and return the timer information to caller only = if a + * channel is available and supports FSB mode. This function is called by = the + * hardlockup detector only if enabled in the kernel command line. + * + * Returns: a pointer with the properties of the reserved HPET channel. + */ +struct hpet_hld_data *hpet_hld_get_timer(void) +{ + struct hpet_channel *hc =3D hpet_base.channels; + int i, irq; + + if (hpet_freq > HPET_HLD_MAX_FREQ) + return NULL; + + for (i =3D 0; i < hpet_base.nr_channels; i++) { + hc =3D hpet_base.channels + i; + + /* + * Associate the first unused channel to the hardlockup + * detector. Bailout if we cannot find one. This may happen if + * the HPET clocksource has taken all the timers. The HPET driver + * (/dev/hpet) should not take timers at this point as channels + * for such driver can only be reserved from user space. + */ + if (hc->mode =3D=3D HPET_MODE_UNUSED) + break; + } + + if (i =3D=3D hpet_base.nr_channels) + return NULL; + + if (!(hc->boot_cfg & HPET_TN_FSB_CAP)) + return NULL; + + hld_data =3D kzalloc(sizeof(*hld_data), GFP_KERNEL); + if (!hld_data) + return NULL; + + hc->mode =3D HPET_MODE_NMI_WATCHDOG; + hc->in_use =3D 1; + hld_data->channel_priv =3D hc; + + if (hc->boot_cfg & HPET_TN_PERIODIC_CAP) + hld_data->has_periodic =3D true; + + if (!hpet_domain) + hpet_domain =3D hpet_create_irq_domain(hpet_blockid); + + if (!hpet_domain) + goto err; + + /* Assign an IRQ with NMI delivery mode. */ + irq =3D hpet_assign_irq(hpet_domain, hc, hc->num, true); + if (irq <=3D 0) + goto err; + + hc->irq =3D irq; + hld_data->irq =3D irq; + hld_data->channel =3D i; + hld_data->ticks_per_second =3D hpet_freq; + + return hld_data; + +err: + hpet_hld_free_timer(hld_data); + hld_data =3D NULL; + return NULL; +} +#endif /* CONFIG_X86_HARDLOCKUP_DETECTOR_HPET */ --=20 2.17.1 From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99FBEC433EF for ; Thu, 5 May 2022 23:59:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1357309AbiEFADV (ORCPT ); Thu, 5 May 2022 20:03:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35030 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387128AbiEFABs (ORCPT ); Thu, 5 May 2022 20:01:48 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E1A8B60DAE for ; Thu, 5 May 2022 16:57:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795074; x=1683331074; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=t1y1ryZh2p05ZGgOOcX19YcqK36Xcxw5j/JRqkc11No=; b=H14gOds5Pn6jcoTi5IAwEoiWHMhOMcjOzXjS3Wkw93u5Yrlhazx5bUSX c/Zr3HcGO8G8aFtznMWPA6GqGNCF7+OoeTVbbkspF4Yh80zz0uu0eC+Nr hU6eT6zcVLY46UqcRZBwfniqBiyQ629G3uHNd3Pa7mIgjw3iau6Mn1JPG axPwiBnZ/7yA14LIjBnP/D0Dh22N0uYRehyOmmnBHDYpK0agZ/Fgk8t70 g6vWQmUB9j+3uG67+RWsgo0z/AWD5UTE2y6gbnW3wuQqmHN+5CRqkcL4L 3hG51FjBCyzfXUlZ0d0YaelKH3absuVcZIblxuT4q06tmlWf0o7/l7PIz A==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283645" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283645" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:52 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914391" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:51 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri Subject: [PATCH v6 18/29] watchdog/hardlockup: Define a generic function to detect hardlockups Date: Thu, 5 May 2022 16:59:57 -0700 Message-Id: <20220506000008.30892-19-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" The procedure to detect hardlockups is independent of the underlying mechanism that generates the non-maskable interrupt used to drive the detector. Thus, it can be put in a separate, generic function. In this manner, it can be invoked by various implementations of the NMI watchdog. For this purpose, move the bulk of watchdog_overflow_callback() to the new function inspect_for_hardlockups(). This function can then be called from the applicable NMI handlers. No functional changes. Cc: Andi Kleen Cc: Nicholas Piggin Cc: Andrew Morton Cc: Stephane Eranian Cc: "Ravi V. Shankar" Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Reviewed-by: Tony Luck Signed-off-by: Ricardo Neri --- Changes since v5: * None Changes since v4: * None Changes since v3: * None Changes since v2: * None Changes since v1: * None --- include/linux/nmi.h | 1 + kernel/watchdog_hld.c | 18 +++++++++++------- 2 files changed, 12 insertions(+), 7 deletions(-) diff --git a/include/linux/nmi.h b/include/linux/nmi.h index 750c7f395ca9..1b68f48ad440 100644 --- a/include/linux/nmi.h +++ b/include/linux/nmi.h @@ -207,6 +207,7 @@ int proc_nmi_watchdog(struct ctl_table *, int , void *,= size_t *, loff_t *); int proc_soft_watchdog(struct ctl_table *, int , void *, size_t *, loff_t = *); int proc_watchdog_thresh(struct ctl_table *, int , void *, size_t *, loff_= t *); int proc_watchdog_cpumask(struct ctl_table *, int, void *, size_t *, loff_= t *); +void inspect_for_hardlockups(struct pt_regs *regs); =20 #ifdef CONFIG_HAVE_ACPI_APEI_NMI #include diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c index 247bf0b1582c..b352e507b17f 100644 --- a/kernel/watchdog_hld.c +++ b/kernel/watchdog_hld.c @@ -106,14 +106,8 @@ static struct perf_event_attr wd_hw_attr =3D { .disabled =3D 1, }; =20 -/* Callback function for perf event subsystem */ -static void watchdog_overflow_callback(struct perf_event *event, - struct perf_sample_data *data, - struct pt_regs *regs) +void inspect_for_hardlockups(struct pt_regs *regs) { - /* Ensure the watchdog never gets throttled */ - event->hw.interrupts =3D 0; - if (__this_cpu_read(watchdog_nmi_touch) =3D=3D true) { __this_cpu_write(watchdog_nmi_touch, false); return; @@ -163,6 +157,16 @@ static void watchdog_overflow_callback(struct perf_eve= nt *event, return; } =20 +/* Callback function for perf event subsystem */ +static void watchdog_overflow_callback(struct perf_event *event, + struct perf_sample_data *data, + struct pt_regs *regs) +{ + /* Ensure the watchdog never gets throttled */ + event->hw.interrupts =3D 0; + inspect_for_hardlockups(regs); +} + static int hardlockup_detector_event_create(void) { unsigned int cpu =3D smp_processor_id(); --=20 2.17.1 From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 59976C433EF for ; Thu, 5 May 2022 23:59:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387309AbiEFADX (ORCPT ); Thu, 5 May 2022 20:03:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35764 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387129AbiEFABs (ORCPT ); Thu, 5 May 2022 20:01:48 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5274C60DB4 for ; Thu, 5 May 2022 16:57:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795075; x=1683331075; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=/O+HEX9RnyO8ZBUK3lK021eADBWVL0Yl8Mx87L868do=; b=Bacrrn10qNt3fZKGU4HocyRE3pr6vF9pu1IyA8P6kjJh6awJn6hucxao r+YypSpKObHslD76oqqbPUhWF2hbsv/vPzxKDH8iJ8x5qUYqJcj+I1YNu niSmiC21WXLaijVuFYAJ5TgV2dPrqhOlXFC4iMrdPxs3ps6+UP1VABPxf o2AK6+iUcLuSiQlH2HhatywFJ+HYPP8nb7BKWeKlsCGQQ1p+53aFF6pv5 p5fn0nDzzEdwzzWR7Isxdk0hSNNRa/fEdqSTKc7fOLoVtv568T6xlDCn5 cg/VTSGP7+OceqZZBEcO/77q9fmIxuZ2nS/lfF3i2/1VPP2XIodGt1PPi Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283646" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283646" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:52 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914406" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:52 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri Subject: [PATCH v6 19/29] watchdog/hardlockup: Decouple the hardlockup detector from perf Date: Thu, 5 May 2022 16:59:58 -0700 Message-Id: <20220506000008.30892-20-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" The current default implementation of the hardlockup detector assumes that it is implemented using perf events. However, the hardlockup detector can be driven by other sources of non-maskable interrupts (e.g., a properly configured timer). Group and wrap in #ifdef CONFIG_HARDLOCKUP_DETECTOR_PERF all the code specific to perf: create and manage perf events, stop and start the perf- based detector. The generic portion of the detector (monitor the timers' thresholds, check timestamps and detect hardlockups as well as the implementation of arch_touch_nmi_watchdog()) is now selected with the new intermediate config symbol CONFIG_HARDLOCKUP_DETECTOR_CORE. The perf-based implementation of the detector selects the new intermediate symbol. Other implementations should do the same. Cc: Andi Kleen Cc: Nicholas Piggin Cc: Andrew Morton Cc: Stephane Eranian Cc: "Ravi V. Shankar" Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Reviewed-by: Tony Luck Signed-off-by: Ricardo Neri --- Changes since v5: * None Changes since v4: * None Changes since v3: * Squashed into this patch a previous patch to make arch_touch_nmi_watchdog() part of the core detector code. Changes since v2: * Undid split of the generic hardlockup detector into a separate file. (Thomas Gleixner) * Added a new intermediate symbol CONFIG_HARDLOCKUP_DETECTOR_CORE to select generic parts of the detector (Paul E. McKenney, Thomas Gleixner). Changes since v1: * Make the generic detector code with CONFIG_HARDLOCKUP_DETECTOR. --- include/linux/nmi.h | 5 ++++- kernel/Makefile | 2 +- kernel/watchdog_hld.c | 32 ++++++++++++++++++++------------ lib/Kconfig.debug | 4 ++++ 4 files changed, 29 insertions(+), 14 deletions(-) diff --git a/include/linux/nmi.h b/include/linux/nmi.h index 1b68f48ad440..cf12380e51b3 100644 --- a/include/linux/nmi.h +++ b/include/linux/nmi.h @@ -94,8 +94,11 @@ static inline void hardlockup_detector_disable(void) {} # define NMI_WATCHDOG_SYSCTL_PERM 0444 #endif =20 -#if defined(CONFIG_HARDLOCKUP_DETECTOR_PERF) +#if defined(CONFIG_HARDLOCKUP_DETECTOR_CORE) extern void arch_touch_nmi_watchdog(void); +#endif + +#if defined(CONFIG_HARDLOCKUP_DETECTOR_PERF) extern void hardlockup_detector_perf_stop(void); extern void hardlockup_detector_perf_restart(void); extern void hardlockup_detector_perf_disable(void); diff --git a/kernel/Makefile b/kernel/Makefile index 847a82bfe0e3..27e75b735ef7 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -95,7 +95,7 @@ obj-$(CONFIG_FAIL_FUNCTION) +=3D fail_function.o obj-$(CONFIG_KGDB) +=3D debug/ obj-$(CONFIG_DETECT_HUNG_TASK) +=3D hung_task.o obj-$(CONFIG_LOCKUP_DETECTOR) +=3D watchdog.o -obj-$(CONFIG_HARDLOCKUP_DETECTOR_PERF) +=3D watchdog_hld.o +obj-$(CONFIG_HARDLOCKUP_DETECTOR_CORE) +=3D watchdog_hld.o obj-$(CONFIG_SECCOMP) +=3D seccomp.o obj-$(CONFIG_RELAY) +=3D relay.o obj-$(CONFIG_SYSCTL) +=3D utsname_sysctl.o diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c index b352e507b17f..bb6435978c46 100644 --- a/kernel/watchdog_hld.c +++ b/kernel/watchdog_hld.c @@ -22,12 +22,8 @@ =20 static DEFINE_PER_CPU(bool, hard_watchdog_warn); static DEFINE_PER_CPU(bool, watchdog_nmi_touch); -static DEFINE_PER_CPU(struct perf_event *, watchdog_ev); -static DEFINE_PER_CPU(struct perf_event *, dead_event); -static struct cpumask dead_events_mask; =20 static unsigned long hardlockup_allcpu_dumped; -static atomic_t watchdog_cpus =3D ATOMIC_INIT(0); =20 notrace void arch_touch_nmi_watchdog(void) { @@ -98,14 +94,6 @@ static inline bool watchdog_check_timestamp(void) } #endif =20 -static struct perf_event_attr wd_hw_attr =3D { - .type =3D PERF_TYPE_HARDWARE, - .config =3D PERF_COUNT_HW_CPU_CYCLES, - .size =3D sizeof(struct perf_event_attr), - .pinned =3D 1, - .disabled =3D 1, -}; - void inspect_for_hardlockups(struct pt_regs *regs) { if (__this_cpu_read(watchdog_nmi_touch) =3D=3D true) { @@ -157,6 +145,24 @@ void inspect_for_hardlockups(struct pt_regs *regs) return; } =20 +#ifdef CONFIG_HARDLOCKUP_DETECTOR_PERF +#undef pr_fmt +#define pr_fmt(fmt) "NMI perf watchdog: " fmt + +static DEFINE_PER_CPU(struct perf_event *, watchdog_ev); +static DEFINE_PER_CPU(struct perf_event *, dead_event); +static struct cpumask dead_events_mask; + +static atomic_t watchdog_cpus =3D ATOMIC_INIT(0); + +static struct perf_event_attr wd_hw_attr =3D { + .type =3D PERF_TYPE_HARDWARE, + .config =3D PERF_COUNT_HW_CPU_CYCLES, + .size =3D sizeof(struct perf_event_attr), + .pinned =3D 1, + .disabled =3D 1, +}; + /* Callback function for perf event subsystem */ static void watchdog_overflow_callback(struct perf_event *event, struct perf_sample_data *data, @@ -298,3 +304,5 @@ int __init hardlockup_detector_perf_init(void) } return ret; } + +#endif /* CONFIG_HARDLOCKUP_DETECTOR_PERF */ diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index 55b9acb2f524..1640532cdc6a 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1079,9 +1079,13 @@ config BOOTPARAM_SOFTLOCKUP_PANIC_VALUE default 0 if !BOOTPARAM_SOFTLOCKUP_PANIC default 1 if BOOTPARAM_SOFTLOCKUP_PANIC =20 +config HARDLOCKUP_DETECTOR_CORE + bool + config HARDLOCKUP_DETECTOR_PERF bool select SOFTLOCKUP_DETECTOR + select HARDLOCKUP_DETECTOR_CORE =20 # # Enables a timestamp based low pass filter to compensate for perf based --=20 2.17.1 From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4485EC433FE for ; Thu, 5 May 2022 23:59:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387184AbiEFADG (ORCPT ); Thu, 5 May 2022 20:03:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35776 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387141AbiEFABt (ORCPT ); Thu, 5 May 2022 20:01:49 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 622EC612BB for ; Thu, 5 May 2022 16:57:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795076; x=1683331076; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=Nult8sm5xsB/xmikf6OohjRwLwTEdiOtEi3jvHsqyXk=; b=XbU5pEJw2XhDEePLdjTKFhY54/ePm8S6yhvSvV2w1yv07NbMSgkP0mp6 XJXptj2uC3YuSn9J4SeCqdApp8HAOQwfYyhdr9or+/JJk+r3ovVSHqsxh 52bM9VzCSnO+0HJh1EcferownI9pXDMxq3xJHPv9c1RpakZ/iHeZSTtCx DMb+MxA07d1tLgT6zG9UN435RXF/PNdJPPOtYlgnAA3YavdRHE9kSeQJ5 /tViV01w5OY48FyUKPGXzmHGjsUCJGx6nytbb3W8bIMW7u03YBxdSxuJq 2ImK/5xa+r3YvdkD7L6EUSwOCGoRsH9JgXfw0Aj2PChK7mJXCZjNLtE68 w==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283648" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283648" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:53 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914415" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:52 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri Subject: [PATCH v6 20/29] init/main: Delay initialization of the lockup detector after smp_init() Date: Thu, 5 May 2022 16:59:59 -0700 Message-Id: <20220506000008.30892-21-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Certain implementations of the hardlockup detector require support for Inter-Processor Interrupt shorthands. On x86, support for these can only be determined after all the possible CPUs have booted once (in smp_init()). Other architectures may not need such check. lockup_detector_init() only performs the initializations of data structures of the lockup detector. Hence, there are no dependencies on smp_init(). Cc: Andi Kleen Cc: Nicholas Piggin Cc: Andrew Morton Cc: Stephane Eranian Cc: "Ravi V. Shankar" Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Reviewed-by: Tony Luck Signed-off-by: Ricardo Neri Acked-by: Nicholas Piggin --- Changes since v5: * Introduced this patch Changes since v4: * N/A Changes since v3: * N/A Changes since v2: * N/A Changes since v1: * N/A --- init/main.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/init/main.c b/init/main.c index 98182c3c2c4b..62c52c9e4c2b 100644 --- a/init/main.c +++ b/init/main.c @@ -1600,9 +1600,11 @@ static noinline void __init kernel_init_freeable(voi= d) =20 rcu_init_tasks_generic(); do_pre_smp_initcalls(); - lockup_detector_init(); =20 smp_init(); + + lockup_detector_init(); + sched_init_smp(); =20 padata_init(); --=20 2.17.1 From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1ADF5C433F5 for ; Thu, 5 May 2022 23:59:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245703AbiEFADN (ORCPT ); Thu, 5 May 2022 20:03:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35012 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1385624AbiEFACO (ORCPT ); Thu, 5 May 2022 20:02:14 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8B61D612BD for ; Thu, 5 May 2022 16:57:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795076; x=1683331076; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=L3Ao7/ADM/27W4au5Ct5JCZDRw2G77dHHek57JRUTvg=; b=E2NtPqViFWrn2atGGZYkTXkyMPghh01m6m/aaUYnjDRPlyrpGbPSNngE UROZnMTxsZlxcj7P/G7WP/mXQ10bKYo30mllB8ulfdqkHLjhX83LgEufs cYp96O3Pdsfq+TMstygsrenDXUP9CCJWAvpzyPjOeG563AXruixSxvQw4 NhrbxMk0w93hYcc3t95G3vFREDWZplKDti8NYIp2uO4H2QqwxqpQ9Cs4z Wgbb8RiovDg1QIvouwOL74ff5B9ZVvMHxtVmWIanP5RZXqXX8lbG7tqnY FgsBUd101imyK2fRhEvDcFrX3O2R+qP8BKDqnHHt+uUBw9FVFVZvrSVnW Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283651" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283651" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:53 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914422" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:53 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri Subject: [PATCH v6 21/29] x86/nmi: Add an NMI_WATCHDOG NMI handler category Date: Thu, 5 May 2022 17:00:00 -0700 Message-Id: <20220506000008.30892-22-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Add a NMI_WATCHDOG as a new category of NMI handler. This new category is to be used with the HPET-based hardlockup detector. This detector does not have a direct way of checking if the HPET timer is the source of the NMI. Instead, it indirectly estimates it using the time-stamp counter. Therefore, we may have false-positives in case another NMI occurs within the estimated time window. For this reason, we want the handler of the detector to be called after all the NMI_LOCAL handlers. A simple way of achieving this with a new NMI handler category. Cc: Andi Kleen Cc: Andrew Morton Cc: "Ravi V. Shankar" Cc: Stephane Eranian Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Reviewed-by: Tony Luck Signed-off-by: Ricardo Neri --- Changes since v5: * Updated to call instrumentation_end() as per f051f6979550 ("x86/nmi: Protect NMI entry against instrumentation") Changes since v4: * None Changes since v3: * None Changes since v2: * Introduced this patch. Changes since v1: * N/A --- arch/x86/include/asm/nmi.h | 1 + arch/x86/kernel/nmi.c | 10 ++++++++++ 2 files changed, 11 insertions(+) diff --git a/arch/x86/include/asm/nmi.h b/arch/x86/include/asm/nmi.h index 1cb9c17a4cb4..4a0d5b562c91 100644 --- a/arch/x86/include/asm/nmi.h +++ b/arch/x86/include/asm/nmi.h @@ -28,6 +28,7 @@ enum { NMI_UNKNOWN, NMI_SERR, NMI_IO_CHECK, + NMI_WATCHDOG, NMI_MAX }; =20 diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c index e73f7df362f5..fde387e0812a 100644 --- a/arch/x86/kernel/nmi.c +++ b/arch/x86/kernel/nmi.c @@ -61,6 +61,10 @@ static struct nmi_desc nmi_desc[NMI_MAX] =3D .lock =3D __RAW_SPIN_LOCK_UNLOCKED(&nmi_desc[3].lock), .head =3D LIST_HEAD_INIT(nmi_desc[3].head), }, + { + .lock =3D __RAW_SPIN_LOCK_UNLOCKED(&nmi_desc[4].lock), + .head =3D LIST_HEAD_INIT(nmi_desc[4].head), + }, =20 }; =20 @@ -168,6 +172,8 @@ int __register_nmi_handler(unsigned int type, struct nm= iaction *action) */ WARN_ON_ONCE(type =3D=3D NMI_SERR && !list_empty(&desc->head)); WARN_ON_ONCE(type =3D=3D NMI_IO_CHECK && !list_empty(&desc->head)); + WARN_ON_ONCE(type =3D=3D NMI_WATCHDOG && !list_empty(&desc->head)); + =20 /* * some handlers need to be executed first otherwise a fake @@ -379,6 +385,10 @@ static noinstr void default_do_nmi(struct pt_regs *reg= s) } raw_spin_unlock(&nmi_reason_lock); =20 + handled =3D nmi_handle(NMI_WATCHDOG, regs); + if (handled =3D=3D NMI_HANDLED) + goto out; + /* * Only one NMI can be latched at a time. To handle * this we may process multiple nmi handlers at once to --=20 2.17.1 From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E14DEC433F5 for ; Fri, 6 May 2022 00:00:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236329AbiEFADk (ORCPT ); Thu, 5 May 2022 20:03:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36696 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387181AbiEFACQ (ORCPT ); Thu, 5 May 2022 20:02:16 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 38B9A61602 for ; Thu, 5 May 2022 16:57:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795077; x=1683331077; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=NyURYc8oxJjVJOB30x31veFmVCCJTY1vDrVWNQ4s5WQ=; b=LhlrLsGtCo0LNpvdKefAeymtvoyV9yN6KVAaLZoRF+uHDUvDeCWLrqMD RLmZjV0oREhK+xgHQtylI8wnfIL6MbpOmIXFkxkJBmhuCqoUPutMypicF 3a67GZ61ExvTE7g4Z5RvGpcdl6Y9NGRjuYkoMhXjRT77WTHY+a+6cWbHn m8Wa/6qbpDjsphmGMmRi6NK1hBNBMt+cv3hkbo1er9EMl5i4EJrjJR3mu s5c3DjegxNCNV0z7TjS9xfmsEtYcvLijIiBmc6Fs4LXlRKOkCjY5eU70E LIjtZmVn0w/GfINHUtBfiZbaDZHYocMmH2G4cyRUcQI0i/x3VneiKc28c w==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283655" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283655" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:54 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914428" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:53 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri Subject: [PATCH v6 22/29] x86/watchdog/hardlockup: Add an HPET-based hardlockup detector Date: Thu, 5 May 2022 17:00:01 -0700 Message-Id: <20220506000008.30892-23-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Implement a hardlockup detector that uses an HPET channel as the source of the non-maskable interrupt. Implement the basic functionality to start, stop, and configure the timer. Designate as the handling CPU one of the CPUs that the detector monitors. Use it to service the NMI from the HPET channel. When servicing the HPET NMI, issue an inter-processor interrupt to the rest of the monitored CPUs. Only enable the detector if IPI shorthands are enabled in the system. During operation, the HPET registers are only accessed to kick the timer. This operation can be avoided if a periodic HPET channel is added to the detector. To configure the HPET channel interrupt, the detector relies on the interrupt subsystem to configure the deliver mode as NMI (as requested in hpet_hld_get_timer()) throughout the IRQ hierarchy. This covers systems with and without interrupt remapping enabled. The detector is not functional at this stage. A subsequent changeset will invoke the interfaces implemented in this changeset go start, stop, and reconfigure the detector. Another subsequent changeset implements logic to determine if the HPET timer caused the NMI. For now, implement a stub function. Cc: Andi Kleen Cc: Stephane Eranian Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Reviewed-by: Tony Luck Signed-off-by: Ricardo Neri --- Changes since v5: * Squashed a previously separate patch to support interrupt remapping into this patch. There is no need to handle interrupt remapping separately. All the necessary plumbing is done in the interrupt subsytem. Now it uses request_irq(). * Use IPI shorthands to send an NMI to the CPUs being monitored. (Thomas) * Added extra check to only use the HPET hardlockup detector if the IPI shorthands are enabled. (Thomas) * Relocated flushing of outstanding interrupts from enable_timer() to disable_timer(). On some systems, making any change in the configuration of the HPET channel causes it to issue an interrupt. * Added a new cpumask to function as a per-cpu test bit to determine if a CPU should check for hardlockups. * Dropped pointless X86_64 || X86_32 check in Kconfig. (Tony) * Dropped pointless dependency on CONFIG_HPET. * Added dependency on CONFIG_GENERIC_MSI_IRQ, needed to build the [|IR]- HPET-MSI irq_chip. * Added hardlockup_detector_hpet_start() to be used when tsc_khz is recalibrated. * Reworked the periodic setting the HPET channel. Rather than changing it every time the channel is disabled or enabled, do it only once. While at here, wrap the code in an initial setup function. * Implemented hardlockup_detector_hpet_start() to be called when tsc_khz is refined. * Enhanced inline comments for clarity. * Added missing #include files. * Relocated function declarations to not depend on CONFIG_HPET_TIMER. Changes since v4: * Dropped hpet_hld_data.enabled_cpus and instead use cpumask_weight(). * Renamed hpet_hld_data.cpu_monitored_mask to hld_data_data.cpu_monitored_mask and converted it to cpumask_var_t. * Flushed out any outstanding interrupt before enabling the HPET channel. * Removed unnecessary MSI_DATA_LEVEL_ASSERT from the MSI message. * Added comments in hardlockup_detector_nmi_handler() to explain how CPUs are targeted for an IPI. * Updated code to only issue an IPI when needed (i.e., there are monitored CPUs to be inspected via an IPI). * Reworked hardlockup_detector_hpet_init() for readability. * Now reserve the cpumasks in the hardlockup detector code and not in the generic HPET code. * Handled the case of watchdog_thresh =3D 0 when disabling the detector. * Made this detector available to i386. * Reworked logic to kick the timer to remove a local variable. (Andi) * Added a comment on what type of timer channel will be assigned to the detector. (Andi) * Reworded prompt comment in Kconfig. (Andi) * Removed unneeded switch to level interrupt mode when disabling the timer. (Andi) * Disabled the HPET timer to avoid a race between an incoming interrupt and an update of the MSI destination ID. (Ashok) * Corrected a typo in an inline comment. (Tony) * Made the HPET hardlockup detector depend on HARDLOCKUP_DETECTOR instead of selecting it. Changes since v3: * Fixed typo in Kconfig.debug. (Randy Dunlap) * Added missing slab.h to include the definition of kfree to fix a build break. Changes since v2: * Removed use of struct cpumask in favor of a variable length array in conjunction with kzalloc. (Peter Zijlstra) * Removed redundant documentation of functions. (Thomas Gleixner) * Added CPU as argument hardlockup_detector_hpet_enable()/disable(). (Thomas Gleixner). Changes since v1: * Do not target CPUs in a round-robin manner. Instead, the HPET timer always targets the same CPU; other CPUs are monitored via an interprocessor interrupt. * Dropped support for IO APIC interrupts and instead use only MSI interrupts. * Removed use of generic irq code to set interrupt affinity and NMI delivery. Instead, configure the interrupt directly in HPET registers. (Thomas Gleixner) * Fixed unconditional return NMI_HANDLED when the HPET timer is programmed for FSB/MSI delivery. (Peter Zijlstra) --- arch/x86/Kconfig.debug | 10 + arch/x86/include/asm/hpet.h | 21 ++ arch/x86/kernel/Makefile | 1 + arch/x86/kernel/watchdog_hld_hpet.c | 386 ++++++++++++++++++++++++++++ 4 files changed, 418 insertions(+) create mode 100644 arch/x86/kernel/watchdog_hld_hpet.c diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug index d872a7522e55..bc34239589db 100644 --- a/arch/x86/Kconfig.debug +++ b/arch/x86/Kconfig.debug @@ -114,6 +114,16 @@ config IOMMU_LEAK config HAVE_MMIOTRACE_SUPPORT def_bool y =20 +config X86_HARDLOCKUP_DETECTOR_HPET + bool "HPET Timer for Hard Lockup Detection" + select HARDLOCKUP_DETECTOR_CORE + depends on HARDLOCKUP_DETECTOR && HPET_TIMER && GENERIC_MSI_IRQ + help + The hardlockup detector is driven by one counter of the Performance + Monitoring Unit (PMU) per CPU. Say y to instead drive the + hardlockup detector using a High-Precision Event Timer and make the + PMU counters available for other purposes. + config X86_DECODER_SELFTEST bool "x86 instruction decoder selftest" depends on DEBUG_KERNEL && INSTRUCTION_DECODER diff --git a/arch/x86/include/asm/hpet.h b/arch/x86/include/asm/hpet.h index 5762bd0169a1..c88901744848 100644 --- a/arch/x86/include/asm/hpet.h +++ b/arch/x86/include/asm/hpet.h @@ -105,6 +105,8 @@ static inline int is_hpet_enabled(void) { return 0; } #endif =20 #ifdef CONFIG_X86_HARDLOCKUP_DETECTOR_HPET +#include + /** * struct hpet_hld_data - Data needed to operate the detector * @has_periodic: The HPET channel supports periodic mode @@ -112,6 +114,10 @@ static inline int is_hpet_enabled(void) { return 0; } * @channe_priv: Private data of the assigned channel * @ticks_per_second: Frequency of the HPET timer * @irq: IRQ number assigned to the HPET channel + * @handling_cpu: CPU handling the HPET interrupt + * @monitored_cpumask: CPUs monitored by the hardlockup detector + * @inspect_cpumask: CPUs that will be inspected at a given time. + * Each CPU clears itself upon inspection. */ struct hpet_hld_data { bool has_periodic; @@ -119,10 +125,25 @@ struct hpet_hld_data { struct hpet_channel *channel_priv; u64 ticks_per_second; int irq; + u32 handling_cpu; + cpumask_var_t monitored_cpumask; + cpumask_var_t inspect_cpumask; }; =20 extern struct hpet_hld_data *hpet_hld_get_timer(void); extern void hpet_hld_free_timer(struct hpet_hld_data *hdata); +int hardlockup_detector_hpet_init(void); +void hardlockup_detector_hpet_start(void); +void hardlockup_detector_hpet_stop(void); +void hardlockup_detector_hpet_enable(unsigned int cpu); +void hardlockup_detector_hpet_disable(unsigned int cpu); +#else +static inline int hardlockup_detector_hpet_init(void) +{ return -ENODEV; } +static inline void hardlockup_detector_hpet_start(void) {} +static inline void hardlockup_detector_hpet_stop(void) {} +static inline void hardlockup_detector_hpet_enable(unsigned int cpu) {} +static inline void hardlockup_detector_hpet_disable(unsigned int cpu) {} #endif /* CONFIG_X86_HARDLOCKUP_DETECTOR_HPET */ =20 #endif /* _ASM_X86_HPET_H */ diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 1a2dc328cb5e..c700b00a2d86 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -115,6 +115,7 @@ obj-$(CONFIG_VM86) +=3D vm86_32.o obj-$(CONFIG_EARLY_PRINTK) +=3D early_printk.o =20 obj-$(CONFIG_HPET_TIMER) +=3D hpet.o +obj-$(CONFIG_X86_HARDLOCKUP_DETECTOR_HPET) +=3D watchdog_hld_hpet.o =20 obj-$(CONFIG_AMD_NB) +=3D amd_nb.o obj-$(CONFIG_DEBUG_NMI_SELFTEST) +=3D nmi_selftest.o diff --git a/arch/x86/kernel/watchdog_hld_hpet.c b/arch/x86/kernel/watchdog= _hld_hpet.c new file mode 100644 index 000000000000..9fc7ac2c5059 --- /dev/null +++ b/arch/x86/kernel/watchdog_hld_hpet.c @@ -0,0 +1,386 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * A hardlockup detector driven by an HPET timer. + * + * Copyright (C) Intel Corporation 2022 + * + * A hardlockup detector driven by an HPET timer. It implements the same + * interfaces as the PMU-based hardlockup detector. + * + * The HPET timer channel designated for the hardlockup detector sends an + * NMI to the one of the CPUs in the watchdog_allowed_mask. Such CPU then + * sends an NMI IPI to the rest of the CPUs in the system. Each individual + * CPU checks for hardlockups. + * + * This detector only is enabled when the system has IPI shorthands + * enabled. Therefore, all the CPUs in the system get the broadcast NMI. + * A cpumask is used to check if a specific CPU needs to check for hard- + * lockups. CPUs that are offline, have their local APIC soft-disabled. + * They will also get the NMI but "ignore" it in the NMI handler. + */ + +#define pr_fmt(fmt) "NMI hpet watchdog: " fmt + +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +static struct hpet_hld_data *hld_data; +static bool hardlockup_use_hpet; + +extern struct static_key_false apic_use_ipi_shorthand; + +static void __init setup_hpet_channel(struct hpet_hld_data *hdata) +{ + u32 v; + + v =3D hpet_readl(HPET_Tn_CFG(hdata->channel)); + if (hdata->has_periodic) + v |=3D HPET_TN_PERIODIC; + else + v &=3D ~HPET_TN_PERIODIC; + + v |=3D HPET_TN_32BIT; + hpet_writel(v, HPET_Tn_CFG(hdata->channel)); +} + +/** + * kick_timer() - Reprogram timer to expire in the future + * @hdata: A data structure with the timer instance to update + * @force: Force reprogramming + * + * Reprogram the timer to expire within watchdog_thresh seconds in the fut= ure. + * If the timer supports periodic mode, it is not kicked unless @force is + * true. + */ +static void kick_timer(struct hpet_hld_data *hdata, bool force) +{ + u64 new_compare, count, period =3D 0; + + /* Kick the timer only when needed. */ + if (!force && hdata->has_periodic) + return; + + /* + * Update the comparator in increments of watch_thresh seconds relative + * to the current count. Since watch_thresh is given in seconds, we + * are able to update the comparator before the counter reaches such new + * value. + * + * Let it wrap around if needed. + */ + + count =3D hpet_readl(HPET_COUNTER); + new_compare =3D count + watchdog_thresh * hdata->ticks_per_second; + + if (!hdata->has_periodic) { + hpet_writel(new_compare, HPET_Tn_CMP(hdata->channel)); + return; + } + + period =3D watchdog_thresh * hdata->ticks_per_second; + hpet_set_comparator_periodic(hdata->channel, (u32)new_compare, + (u32)period); +} + +static void disable_timer(struct hpet_hld_data *hdata) +{ + u32 v; + + v =3D hpet_readl(HPET_Tn_CFG(hdata->channel)); + v &=3D ~HPET_TN_ENABLE; + /* + * Prepare to flush out any outstanding interrupt. This can only be + * done in level-triggered mode. + */ + v |=3D HPET_TN_LEVEL; + hpet_writel(v, HPET_Tn_CFG(hdata->channel)); + + /* + * Even though we use the HPET channel in edge-triggered mode, hardware + * seems to keep an outstanding interrupt and posts an MSI message when + * making any change to it (e.g., enabling or setting to FSB mode). + * Flush out the interrupt status bit of our channel. + */ + hpet_writel(1 << hdata->channel, HPET_STATUS); +} + +static void enable_timer(struct hpet_hld_data *hdata) +{ + u32 v; + + v =3D hpet_readl(HPET_Tn_CFG(hdata->channel)); + v &=3D ~HPET_TN_LEVEL; + v |=3D HPET_TN_ENABLE; + hpet_writel(v, HPET_Tn_CFG(hdata->channel)); +} + +/** + * is_hpet_hld_interrupt() - Check if an HPET timer caused the interrupt + * @hdata: A data structure with the timer instance to enable + * + * Returns: + * True if the HPET watchdog timer caused the interrupt. False otherwise. + */ +static bool is_hpet_hld_interrupt(struct hpet_hld_data *hdata) +{ + return false; +} + +/** + * hardlockup_detector_nmi_handler() - NMI Interrupt handler + * @type: Type of NMI handler; not used. + * @regs: Register values as seen when the NMI was asserted + * + * Check if it was caused by the expiration of the HPET timer. If yes, ins= pect + * for lockups by issuing an IPI to the rest of the CPUs. Also, kick the + * timer if it is non-periodic. + * + * Returns: + * NMI_DONE if the HPET timer did not cause the interrupt. NMI_HANDLED + * otherwise. + */ +static int hardlockup_detector_nmi_handler(unsigned int type, + struct pt_regs *regs) +{ + struct hpet_hld_data *hdata =3D hld_data; + int cpu; + + /* + * The CPU handling the HPET NMI will land here and trigger the + * inspection of hardlockups in the rest of the monitored + * CPUs. + */ + if (is_hpet_hld_interrupt(hdata)) { + /* + * Kick the timer first. If the HPET channel is periodic, it + * helps to reduce the delta between the expected TSC value and + * its actual value the next time the HPET channel fires. + */ + kick_timer(hdata, !(hdata->has_periodic)); + + if (cpumask_weight(hld_data->monitored_cpumask) > 1) { + /* + * Since we cannot know the source of an NMI, the best + * we can do is to use a flag to indicate to all online + * CPUs that they will get an NMI and that the source of + * that NMI is the hardlockup detector. Offline CPUs + * also receive the NMI but they ignore it. + * + * Even though we are in NMI context, we have concluded + * that the NMI came from the HPET channel assigned to + * the detector, an event that is infrequent and only + * occurs in the handling CPU. There should not be races + * with other NMIs. + */ + cpumask_copy(hld_data->inspect_cpumask, + cpu_online_mask); + + /* If we are here, IPI shorthands are enabled. */ + apic->send_IPI_allbutself(NMI_VECTOR); + } + + inspect_for_hardlockups(regs); + return NMI_HANDLED; + } + + /* The rest of the CPUs will land here after receiving the IPI. */ + cpu =3D smp_processor_id(); + if (cpumask_test_and_clear_cpu(cpu, hld_data->inspect_cpumask)) { + if (cpumask_test_cpu(cpu, hld_data->monitored_cpumask)) + inspect_for_hardlockups(regs); + + return NMI_HANDLED; + } + + return NMI_DONE; +} + +/** + * setup_hpet_irq() - Configure the interrupt delivery of an HPET timer + * @data: Data associated with the instance of the HPET timer to configure + * + * Configure the interrupt parameters of an HPET timer. If supported, conf= igure + * interrupts to be delivered via the Front-Side Bus. Also, install an int= errupt + * handler. + * + * Returns: + * 0 success. An error code if setup was unsuccessful. + */ +static int setup_hpet_irq(struct hpet_hld_data *hdata) +{ + int ret; + u32 v; + + /* + * hld_data->irq was configured to deliver the interrupt as + * NMI. Thus, there is no need for a regular interrupt handler. + */ + ret =3D request_irq(hld_data->irq, no_action, + IRQF_TIMER | IRQF_NOBALANCING, + "hpet_hld", hld_data); + if (ret) + return ret; + + ret =3D register_nmi_handler(NMI_WATCHDOG, + hardlockup_detector_nmi_handler, 0, + "hpet_hld"); + + if (ret) { + free_irq(hld_data->irq, hld_data); + return ret; + } + + v =3D hpet_readl(HPET_Tn_CFG(hdata->channel)); + v |=3D HPET_TN_FSB; + + hpet_writel(v, HPET_Tn_CFG(hdata->channel)); + + return 0; +} + +/** + * hardlockup_detector_hpet_enable() - Enable the hardlockup detector + * @cpu: CPU Index in which the watchdog will be enabled. + * + * Enable the hardlockup detector in @cpu. Also, start the detector if not= done + * before. + */ +void hardlockup_detector_hpet_enable(unsigned int cpu) +{ + cpumask_set_cpu(cpu, hld_data->monitored_cpumask); + + /* + * If this is the first CPU on which the detector is enabled, + * start everything. The HPET channel is disabled at this point. + */ + if (cpumask_weight(hld_data->monitored_cpumask) =3D=3D 1) { + hld_data->handling_cpu =3D cpu; + /* + * Only update the affinity of the HPET channel interrupt when + * disabled. + */ + if (irq_set_affinity(hld_data->irq, + cpumask_of(hld_data->handling_cpu))) { + pr_warn_once("Failed to set affinity. Hardlockdup detector not started"= ); + return; + } + + kick_timer(hld_data, true); + enable_timer(hld_data); + } +} + +/** + * hardlockup_detector_hpet_disable() - Disable the hardlockup detector + * @cpu: CPU index in which the watchdog will be disabled + * + * Disable the hardlockup detector in @cpu. If @cpu is also handling the N= MI + * from the HPET timer, update the affinity of the interrupt. + */ +void hardlockup_detector_hpet_disable(unsigned int cpu) +{ + cpumask_clear_cpu(cpu, hld_data->monitored_cpumask); + + if (hld_data->handling_cpu !=3D cpu) + return; + + disable_timer(hld_data); + if (!cpumask_weight(hld_data->monitored_cpumask)) + return; + + /* + * If watchdog_thresh is zero, then the hardlockup detector is being + * disabled. + */ + if (!watchdog_thresh) + return; + + hld_data->handling_cpu =3D cpumask_any_but(hld_data->monitored_cpumask, + cpu); + /* + * Only update the affinity of the HPET channel interrupt when + * disabled. + */ + if (irq_set_affinity(hld_data->irq, + cpumask_of(hld_data->handling_cpu))) { + pr_warn_once("Failed to set affinity. Hardlockdup detector stopped"); + return; + } + + enable_timer(hld_data); +} + +void hardlockup_detector_hpet_stop(void) +{ + disable_timer(hld_data); +} + +void hardlockup_detector_hpet_start(void) +{ + kick_timer(hld_data, true); + enable_timer(hld_data); +} + +/** + * hardlockup_detector_hpet_init() - Initialize the hardlockup detector + * + * Only initialize and configure the detector if an HPET is available on t= he + * system, the TSC is stable, and IPI shorthands are enabled. + * + * Returns: + * 0 success. An error code if initialization was unsuccessful. + */ +int __init hardlockup_detector_hpet_init(void) +{ + int ret; + + if (!hardlockup_use_hpet) + return -ENODEV; + + if (!is_hpet_enabled()) + return -ENODEV; + + if (!static_branch_likely(&apic_use_ipi_shorthand)) + return -ENODEV; + + if (check_tsc_unstable()) + return -ENODEV; + + hld_data =3D hpet_hld_get_timer(); + if (!hld_data) + return -ENODEV; + + disable_timer(hld_data); + + setup_hpet_channel(hld_data); + + ret =3D setup_hpet_irq(hld_data); + if (ret) + goto err_no_irq; + + if (!zalloc_cpumask_var(&hld_data->monitored_cpumask, GFP_KERNEL)) + goto err_no_monitored_cpumask; + + if (!zalloc_cpumask_var(&hld_data->inspect_cpumask, GFP_KERNEL)) + goto err_no_inspect_cpumask; + + return 0; + +err_no_inspect_cpumask: + free_cpumask_var(hld_data->monitored_cpumask); +err_no_monitored_cpumask: + ret =3D -ENOMEM; +err_no_irq: + hpet_hld_free_timer(hld_data); + hld_data =3D NULL; + + return ret; +} --=20 2.17.1 From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 20112C433F5 for ; Thu, 5 May 2022 23:59:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387317AbiEFADb (ORCPT ); Thu, 5 May 2022 20:03:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36736 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387102AbiEFACS (ORCPT ); Thu, 5 May 2022 20:02:18 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 05E2061605 for ; Thu, 5 May 2022 16:57:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795078; x=1683331078; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=yC1VMzzaD0T/TLg0eVxuSiZa3kBrYQyg8Df/Iv+75Oc=; b=mVaRri0j6L4o0EUpcZWzskzFxyBjU3TjnxbFkdTMcM3YV7vzhMU9RFug 33YrKx3CP3B+ZqsUsXXl8821WskY3Jhb5EPN/frXb7pvLZjw+q8ZubP0Q y4vpjY4NxSk/5mkmf9qr9RwDoc2Js6PgHBLmut7NWb8SpwUBKnIFrHZip bZtLOhYq51Jf5dmUnq6/5n9csHdee2mnnTTjpkbGE1m6A0NoyZLcQs5f4 2XfMAv1fV1/LswKqxAdBdprA0+CtF/xwB8sJjMLff916Hf7m6iD8/6Rxc YUuPSGkllmPGl7mUiYcOwWh2sS4oO+V6YfOx6Fk1dJ0PtZf1AqPxU+luS Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283656" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283656" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:55 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914432" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:54 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri Subject: [PATCH v6 23/29] x86/watchdog/hardlockup/hpet: Determine if HPET timer caused NMI Date: Thu, 5 May 2022 17:00:02 -0700 Message-Id: <20220506000008.30892-24-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" It is not possible to determine the source of a non-maskable interrupt (NMI) in x86. When dealing with an HPET channel, the only direct method to determine whether it caused an NMI would be to read the Interrupt Status register. However, reading HPET registers is slow and, therefore, not to be done while in NMI context. Furthermore, status is not available if the HPET channel is programmed to deliver an MSI interrupt. An indirect manner to infer if an incoming NMI was caused by the HPET channel of the detector is to use the time-stamp counter (TSC). Compute the value that the TSC is expected to have at the next interrupt of the HPET channel and compare it with the value it has when the interrupt does happen. If the actual value falls within a small error window, assume that the HPET channel of the detector is the source of the NMI. Let tsc_delta be the difference between the value the TSC has now and the value it will have when the next HPET channel interrupt happens. Define the error window as a percentage of tsc_delta. Below is a table that characterizes the error in the error in the expected TSC value when the HPET channel fires on a variety of systems. It presents the error as a percentage of tsc_delta and in microseconds. The table summarizes the error of 4096 interrupts of the HPET channel collected after the system has been up for 5 minutes as well as since boot. The maximum observed error on any system is 0.045%. When the error since boot is considered, the maximum observed error is 0.198%. To find the most common error value, the collected data is grouped into buckets of 0.000001 percentage points of the error and 10ns, respectively. The most common error on any system is of 0.01317% Allow a maximum error that is twice as big the maximum error observed in these experiments: 0.4% watchdog_thresh 1s 10s 60s Error wrt expected TSC value % us % us % us AMD EPYC 7742 64-Core Processor Abs max since boot 0.04517 451.74 0.00171 171.04 0.00034 201.89 Abs max 0.04517 451.74 0.00171 171.04 0.00034 201.89 Mode 0.00002 0.18 0.00002 2.07 -0.00003 -19.20 Intel(R) Xeon(R) CPU E7-8890 - INTEL_FAM6_HASWELL_X abs max since boot 0.00811 81.15 0.00462 462.40 0.00014 81.65 Abs max 0.00811 81.15 0.00084 84.31 0.00014 81.65 Mode -0.00422 -42.16 -0.00043 -42.50 -0.00007 -40.40 Intel(R) Xeon(R) Platinum 8170M - INTEL_FAM6_SKYLAKE_X Abs max since boot 0.10530 1053.04 0.01324 1324.27 0.00407 2443.25 Abs max 0.01166 116.59 0.00114 114.11 0.00024 143.47 Mode -0.01023 -102.32 -0.00103 -102.44 -0.00022 -132.38 Intel(R) Xeon(R) CPU E5-2699A v4 - INTEL_FAM6_BROADSWELL_X Abs max since boot 0.00010 99.34 0.00099 98.83 0.00016 97.50 Abs max 0.00010 99.34 0.00099 98.83 0.00016 97.50 Mode -0.00007 -74.29 -0.00074 -73.99 -0.00012 -73.12 Intel(R) Xeon(R) Gold 5318H - INTEL_FAM6_COOPERLAKE_X Abs max since boot 0.11262 1126.17 0.01109 1109.17 0.00409 2455.73 Abs max 0.01073 107.31 0.00109 109.02 0.00019 115.34 Mode -0.00953 -95.26 -0.00094 -93.63 -0.00015 -90.42 Intel(R) Xeon(R) Platinum 8360Y - INTEL_FAM6_ICELAKE_X Abs max since boot 0.19853 1985.30 0.00784 783.53 -0.00017 -104.77 Abs max 0.01550 155.02 0.00158 157.56 0.00020 117.74 Mode -0.01317 -131.65 -0.00136 -136.42 -0.00018 -105.06 Cc: Andi Kleen Cc: Stephane Eranian Cc: "Ravi V. Shankar" Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Suggested-by: Andi Kleen Reviewed-by: Tony Luck Signed-off-by: Ricardo Neri --- NOTE: The error characterization data is repetead here from the cover letter. --- Changes since v5: * Reworked is_hpet_hld_interrupt() to reduce indentation. * Use time_in_range64() to compare the actual TSC value vs the expected value. This makes it more readable. (Tony) * Reduced the error window of the expected TSC value at the time of the HPET channel expiration. * Described better the heuristics used to determine if the HPET channel caused the NMI. (Tony) * Added a table to characterize the error in the expected TSC value when the HPET channel fires. * Removed references to groups of monitored CPUs. Instead, use tsc_khz directly. Changes since v4: * Compute the TSC expected value at the next HPET interrupt based on the number of monitored packages and not the number of monitored CPUs. Changes since v3: * None Changes since v2: * Reworked condition to check if the expected TSC value is within the error margin to avoid an unnecessary conditional. (Peter Zijlstra) * Removed TSC error margin from struct hld_data; use a global variable instead. (Peter Zijlstra) Changes since v1: * Introduced this patch. --- arch/x86/include/asm/hpet.h | 3 ++ arch/x86/kernel/watchdog_hld_hpet.c | 54 +++++++++++++++++++++++++++-- 2 files changed, 55 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/hpet.h b/arch/x86/include/asm/hpet.h index c88901744848..af0a504b5cff 100644 --- a/arch/x86/include/asm/hpet.h +++ b/arch/x86/include/asm/hpet.h @@ -113,6 +113,8 @@ static inline int is_hpet_enabled(void) { return 0; } * @channel: HPET channel assigned to the detector * @channe_priv: Private data of the assigned channel * @ticks_per_second: Frequency of the HPET timer + * @tsc_next: Estimated value of the TSC at the next + * HPET timer interrupt * @irq: IRQ number assigned to the HPET channel * @handling_cpu: CPU handling the HPET interrupt * @monitored_cpumask: CPUs monitored by the hardlockup detector @@ -124,6 +126,7 @@ struct hpet_hld_data { u32 channel; struct hpet_channel *channel_priv; u64 ticks_per_second; + u64 tsc_next; int irq; u32 handling_cpu; cpumask_var_t monitored_cpumask; diff --git a/arch/x86/kernel/watchdog_hld_hpet.c b/arch/x86/kernel/watchdog= _hld_hpet.c index 9fc7ac2c5059..3effdbf29095 100644 --- a/arch/x86/kernel/watchdog_hld_hpet.c +++ b/arch/x86/kernel/watchdog_hld_hpet.c @@ -34,6 +34,7 @@ =20 static struct hpet_hld_data *hld_data; static bool hardlockup_use_hpet; +static u64 tsc_next_error; =20 extern struct static_key_false apic_use_ipi_shorthand; =20 @@ -59,10 +60,39 @@ static void __init setup_hpet_channel(struct hpet_hld_d= ata *hdata) * Reprogram the timer to expire within watchdog_thresh seconds in the fut= ure. * If the timer supports periodic mode, it is not kicked unless @force is * true. + * + * Also, compute the expected value of the time-stamp counter at the time = of + * expiration as well as a deviation from the expected value. */ static void kick_timer(struct hpet_hld_data *hdata, bool force) { - u64 new_compare, count, period =3D 0; + u64 tsc_curr, tsc_delta, new_compare, count, period =3D 0; + + tsc_curr =3D rdtsc(); + + /* + * Compute the delta between the value of the TSC now and the value + * it will have the next time the HPET channel fires. + */ + tsc_delta =3D watchdog_thresh * tsc_khz * 1000L; + hdata->tsc_next =3D tsc_curr + tsc_delta; + + /* + * Define an error window between the expected TSC value and the actual + * value it will have the next time the HPET channel fires. Define this + * error as percentage of tsc_delta. + * + * The systems that have been tested so far exhibit an error of 0.05% + * of the expected TSC value once the system is up and running. Systems + * that refine tsc_khz exhibit a larger initial error up to 0.2%. + * + * To be safe, allow a maximum error of ~0.4%. This error value can be + * computed by left-shifting tsc_delta by 8 positions. Shift 9 + * positions to calculate half the error. When the HPET channel fires, + * check if the actual TSC value is in the range + * [tsc_next - (tsc_next_error / 2), tsc_next + (tsc_next_error / 2)] + */ + tsc_next_error =3D tsc_delta >> 9; =20 /* Kick the timer only when needed. */ if (!force && hdata->has_periodic) @@ -126,12 +156,32 @@ static void enable_timer(struct hpet_hld_data *hdata) * is_hpet_hld_interrupt() - Check if an HPET timer caused the interrupt * @hdata: A data structure with the timer instance to enable * + * Checking whether the HPET was the source of this NMI is not possible. + * Determining the sources of NMIs is not possible. Furthermore, we have + * programmed the HPET channel for MSI delivery, which does not have a + * status bit. Also, reading HPET registers is slow. + * + * Instead, we just assume that any NMI delivered within a time window + * of when the HPET was expected to fire probably came from the HPET. + * + * The window is estimated using the TSC counter. Check the comments in + * kick_timer() for details on the size of the time window. + * * Returns: * True if the HPET watchdog timer caused the interrupt. False otherwise. */ static bool is_hpet_hld_interrupt(struct hpet_hld_data *hdata) { - return false; + u64 tsc_curr, tsc_curr_min, tsc_curr_max; + + if (smp_processor_id() !=3D hdata->handling_cpu) + return false; + + tsc_curr =3D rdtsc(); + tsc_curr_min =3D tsc_curr - tsc_next_error; + tsc_curr_max =3D tsc_curr + tsc_next_error; + + return time_in_range64(hdata->tsc_next, tsc_curr_min, tsc_curr_max); } =20 /** --=20 2.17.1 From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7FBFBC433EF for ; Fri, 6 May 2022 00:00:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387329AbiEFADs (ORCPT ); Thu, 5 May 2022 20:03:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36774 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387103AbiEFACS (ORCPT ); Thu, 5 May 2022 20:02:18 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0C107612AF for ; Thu, 5 May 2022 16:57:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795078; x=1683331078; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=FvfOUL9RmgV913TUdo/h6z7c9UJ7bF06B9BxDFF6nMg=; b=nfjf8IWlAdg8KzwOwjqHHoFRyVVECCYp7DMgn9tfMKdRGkhcdzjRQG7g JEQo9VxaZruTaAZ3B6VS7jc+9aVicHOe25v+/YkMDF+TAYITMxzhyUYq3 NYKzzFRxc3PaBth11ez97wcc5DvbDcxa6tNjYAwPIr83dpxhCeWTXmJk0 I6CBsukUhE8ldM+UOa1nQEGnYFJvW8jRFyLahswVqYWqp5lle0mg4pfHa B+vXyUxiuZ8Y9DiK3tHiDDs/h7o8LWBMqgXnCZ8dp427EYdqlFGjMoEc5 m1eLJIDCD9lsH+vFsAM8C7Xdz2/XLapXzonf4kjL/J4/E4jnUwPxC9FUB Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283657" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283657" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:55 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914437" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:55 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri Subject: [PATCH v6 24/29] watchdog/hardlockup: Use parse_option_str() to handle "nmi_watchdog" Date: Thu, 5 May 2022 17:00:03 -0700 Message-Id: <20220506000008.30892-25-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Prepare hardlockup_panic_setup() to handle a comma-separated list of options. Thus, it can continue parsing its own command-line options while ignoring parameters that are relevant only to specific implementations of the hardlockup detector. Such implementations may use an early_param to parse their own options. Cc: Andi Kleen Cc: Nicholas Piggin Cc: Stephane Eranian Cc: "Ravi V. Shankar" Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Reviewed-by: Tony Luck Signed-off-by: Ricardo Neri --- Changes since v5: * Corrected typo in commit message. (Tony) Changes since v4: * None Changes since v3: * None Changes since v2: * Introduced this patch. Changes since v1: * None --- kernel/watchdog.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 9166220457bc..6443841a755f 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -73,13 +73,13 @@ void __init hardlockup_detector_disable(void) =20 static int __init hardlockup_panic_setup(char *str) { - if (!strncmp(str, "panic", 5)) + if (parse_option_str(str, "panic")) hardlockup_panic =3D 1; - else if (!strncmp(str, "nopanic", 7)) + else if (parse_option_str(str, "nopanic")) hardlockup_panic =3D 0; - else if (!strncmp(str, "0", 1)) + else if (parse_option_str(str, "0")) nmi_watchdog_user_enabled =3D 0; - else if (!strncmp(str, "1", 1)) + else if (parse_option_str(str, "1")) nmi_watchdog_user_enabled =3D 1; return 1; } --=20 2.17.1 From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F642C433F5 for ; Fri, 6 May 2022 00:00:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387113AbiEFAEb (ORCPT ); Thu, 5 May 2022 20:04:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35756 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387211AbiEFAC0 (ORCPT ); Thu, 5 May 2022 20:02:26 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9FDB261615 for ; Thu, 5 May 2022 16:57:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795079; x=1683331079; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=kUyeigYz7qVM6Mrp46t6dkjR9tV464c8drNXHNVDlvU=; b=DCgVPK2dKe7XRceCqzmpXAADDOxku5ZKqAR9XeVj+PJBoODchlftagWN k6IXNCfNHYhUl2SmQr3Sa2M49g/zE814K9ZPV61C1mBmPGAma1L2jB4iN 5dnXKkLzabw5UND8US933BwsYAwe87l+/solNvH565QOvbAcvNmlnc061 +QPBkwTDA38LnLcGgRPjO+VSnHVuRaXhT41mu5wpMJFy4bcV64t3ckKQE Dvz5eCpytQgPLeFd66Kx7KHlglpCC+xTO6B5TXjZFunrJ/kDx04WwZR1M mbC/AwujxOj//mK+loQM+trmTUncaMGVvRl1Hz7cGgjgeNl7W2XgTEjsF g==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283660" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283660" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:55 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914442" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:55 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri Subject: [PATCH v6 25/29] watchdog/hardlockup/hpet: Only enable the HPET watchdog via a boot parameter Date: Thu, 5 May 2022 17:00:04 -0700 Message-Id: <20220506000008.30892-26-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Keep the HPET-based hardlockup detector disabled unless explicitly enabled via a command-line argument. If such parameter is not given, the initialization of the HPET-based hardlockup detector fails and the NMI watchdog will fall back to use the perf-based implementation. Implement the command-line parsing using an early_param, as __setup("nmi_watchdog=3D") only parses generic options. Cc: Andi Kleen Cc: Stephane Eranian Cc: "Ravi V. Shankar" Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Reviewed-by: Tony Luck Signed-off-by: Ricardo Neri -- Changes since v5: * None Changes since v4: * None Changes since v3: * None Changes since v2: * Do not imply that using nmi_watchdog=3Dhpet means the detector is enabled. Instead, print a warning in such case. Changes since v1: * Added documentation to the function handing the nmi_watchdog kernel command-line argument. --- .../admin-guide/kernel-parameters.txt | 8 ++++++- arch/x86/kernel/watchdog_hld_hpet.c | 22 +++++++++++++++++++ 2 files changed, 29 insertions(+), 1 deletion(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentatio= n/admin-guide/kernel-parameters.txt index 269be339d738..89eae950fdb8 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3370,7 +3370,7 @@ Format: [state][,regs][,debounce][,die] =20 nmi_watchdog=3D [KNL,BUGS=3DX86] Debugging features for SMP kernels - Format: [panic,][nopanic,][num] + Format: [panic,][nopanic,][num,][hpet] Valid num: 0 or 1 0 - turn hardlockup detector in nmi_watchdog off 1 - turn hardlockup detector in nmi_watchdog on @@ -3381,6 +3381,12 @@ please see 'nowatchdog'. This is useful when you use a panic=3D... timeout and need the box quickly up again. + When hpet is specified, the NMI watchdog will be driven + by an HPET timer, if available in the system. Otherwise, + it falls back to the default implementation (perf or + architecture-specific). Specifying hpet has no effect + if the NMI watchdog is not enabled (either at build time + or via the command line). =20 These settings can be accessed at runtime via the nmi_watchdog and hardlockup_panic sysctls. diff --git a/arch/x86/kernel/watchdog_hld_hpet.c b/arch/x86/kernel/watchdog= _hld_hpet.c index 3effdbf29095..4413d5fb94f4 100644 --- a/arch/x86/kernel/watchdog_hld_hpet.c +++ b/arch/x86/kernel/watchdog_hld_hpet.c @@ -379,6 +379,28 @@ void hardlockup_detector_hpet_start(void) enable_timer(hld_data); } =20 +/** + * hardlockup_detector_hpet_setup() - Parse command-line parameters + * @str: A string containing the kernel command line + * + * Parse the nmi_watchdog parameter from the kernel command line. If + * selected by the user, use this implementation to detect hardlockups. + */ +static int __init hardlockup_detector_hpet_setup(char *str) +{ + if (!str) + return -EINVAL; + + if (parse_option_str(str, "hpet")) + hardlockup_use_hpet =3D true; + + if (!nmi_watchdog_user_enabled && hardlockup_use_hpet) + pr_err("Selecting HPET NMI watchdog has no effect with NMI watchdog disa= bled\n"); + + return 0; +} +early_param("nmi_watchdog", hardlockup_detector_hpet_setup); + /** * hardlockup_detector_hpet_init() - Initialize the hardlockup detector * --=20 2.17.1 From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1483C433F5 for ; Fri, 6 May 2022 00:00:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387292AbiEFAEg (ORCPT ); Thu, 5 May 2022 20:04:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35140 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387212AbiEFAC0 (ORCPT ); Thu, 5 May 2022 20:02:26 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 49201612B4 for ; Thu, 5 May 2022 16:58:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795080; x=1683331080; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=NgL6S1cr5J9GxlUSQa9DYNpqQacM6MNmaOCrW38i9fU=; b=DViodAtavYHO6WN7lMdblj+RO5AQ1oVwFfzmW3y2h3wtAoTVQEIoM6qk Y4UP8A/P94s6XYQZa4synSRSxcj2Ld4WrhRLDHMaoU/UveSjo2zHJXzw+ wUzMF6CyCx/S1ojveZuqdQ4ucmEpyWPozaJ10rQTXB+7nRFJ7eDUez9a7 83MOCuGIPNxqRMVi+MN/baDrM9LInn/wGyZwz3jl8W0T8pAG7xz0M2Shu v9jkhwJa1aUR1o1Ppt7C2P4PGrsHfanw1ft9C6SEtM6Dk9rLHuLXYj9Jd ZYBdT1cjxwl9jcYDRCuclm5w7xs9X9c8eR090HCAbEHfiSaeoAx7AQmL7 g==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283661" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283661" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914448" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:55 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri Subject: [PATCH v6 26/29] x86/watchdog: Add a shim hardlockup detector Date: Thu, 5 May 2022 17:00:05 -0700 Message-Id: <20220506000008.30892-27-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" The generic hardlockup detector is based on perf. It also provides a set of weak functions that CPU architectures can override. Add a shim hardlockup detector for x86 that overrides such functions and can select between perf and HPET implementations of the detector. For clarity, add the intermediate Kconfig symbol X86_HARDLOCKUP_DETECTOR that is selected whenever the core of the hardlockup detector is selected. Cc: Andi Kleen Cc: Stephane Eranian Cc: "Ravi V. Shankar" Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Suggested-by: Nicholas Piggin Reviewed-by: Tony Luck Signed-off-by: Ricardo Neri --- Changes since v5: * Added watchdog_nmi_start() to be used when tsc_khz is recalibrated. * Always build the x86-specific hardlockup detector shim; not only when the HPET-based detector is selected. * Corrected a typo in comment in watchdog_nmi_probe() (Ani) * Removed useless local ret variable in watchdog_nmi_enable(). (Ani) Changes since v4: * Use a switch to enable and disable the various available detectors. (Andi) Changes since v3: * Fixed style in multi-line comment. (Randy Dunlap) Changes since v2: * Pass cpu number as argument to hardlockup_detector_[enable|disable]. (Thomas Gleixner) Changes since v1: * Introduced this patch: Added an x86-specific shim hardlockup detector. (Nicholas Piggin) --- arch/x86/Kconfig.debug | 3 ++ arch/x86/kernel/Makefile | 2 + arch/x86/kernel/watchdog_hld.c | 85 ++++++++++++++++++++++++++++++++++ 3 files changed, 90 insertions(+) create mode 100644 arch/x86/kernel/watchdog_hld.c diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug index bc34239589db..599001157847 100644 --- a/arch/x86/Kconfig.debug +++ b/arch/x86/Kconfig.debug @@ -6,6 +6,9 @@ config TRACE_IRQFLAGS_NMI_SUPPORT config EARLY_PRINTK_USB bool =20 +config X86_HARDLOCKUP_DETECTOR + def_bool y if HARDLOCKUP_DETECTOR_CORE + config X86_VERBOSE_BOOTUP bool "Enable verbose x86 bootup info messages" default y diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index c700b00a2d86..af3d54e4c836 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -114,6 +114,8 @@ obj-$(CONFIG_KGDB) +=3D kgdb.o obj-$(CONFIG_VM86) +=3D vm86_32.o obj-$(CONFIG_EARLY_PRINTK) +=3D early_printk.o =20 +obj-$(CONFIG_X86_HARDLOCKUP_DETECTOR) +=3D watchdog_hld.o + obj-$(CONFIG_HPET_TIMER) +=3D hpet.o obj-$(CONFIG_X86_HARDLOCKUP_DETECTOR_HPET) +=3D watchdog_hld_hpet.o =20 diff --git a/arch/x86/kernel/watchdog_hld.c b/arch/x86/kernel/watchdog_hld.c new file mode 100644 index 000000000000..ef11f0af4ef5 --- /dev/null +++ b/arch/x86/kernel/watchdog_hld.c @@ -0,0 +1,85 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * A shim hardlockup detector. It overrides the weak stubs of the generic + * implementation to select between the perf- or the hpet-based implementa= tion. + * + * Copyright (C) Intel Corporation 2022 + */ + +#include +#include + +enum x86_hardlockup_detector { + X86_HARDLOCKUP_DETECTOR_PERF, + X86_HARDLOCKUP_DETECTOR_HPET, +}; + +static enum __read_mostly x86_hardlockup_detector detector_type; + +int watchdog_nmi_enable(unsigned int cpu) +{ + switch (detector_type) { + case X86_HARDLOCKUP_DETECTOR_PERF: + hardlockup_detector_perf_enable(); + break; + case X86_HARDLOCKUP_DETECTOR_HPET: + hardlockup_detector_hpet_enable(cpu); + break; + default: + return -ENODEV; + } + + return 0; +} + +void watchdog_nmi_disable(unsigned int cpu) +{ + switch (detector_type) { + case X86_HARDLOCKUP_DETECTOR_PERF: + hardlockup_detector_perf_disable(); + break; + case X86_HARDLOCKUP_DETECTOR_HPET: + hardlockup_detector_hpet_disable(cpu); + break; + } +} + +int __init watchdog_nmi_probe(void) +{ + int ret; + + /* + * Try first with the HPET hardlockup detector. It will only + * succeed if selected at build time and requested in the + * nmi_watchdog command-line parameter. This ensures that the + * perf-based detector is used by default, if selected at + * build time. + */ + ret =3D hardlockup_detector_hpet_init(); + if (!ret) { + detector_type =3D X86_HARDLOCKUP_DETECTOR_HPET; + return ret; + } + + ret =3D hardlockup_detector_perf_init(); + if (!ret) { + detector_type =3D X86_HARDLOCKUP_DETECTOR_PERF; + return ret; + } + + return 0; +} + +void watchdog_nmi_stop(void) +{ + /* Only the HPET lockup detector defines a stop function. */ + if (detector_type =3D=3D X86_HARDLOCKUP_DETECTOR_HPET) + hardlockup_detector_hpet_stop(); +} + +void watchdog_nmi_start(void) +{ + /* Only the HPET lockup detector defines a start function. */ + if (detector_type =3D=3D X86_HARDLOCKUP_DETECTOR_HPET) + hardlockup_detector_hpet_start(); +} --=20 2.17.1 From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 32580C433F5 for ; Fri, 6 May 2022 00:01:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387394AbiEFAEx (ORCPT ); Thu, 5 May 2022 20:04:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35030 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387227AbiEFAC0 (ORCPT ); Thu, 5 May 2022 20:02:26 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0B47661627 for ; Thu, 5 May 2022 16:58:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795081; x=1683331081; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=CEzigczvRyU1oQzEnVej8ViYVBfDe59Bvau/LRgGfAw=; b=JD21qsHSBnCi18XMReoR2u3TNRwNJpyvZ4bq83K9XfJcxLgAbRDwfHL9 Plt7xfb4lLsjXs6eT0OuiXHKlkqRQ4uKI74LbXPtr25HDJRAMBj71t/aD O1M0BP2xUt+dCstU7irXXrV0immxW80YN72EeUYVrqLRo9nMWI2C+uKD8 CURGybN8lFYGWE+kjba63VhPRqHff7NOdxlUW8r2kIePptQohiCncH8DV XPZka48dte/4PpTQ0VU+2BIfoJCQKPemSnjYs9Cht+fyNBNvYIbiZJFP0 9JT3HY/Gxlteps9UK2qi658GCGp+UCoN2OXx/3gk7NIdOZoTNH1j55QBs A==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283662" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283662" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914451" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:56 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri Subject: [PATCH v6 27/29] watchdog: Expose lockup_detector_reconfigure() Date: Thu, 5 May 2022 17:00:06 -0700 Message-Id: <20220506000008.30892-28-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" When there are multiple implementations of the NMI watchdog, there may be situations in which switching from one to another is needed. If the time- stamp counter becomes unstable, the HPET-based NMI watchdog can no longer be used. Similarly, the HPET-based NMI watchdog relies on tsc_khz and needs to be informed when it is refined. Reloading the NMI watchdog or switching to another hardlockup detector can be done cleanly by updating the arch-specific stub and then reconfiguring the whole lockup detector. Expose lockup_detector_reconfigure() to achieve this goal. Cc: Andi Kleen Cc: Nicholas Piggin Cc: Stephane Eranian Cc: "Ravi V. Shankar" Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Reviewed-by: Tony Luck Signed-off-by: Ricardo Neri --- Changes since v5: * None Changes since v4: * Switching to the perf-based lockup detector under the hood is hacky. Instead, reconfigure the whole lockup detector. Changes since v3: * None Changes since v2: * Introduced this patch. Changes since v1: * N/A --- include/linux/nmi.h | 2 ++ kernel/watchdog.c | 4 ++-- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/include/linux/nmi.h b/include/linux/nmi.h index cf12380e51b3..73827a477288 100644 --- a/include/linux/nmi.h +++ b/include/linux/nmi.h @@ -16,6 +16,7 @@ void lockup_detector_init(void); void lockup_detector_soft_poweroff(void); void lockup_detector_cleanup(void); bool is_hardlockup(void); +void lockup_detector_reconfigure(void); =20 extern int watchdog_user_enabled; extern int nmi_watchdog_user_enabled; @@ -37,6 +38,7 @@ extern int sysctl_hardlockup_all_cpu_backtrace; static inline void lockup_detector_init(void) { } static inline void lockup_detector_soft_poweroff(void) { } static inline void lockup_detector_cleanup(void) { } +static inline void lockup_detector_reconfigure(void) { } #endif /* !CONFIG_LOCKUP_DETECTOR */ =20 #ifdef CONFIG_SOFTLOCKUP_DETECTOR diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 6443841a755f..e5b67544f8c8 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -537,7 +537,7 @@ int lockup_detector_offline_cpu(unsigned int cpu) return 0; } =20 -static void lockup_detector_reconfigure(void) +void lockup_detector_reconfigure(void) { cpus_read_lock(); watchdog_nmi_stop(); @@ -579,7 +579,7 @@ static __init void lockup_detector_setup(void) } =20 #else /* CONFIG_SOFTLOCKUP_DETECTOR */ -static void lockup_detector_reconfigure(void) +void lockup_detector_reconfigure(void) { cpus_read_lock(); watchdog_nmi_stop(); --=20 2.17.1 From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44B6CC433EF for ; Fri, 6 May 2022 00:01:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387336AbiEFAEn (ORCPT ); Thu, 5 May 2022 20:04:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37048 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387123AbiEFAC1 (ORCPT ); Thu, 5 May 2022 20:02:27 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1F2E061289 for ; Thu, 5 May 2022 16:58:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795084; x=1683331084; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=OMAzr1b+gR0ghYqlTvIkOvlFxefky6ROYu4/AGgKvYI=; b=KOScxYLh9RpWN1+/gGfLBOuP7yLtkhBmAM3/h/WJ1UiPvuuONwyh6dZh 1qFzF5j7u2HsLCxaLB7BFjGQUG4GWs8JpTynKLW8BFQbk61jeqWjQLi8v N1bTLTTcT08xicRk2JM4dAGtBPT9qhFsn30yWBfBr/Vn3VXIIRgKuq+Q9 lyFSgz957eK0c69/gS/tb0SBAFwUO87dbVxeLP9VK2QxqVaCQzMtNmOIo T5N8qav91tsq7NCJ7LbFxSbcDavQZWV8JHdCAH7n1c84cNVbj5uO+zO8k gaLga0n68CQuuesblf6mP7HqsJkFheOiyxP+Cu8WFoC2rxCjJIWBz9bpy g==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283664" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283664" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:57 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914455" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:56 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri Subject: [PATCH v6 28/29] x86/tsc: Restart NMI watchdog after refining tsc_khz Date: Thu, 5 May 2022 17:00:07 -0700 Message-Id: <20220506000008.30892-29-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" The HPET hardlockup detector relies on tsc_khz to estimate the value of that the TSC will have when its HPET channel fires. A refined tsc_khz helps to estimate better the expected TSC value. Using the early value of tsc_khz may lead to a large error in the expected TSC value. Restarting the NMI watchdog detector has the effect of kicking its HPET channel and make use of the refined tsc_khz. When the HPET hardlockup is not in use, restarting the NMI watchdog is a noop. Cc: Andi Kleen Cc: Stephane Eranian Cc: "Ravi V. Shankar" Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Signed-off-by: Ricardo Neri --- Changes since v5: * Introduced this patch Changes since v4 * N/A Changes since v3 * N/A Changes since v2: * N/A Changes since v1: * N/A --- arch/x86/kernel/tsc.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c index cafacb2e58cc..cc1843044d88 100644 --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -1386,6 +1386,12 @@ static void tsc_refine_calibration_work(struct work_= struct *work) /* Inform the TSC deadline clockevent devices about the recalibration */ lapic_update_tsc_freq(); =20 + /* + * If in use, the HPET hardlockup detector relies on tsc_khz. + * Reconfigure it to make use of the refined tsc_khz. + */ + lockup_detector_reconfigure(); + /* Update the sched_clock() rate to match the clocksource one */ for_each_possible_cpu(cpu) set_cyc2ns_scale(tsc_khz, cpu, tsc_stop); --=20 2.17.1 From nobody Sun Feb 8 02:51:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4469C433EF for ; Fri, 6 May 2022 00:01:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387353AbiEFAEp (ORCPT ); Thu, 5 May 2022 20:04:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37606 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387094AbiEFACl (ORCPT ); Thu, 5 May 2022 20:02:41 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D3CE061637 for ; Thu, 5 May 2022 16:58:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1651795084; x=1683331084; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=+ZwU4efMPyk6LipSXhNxIogBqScj1OvsPJuz+1wxs4g=; b=VChzPwj/PQuOBhNK/FhVuRBaTubzzNmtuuak/CGOyFEvkjt7BZoOEWZa ijOyLePElG0OcXdLw/IIx6loKGRAUzCGGfZRZV+rCWRt3XDjeskroXOHA T5CMd3tkGN+Dzo4y/SG2WWQ0U+Ejk17gp6lVQqQYVfSPg+9hAeSC1O0Jj gbRIdghndZqTnub6wxM1ZlEjGRJHfCXLp2LRt7rONW3hM15to1aCTuLu3 C4FS79WxFJ+ZX2YYaAsW4YpMqDUUBpXVjVbIiMu5hjzB2RtFGFKOVc53K s/72sSx48eFdh6AWNq6CVjBwK0IKKgYszvwGUEYxDdHtEIu+EWIyyZbor w==; X-IronPort-AV: E=McAfee;i="6400,9594,10338"; a="250283666" X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="250283666" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 May 2022 16:57:57 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,203,1647327600"; d="scan'208";a="694914459" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga004.jf.intel.com with ESMTP; 05 May 2022 16:57:57 -0700 From: Ricardo Neri To: Thomas Gleixner , x86@kernel.org Cc: Tony Luck , Andi Kleen , Stephane Eranian , Andrew Morton , Joerg Roedel , Suravee Suthikulpanit , David Woodhouse , Lu Baolu , Nicholas Piggin , "Ravi V. Shankar" , Ricardo Neri , iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org, Ricardo Neri Subject: [PATCH v6 29/29] x86/tsc: Switch to perf-based hardlockup detector if TSC become unstable Date: Thu, 5 May 2022 17:00:08 -0700 Message-Id: <20220506000008.30892-30-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> References: <20220506000008.30892-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" The HPET-based hardlockup detector relies on the TSC to determine if an observed NMI interrupt was originated by HPET timer. Hence, this detector can no longer be used with an unstable TSC. In such case, permanently stop the HPET-based hardlockup detector and start the perf-based detector. Cc: Andi Kleen Cc: Stephane Eranian Cc: "Ravi V. Shankar" Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Suggested-by: Thomas Gleixner Reviewed-by: Tony Luck Signed-off-by: Ricardo Neri --- Changes since v5: * Relocated the delcaration of hardlockup_detector_switch_to_perf() to x86/nmi.h It does not depend on HPET. * Removed function stub. The shim hardlockup detector is always for x86. Changes since v4: * Added a stub version of hardlockup_detector_switch_to_perf() for !CONFIG_HPET_TIMER. (lkp) * Reconfigure the whole lockup detector instead of unconditionally starting the perf-based hardlockup detector. Changes since v3: * None Changes since v2: * Introduced this patch. Changes since v1: * N/A --- arch/x86/include/asm/nmi.h | 6 ++++++ arch/x86/kernel/tsc.c | 2 ++ arch/x86/kernel/watchdog_hld.c | 6 ++++++ 3 files changed, 14 insertions(+) diff --git a/arch/x86/include/asm/nmi.h b/arch/x86/include/asm/nmi.h index 4a0d5b562c91..47752ff67d8b 100644 --- a/arch/x86/include/asm/nmi.h +++ b/arch/x86/include/asm/nmi.h @@ -63,4 +63,10 @@ void stop_nmi(void); void restart_nmi(void); void local_touch_nmi(void); =20 +#ifdef CONFIG_X86_HARDLOCKUP_DETECTOR +void hardlockup_detector_switch_to_perf(void); +#else +static inline void hardlockup_detector_switch_to_perf(void) { } +#endif + #endif /* _ASM_X86_NMI_H */ diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c index cc1843044d88..74772ffc79d1 100644 --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -1176,6 +1176,8 @@ void mark_tsc_unstable(char *reason) =20 clocksource_mark_unstable(&clocksource_tsc_early); clocksource_mark_unstable(&clocksource_tsc); + + hardlockup_detector_switch_to_perf(); } =20 EXPORT_SYMBOL_GPL(mark_tsc_unstable); diff --git a/arch/x86/kernel/watchdog_hld.c b/arch/x86/kernel/watchdog_hld.c index ef11f0af4ef5..7940977c6312 100644 --- a/arch/x86/kernel/watchdog_hld.c +++ b/arch/x86/kernel/watchdog_hld.c @@ -83,3 +83,9 @@ void watchdog_nmi_start(void) if (detector_type =3D=3D X86_HARDLOCKUP_DETECTOR_HPET) hardlockup_detector_hpet_start(); } + +void hardlockup_detector_switch_to_perf(void) +{ + detector_type =3D X86_HARDLOCKUP_DETECTOR_PERF; + lockup_detector_reconfigure(); +} --=20 2.17.1