With NMI source reporting enabled, NMI handler can prioritize the
handling of sources reported explicitly. If the source is unknown, then
resume the existing processing flow. i.e. invoke all NMI handlers.
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
---
arch/x86/kernel/nmi.c | 48 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 48 insertions(+)
diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index e2122ec9313c..32c285722734 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -149,12 +149,60 @@ static inline int do_handle_nmi(struct nmiaction *a, struct pt_regs *regs, unsig
return thishandled;
}
+static inline int nmi_handle_src(unsigned int type, struct pt_regs *regs)
+{
+ unsigned long source_bitmask;
+ struct nmiaction *a;
+ int handled = 0;
+ int vec = 1;
+
+ if (!cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) || type != NMI_LOCAL)
+ return 0;
+
+ source_bitmask = fred_event_data(regs);
+ if (!source_bitmask) {
+ pr_warn_ratelimited("NMI received without source information!\n");
+ return 0;
+ }
+
+ /*
+ * Per NMI source specification, there is no guarantee that a valid
+ * NMI vector is always delivered, even when the source specified
+ * one. It is software's responsibility to check all available NMI
+ * sources when bit 0 is set in the NMI source bitmap. i.e. we have
+ * to call every handler as if we have no NMI source.
+ * On the other hand, if we do get non-zero vectors, we know exactly
+ * what the sources are. So we only call the handlers with the bit set.
+ */
+ if (source_bitmask & BIT(NMI_SOURCE_VEC_UNKNOWN)) {
+ pr_warn_ratelimited("NMI received with unknown source\n");
+ return 0;
+ }
+
+ rcu_read_lock();
+ /* Bit 0 is for unknown NMI sources, skip it. */
+ for_each_set_bit_from(vec, &source_bitmask, NR_NMI_SOURCE_VECTORS) {
+ a = rcu_dereference(nmiaction_src_table[vec]);
+ if (!a) {
+ pr_warn_ratelimited("NMI received %d no handler", vec);
+ continue;
+ }
+ handled += do_handle_nmi(a, regs, type);
+ }
+ rcu_read_unlock();
+ return handled;
+}
+
static int nmi_handle(unsigned int type, struct pt_regs *regs)
{
struct nmi_desc *desc = nmi_to_desc(type);
struct nmiaction *a;
int handled=0;
+ handled = nmi_handle_src(type, regs);
+ if (handled)
+ return handled;
+
rcu_read_lock();
/*
--
2.25.1
On 5/29/24 13:33, Jacob Pan wrote:
> +
> + /*
> + * Per NMI source specification, there is no guarantee that a valid
> + * NMI vector is always delivered, even when the source specified
> + * one. It is software's responsibility to check all available NMI
> + * sources when bit 0 is set in the NMI source bitmap. i.e. we have
> + * to call every handler as if we have no NMI source.
> + * On the other hand, if we do get non-zero vectors, we know exactly
> + * what the sources are. So we only call the handlers with the bit set.
> + */
> + if (source_bitmask & BIT(NMI_SOURCE_VEC_UNKNOWN)) {
> + pr_warn_ratelimited("NMI received with unknown source\n");
> + return 0;
> + }
> +
Note: if bit 0 is set, you can process any other bits first (on the
general assumption that if you bother with NMI source then those events
are performance sensitive), and you could even exclude them from the
poll. This is an optimization, and what you have here is correct from a
functional point of view.
> + source_bitmask = fred_event_data(regs);
> + if (!source_bitmask) {
> + pr_warn_ratelimited("NMI received without source information!\n");
> + return 0;
> + }
If the event data word is 0, it probably should be treated as a
*permanent* failure, as it is a Should Not Happen[TM] situation, and
means there is an implementation (or, perhaps more likely,
virtualization!) bug, and as such it may not be safe to trust the NMI
source information in the future.
> + if (!cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) || type != NMI_LOCAL)
> + return 0;
I'm not sure I understand why you are requiring type to be NMI_LOCAL here?
-hpa
Hi Peter,
On Wed, 29 May 2024 13:47:09 -0700, "H. Peter Anvin" <hpa@zytor.com> wrote:
> On 5/29/24 13:33, Jacob Pan wrote:
> > +
> > + /*
> > + * Per NMI source specification, there is no guarantee that a
> > valid
> > + * NMI vector is always delivered, even when the source
> > specified
> > + * one. It is software's responsibility to check all available
> > NMI
> > + * sources when bit 0 is set in the NMI source bitmap. i.e. we
> > have
> > + * to call every handler as if we have no NMI source.
> > + * On the other hand, if we do get non-zero vectors, we know
> > exactly
> > + * what the sources are. So we only call the handlers with the
> > bit set.
> > + */
> > + if (source_bitmask & BIT(NMI_SOURCE_VEC_UNKNOWN)) {
> > + pr_warn_ratelimited("NMI received with unknown
> > source\n");
> > + return 0;
> > + }
> > +
>
> Note: if bit 0 is set, you can process any other bits first (on the
> general assumption that if you bother with NMI source then those events
> are performance sensitive), and you could even exclude them from the
> poll. This is an optimization, and what you have here is correct from a
> functional point of view.
>
Yes, it is a good optimization but also a rare case that bit 0 is set. no?
> > + source_bitmask = fred_event_data(regs);
> > + if (!source_bitmask) {
> > + pr_warn_ratelimited("NMI received without source
> > information!\n");
> > + return 0;
> > + }
>
> If the event data word is 0, it probably should be treated as a
> *permanent* failure, as it is a Should Not Happen[TM] situation, and
> means there is an implementation (or, perhaps more likely,
> virtualization!) bug, and as such it may not be safe to trust the NMI
> source information in the future.
>
Good point, I will add a flag to permanently disable NMI source reporting
for this boot cycle if that happens.
> > + if (!cpu_feature_enabled(X86_FEATURE_NMI_SOURCE) || type !=
> > NMI_LOCAL)
> > + return 0;
>
> I'm not sure I understand why you are requiring type to be NMI_LOCAL here?
>
It is just for this current implementation I am not including external
NMIs. AFAIK, there is no users, i.e. no device MSIs delivered as NMI. I saw
effort trying to make HPET NMI watchdog but not materialized.
Thanks,
Jacob
On 5/29/24 13:33, Jacob Pan wrote:
> +
> + rcu_read_lock();
> + /* Bit 0 is for unknown NMI sources, skip it. */
> + for_each_set_bit_from(vec, &source_bitmask, NR_NMI_SOURCE_VECTORS) {
> + a = rcu_dereference(nmiaction_src_table[vec]);
> + if (!a) {
> + pr_warn_ratelimited("NMI received %d no handler", vec);
> + continue;
> + }
In this case, you should assume some chipset hardware or VMM is giving
you garbage in the event bitmask, and treat it as if bit 0 were set.
-hpa
Hi Peter,
On Wed, 29 May 2024 14:12:19 -0700, "H. Peter Anvin" <hpa@zytor.com> wrote:
> On 5/29/24 13:33, Jacob Pan wrote:
> > +
> > + rcu_read_lock();
> > + /* Bit 0 is for unknown NMI sources, skip it. */
> > + for_each_set_bit_from(vec, &source_bitmask,
> > NR_NMI_SOURCE_VECTORS) {
> > + a = rcu_dereference(nmiaction_src_table[vec]);
> > + if (!a) {
> > + pr_warn_ratelimited("NMI received %d no
> > handler", vec);
> > + continue;
> > + }
>
> In this case, you should assume some chipset hardware or VMM is giving
> you garbage in the event bitmask, and treat it as if bit 0 were set.
>
right, should return 0 and poll all handlers.
Thanks,
Jacob
© 2016 - 2025 Red Hat, Inc.