[PATCH] x86/msr: add CPU_OUT_OF_SPEC taint name to "unrecognized" pr_warn(msg)

Marc Herbert posted 1 patch 3 months, 1 week ago
There is a newer version of this series
arch/x86/kernel/msr.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
[PATCH] x86/msr: add CPU_OUT_OF_SPEC taint name to "unrecognized" pr_warn(msg)
Posted by Marc Herbert 3 months, 1 week ago
While restricting access, commit a7e1f67ed29f ("x86/msr: Filter MSR
writes") also added warning and tainting. But the warning message never
mentioned tainting. Moreover, this uses the "CPU_OUT_OF_SPEC" flag which
is not clearly related to MSRs: that flag is overloaded by several,
fairly different situations, including some much scarier ones. So,
without an expert around (thank you Dave Hansen), it would have been
practically impossible to root cause the tainting from just the log file
at hand.  Fix this by simply appending the CPU_OUT_OF_SPEC flag to the
warning message.

This readability issue happened when staring at logs involving the
Intel Memory Latency Checker (among many other things going on in that
log). The MLC disables hardware prefetch.

Signed-off-by: Marc Herbert <marc.herbert@linux.intel.com>
---
 arch/x86/kernel/msr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/msr.c b/arch/x86/kernel/msr.c
index e17c16c54a37..21355130cc78 100644
--- a/arch/x86/kernel/msr.c
+++ b/arch/x86/kernel/msr.c
@@ -98,8 +98,8 @@ static int filter_write(u32 reg)
 	if (!__ratelimit(&fw_rs))
 		return 0;
 
-	pr_warn("Write to unrecognized MSR 0x%x by %s (pid: %d).\n",
-	        reg, current->comm, current->pid);
+	pr_warn("Write to unrecognized MSR 0x%x by %s (pid: %d), tainting %s\n",
+		reg, current->comm, current->pid, taint_flags[TAINT_CPU_OUT_OF_SPEC].desc);
 	pr_warn("See https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/about for details.\n");
 
 	return 0;

---
base-commit: dcb6fa37fd7bc9c3d2b066329b0d27dedf8becaa
change-id: 20251030-tainted-msr-4f4ec8acd95c

Best regards,
--  
Marc Herbert <marc.herbert@linux.intel.com>
Re: [PATCH] x86/msr: add CPU_OUT_OF_SPEC taint name to "unrecognized" pr_warn(msg)
Posted by kernel test robot 3 months, 1 week ago
Hi Marc,

kernel test robot noticed the following build errors:

[auto build test ERROR on dcb6fa37fd7bc9c3d2b066329b0d27dedf8becaa]

url:    https://github.com/intel-lab-lkp/linux/commits/Marc-Herbert/x86-msr-add-CPU_OUT_OF_SPEC-taint-name-to-unrecognized-pr_warn-msg/20251101-111139
base:   dcb6fa37fd7bc9c3d2b066329b0d27dedf8becaa
patch link:    https://lore.kernel.org/r/20251101-tainted-msr-v1-1-e00658ba04d4%40linux.intel.com
patch subject: [PATCH] x86/msr: add CPU_OUT_OF_SPEC taint name to "unrecognized" pr_warn(msg)
config: x86_64-randconfig-103-20251101 (https://download.01.org/0day-ci/archive/20251102/202511020456.NiTKb752-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251102/202511020456.NiTKb752-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202511020456.NiTKb752-lkp@intel.com/

All errors (new ones prefixed by >>, old ones prefixed by <<):

>> ERROR: modpost: "taint_flags" [arch/x86/kernel/msr.ko] undefined!

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Re: [PATCH] x86/msr: add CPU_OUT_OF_SPEC taint name to "unrecognized" pr_warn(msg)
Posted by Borislav Petkov 3 months, 1 week ago
On Sat, Nov 01, 2025 at 03:10:24AM +0000, Marc Herbert wrote:
> While restricting access, commit a7e1f67ed29f ("x86/msr: Filter MSR
> writes") also added warning and tainting. But the warning message never
> mentioned tainting. Moreover, this uses the "CPU_OUT_OF_SPEC" flag which
> is not clearly related to MSRs: that flag is overloaded by several,
> fairly different situations, including some much scarier ones. So,

Taint flags are expensive and we don't have flag for everything. And when
userspace is poking at MSRs, that's similar to putting the CPU in
a out-of-specification mode of sorts. So I took what's closest.

> without an expert around (thank you Dave Hansen), it would have been
> practically impossible to root cause the tainting from just the log file
> at hand.  Fix this by simply appending the CPU_OUT_OF_SPEC flag to the
> warning message.
>
> This readability issue happened when staring at logs involving the
> Intel Memory Latency Checker (among many other things going on in that
> log). The MLC disables hardware prefetch.

Thanks for the background.

What is not clear to me is why do you need to dump the fact that it tainted
here and dump the taint flag too?

Also, why aren't you using print_tainted() if that is really necessary?

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette
Re: [PATCH] x86/msr: add CPU_OUT_OF_SPEC taint name to "unrecognized" pr_warn(msg)
Posted by Marc Herbert 3 months, 1 week ago
Thanks Boris for looking at this!

Borislav Petkov <bp@alien8.de> writes:
> Taint flags are expensive and we don't have flag for everything. And when
> userspace is poking at MSRs, that's similar to putting the CPU in
> a out-of-specification mode of sorts. So I took what's closest.

I tried to be very thorough to justify the change but I did not mean
some alternative implementation was available, sorry if I gave that
impression. I just struggled to make sense of some unrelated crash logs
and that's really all from my point of view. I mean all for now at
least.

> What is not clear to me is why do you need to dump the fact that it tainted
> here and dump the taint flag too?

You mean the "%s", tainted_flags[OOSPEC].desc bit? Not sure I understand
your question. That was just a way to re-use the existing string. But as
found by the kernel test robot, that was a mistake because
tainted_flags[] is not exported to modules. I missed that msr.c could be
compiled as a module, apologies. I've removed that in v2, I want to make
this change even simpler and append this instead:

  pr_warn( "... (pid: %d), tainting CPU_OUT_OF_SPEC.\n",

> Also, why aren't you using print_tainted()... 

AFAIK print_tainted() shows all current flags, not just
CPU_OUT_OF_SPEC. So a full print_tainted() display would not make
unambiguous which specific flag is being set, even more so considering
writing MSRs can happen a very long time after boot. But, much worse:
the CPU_OUT_OF_SPEC taint loop has not even been set yet when this
pr_warn() is run...

> ...  if that is really necessary?

This patch is not "really necessary". Its purpose is only to save hours
or even days for people trying to make sense of crash logs. In my ideal
world, it should always be easy to tell "who tainted what when" from the
logs without an corresponding expert and/or searching the source code. I
admit this could harm job security ;-)

This is not unusual, for instance TAINT_OOT_MODULE is not printed
literally in the logs yet this other pr_warn() clearly points the finger
at mod->name(s) in the logs:

  pr_warn("%s: loading out-of-tree module taints kernel, mod->name);

Some other flags have a much narrower range which makes the lack of
finger pointing a non-issue.
Re: [PATCH] x86/msr: add CPU_OUT_OF_SPEC taint name to "unrecognized" pr_warn(msg)
Posted by Borislav Petkov 3 months, 1 week ago
Hi,

On Sat, Nov 01, 2025 at 08:19:27PM -0700, Marc Herbert wrote:
> I just struggled to make sense of some unrelated crash logs and that's
> really all from my point of view. I mean all for now at least.

There's the answer to my question! :-)

>   pr_warn( "... (pid: %d), tainting CPU_OUT_OF_SPEC.\n",

Right, or simply "tainting kernel" suffices. Like other code does. So that you
can find the word "taint" in the logs and you know where it comes from. As
this is what you want.

> This patch is not "really necessary". Its purpose is only to save hours
> or even days for people trying to make sense of crash logs. In my ideal
> world, it should always be easy to tell "who tainted what when" from the
> logs without an corresponding expert and/or searching the source code.

Yap, makes sense.

> I admit this could harm job security ;-)

Yeah, although there are always other, more interesting things to hack on. :-)

So yeah, v2 with the agreed upon changes sounds good.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette