Currently collect_cpu_info() is only returning what was cached earlier
instead of reading the current revision from the proper MSR.
Collect the current revision and report that value instead of reflecting
what was cached in the past.
[TBD:
Need to change microcode/amd.c. I didn't quite follow the logic since
it reports the revision from the patch file, instead of reporting the
real PATCH_LEVEL MSR.
Untested on AMD.
]
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: x86 <x86@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Thomas Gleixner (Intel) <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Stefan Talpalaru <stefantalpalaru@yahoo.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Peter Zilstra (Intel) <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Cooper <Andrew.Cooper3@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Martin Pohlack <mpohlack@amazon.de>
---
arch/x86/kernel/cpu/microcode/intel.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index 467cf37ea90a..de8e591c42cd 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -542,6 +542,13 @@ static int collect_cpu_info(int cpu_num, struct cpu_signature *csig)
{
struct cpuinfo_x86 *c = &cpu_data(cpu_num);
unsigned int val[2];
+ int rev;
+
+ /*
+ * intel_get_microcode_revision() reads a per-core MSR
+ * to read the revision (MSR_IA32_UCODE_REV).
+ */
+ WARN_ON_ONCE(cpu_num != smp_processor_id());
memset(csig, 0, sizeof(*csig));
@@ -553,7 +560,9 @@ static int collect_cpu_info(int cpu_num, struct cpu_signature *csig)
csig->pf = 1 << ((val[1] >> 18) & 7);
}
- csig->rev = c->microcode;
+ rev = intel_get_microcode_revision();
+ c->microcode = rev;
+ csig->rev = rev;
return 0;
}
--
2.37.2
On 1/30/23 13:39, Ashok Raj wrote: > Currently collect_cpu_info() is only returning what was cached earlier > instead of reading the current revision from the proper MSR. > > Collect the current revision and report that value instead of reflecting > what was cached in the past. > > [TBD: > Need to change microcode/amd.c. I didn't quite follow the logic since > it reports the revision from the patch file, instead of reporting the > real PATCH_LEVEL MSR. > > Untested on AMD. > ] This thread is meandering a bit. I think it's because this changelog doesn't have a problem statement. It's hard to agree on a patch being a solution to anything if we haven't first agreed on the problem. What is the problem? What does this "fix"?
On Wed, Feb 01, 2023 at 11:13:58AM -0800, Dave Hansen wrote: > On 1/30/23 13:39, Ashok Raj wrote: > > Currently collect_cpu_info() is only returning what was cached earlier > > instead of reading the current revision from the proper MSR. > > > > Collect the current revision and report that value instead of reflecting > > what was cached in the past. > > > > [TBD: > > Need to change microcode/amd.c. I didn't quite follow the logic since > > it reports the revision from the patch file, instead of reporting the > > real PATCH_LEVEL MSR. > > > > Untested on AMD. > > ] > > This thread is meandering a bit. I think it's because this changelog > doesn't have a problem statement. It's hard to agree on a patch being a > solution to anything if we haven't first agreed on the problem. > > What is the problem? I alluded here.. But yes, clearly missed in the commit log. https://lore.kernel.org/lkml/Y9mW7EiL%2FBpYFLWn@a4bf019067fa.jf.intel.com/ Thomas alluded here https://lore.kernel.org/lkml/87y1pygiyf.ffs@tglx/ that error handling in __reload_late()::wait_for_siblings() code patch is completely broken. This is one that I "assumed" he was referring to, since all we need is to update the current revision, but we end up depending on the behavior of apply_microcode() and that might accidentally have some side effects. Instead only call the collect_cpu_info() and allow that to update the per-cpu revision instead. And there is no risk in performing that vs accidentally letting it fall through with an apply_microcode() that might have risks. > > What does this "fix"? The code performs this delicate late-load dance to prevent sibling threads to be quiet while performing the update. At wait_for_siblings() when all threads arrive, then the sibling does the apply_microcode() which seems wrong.
On Mon, Jan 30, 2023 at 01:39:49PM -0800, Ashok Raj wrote:
> Currently collect_cpu_info() is only returning what was cached earlier
> instead of reading the current revision from the proper MSR.
Because this is how this is supposed to work: you collect what's
currently applied - which should be exactly the same as what's in the
MSR - and then see if there's a new patch in the cache and if so, update
it.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
>> Currently collect_cpu_info() is only returning what was cached earlier >> instead of reading the current revision from the proper MSR. > > Because this is how this is supposed to work: you collect what's > currently applied - which should be exactly the same as what's in the > MSR - and then see if there's a new patch in the cache and if so, update > it. But those get out of step when applying ucode on one logical CPU does an update to other(s) (in this case the HT sibling for the same core). Would you prefer that Ashok leave collect_cpu_info() as-is, and create a new function to read the MSR? -Tony
On Tue, Jan 31, 2023 at 05:34:14PM +0000, Luck, Tony wrote:
> But those get out of step when applying ucode on one logical CPU does
> an update to other(s) (in this case the HT sibling for the same core).
They shouldn't.
I presume you're talking about late update. If so and if it finds a
patch in the cache, it'll do this:
apply_microcode_intel:
/*
* Save us the MSR write below - which is a particular expensive
* operation - when the other hyperthread has updated the microcode
* already.
*/
rev = intel_get_microcode_revision();
if (rev >= mc->hdr.rev) {
ret = UCODE_OK;
goto out;
}
and at the out: label it'll update the revision.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
> I presume you're talking about late update. If so and if it finds a
> patch in the cache, it'll do this:
Yes. This is about late update.
>
> apply_microcode_intel:
>
> /*
> * Save us the MSR write below - which is a particular expensive
> * operation - when the other hyperthread has updated the microcode
> * already.
> */
> rev = intel_get_microcode_revision();
What happens here if the update on the first hyperthread failed (sure, it shouldn't,
but stuff happens at large scale). In this case the current rev is still older that the
the cache version ... so there is no "goto out", and this hyperthread will now write
the MSR to initiate microcode update here, while the first thread is off executing
arbitrary code (the situation that we want to avoid).
> if (rev >= mc->hdr.rev) {
> ret = UCODE_OK;
> goto out;
> }
>
> and at the out: label it'll update the revision.
-Tony
On Tue, Jan 31, 2023 at 08:49:52PM +0000, Luck, Tony wrote:
> What happens here if the update on the first hyperthread failed (sure, it shouldn't,
> but stuff happens at large scale). In this case the current rev is still older that the
> the cache version ... so there is no "goto out", and this hyperthread will now write
> the MSR to initiate microcode update here, while the first thread is off executing
> arbitrary code (the situation that we want to avoid).
Lemme see if I can follow: we sync all threads in __reload_late() and
once they all arrive, we send them down into ->apply_microcode.
T0 arrives, and fails the update. That is this piece:
/* write microcode via MSR 0x79 */
wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)mc->bits);
rev = intel_get_microcode_revision();
if (rev != mc->hdr.rev) {
pr_err("CPU%d update to revision 0x%x failed\n",
cpu, mc->hdr.rev);
return UCODE_ERROR;
}
We return here without updating cpu_sig.rev, as we should.
T1 arrives, updates successfully and updates its cpu_sig.rev.
T0's patch level has been updated too with that because the microcode
engine is shared between the threads. T0's cpu_sig.rev isn't, however,
as that has happened "behind its back", so to speak.
Is that the scenario you're talking about?
If so, if you look at __reload_late(), it'll say
pr_warn("Error reloading microcode on CPU %d\n", cpu);
and the large scale operator will know.
And well, the easy fix is, do the reload again. :-)
That'll update the cached values too.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
> T0 arrives, and fails the update. That is this piece:
>
> /* write microcode via MSR 0x79 */
> wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)mc->bits);
>
> rev = intel_get_microcode_revision();
>
> if (rev != mc->hdr.rev) {
> pr_err("CPU%d update to revision 0x%x failed\n",
> cpu, mc->hdr.rev);
> return UCODE_ERROR;
> }
>
> We return here without updating cpu_sig.rev, as we should.
>
> T1 arrives, updates successfully and updates its cpu_sig.rev.
In an ideal world yes. But what if T1 arrives here and tries to do the
update while T0, which has returned out of the microcode update
code and could be doing anything, happen to be doing WRMSR(some MSR
that the ucode update is tinkering with).
Now T0 explodes (not literally, I hope!) but does something crazy because
it was in the middle of some microcode flow that got updated between two
operations.
-Tony
On Tue, Jan 31, 2023 at 10:43:23PM +0000, Luck, Tony wrote:
> In an ideal world yes. But what if T1 arrives here and tries to do the
> update while T0, which has returned out of the microcode update
> code and could be doing anything, happen to be doing WRMSR(some MSR
> that the ucode update is tinkering with).
>
> Now T0 explodes (not literally, I hope!) but does something crazy because
> it was in the middle of some microcode flow that got updated between two
> operations.
So first of all, I'm wondering whether the scenario you're chasing is
something completely hypothetical or you're actually thinking of
something concrete which has actually happened or there's high potential
for it.
In that case, that late patching sync algorithm would need to be made
more robust to handle cases like that.
Because from what I'm reading above, this doesn't sound like the
reporting is wrong only but more like, if T0 fails the update and T1
gets to do that update for a change, then crap can happen.
Which means, our update dance cannot handle that case properly.
Hmmm...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
> So first of all, I'm wondering whether the scenario you're chasing is > something completely hypothetical or you're actually thinking of > something concrete which has actually happened or there's high potential > for it. Rare, but very real. For one of the speculative side channel microcode updates there were some reports from large data centers that the kernel choked on a WRMSR(IA32_SPEC_CTRL) > In that case, that late patching sync algorithm would need to be made > more robust to handle cases like that. " Yup. when you get through the cleanups and the "minrev" update Ashok has a handful of final[1] patches that add robustness. -Tony [1] Yes, there is an end in sight to all these microcode changes.
On Wed, Feb 01, 2023 at 01:53:32PM +0100, Borislav Petkov wrote: > On Tue, Jan 31, 2023 at 10:43:23PM +0000, Luck, Tony wrote: > > In an ideal world yes. But what if T1 arrives here and tries to do the > > update while T0, which has returned out of the microcode update > > code and could be doing anything, happen to be doing WRMSR(some MSR > > that the ucode update is tinkering with). > > > > Now T0 explodes (not literally, I hope!) but does something crazy because > > it was in the middle of some microcode flow that got updated between two > > operations. > > So first of all, I'm wondering whether the scenario you're chasing is > something completely hypothetical or you're actually thinking of > something concrete which has actually happened or there's high potential > for it. > > In that case, that late patching sync algorithm would need to be made > more robust to handle cases like that. That's correct. But fundamentally we sent the sibling down the apply_microcode() path just to make sure the per-thread info is updated. It appears the code is using a side effect that the revision got updated even though we don't actually intend to perform a wrmsr on the sibling in the normal case that primary completes the update. If the purpose is only to update the revision, using the collect_cpu_info() which seems more appropriate for that purpose, and doesn't have any implied issues with using a wrmsr flow. It's not broken today, but the code isn't future proof. Calling the revision update only keeps those questions at bay. I think this is what Thomas implied to cleanup in his comments. > > Because from what I'm reading above, this doesn't sound like the > reporting is wrong only but more like, if T0 fails the update and T1 > gets to do that update for a change, then crap can happen. > > Which means, our update dance cannot handle that case properly. > It doesn't need to if we don't do an apply_microcode() for the sibling. Cheers, Ashok
On Wed, Feb 01, 2023 at 07:13:30AM -0800, Ashok Raj wrote:
> If the purpose is only to update the revision, using the collect_cpu_info()
> which seems more appropriate for that purpose,
collect_cpu_info() is not called "update local revision" so no, it is
not appropriate.
I can't understand the rest of your handwaving so I'll give you the
usual spiel: first, define exactly what the real-life problem is you're
trying to solve.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Tue, Jan 31, 2023 at 10:08:48PM +0100, Borislav Petkov wrote:
> On Tue, Jan 31, 2023 at 08:49:52PM +0000, Luck, Tony wrote:
> > What happens here if the update on the first hyperthread failed (sure, it shouldn't,
> > but stuff happens at large scale). In this case the current rev is still older that the
> > the cache version ... so there is no "goto out", and this hyperthread will now write
> > the MSR to initiate microcode update here, while the first thread is off executing
> > arbitrary code (the situation that we want to avoid).
>
> Lemme see if I can follow: we sync all threads in __reload_late() and
> once they all arrive, we send them down into ->apply_microcode.
I was interpreting Thomas's response here
https://lore.kernel.org/lkml/87y1pygiyf.ffs@tglx/
---------
#3
Not to talk about the completely broken error handling in the actual
microcode loading case in __reload_late()::wait_for_siblings code path.
---------
I thought sending sibling down the apply_microcode() just to update
revision might have bad interaction, so revise this to be more explicit and
update only the revision number and not have any other intended side
effect.
Patch3 was a prep-patch for patch4, to eliminate the call to
apply_microcode(). Maybe that wasn't the issue?
It's likely Thomas hinted at somthing else and I took the wrong hint.
>
> T0 arrives, and fails the update. That is this piece:
>
> /* write microcode via MSR 0x79 */
> wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)mc->bits);
>
> rev = intel_get_microcode_revision();
>
> if (rev != mc->hdr.rev) {
> pr_err("CPU%d update to revision 0x%x failed\n",
> cpu, mc->hdr.rev);
> return UCODE_ERROR;
> }
>
> We return here without updating cpu_sig.rev, as we should.
>
> T1 arrives, updates successfully and updates its cpu_sig.rev.
>
> T0's patch level has been updated too with that because the microcode
> engine is shared between the threads. T0's cpu_sig.rev isn't, however,
> as that has happened "behind its back", so to speak.
>
> Is that the scenario you're talking about?
The fix in patch4 was just attempting to not call apply_microcode(), since
at this time the other CPUs that arrived in wait_for_siblings: have left,
doing an update when the others are strictly not in renedzvous is the
condition we try to avoid.
Again, maybe this is me reading Thomas's request incorrectly.
Cheers,
Ashok
On Tue, Jan 31, 2023 at 09:34:14AM -0800, Tony Luck wrote:
> >> Currently collect_cpu_info() is only returning what was cached earlier
> >> instead of reading the current revision from the proper MSR.
> >
> > Because this is how this is supposed to work: you collect what's
> > currently applied - which should be exactly the same as what's in the
> > MSR - and then see if there's a new patch in the cache and if so, update
> > it.
>
> But those get out of step when applying ucode on one logical CPU does
> an update to other(s) (in this case the HT sibling for the same core).
>
> Would you prefer that Ashok leave collect_cpu_info() as-is, and create
> a new function to read the MSR?
>
Untested patch below. I think we could add a new function if needed as you
suggest.
Boris, do you have the same dependency to take the siblings through an
update to update some portions of the ucode? or they are only for early
loading?
diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
index d144f918a896..0038690a5d4b 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -673,8 +673,10 @@ static int collect_cpu_info_amd(int cpu, struct cpu_signature *csig)
* mc_bp_resume() can call apply_microcode()
*/
p = find_patch(cpu);
- if (p && (p->patch_id == csig->rev))
+ if (p) {
+ csig->rev = c->microcode = ((struct microcode_amd *)p->data)->hdr.patch_id;
uci->mc = p->data;
+ }
pr_info("CPU%d: patch_level=0x%08x\n", cpu, csig->rev);
© 2016 - 2025 Red Hat, Inc.