[v1] x86/microcode: Support for Intel Staging Feature

[PATCH 0/6] x86/microcode: Support for Intel Staging Feature

Posted by Chang S. Bae 1 year, 1 month ago

Hi all,

Changes since the RFC posting [1]:

* Simplified the staging address discovery code. Leveraging the staging
  topology, stage only if package id changes (Thomas).

* Cleaned up the MSR read logic (Boris and Dave).

* Renamed functions to align with the do_something naming convention
  (Boris).

* Polished staging result messages (Boris).

* Dropped the WBINVD removal as mainlined now.

This series is based on 6.13-rc2. You can also find it from this repo:
    git://github.com/intel-staging/microcode.git staging_v1

I would appreciate further reviews and feedback.

Thanks,
Chang

---

Here is the original cover letter with minor updates -- removing the
WBINVD story and updating the function names:

== Latency Spike Issue ==

As microcode images have increased in size, a corresponding rise in load
latency has become inevitable. This latency spike significantly impacts
late loading, which remains in use despite the cautions highlighted in
the documentation [2]. The issue is especially critical for continuously
running workloads and virtual machines, where excessive delays can lead
to timeouts.

== Staging for Latency Reduction ==

Currently, writing to MSR_IA32_UCODE_WRITE triggers the entire update
process -- loading, validating, and activation -- all of which contribute
to the latency during CPU halt. The staging feature mitigates this by
refactoring all but the activation step out of the critical path,
allowing CPUs to continue serving workloads while staging takes place.

== Validation ==

We internally established pseudocode to clearly define all essential
steps for interacting with the firmware. Any firmware implementation
supporting staging should adhere to this contract. This patch set
incorporates that staging logic, which I successfully tested on one
firmware implementation. Multiple teams at Intel have also validated the
feature across different implementations.

Preliminary results from a pre-production system show a significant
reduction in latency (about 40%) with the staging approach alone.
Further improvements are possible with additional optimizations [*].

== Call for Review ==

Here are several key points to highlight for feedback:

  1. Staging Integration Approach:

     In the core code, the high-level sequence for late loading is:

     (1) request_microcode_fw(), and
     (2) load_late_stop_cpus()->apply_microcode()

     Staging doesn't fit neatly into either steps, as it involves the
     loading process but not the activation. Therefore, a new callback is
     introduced:

       core::load_late_locked()
       -> intel::staging_microcode()
          -> intel_staging::do_stage()

  2. Code Abstraction:

     The newly added intel_staging.c file contains all staging-related
     code to keep it self-contained. Ideally, the entire firmware
     interaction could eventually be abstracted into a single MSR write,
     which remains a long-term goal. Fortunately, recent protocol
     simplifications have made this more feasible.

  3. Staging Policy (TODO):

     While staging is always attempted, the system will fall back to the
     legacy update method if staging fails. There is an open question
     regarding staging policy: should it be mandatory, without fallback,
     in certain usage scenarios? This could lead further refinements in
     the flow depending on feedback and use cases.

  4. Specification Updates

     Recent specification updates have simplified the staging protocol
     and clarified the behavior of MSR_IA32_UCODE_WRITE in conjunction
     with staging:

     4.1. Protocol Simplification

     The specification update [3] has significantly reduced the
     complexity of staging code, trimming the kernel code from ~1K lines
     in preliminary implementations. Thanks to Dave for guiding this
     redesign effort.

     4.2. Clarification of Legacy Update Behavior

     Chapter 5 of the specification adds further clarification on
     MSR_IA32_UCODE_WRITE. Key points are summarized below:

     (a) When staging is not performed or failed, a WRMSR will still load
     the patch image, but with higher latency.

     (b) During an active staging process, MSR_IA32_UCODE_WRITE can
     load a new microcode image, again with higher latency.

     (c) If the versions differ between the staged microcode and the
     version loaded via MSR_IA32_UCODE_WRITE, the version loaded through
     the MSR takes precedence.

     I'd also make sure there is no further ambiguity in this documentation
     [3]. Feel free to provide feedback if anything seems unclear or
     unreasonable.

As noted [*], an additional series focused on further latency
optimizations will follow. However, the staging approach was prioritized
due to its significant first-order impact on latency.

[1]: https://lore.kernel.org/all/20241001161042.465584-1-chang.seok.bae@intel.com/
[2]: https://docs.kernel.org/arch/x86/microcode.html#why-is-late-loading-dangerous
[3]: https://cdrdv2.intel.com/v1/dl/getContent/782715
[*]: Further latency improvements will be addressed in the upcoming
     ‘Uniform’ feature series.

Chang S. Bae (6):
  x86/microcode: Introduce staging option to reduce late-loading latency
  x86/msr-index: Define MSR index and bit for the microcode staging
    feature
  x86/microcode/intel: Prepare for microcode staging
  x86/microcode/intel_staging: Implement staging logic
  x86/microcode/intel_staging: Support mailbox data transfer
  x86/microcode/intel: Enable staging when available

 arch/x86/include/asm/msr-index.h              |   9 ++
 arch/x86/kernel/cpu/microcode/Makefile        |   2 +-
 arch/x86/kernel/cpu/microcode/core.c          |  12 +-
 arch/x86/kernel/cpu/microcode/intel.c         |  53 +++++++
 arch/x86/kernel/cpu/microcode/intel_staging.c | 149 ++++++++++++++++++
 arch/x86/kernel/cpu/microcode/internal.h      |   7 +-
 6 files changed, 228 insertions(+), 4 deletions(-)
 create mode 100644 arch/x86/kernel/cpu/microcode/intel_staging.c

-- 
2.45.2

Re: [PATCH 0/6] x86/microcode: Support for Intel Staging Feature

Posted by Chang S. Bae 1 year ago

On 12/10/2024 5:42 PM, Chang S. Bae wrote:
> 
> This series is based on 6.13-rc2. You can also find it from this repo:
>      git://github.com/intel-staging/microcode.git staging_v1

I've rebased on 6.14-rc1 and found no conflict, pushing to the repo:
     git://github.com/intel-staging/microcode.git staging_v1-for-6.14-rc1

>    3. Staging Policy (TODO):
> 
>       While staging is always attempted, the system will fall back to the
>       legacy update method if staging fails. There is an open question
>       regarding staging policy: should it be mandatory, without fallback,
>       in certain usage scenarios? This could lead further refinements in
>       the flow depending on feedback and use cases.

While I noted this as "TODO", the policy part may require a separate 
follow-up.

At the moment, I’d like to assess its current state and ensure the 
review process covers details up to the mailbox handling logic, while 
fully respecting the maintainers' schedule.

Also, feedback from those considering this feature is highly appreciated 
here. If the series looks good, I hope to collect ack or review tags to 
help move it forward.

Thanks,
Chang

Re: [PATCH 0/6] x86/microcode: Support for Intel Staging Feature

Posted by Colin Mitchell 11 months, 2 weeks ago

As a potential user, I'd advocate for an option to disable legacy
loading if staging is supported.

The difference in quiesce time between staging and legacy loading
can be significant. Since late loading is more of an explicit active
action, I would believe allowing the initiating process freedom of a retry
loop or handling any errors from staging makes sense.

Presumably load_late_locked could return an error on failure
to stage leading to an appropriate print.

Re: [PATCH 0/6] x86/microcode: Support for Intel Staging Feature

Posted by Dave Hansen 11 months, 2 weeks ago

On 2/28/25 14:27, Colin Mitchell wrote:
> As a potential user, I'd advocate for an option to disable legacy
> loading if staging is supported.
> 
> The difference in quiesce time between staging and legacy loading
> can be significant. Since late loading is more of an explicit active
> action, I would believe allowing the initiating process freedom of a retry
> loop or handling any errors from staging makes sense.
> 
> Presumably load_late_locked could return an error on failure
> to stage leading to an appropriate print.

Hey Colin!

You might not have noticed, but this series hasn't even been merged yet.
If this series is an important thing to your employer, I'd really
appreciate some reviews on it from you or anyone else to whom it is
important. That would really speed things along!

Requesting new features to be piled on top of an out-of-tree
under-reviewed patch set is likely to slow things down more than speed
them up.

Re: [PATCH 0/6] x86/microcode: Support for Intel Staging Feature

Posted by Borislav Petkov 11 months, 2 weeks ago

On Fri, Feb 28, 2025 at 02:27:15PM -0800, Colin Mitchell wrote:
> As a potential user, I'd advocate for an option to disable legacy
> loading if staging is supported.

What happens if staging is failing indefinitely, for whatever reason?

You can't load any microcode anymore if you've disabled the legacy method
too.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Re: [PATCH 0/6] x86/microcode: Support for Intel Staging Feature

Posted by Dave Hansen 11 months, 2 weeks ago

On 2/28/25 14:52, Borislav Petkov wrote:
> On Fri, Feb 28, 2025 at 02:27:15PM -0800, Colin Mitchell wrote:
>> As a potential user, I'd advocate for an option to disable legacy
>> loading if staging is supported.
> What happens if staging is failing indefinitely, for whatever reason?
> 
> You can't load any microcode anymore if you've disabled the legacy method
> too.

Yeah, I'm perplexed by this choice too.

You seem to be saying that you'd rather be (for instance) insecure
running old microcode than have the latency blip from a legacy microcode
load.

What action would you take if a staging-load fails? Retry again a few
times? Go back to the CPU vendor and get a new image? Or just ignore it?

Re: [PATCH 0/6] x86/microcode: Support for Intel Staging Feature

Posted by Colin Mitchell 10 months, 2 weeks ago

> On 2/28/25 14:52, Borislav Petkov wrote:
> You can't load any microcode anymore if you've disabled the legacy method
> too.

Staging, if I've read the code correctly here, is only used for late loading.
There is performance reason to use staging for early kernel microcode 
loading pre-SMP. Therefore, if staging perpetually fails, it can be applied
without staging on the next reboot.

> On 2/28/25 15:23, Dave Hansen wrote:
> You seem to be saying that you'd rather be (for instance) insecure
> running old microcode than have the latency blip from a legacy microcode
> load.
> What action would you take if a staging-load fails? Retry again a few
> times? Go back to the CPU vendor and get a new image? Or just ignore it?

That's correct, but the latency tradeoff scales with the platform specific
size of the microcode patch. I'd prefer to have a more deterministic
update path and believe the potential latency blip would be significant
enough to justify the option.

Adding configuration would allow me to handle the error as needed.
A retry loop would be a first step but I could also look to migrate VMs
off the machine if the platform specific latency blip would negatively 
affect sensitive guest VMs. While an ideal solution imo would then
allow me to force legacy loading, I could also settle with it being done
through a reboot where early boot would already skip staging.

Re: [PATCH 0/6] x86/microcode: Support for Intel Staging Feature

Posted by Dave Hansen 10 months, 1 week ago

On 3/26/25 14:29, Colin Mitchell wrote:
>> On 2/28/25 15:23, Dave Hansen wrote:
>> You seem to be saying that you'd rather be (for instance) insecure
>> running old microcode than have the latency blip from a legacy microcode
>> load.
>> What action would you take if a staging-load fails? Retry again a few
>> times? Go back to the CPU vendor and get a new image? Or just ignore it?
> That's correct, but the latency tradeoff scales with the platform specific
> size of the microcode patch. I'd prefer to have a more deterministic
> update path and believe the potential latency blip would be significant
> enough to justify the option.
> 
> Adding configuration would allow me to handle the error as needed.
> A retry loop would be a first step but I could also look to migrate VMs
> off the machine if the platform specific latency blip would negatively 
> affect sensitive guest VMs. While an ideal solution imo would then
> allow me to force legacy loading, I could also settle with it being done
> through a reboot where early boot would already skip staging.

There's a lot to unpack there.

But, for the purposes of this series, I think what's here is fine for
now. Let's leave staging _purely_ as an opportunistic optimization.

If folks want to make this more configurable like making staging
*mandatory* and disabling legacy loading then we'll look at the patches
(and their justifications) as folks submit them. A good justification
would be something along these lines:

	Legacy microcode loading causes a 5,000ms latency blip. Our
	customers have been complaining to us for years about those
	legacy loading blips. Migrating a VM causes a 1ms latency blip.
	Those 4,999ms mean a lot to the folks running those VMs. As a
	CSP, we would like the flexibility to avoid the gigantic legacy
	microcode loading blips because they are bad and getting worse.

It becomes less compelling if it's something like this:

	Legacy microcode loading causes a 50ms latency blip. Migrating a
	VM causes a 49ms latency blip. That millisecond is super
	important.

... and increasingly less so as it becomes:

	We like knobs and flexibility for $REASONS.

You don't have to have down-to-the-millisecond numbers here. Orders of
magnitude are fine. But if you can't demonstrate (or don't anticipate)
orders of magnitude improvement from the knob, then it's probably not
worth it. It better be a 10x improvement, not 10%.