[v4] x86: Support for Intel Microcode Staging Feature

[PATCH v4 0/6] x86: Support for Intel Microcode Staging Feature

Posted by Chang S. Bae 1 month, 3 weeks ago

Dear x86 maintainers,

This is another posting of the series, now including a Tested-by tag.
There are no code changes.

The use case for this feature is late-loading with increased load times,
most likely in cloud service environments. I reached out to potential
users for reviews and feedback, and Anselm Busse has confirmed his
testing.

While additional reviews or endorsements could help ease the maintainers'
burden and build confidence, there hasn't been much recent traction. At
this point, I believe it's better to move forward and get a clear
assessment from the maintainers -- either to prompt more engagement from
prospective users or to determine its readiness for merging.

Of course, I appreciate any further feedback on this set.

As usual, the patch series is based on tip/master and is also available
in this repo:
  git://github.com/intel-staging/microcode.git staging_v4

Last posting (V3):
  https://lore.kernel.org/lkml/20250409232713.4536-1-chang.seok.bae@intel.com/

Thanks,
Chang

Chang S. Bae (6):
  x86/microcode: Introduce staging step to reduce late-loading time
  x86/microcode/intel: Establish staging control logic
  x86/microcode/intel: Define staging state struct
  x86/microcode/intel: Implement staging handler
  x86/microcode/intel: Support mailbox transfer
  x86/microcode/intel: Enable staging when available

 arch/x86/include/asm/msr-index.h         |   9 +
 arch/x86/include/asm/topology.h          |   1 +
 arch/x86/kernel/cpu/microcode/core.c     |  11 +
 arch/x86/kernel/cpu/microcode/intel.c    | 338 +++++++++++++++++++++++
 arch/x86/kernel/cpu/microcode/internal.h |   4 +-
 5 files changed, 362 insertions(+), 1 deletion(-)


base-commit: ca76508b9352e8c770b58213cc6c4700e459b7c2
-- 
2.48.1

[PATCH v5 0/7] x86: Support for Intel Microcode Staging Feature

Posted by Chang S. Bae 1 month, 1 week ago

Hi all,

This is another iteration -- changes since v4 [*]:

  * Reworked the preparatory change before referencing
    cpu_primary_thread_mask, based on Dave’s feedback. This is now patch1.

  * Incorporated further feedback from Dave on the staging code, mainly
    to improve clarity, reduce ambiguity, and fix minor issues. Each
    patch includes some details.

  * Collected Chao’s review tag (thanks!) for the first and last patches.

As usual, the series is also available here:
  git://github.com/intel-staging/microcode.git staging_v5

[*] https://lore.kernel.org/lkml/20250813172649.15474-1-chang.seok.bae@intel.com/

Thanks,
Chang

Chang S. Bae (7):
  x86/cpu/topology: Make primary thread mask available with SMP=n
  x86/microcode: Introduce staging step to reduce late-loading time
  x86/microcode/intel: Establish staging control logic
  x86/microcode/intel: Define staging state struct
  x86/microcode/intel: Implement staging handler
  x86/microcode/intel: Support mailbox transfer
  x86/microcode/intel: Enable staging when available

 arch/x86/include/asm/msr-index.h         |   9 +
 arch/x86/include/asm/topology.h          |  12 +-
 arch/x86/kernel/cpu/microcode/core.c     |  11 +
 arch/x86/kernel/cpu/microcode/intel.c    | 386 +++++++++++++++++++++++
 arch/x86/kernel/cpu/microcode/internal.h |   4 +-
 arch/x86/kernel/cpu/topology.c           |   4 -
 arch/x86/kernel/cpu/topology_common.c    |   3 +
 arch/x86/kernel/smpboot.c                |   3 -
 8 files changed, 418 insertions(+), 14 deletions(-)


base-commit: 7182bf4176f93be42225d2ef983894febfa4a1b1
-- 
2.48.1

Re: [PATCH v5 0/7] x86: Support for Intel Microcode Staging Feature

Posted by Luck, Tony 1 month, 1 week ago

On Sat, Aug 23, 2025 at 08:52:03AM -0700, Chang S. Bae wrote:
> Hi all,
> 
> This is another iteration -- changes since v4 [*]:
> 
>   * Reworked the preparatory change before referencing
>     cpu_primary_thread_mask, based on Dave’s feedback. This is now patch1.
> 
>   * Incorporated further feedback from Dave on the staging code, mainly
>     to improve clarity, reduce ambiguity, and fix minor issues. Each
>     patch includes some details.
> 
>   * Collected Chao’s review tag (thanks!) for the first and last patches.
> 
> As usual, the series is also available here:
>   git://github.com/intel-staging/microcode.git staging_v5
> 
> [*] https://lore.kernel.org/lkml/20250813172649.15474-1-chang.seok.bae@intel.com/
> 
> Thanks,
> Chang

Reviewed-by: Tony Luck <tony.luck@intel.com>

-Tony

Re: [PATCH v5 0/7] x86: Support for Intel Microcode Staging Feature

Posted by Chang S. Bae 1 month, 1 week ago

On 8/26/2025 3:13 PM, Luck, Tony wrote:
> 
> Reviewed-by: Tony Luck <tony.luck@intel.com>

Thanks!

[PATCH v5 1/7] x86/cpu/topology: Make primary thread mask available with SMP=n

Posted by Chang S. Bae 1 month, 1 week ago

cpu_primary_thread_mask is only defined when CONFIG_SMP=y. However, even
in UP kernels there is always exactly one CPU, which can reasonably be
treated as the primary thread.

Historically, topology_is_primary_thread() always returned true with
CONFIG_SMP=n. A recent commit:

  4b455f59945aa ("cpu/SMT: Provide a default topology_is_primary_thread()")

replaced it with a generic implementation with the note:

  "When disabling SMT, the primary thread of the SMT will remain
   enabled/active. Architectures that have a special primary thread (e.g.
   x86) need to override this function. ..."

For consistency and clarity, make the primary thread mask available
regardless of SMP, similar to cpu_possible_mask and cpu_present_mask.

Move __cpu_primary_thread_mask into common code to prevent build issues.
Let cpu_mark_primary_thread() configure the mask even for UP kernels,
alongside other masks. Then, topology_is_primary_thread() can
consistently reference it.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
---
V4 -> V5: New patch

Preparatory patch to set up the mask correctly before its new use in
patch3.
---
 arch/x86/include/asm/topology.h       | 12 ++++++------
 arch/x86/kernel/cpu/topology.c        |  4 ----
 arch/x86/kernel/cpu/topology_common.c |  3 +++
 arch/x86/kernel/smpboot.c             |  3 ---
 4 files changed, 9 insertions(+), 13 deletions(-)

diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index 6c79ee7c0957..281252af6e9d 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -218,6 +218,12 @@ static inline unsigned int topology_amd_nodes_per_pkg(void)
 	return __amd_nodes_per_pkg;
 }
 
+#else /* CONFIG_SMP */
+static inline int topology_phys_to_logical_pkg(unsigned int pkg) { return 0; }
+static inline int topology_max_smt_threads(void) { return 1; }
+static inline unsigned int topology_amd_nodes_per_pkg(void) { return 1; }
+#endif /* !CONFIG_SMP */
+
 extern struct cpumask __cpu_primary_thread_mask;
 #define cpu_primary_thread_mask ((const struct cpumask *)&__cpu_primary_thread_mask)
 
@@ -231,12 +237,6 @@ static inline bool topology_is_primary_thread(unsigned int cpu)
 }
 #define topology_is_primary_thread topology_is_primary_thread
 
-#else /* CONFIG_SMP */
-static inline int topology_phys_to_logical_pkg(unsigned int pkg) { return 0; }
-static inline int topology_max_smt_threads(void) { return 1; }
-static inline unsigned int topology_amd_nodes_per_pkg(void) { return 1; }
-#endif /* !CONFIG_SMP */
-
 static inline void arch_fix_phys_package_id(int num, u32 slot)
 {
 }
diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c
index e35ccdc84910..f083023f7dd9 100644
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -75,15 +75,11 @@ bool arch_match_cpu_phys_id(int cpu, u64 phys_id)
 	return phys_id == (u64)cpuid_to_apicid[cpu];
 }
 
-#ifdef CONFIG_SMP
 static void cpu_mark_primary_thread(unsigned int cpu, unsigned int apicid)
 {
 	if (!(apicid & (__max_threads_per_core - 1)))
 		cpumask_set_cpu(cpu, &__cpu_primary_thread_mask);
 }
-#else
-static inline void cpu_mark_primary_thread(unsigned int cpu, unsigned int apicid) { }
-#endif
 
 /*
  * Convert the APIC ID to a domain level ID by masking out the low bits
diff --git a/arch/x86/kernel/cpu/topology_common.c b/arch/x86/kernel/cpu/topology_common.c
index b5a5e1411469..71625795d711 100644
--- a/arch/x86/kernel/cpu/topology_common.c
+++ b/arch/x86/kernel/cpu/topology_common.c
@@ -16,6 +16,9 @@ EXPORT_SYMBOL_GPL(x86_topo_system);
 unsigned int __amd_nodes_per_pkg __ro_after_init;
 EXPORT_SYMBOL_GPL(__amd_nodes_per_pkg);
 
+/* CPUs which are the primary SMT threads */
+struct cpumask __cpu_primary_thread_mask __read_mostly;
+
 void topology_set_dom(struct topo_scan *tscan, enum x86_topology_domains dom,
 		      unsigned int shift, unsigned int ncpus)
 {
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 33e166f6ab12..7804175d2d87 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -103,9 +103,6 @@ EXPORT_PER_CPU_SYMBOL(cpu_core_map);
 DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_die_map);
 EXPORT_PER_CPU_SYMBOL(cpu_die_map);
 
-/* CPUs which are the primary SMT threads */
-struct cpumask __cpu_primary_thread_mask __read_mostly;
-
 /* Representing CPUs for which sibling maps can be computed */
 static cpumask_var_t cpu_sibling_setup_mask;
 
-- 
2.48.1

[PATCH v5 2/7] x86/microcode: Introduce staging step to reduce late-loading time

Posted by Chang S. Bae 1 month, 1 week ago

As microcode patch sizes continue to grow, late-loading latency spikes
can lead to timeouts and disruptions in running workloads. This trend of
increasing patch sizes is expected to continue, so a foundational
solution is needed to address the issue.

To mitigate the problem, a new staging feature is introduced. This option
processes most of the microcode update (excluding activation) on a
non-critical path, allowing CPUs to remain operational during the
majority of the update. By offloading work from the critical path,
staging can significantly reduces latency spikes.

Integrate staging as a preparatory step in late-loading. Introduce a new
callback for staging, which is invoked at the beginning of
load_late_stop_cpus(), before CPUs enter the rendezvous phase.

Staging follows an opportunistic model:

  *  If successful, it reduces CPU rendezvous time
  *  Even though it fails, the process falls back to the legacy path to
     finish the loading process but with potentially higher latency.

Extend struct microcode_ops to incorporate staging properties, which will
be implemented in the vendor code separately.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Tested-by: Anselm Busse <abusse@amazon.de>
Reviewed-by: Chao Gao <chao.gao@intel.com>
---
V4 -> V5:
* Collect Chao's review tag

V1 -> V2:
* Move invocation inside of load_late_stop_cpus() (Boris)
* Add more note about staging (Dave)

There were discussions about whether staging success should be enforced
by a configurable option. That topic is identified as follow-up work,
separate from this series.
    https://lore.kernel.org/lkml/54308373-7867-4b76-be34-63730953f83c@intel.com/
---
 arch/x86/kernel/cpu/microcode/core.c     | 11 +++++++++++
 arch/x86/kernel/cpu/microcode/internal.h |  4 +++-
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
index b92e09a87c69..34e569ee1db2 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -552,6 +552,17 @@ static int load_late_stop_cpus(bool is_safe)
 		pr_err("You should switch to early loading, if possible.\n");
 	}
 
+	/*
+	 * Pre-load the microcode image into a staging device. This
+	 * process is preemptible and does not require stopping CPUs.
+	 * Successful staging simplifies the subsequent late-loading
+	 * process, reducing rendezvous time.
+	 *
+	 * Even if the transfer fails, the update will proceed as usual.
+	 */
+	if (microcode_ops->use_staging)
+		microcode_ops->stage_microcode();
+
 	atomic_set(&late_cpus_in, num_online_cpus());
 	atomic_set(&offline_in_nmi, 0);
 	loops_per_usec = loops_per_jiffy / (TICK_NSEC / 1000);
diff --git a/arch/x86/kernel/cpu/microcode/internal.h b/arch/x86/kernel/cpu/microcode/internal.h
index 50a9702ae4e2..adf02ebbf7a3 100644
--- a/arch/x86/kernel/cpu/microcode/internal.h
+++ b/arch/x86/kernel/cpu/microcode/internal.h
@@ -31,10 +31,12 @@ struct microcode_ops {
 	 * See also the "Synchronization" section in microcode_core.c.
 	 */
 	enum ucode_state	(*apply_microcode)(int cpu);
+	void			(*stage_microcode)(void);
 	int			(*collect_cpu_info)(int cpu, struct cpu_signature *csig);
 	void			(*finalize_late_load)(int result);
 	unsigned int		nmi_safe	: 1,
-				use_nmi		: 1;
+				use_nmi		: 1,
+				use_staging	: 1;
 };
 
 struct early_load_data {
-- 
2.48.1

Re: [PATCH v5 2/7] x86/microcode: Introduce staging step to reduce late-loading time

Posted by Borislav Petkov 1 month ago

On Sat, Aug 23, 2025 at 08:52:05AM -0700, Chang S. Bae wrote:
> As microcode patch sizes continue to grow, late-loading latency spikes
> can lead to timeouts and disruptions in running workloads. This trend of
> increasing patch sizes is expected to continue, so a foundational
> solution is needed to address the issue.
> 
> To mitigate the problem, a new staging feature is introduced. This option
> processes most of the microcode update (excluding activation) on a
> non-critical path, allowing CPUs to remain operational during the
> majority of the update. By offloading work from the critical path,
> staging can significantly reduces latency spikes.

"reduce"

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Re: [PATCH v5 2/7] x86/microcode: Introduce staging step to reduce late-loading time

Posted by Chang S. Bae 4 weeks, 1 day ago

On 9/4/2025 5:08 AM, Borislav Petkov wrote:
> On Sat, Aug 23, 2025 at 08:52:05AM -0700, Chang S. Bae wrote:
>>
>> staging can significantly reduces latency spikes.
> 
> "reduce"

Sorry for the type. Fixed.

[PATCH v5 3/7] x86/microcode/intel: Establish staging control logic

Posted by Chang S. Bae 1 month, 1 week ago

When microcode staging is initiated, operations are carried out through
an MMIO interface. Each package has a unique interface specified by the
IA32_MCU_STAGING_MBOX_ADDR MSR, which maps to a set of 32-bit registers.

Prepare staging with the following steps:

  1.  Ensure the microcode image is 32-bit aligned to match the MMIO
      register size.

  2.  Identify each MMIO interface based on its per-package scope.

  3.  Invoke the staging function for each identified interface, which
      will be implemented separately.

Also, define cpu_primary_thread_mask for the CONFIG_SMP=n case, allowing
consistent use when narrowing down primary threads to locate the
per-package interface.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Tested-by: Anselm Busse <abusse@amazon.de>
Link: https://lore.kernel.org/all/871pznq229.ffs@tglx
---
V4 -> V5:
* Rebase on the primary thread cpumask fix (Dave)
* Clean up the rev print code (Dave)
* rdmsrl_on_cpu() -> rdmsrq_on_cpu (Chao)

V2 -> V3:
* Remove a global variable and adjust stage_microcode() (Dave).
* Simplify for_each_cpu() loop control code
* Handle rdmsrl_on_cpu() return code explicitly (Chao)

V1 -> V2:
* Adjust to reference the staging_state struct.
* Add lockdep_assert_cpus_held() (Boris)
---
 arch/x86/include/asm/msr-index.h      |  2 ++
 arch/x86/kernel/cpu/microcode/intel.c | 49 +++++++++++++++++++++++++++
 2 files changed, 51 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index b65c3ba5fa14..0356155f9264 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -913,6 +913,8 @@
 #define MSR_IA32_UCODE_WRITE		0x00000079
 #define MSR_IA32_UCODE_REV		0x0000008b
 
+#define MSR_IA32_MCU_STAGING_MBOX_ADDR	0x000007a5
+
 /* Intel SGX Launch Enclave Public Key Hash MSRs */
 #define MSR_IA32_SGXLEPUBKEYHASH0	0x0000008C
 #define MSR_IA32_SGXLEPUBKEYHASH1	0x0000008D
diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index 371ca6eac00e..d309fb1f058f 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -299,6 +299,54 @@ static __init struct microcode_intel *scan_microcode(void *data, size_t size,
 	return size ? NULL : patch;
 }
 
+/*
+ * Handle the staging process using the mailbox MMIO interface.
+ * Return the result state.
+ */
+static enum ucode_state do_stage(u64 mmio_pa)
+{
+	pr_debug_once("Staging implementation is pending.\n");
+	return UCODE_ERROR;
+}
+
+static void stage_microcode(void)
+{
+	unsigned int pkg_id = UINT_MAX;
+	enum ucode_state ret;
+	int cpu, err;
+	u64 mmio_pa;
+
+	if (!IS_ALIGNED(get_totalsize(&ucode_patch_late->hdr), sizeof(u32)))
+		return;
+
+	lockdep_assert_cpus_held();
+
+	/*
+	 * The MMIO address is unique per package, and all the SMT
+	 * primary threads are online here. Find each MMIO space by
+	 * their package ids to avoid duplicate staging.
+	 */
+	for_each_cpu(cpu, cpu_primary_thread_mask) {
+		if (topology_logical_package_id(cpu) == pkg_id)
+			continue;
+		pkg_id = topology_logical_package_id(cpu);
+
+		err = rdmsrq_on_cpu(cpu, MSR_IA32_MCU_STAGING_MBOX_ADDR, &mmio_pa);
+		if (WARN_ON_ONCE(err))
+			return;
+
+		ret = do_stage(mmio_pa);
+		if (ret != UCODE_OK) {
+			pr_err("Error: staging failed with %s for CPU%d at package %u.\n",
+			       ret == UCODE_TIMEOUT ? "timeout" : "error state",
+			       cpu, pkg_id);
+			return;
+		}
+	}
+
+	pr_info("Staging of patch revision 0x%x succeeded.\n", ucode_patch_late->hdr.rev);
+}
+
 static enum ucode_state __apply_microcode(struct ucode_cpu_info *uci,
 					  struct microcode_intel *mc,
 					  u32 *cur_rev)
@@ -627,6 +675,7 @@ static struct microcode_ops microcode_intel_ops = {
 	.collect_cpu_info	= collect_cpu_info,
 	.apply_microcode	= apply_microcode_late,
 	.finalize_late_load	= finalize_late_load,
+	.stage_microcode	= stage_microcode,
 	.use_nmi		= IS_ENABLED(CONFIG_X86_64),
 };
 
-- 
2.48.1

Re: [PATCH v5 3/7] x86/microcode/intel: Establish staging control logic

Posted by Borislav Petkov 1 month ago

On Sat, Aug 23, 2025 at 08:52:06AM -0700, Chang S. Bae wrote:
> When microcode staging is initiated, operations are carried out through
> an MMIO interface. Each package has a unique interface specified by the
> IA32_MCU_STAGING_MBOX_ADDR MSR, which maps to a set of 32-bit registers.
> 
> Prepare staging with the following steps:
> 
>   1.  Ensure the microcode image is 32-bit aligned to match the MMIO
>       register size.
> 
>   2.  Identify each MMIO interface based on its per-package scope.
> 
>   3.  Invoke the staging function for each identified interface, which
>       will be implemented separately.
> 
> Also, define cpu_primary_thread_mask for the CONFIG_SMP=n case, allowing
> consistent use when narrowing down primary threads to locate the
> per-package interface.

This paragraph is stale now and can go.

> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index b65c3ba5fa14..0356155f9264 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -913,6 +913,8 @@
>  #define MSR_IA32_UCODE_WRITE		0x00000079
>  #define MSR_IA32_UCODE_REV		0x0000008b
>  
> +#define MSR_IA32_MCU_STAGING_MBOX_ADDR	0x000007a5

Doesn't look sorted to me.

> +
>  /* Intel SGX Launch Enclave Public Key Hash MSRs */
>  #define MSR_IA32_SGXLEPUBKEYHASH0	0x0000008C
>  #define MSR_IA32_SGXLEPUBKEYHASH1	0x0000008D
> diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
> index 371ca6eac00e..d309fb1f058f 100644
> --- a/arch/x86/kernel/cpu/microcode/intel.c
> +++ b/arch/x86/kernel/cpu/microcode/intel.c
> @@ -299,6 +299,54 @@ static __init struct microcode_intel *scan_microcode(void *data, size_t size,
>  	return size ? NULL : patch;
>  }
>  
> +/*
> + * Handle the staging process using the mailbox MMIO interface.
> + * Return the result state.
> + */
> +static enum ucode_state do_stage(u64 mmio_pa)
> +{
> +	pr_debug_once("Staging implementation is pending.\n");
> +	return UCODE_ERROR;
> +}
> +
> +static void stage_microcode(void)
> +{
> +	unsigned int pkg_id = UINT_MAX;
> +	enum ucode_state ret;
> +	int cpu, err;
> +	u64 mmio_pa;
> +
> +	if (!IS_ALIGNED(get_totalsize(&ucode_patch_late->hdr), sizeof(u32)))
> +		return;
> +
> +	lockdep_assert_cpus_held();
> +
> +	/*
> +	 * The MMIO address is unique per package, and all the SMT
> +	 * primary threads are online here. Find each MMIO space by
> +	 * their package ids to avoid duplicate staging.
> +	 */
> +	for_each_cpu(cpu, cpu_primary_thread_mask) {
> +		if (topology_logical_package_id(cpu) == pkg_id)
> +			continue;

<---- newline here.

> +		pkg_id = topology_logical_package_id(cpu);
> +
> +		err = rdmsrq_on_cpu(cpu, MSR_IA32_MCU_STAGING_MBOX_ADDR, &mmio_pa);
> +		if (WARN_ON_ONCE(err))
> +			return;
> +
> +		ret = do_stage(mmio_pa);
> +		if (ret != UCODE_OK) {
> +			pr_err("Error: staging failed with %s for CPU%d at package %u.\n",
> +			       ret == UCODE_TIMEOUT ? "timeout" : "error state",

What does "error state" mean?

Are we going to dump additional error state so that it is clear why it failed?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Re: [PATCH v5 3/7] x86/microcode/intel: Establish staging control logic

Posted by Chang S. Bae 4 weeks, 1 day ago

On 9/4/2025 5:13 AM, Borislav Petkov wrote:
> On Sat, Aug 23, 2025 at 08:52:06AM -0700, Chang S. Bae wrote:
>>
>> Also, define cpu_primary_thread_mask for the CONFIG_SMP=n case, allowing
>> consistent use when narrowing down primary threads to locate the
>> per-package interface.
> 
> This paragraph is stale now and can go.

Ah, right. Sorry, I had to chop it out.

>> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
>> index b65c3ba5fa14..0356155f9264 100644
>> --- a/arch/x86/include/asm/msr-index.h
>> +++ b/arch/x86/include/asm/msr-index.h
>> @@ -913,6 +913,8 @@
>>   #define MSR_IA32_UCODE_WRITE		0x00000079
>>   #define MSR_IA32_UCODE_REV		0x0000008b
>>   
>> +#define MSR_IA32_MCU_STAGING_MBOX_ADDR	0x000007a5
> 
> Doesn't look sorted to me.

Okay, I think this 'MCU' looks a bit fuzzy. Intel folks already use this 
acronym elsewhere (e.g. MSR_IA32_MCU_OPT_CTRL). It might be clearer to 
consolidate them under one section, like:

/* Intel microcode update MSRs */
#define MSR_IA32_MCU_OPT_CTRL           0x00000123
#define MSR_IA32_MCU_ENUMERATION        0x0000007b  <- patch7 adds this
#define MSR_IA32_MCU_STAGING_MBOX_ADDR  0x000007a5
...

>> +	/*
>> +	 * The MMIO address is unique per package, and all the SMT
>> +	 * primary threads are online here. Find each MMIO space by
>> +	 * their package ids to avoid duplicate staging.
>> +	 */
>> +	for_each_cpu(cpu, cpu_primary_thread_mask) {
>> +		if (topology_logical_package_id(cpu) == pkg_id)
>> +			continue;
> 
> <---- newline here.

Fixed. Thanks.

>> +		ret = do_stage(mmio_pa);
>> +		if (ret != UCODE_OK) {
>> +			pr_err("Error: staging failed with %s for CPU%d at package %u.\n",
>> +			       ret == UCODE_TIMEOUT ? "timeout" : "error state",
> 
> What does "error state" mean?
> 
> Are we going to dump additional error state so that it is clear why it failed?

Yeah, right. The wording "error state" is vague.

The next two patches in this series introduce helpers that return an 
error code and *also* update this ucode_state. The latter could go away. 
Instead, the error code could be just down through here and decoded like:

static const char *staging_errstr(int code)
{
   switch (code) {
   case -ETIMEDOUT:	return "timeout";
   ...
   default:		return "unknown error";
   }
}

static void stage_microcode(void)
{
   ...

   err = do_stage(mmio_pa);
   if (err) {
     pr_err("Error: staging failed with %s for CPU%d at package %u.\n",
            staging_errstr(err), cpu, pkg_id);
     return;
   }
}

I've attached a diff on top of V5 to picture what it will look like.
diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index 455b50514d87..ae4b08e71e20 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -98,7 +98,6 @@ struct extended_sigtable {
  * @chunk_size:		Size of each data piece
  * @bytes_sent:		Total bytes transmitted so far
  * @offset:		Current offset in the microcode image
- * @state:		Current state of the staging process
  */
 struct staging_state {
 	void __iomem		*mmio_base;
@@ -106,7 +105,6 @@ struct staging_state {
 	unsigned int		chunk_size;
 	unsigned int		bytes_sent;
 	unsigned int		offset;
-	enum ucode_state	state;
 };
 
 #define DEFAULT_UCODE_TOTALSIZE (DEFAULT_UCODE_DATASIZE + MC_HEADER_SIZE)
@@ -425,7 +423,7 @@ static inline unsigned int calc_next_chunk_size(unsigned int ucode_len, unsigned
  * Update the chunk size and decide whether another chunk can be sent.
  * This accounts for remaining data and retry limits.
  */
-static bool can_send_next_chunk(struct staging_state *ss)
+static bool can_send_next_chunk(struct staging_state *ss, int *err)
 {
 	ss->chunk_size = calc_next_chunk_size(ss->ucode_len, ss->offset);
 	/*
@@ -445,10 +443,11 @@ static bool can_send_next_chunk(struct staging_state *ss)
 	 * likely stuck and mark the state as timeout.
 	 */
 	if (ss->bytes_sent + ss->chunk_size > ss->ucode_len * 2) {
-		ss->state = UCODE_TIMEOUT;
+		*err = -ETIMEDOUT;
 		return false;
 	}
 
+	*err = 0;
 	return true;
 }
 
@@ -465,9 +464,9 @@ static inline bool is_end_offset(u32 offset)
  * the end offset, or no more transactions are permitted (retry limit
  * reached).
  */
-static inline bool staging_is_complete(struct staging_state *ss)
+static inline bool staging_is_complete(struct staging_state *ss, int *err)
 {
-	return is_end_offset(ss->offset) || !can_send_next_chunk(ss);
+	return is_end_offset(ss->offset) || !can_send_next_chunk(ss, err);
 }
 
 /*
@@ -489,21 +488,16 @@ static int wait_for_transaction(struct staging_state *ss)
 	}
 
 	/* Check for explicit error response */
-	if (status & MASK_MBOX_STATUS_ERROR) {
-		ss->state = UCODE_ERROR;
-		return -EPROTO;
-	}
+	if (status & MASK_MBOX_STATUS_ERROR)
+		return -EIO;
 
 	/*
 	 * Hardware is neither responded to the action nor signaled any
 	 * error. Treat this as timeout.
 	 */
-	if (!(status & MASK_MBOX_STATUS_READY)) {
-		ss->state = UCODE_TIMEOUT;
+	if (!(status & MASK_MBOX_STATUS_READY))
 		return -ETIMEDOUT;
-	}
 
-	ss->state = UCODE_OK;
 	return 0;
 }
 
@@ -545,7 +539,6 @@ static int fetch_next_offset(struct staging_state *ss)
 	const u64 expected_header = MBOX_HEADER(MBOX_HEADER_SIZE + MBOX_RESPONSE_SIZE);
 	u32 offset, status;
 	u64 header;
-	int err;
 
 	/*
 	 * The 'response' mailbox returns three fields, in order:
@@ -558,55 +551,42 @@ static int fetch_next_offset(struct staging_state *ss)
 	status = read_mbox_dword(ss->mmio_base);
 
 	/* All valid responses must start with the expected header. */
-	if (header != expected_header) {
-		pr_err_once("staging: invalid response header\n");
-		err = -EINVAL;
-		goto err_out;
-	}
+	if (header != expected_header)
+		return -EBADR;
 
 	/*
 	 * Verify the offset: If not at the end marker, it must not
 	 * exceed the microcode image length
 	 */
-	if (!is_end_offset(offset) && offset > ss->ucode_len) {
-		pr_err_once("staging: invalid response offset\n");
-		err = -EINVAL;
-		goto err_out;
-	}
+	if (!is_end_offset(offset) && offset > ss->ucode_len)
+		return -EINVAL;
 
 	/* Hardware may report errors explicitly in the status field */
-	if (status & MASK_MBOX_RESP_ERROR) {
-		err = -EPROTO;
-		goto err_out;
-	}
+	if (status & MASK_MBOX_RESP_ERROR)
+		return -EPROTO;
 
 	ss->offset = offset;
-	ss->state  = UCODE_OK;
 	return 0;
-
-err_out:
-	ss->state = UCODE_ERROR;
-	return err;
 }
 
 /*
  * Handle the staging process using the mailbox MMIO interface. The
  * microcode image is transferred in chunks until completion. Return the
- * result state.
+ * error code.
  */
-static enum ucode_state do_stage(u64 mmio_pa)
+static int do_stage(u64 mmio_pa)
 {
 	struct staging_state ss = {};
 	int err;
 
 	ss.mmio_base = ioremap(mmio_pa, MBOX_REG_NUM * MBOX_REG_SIZE);
 	if (WARN_ON_ONCE(!ss.mmio_base))
-		return UCODE_ERROR;
+		return -EADDRNOTAVAIL;
 
 	init_stage(&ss);
 
 	/* Perform the staging process while within the retry limit */
-	while (!staging_is_complete(&ss)) {
+	while (!staging_is_complete(&ss, &err)) {
 		/* Send a chunk of microcode each time: */
 		err = send_data_chunk(&ss, ucode_patch_late);
 		if (err)
@@ -622,17 +602,25 @@ static enum ucode_state do_stage(u64 mmio_pa)
 
 	iounmap(ss.mmio_base);
 
-	/*
-	 * The helpers update ss.state on error. The final state is
-	 * returned to the caller.
-	 */
-	return ss.state;
+	return err;
+}
+
+static const char *staging_errstr(int code)
+{
+	switch (code) {
+	case -EBADR:		return "invalid response header";
+	case -EINVAL:		return "invalid next offset";
+	case -EIO:		return "transaction error";
+	case -ETIMEDOUT:	return "timeout";
+	case -EPROTO:		return "response error";
+	case -EADDRNOTAVAIL:	return "ioremap() failure";
+	default:		return "unknown error";
+	};
 }
 
 static void stage_microcode(void)
 {
 	unsigned int pkg_id = UINT_MAX;
-	enum ucode_state ret;
 	int cpu, err;
 	u64 mmio_pa;
 
@@ -656,11 +644,10 @@ static void stage_microcode(void)
 		if (WARN_ON_ONCE(err))
 			return;
 
-		ret = do_stage(mmio_pa);
-		if (ret != UCODE_OK) {
+		err = do_stage(mmio_pa);
+		if (err) {
 			pr_err("Error: staging failed with %s for CPU%d at package %u.\n",
-			       ret == UCODE_TIMEOUT ? "timeout" : "error state",
-			       cpu, pkg_id);
+			       staging_errstr(err), cpu, pkg_id);
 			return;
 		}
 	}

Re: [PATCH v5 3/7] x86/microcode/intel: Establish staging control logic

Posted by Borislav Petkov 4 weeks, 1 day ago

On Thu, Sep 04, 2025 at 05:04:24PM -0700, Chang S. Bae wrote:
> Okay, I think this 'MCU' looks a bit fuzzy. 

I mean we should try to sort them by MSR *number* so that they're easier to
find. I know, I know, the whole file needs sorting but we don't need the churn
right now. So let's put new ones where they should be, i.e.:

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 6eb0c62c7fbd..87be1f815211 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -922,8 +922,6 @@
 #define MSR_IA32_UCODE_WRITE		0x00000079
 #define MSR_IA32_UCODE_REV		0x0000008b
 
-#define MSR_IA32_MCU_STAGING_MBOX_ADDR	0x000007a5
-
 /* Intel SGX Launch Enclave Public Key Hash MSRs */
 #define MSR_IA32_SGXLEPUBKEYHASH0	0x0000008C
 #define MSR_IA32_SGXLEPUBKEYHASH1	0x0000008D
@@ -1219,6 +1217,8 @@
 #define MSR_IA32_VMX_VMFUNC             0x00000491
 #define MSR_IA32_VMX_PROCBASED_CTLS3	0x00000492
 
+#define MSR_IA32_MCU_STAGING_MBOX_ADDR	0x000007a5
+
 /* Resctrl MSRs: */
 /* - Intel: */
 #define MSR_IA32_L3_QOS_CFG		0xc81

> The next two patches in this series introduce helpers that return an error
> code and *also* update this ucode_state. The latter could go away. Instead,
> the error code could be just down through here and decoded like:

You don't need the strings decoding... yet. This stuff is going to be looked
at only by us so there's no problem with simply dumping the errval and then
going to consult the sources where that value comes from.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Re: [PATCH v5 3/7] x86/microcode/intel: Establish staging control logic

Posted by Chang S. Bae 4 weeks, 1 day ago

On 9/5/2025 4:13 AM, Borislav Petkov wrote:
> 
> So let's put new ones where they should be

I see.

> You don't need the strings decoding... yet. This stuff is going to be looked
> at only by us so there's no problem with simply dumping the errval and then
> going to consult the sources where that value comes from.

Makes sense. Then, it could be just like this:

   pr_err("Error: staging failed (code = %d) for CPU%d at package %u.\n",
          err, cpu, pkg_id);

Thanks,
Chang

[PATCH v5 4/7] x86/microcode/intel: Define staging state struct

Posted by Chang S. Bae 1 month, 1 week ago

To facilitate a structured staging handler, a set of functions will be
introduced. Define staging_state struct to simplify function prototypes
by consolidating relevant data, instead of passing multiple local
variables.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Tested-by: Anselm Busse <abusse@amazon.de>
---
V4 -> V5: Drop the ucode_ptr field (Dave)
V1 -> V2: New patch

Prior to V2, local variables were used to track state values, with the
intention of improving readability by explicitly passing them between
functions. However, given feedbacks, introducing a dedicated data
structure looks to provide a benefit by simplifying the main loop.
---
 arch/x86/kernel/cpu/microcode/intel.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index d309fb1f058f..3ca22457d839 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -54,6 +54,25 @@ struct extended_sigtable {
 	struct extended_signature	sigs[];
 };
 
+/**
+ * struct staging_state - Tracks the current staging process state
+ *
+ * @mmio_base:		MMIO base address for staging
+ * @ucode_len:		Total size of the microcode image
+ * @chunk_size:		Size of each data piece
+ * @bytes_sent:		Total bytes transmitted so far
+ * @offset:		Current offset in the microcode image
+ * @state:		Current state of the staging process
+ */
+struct staging_state {
+	void __iomem		*mmio_base;
+	unsigned int		ucode_len;
+	unsigned int		chunk_size;
+	unsigned int		bytes_sent;
+	unsigned int		offset;
+	enum ucode_state	state;
+};
+
 #define DEFAULT_UCODE_TOTALSIZE (DEFAULT_UCODE_DATASIZE + MC_HEADER_SIZE)
 #define EXT_HEADER_SIZE		(sizeof(struct extended_sigtable))
 #define EXT_SIGNATURE_SIZE	(sizeof(struct extended_signature))
-- 
2.48.1

Re: [PATCH v5 4/7] x86/microcode/intel: Define staging state struct

Posted by Borislav Petkov 1 month ago

On Sat, Aug 23, 2025 at 08:52:07AM -0700, Chang S. Bae wrote:
> To facilitate a structured staging handler, a set of functions will be
> introduced.

No need for that sentence.

> Define staging_state struct to simplify function prototypes
> by consolidating relevant data, instead of passing multiple local
> variables.

That's clear enough.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Re: [PATCH v5 4/7] x86/microcode/intel: Define staging state struct

Posted by Chang S. Bae 4 weeks, 1 day ago

On 9/4/2025 6:48 AM, Borislav Petkov wrote:
> On Sat, Aug 23, 2025 at 08:52:07AM -0700, Chang S. Bae wrote:
>> To facilitate a structured staging handler, a set of functions will be
>> introduced.
> 
> No need for that sentence.

Fixed. Thanks

[PATCH v5 5/7] x86/microcode/intel: Implement staging handler

Posted by Chang S. Bae 1 month, 1 week ago

Previously, per-package staging invocations and their associated state
data were established. The next step is to implement the actual staging
handler according to the specified protocol. Below are key aspects to
note:

  (a)  Each staging process must begin by resetting the staging hardware.

  (b)  The staging hardware processes up to a page-sized chunk of the
       microcode image per iteration, requiring software to submit data
       incrementally.

  (c)  Once a data chunk is processed, the hardware responds with an
       offset in the image for the next chunk.

  (d) The offset may indicate completion or request retransmission of an
       already transferred chunk. As long as the total transferred data
       remains within the predefined limit (twice the image size),
       retransmissions should be acceptable.

With that, incorporate these code sequences to the staging handler:

  1.  Initialization: Map the MMIO space via ioremap(). Reset the staging
      hardware and initialize software state, ensuring a fresh staging
      process aligned with (a).

  2.  Processing Loop: Introduce a loop iterating over data chunk,
      following (b), with proper termination conditions established from
      (d) -- stop staging when the hardware signals completion, or if the
      total transmitted data exceeds the predefined limit.

  3.  Loop Body: Finally, compose the loop body with two steps --
      transmitting a data chunk and retrieving the next offset from the
      hardware response, aligning with (b) and (c).

Since data transmission and mailbox format handling require additional
details, they are implemented separately in next changes.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Tested-by: Anselm Busse <abusse@amazon.de>
---
V4 -> V5:
* Convert helper functions to return error codes (Dave)
* Consolidate loop-control logic
* Refactor next-chunk calculation/check for clarity
* Remove offset sanity check (moved to next patch)

V2 -> V3:
* Rework code to eliminate global variables (Dave)
* Remove redundant variable resets (Chao)

V1 -> V2:
* Re-write the changelog for clarity (Dave).
* Move staging handling code into intel.c (Boris).
* Add extensive comments to clarify staging logic and hardware
  interactions, along with function renaming (Dave).
---
 arch/x86/kernel/cpu/microcode/intel.c | 137 +++++++++++++++++++++++++-
 1 file changed, 133 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index 3ca22457d839..a1b13202330d 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -20,6 +20,8 @@
 #include <linux/cpu.h>
 #include <linux/uio.h>
 #include <linux/mm.h>
+#include <linux/delay.h>
+#include <linux/io.h>
 
 #include <asm/cpu_device_id.h>
 #include <asm/processor.h>
@@ -33,6 +35,16 @@ static const char ucode_path[] = "kernel/x86/microcode/GenuineIntel.bin";
 
 #define UCODE_BSP_LOADED	((struct microcode_intel *)0x1UL)
 
+/* Defines for the microcode staging mailbox interface */
+
+#define MBOX_REG_NUM		4
+#define MBOX_REG_SIZE		sizeof(u32)
+
+#define MBOX_CONTROL_OFFSET	0x0
+#define MBOX_STATUS_OFFSET	0x4
+
+#define MASK_MBOX_CTRL_ABORT	BIT(0)
+
 /* Current microcode patch used in early patching on the APs. */
 static struct microcode_intel *ucode_patch_va __read_mostly;
 static struct microcode_intel *ucode_patch_late __read_mostly;
@@ -319,13 +331,130 @@ static __init struct microcode_intel *scan_microcode(void *data, size_t size,
 }
 
 /*
- * Handle the staging process using the mailbox MMIO interface.
- * Return the result state.
+ * Prepare for a new microcode transfer: reset hardware and record the
+ * image size.
+ */
+static void init_stage(struct staging_state *ss)
+{
+	ss->ucode_len = get_totalsize(&ucode_patch_late->hdr);
+
+	/*
+	 * Abort any ongoing process, effectively resetting the device.
+	 * Unlike regular mailbox data processing requests, this
+	 * operation does not require a status check.
+	 */
+	writel(MASK_MBOX_CTRL_ABORT, ss->mmio_base + MBOX_CONTROL_OFFSET);
+}
+
+/*
+ * Return PAGE_SIZE, or remaining bytes if this is the final chunk
+ */
+static inline unsigned int calc_next_chunk_size(unsigned int ucode_len, unsigned int offset)
+{
+	return min(PAGE_SIZE, ucode_len - offset);
+}
+
+/*
+ * Update the chunk size and decide whether another chunk can be sent.
+ * This accounts for remaining data and retry limits.
+ */
+static bool can_send_next_chunk(struct staging_state *ss)
+{
+	ss->chunk_size = calc_next_chunk_size(ss->ucode_len, ss->offset);
+	/*
+	 * Each microcode image is divided into chunks, each at most
+	 * one page size. A 10-chunk  image would typically require 10
+	 * transactions.
+	 *
+	 * However, the hardware managing the mailbox has limited
+	 * resources and may not cache the entire image, potentially
+	 * requesting the same chunk multiple times.
+	 *
+	 * To tolerate this behavior, allow up to twice the expected
+	 * number of transactions (i.e., a 10-chunk image can take up to
+	 * 20 attempts).
+	 *
+	 * If the number of attempts exceeds this limit, the hardware is
+	 * likely stuck and mark the state as timeout.
+	 */
+	if (ss->bytes_sent + ss->chunk_size > ss->ucode_len * 2) {
+		ss->state = UCODE_TIMEOUT;
+		return false;
+	}
+
+	return true;
+}
+
+/*
+ * Determine whether staging is complete: either the hardware signaled
+ * the end offset, or no more transactions are permitted (retry limit
+ * reached).
+ */
+static inline bool staging_is_complete(struct staging_state *ss)
+{
+	return (ss->offset == UINT_MAX) || !can_send_next_chunk(ss);
+}
+
+/*
+ * Transmit a chunk of the microcode image to the hardware.
+ * Return 0 on success, or an error code on failure.
+ */
+static int send_data_chunk(struct staging_state *ss, void *ucode_ptr __maybe_unused)
+{
+	pr_debug_once("Staging mailbox loading code needs to be implemented.\n");
+	ss->state = UCODE_ERROR;
+	return -EPROTONOSUPPORT;
+}
+
+/*
+ * Retrieve the next offset from the hardware response.
+ * Return 0 on success, or an error code on failure.
+ */
+static int fetch_next_offset(struct staging_state *ss)
+{
+	pr_debug_once("Staging mailbox response handling code needs to be implemented.\n\n");
+	ss->state = UCODE_ERROR;
+	return -EPROTONOSUPPORT;
+}
+
+/*
+ * Handle the staging process using the mailbox MMIO interface. The
+ * microcode image is transferred in chunks until completion. Return the
+ * result state.
  */
 static enum ucode_state do_stage(u64 mmio_pa)
 {
-	pr_debug_once("Staging implementation is pending.\n");
-	return UCODE_ERROR;
+	struct staging_state ss = {};
+	int err;
+
+	ss.mmio_base = ioremap(mmio_pa, MBOX_REG_NUM * MBOX_REG_SIZE);
+	if (WARN_ON_ONCE(!ss.mmio_base))
+		return UCODE_ERROR;
+
+	init_stage(&ss);
+
+	/* Perform the staging process while within the retry limit */
+	while (!staging_is_complete(&ss)) {
+		/* Send a chunk of microcode each time: */
+		err = send_data_chunk(&ss, ucode_patch_late);
+		if (err)
+			break;
+		/*
+		 * Then, ask the hardware which piece of the image it
+		 * needs next. The same piece may be sent more than once.
+		 */
+		err = fetch_next_offset(&ss);
+		if (err)
+			break;
+	}
+
+	iounmap(ss.mmio_base);
+
+	/*
+	 * The helpers update ss.state on error. The final state is
+	 * returned to the caller.
+	 */
+	return ss.state;
 }
 
 static void stage_microcode(void)
-- 
2.48.1

Re: [PATCH v5 5/7] x86/microcode/intel: Implement staging handler

Posted by Borislav Petkov 3 weeks, 2 days ago

On Sat, Aug 23, 2025 at 08:52:08AM -0700, Chang S. Bae wrote:
> diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
> index 3ca22457d839..a1b13202330d 100644
> --- a/arch/x86/kernel/cpu/microcode/intel.c
> +++ b/arch/x86/kernel/cpu/microcode/intel.c
> @@ -20,6 +20,8 @@
>  #include <linux/cpu.h>
>  #include <linux/uio.h>
>  #include <linux/mm.h>
> +#include <linux/delay.h>
> +#include <linux/io.h>

You do see those are sorted by header name length in a reverse order, right?

>  
>  #include <asm/cpu_device_id.h>
>  #include <asm/processor.h>
> @@ -33,6 +35,16 @@ static const char ucode_path[] = "kernel/x86/microcode/GenuineIntel.bin";
>  
>  #define UCODE_BSP_LOADED	((struct microcode_intel *)0x1UL)
>  
> +/* Defines for the microcode staging mailbox interface */
> +

^ Superfluous newline.

> +#define MBOX_REG_NUM		4
> +#define MBOX_REG_SIZE		sizeof(u32)
> +
> +#define MBOX_CONTROL_OFFSET	0x0
> +#define MBOX_STATUS_OFFSET	0x4
> +
> +#define MASK_MBOX_CTRL_ABORT	BIT(0)
> +
>  /* Current microcode patch used in early patching on the APs. */
>  static struct microcode_intel *ucode_patch_va __read_mostly;
>  static struct microcode_intel *ucode_patch_late __read_mostly;

...

> +/*
> + * Return PAGE_SIZE, or remaining bytes if this is the final chunk
> + */
> +static inline unsigned int calc_next_chunk_size(unsigned int ucode_len, unsigned int offset)
> +{
> +	return min(PAGE_SIZE, ucode_len - offset);
> +}

That oneliner looks useless - sticking a comment over tne min() and putting it
at the single callsite below is good enough.

> +
> +/*
> + * Update the chunk size and decide whether another chunk can be sent.
> + * This accounts for remaining data and retry limits.
> + */
> +static bool can_send_next_chunk(struct staging_state *ss)
> +{
> +	ss->chunk_size = calc_next_chunk_size(ss->ucode_len, ss->offset);
> +	/*
> +	 * Each microcode image is divided into chunks, each at most
> +	 * one page size. A 10-chunk  image would typically require 10
				   ^^^^

> +	 * transactions.
> +	 *
> +	 * However, the hardware managing the mailbox has limited
> +	 * resources and may not cache the entire image, potentially
> +	 * requesting the same chunk multiple times.
> +	 *
> +	 * To tolerate this behavior, allow up to twice the expected
> +	 * number of transactions (i.e., a 10-chunk image can take up to
> +	 * 20 attempts).

Looks quirky but ok, let's try it in practice first...

> +	 *
> +	 * If the number of attempts exceeds this limit, the hardware is
> +	 * likely stuck and mark the state as timeout.
> +	 */
> +	if (ss->bytes_sent + ss->chunk_size > ss->ucode_len * 2) {
> +		ss->state = UCODE_TIMEOUT;
> +		return false;
> +	}
> +
> +	return true;
> +}

...

>  static enum ucode_state do_stage(u64 mmio_pa)
>  {
> -	pr_debug_once("Staging implementation is pending.\n");
> -	return UCODE_ERROR;
> +	struct staging_state ss = {};
> +	int err;
> +
> +	ss.mmio_base = ioremap(mmio_pa, MBOX_REG_NUM * MBOX_REG_SIZE);
> +	if (WARN_ON_ONCE(!ss.mmio_base))
> +		return UCODE_ERROR;
> +
> +	init_stage(&ss);
> +
> +	/* Perform the staging process while within the retry limit */
> +	while (!staging_is_complete(&ss)) {
> +		/* Send a chunk of microcode each time: */
> +		err = send_data_chunk(&ss, ucode_patch_late);
> +		if (err)
> +			break;
> +		/*
> +		 * Then, ask the hardware which piece of the image it
> +		 * needs next. The same piece may be sent more than once.

If this is part of normal operation, your send-max-2x-the-size heuristic might
fail quickly here. I'd track the number of chunks it wants you to send and
then set a per-chunk limit and when it reaches that limit, then cancel the
transaction. Dunno, let's try the simple scheme first...

> +		 */
> +		err = fetch_next_offset(&ss);
> +		if (err)
> +			break;
> +	}
> +
> +	iounmap(ss.mmio_base);
> +
> +	/*
> +	 * The helpers update ss.state on error. The final state is
> +	 * returned to the caller.
> +	 */
> +	return ss.state;
>  }
>  
>  static void stage_microcode(void)
> -- 
> 2.48.1
> 

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Re: [PATCH v5 5/7] x86/microcode/intel: Implement staging handler

Posted by Chang S. Bae 3 weeks, 2 days ago

On 9/10/2025 11:33 AM, Borislav Petkov wrote:
> On Sat, Aug 23, 2025 at 08:52:08AM -0700, Chang S. Bae wrote:
>> diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
>> index 3ca22457d839..a1b13202330d 100644
>> --- a/arch/x86/kernel/cpu/microcode/intel.c
>> +++ b/arch/x86/kernel/cpu/microcode/intel.c
>> @@ -20,6 +20,8 @@
>>   #include <linux/cpu.h>
>>   #include <linux/uio.h>
>>   #include <linux/mm.h>
>> +#include <linux/delay.h>
>> +#include <linux/io.h>
> 
> You do see those are sorted by header name length in a reverse order, right?

Okay, fixed -- and I adjusted patch6 as well:

  #include <linux/initrd.h>
  #include <linux/io.h>
  #include <linux/kernel.h>
+#include <linux/pci_ids.h>
  #include <linux/slab.h>
  #include <linux/cpu.h>
  #include <linux/uio.h>

>> +/* Defines for the microcode staging mailbox interface */
>> +
> 
> ^ Superfluous newline.

Dropped.

>> +/*
>> + * Return PAGE_SIZE, or remaining bytes if this is the final chunk
>> + */
>> +static inline unsigned int calc_next_chunk_size(unsigned int ucode_len, unsigned int offset)
>> +{
>> +	return min(PAGE_SIZE, ucode_len - offset);
>> +}
> 
> That oneliner looks useless - sticking a comment over tne min() and putting it
> at the single callsite below is good enough.

Agreed -- removed the helper and moved them.

>> +/*
>> + * Update the chunk size and decide whether another chunk can be sent.
>> + * This accounts for remaining data and retry limits.
>> + */
>> +static bool can_send_next_chunk(struct staging_state *ss)
>> +{
>> +	ss->chunk_size = calc_next_chunk_size(ss->ucode_len, ss->offset);
>> +	/*
>> +	 * Each microcode image is divided into chunks, each at most
>> +	 * one page size. A 10-chunk  image would typically require 10
> 				   ^^^^

Fixed.

Just to make sure, include the diff here.

Thanks for the careful review and for sticking with this set.diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index c7a75afd2b9a..4d663cab4f48 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -12,16 +12,16 @@
  */
 #define pr_fmt(fmt) "microcode: " fmt
 #include <linux/earlycpio.h>
+#include <linux/delay.h>
 #include <linux/firmware.h>
 #include <linux/uaccess.h>
 #include <linux/initrd.h>
+#include <linux/io.h>
 #include <linux/kernel.h>
 #include <linux/slab.h>
 #include <linux/cpu.h>
 #include <linux/uio.h>
 #include <linux/mm.h>
-#include <linux/delay.h>
-#include <linux/io.h>
 
 #include <asm/cpu_device_id.h>
 #include <asm/processor.h>
@@ -36,7 +36,6 @@ static const char ucode_path[] = "kernel/x86/microcode/GenuineIntel.bin";
 #define UCODE_BSP_LOADED	((struct microcode_intel *)0x1UL)
 
 /* Defines for the microcode staging mailbox interface */
-
 #define MBOX_REG_NUM		4
 #define MBOX_REG_SIZE		sizeof(u32)
 
@@ -344,24 +343,18 @@ static void init_stage(struct staging_state *ss)
 	writel(MASK_MBOX_CTRL_ABORT, ss->mmio_base + MBOX_CONTROL_OFFSET);
 }
 
-/*
- * Return PAGE_SIZE, or remaining bytes if this is the final chunk
- */
-static inline unsigned int calc_next_chunk_size(unsigned int ucode_len, unsigned int offset)
-{
-	return min(PAGE_SIZE, ucode_len - offset);
-}
-
 /*
  * Update the chunk size and decide whether another chunk can be sent.
  * This accounts for remaining data and retry limits.
  */
 static bool can_send_next_chunk(struct staging_state *ss, int *err)
 {
-	ss->chunk_size = calc_next_chunk_size(ss->ucode_len, ss->offset);
+	/* a page size or remaining bytes if this is the final chunk */
+	ss->chunk_size = min(PAGE_SIZE, ss->ucode_len - ss->offset);
+
 	/*
 	 * Each microcode image is divided into chunks, each at most
-	 * one page size. A 10-chunk  image would typically require 10
+	 * one page size. A 10-chunk image would typically require 10
 	 * transactions.
 	 *
 	 * However, the hardware managing the mailbox has limited

[PATCH v5 6/7] x86/microcode/intel: Support mailbox transfer

Posted by Chang S. Bae 1 month, 1 week ago

Previously, the functions for sending microcode data and retrieving the
next offset were placeholders, as they required handling the specific
mailbox format. Implement them as following:

== Mailbox Format ==

The staging mailbox consists of two primary sections: 'header' and
'data'. While the microcode must be transferred following this format,
the actual data transfer mechanism involves reading and writing to
specific MMIO registers.

== Mailbox Data Registers ==

Unlike conventional interfaces that allocate MMIO space for each data
chunk, the staging interface features a "narrow" interface, using only
two dword-sized registers for read and write operations.

For example, if writing 2 dwords of data to a device. Typically, the
device would expose 2 dwords of "wide" MMIO space. To send the data to
the device:

	writel(buf[0], io_addr + 0);
	writel(buf[1], io_addr + 1);

But, this interface is a bit different. Instead of having a "wide"
interface where there is separate MMIO space for each word in a
transaction, it has a "narrow" interface where several words are written
to the same spot in MMIO space:

	writel(buf[0], io_addr);
	writel(buf[1], io_addr);

The same goes for the read side.

== Implementation Summary ==

Given that, introduce two layers of helper functions at first:

  * Low-level helpers for reading and writing to data registers directly.
  * Wrapper functions for handling mailbox header and data sections.

Using them, implement send_data_chunk() and fetch_next_offset()
functions. Add explicit error and timeout handling routine in
wait_for_transaction(), finishing up the transfer.

Both hardware error states and implicit errors -- invalid header or
offset -- result in UCODE_ERROR. Emit a clear message for the latter.

Note: The kernel has support for similar mailboxes. But none of them are
compatible with this one. Trying to share code resulted in a bloated
mess, so this code is standalone.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Tested-by: Anselm Busse <abusse@amazon.de>
---
V4 -> V5: Addressed Dave's feedback
* fetch_next_offset():
  - Make dword reads explicit
  - Consolidate offset validation -- adding another user for the
    end-offset checker
  - Convert WARN_* with pr_err_once()
* Simplify transaction waiting logic a bit

V2 -> V3:
* Update code to reflect the removal of a global variable (Dave).

V1 -> V2:
* Add lots of code comments and edit the changelog (Dave).
* Encapsulate register read/write operations for processing header and
  data sections.
---
 arch/x86/kernel/cpu/microcode/intel.c | 186 +++++++++++++++++++++++++-
 1 file changed, 179 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index a1b13202330d..f6b365eba6a2 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -22,6 +22,7 @@
 #include <linux/mm.h>
 #include <linux/delay.h>
 #include <linux/io.h>
+#include <linux/pci_ids.h>
 
 #include <asm/cpu_device_id.h>
 #include <asm/processor.h>
@@ -42,8 +43,31 @@ static const char ucode_path[] = "kernel/x86/microcode/GenuineIntel.bin";
 
 #define MBOX_CONTROL_OFFSET	0x0
 #define MBOX_STATUS_OFFSET	0x4
+#define MBOX_WRDATA_OFFSET	0x8
+#define MBOX_RDDATA_OFFSET	0xc
 
 #define MASK_MBOX_CTRL_ABORT	BIT(0)
+#define MASK_MBOX_CTRL_GO	BIT(31)
+
+#define MASK_MBOX_STATUS_ERROR	BIT(2)
+#define MASK_MBOX_STATUS_READY	BIT(31)
+
+#define MASK_MBOX_RESP_SUCCESS	BIT(0)
+#define MASK_MBOX_RESP_PROGRESS	BIT(1)
+#define MASK_MBOX_RESP_ERROR	BIT(2)
+
+#define MBOX_CMD_LOAD		0x3
+#define MBOX_OBJ_STAGING	0xb
+#define MBOX_HEADER(size)	((PCI_VENDOR_ID_INTEL)    | \
+				 (MBOX_OBJ_STAGING << 16) | \
+				 ((u64)((size) / sizeof(u32)) << 32))
+
+/* The size of each mailbox header */
+#define MBOX_HEADER_SIZE	sizeof(u64)
+/* The size of staging hardware response */
+#define MBOX_RESPONSE_SIZE	sizeof(u64)
+
+#define MBOX_XACTION_TIMEOUT_MS	(10 * MSEC_PER_SEC)
 
 /* Current microcode patch used in early patching on the APs. */
 static struct microcode_intel *ucode_patch_va __read_mostly;
@@ -330,6 +354,49 @@ static __init struct microcode_intel *scan_microcode(void *data, size_t size,
 	return size ? NULL : patch;
 }
 
+static inline u32 read_mbox_dword(void __iomem *mmio_base)
+{
+	u32 dword = readl(mmio_base + MBOX_RDDATA_OFFSET);
+
+	/* Acknowledge read completion to the staging hardware */
+	writel(0, mmio_base + MBOX_RDDATA_OFFSET);
+	return dword;
+}
+
+static inline void write_mbox_dword(void __iomem *mmio_base, u32 dword)
+{
+	writel(dword, mmio_base + MBOX_WRDATA_OFFSET);
+}
+
+static inline u64 read_mbox_header(void __iomem *mmio_base)
+{
+	u32 high, low;
+
+	low  = read_mbox_dword(mmio_base);
+	high = read_mbox_dword(mmio_base);
+
+	return ((u64)high << 32) | low;
+}
+
+static inline void write_mbox_header(void __iomem *mmio_base, u64 value)
+{
+	write_mbox_dword(mmio_base, value);
+	write_mbox_dword(mmio_base, value >> 32);
+}
+
+static void write_mbox_data(void __iomem *mmio_base, u32 *chunk, unsigned int chunk_bytes)
+{
+	int i;
+
+	/*
+	 * The MMIO space is mapped as Uncached (UC). Each write arrives
+	 * at the device as an individual transaction in program order.
+	 * The device can then resemble the sequence accordingly.
+	 */
+	for (i = 0; i < chunk_bytes / sizeof(u32); i++)
+		write_mbox_dword(mmio_base, chunk[i]);
+}
+
 /*
  * Prepare for a new microcode transfer: reset hardware and record the
  * image size.
@@ -385,6 +452,14 @@ static bool can_send_next_chunk(struct staging_state *ss)
 	return true;
 }
 
+/*
+ * The hardware indicates completion by returning a sentinel end offset
+ */
+static inline bool is_end_offset(u32 offset)
+{
+	return offset == UINT_MAX;
+}
+
 /*
  * Determine whether staging is complete: either the hardware signaled
  * the end offset, or no more transactions are permitted (retry limit
@@ -392,18 +467,73 @@ static bool can_send_next_chunk(struct staging_state *ss)
  */
 static inline bool staging_is_complete(struct staging_state *ss)
 {
-	return (ss->offset == UINT_MAX) || !can_send_next_chunk(ss);
+	return is_end_offset(ss->offset) || !can_send_next_chunk(ss);
+}
+
+/*
+ * Wait for the hardware to complete a transaction.
+ * Return 0 on success, or an error code on failure.
+ */
+static int wait_for_transaction(struct staging_state *ss)
+{
+	u32 timeout, status;
+
+	/* Allow time for hardware to complete the operation: */
+	for (timeout = 0; timeout < MBOX_XACTION_TIMEOUT_MS; timeout++) {
+		msleep(1);
+
+		status = readl(ss->mmio_base + MBOX_STATUS_OFFSET);
+		/* Break out early if the hardware is ready: */
+		if (status & MASK_MBOX_STATUS_READY)
+			break;
+	}
+
+	/* Check for explicit error response */
+	if (status & MASK_MBOX_STATUS_ERROR) {
+		ss->state = UCODE_ERROR;
+		return -EPROTO;
+	}
+
+	/*
+	 * Hardware is neither responded to the action nor signaled any
+	 * error. Treat this as timeout.
+	 */
+	if (!(status & MASK_MBOX_STATUS_READY)) {
+		ss->state = UCODE_TIMEOUT;
+		return -ETIMEDOUT;
+	}
+
+	ss->state = UCODE_OK;
+	return 0;
 }
 
 /*
  * Transmit a chunk of the microcode image to the hardware.
  * Return 0 on success, or an error code on failure.
  */
-static int send_data_chunk(struct staging_state *ss, void *ucode_ptr __maybe_unused)
+static int send_data_chunk(struct staging_state *ss, void *ucode_ptr)
 {
-	pr_debug_once("Staging mailbox loading code needs to be implemented.\n");
-	ss->state = UCODE_ERROR;
-	return -EPROTONOSUPPORT;
+	u32 *src_chunk = ucode_ptr + ss->offset;
+	u16 mbox_size;
+
+	/*
+	 * Write a 'request' mailbox object in this order:
+	 *  1. Mailbox header includes total size
+	 *  2. Command header specifies the load operation
+	 *  3. Data section contains a microcode chunk
+	 *
+	 * Thus, the mailbox size is two headers plus the chunk size.
+	 */
+	mbox_size = MBOX_HEADER_SIZE * 2 + ss->chunk_size;
+	write_mbox_header(ss->mmio_base, MBOX_HEADER(mbox_size));
+	write_mbox_header(ss->mmio_base, MBOX_CMD_LOAD);
+	write_mbox_data(ss->mmio_base, src_chunk, ss->chunk_size);
+	ss->bytes_sent += ss->chunk_size;
+
+	/* Notify the hardware that the mailbox is ready for processing. */
+	writel(MASK_MBOX_CTRL_GO, ss->mmio_base + MBOX_CONTROL_OFFSET);
+
+	return wait_for_transaction(ss);
 }
 
 /*
@@ -412,9 +542,51 @@ static int send_data_chunk(struct staging_state *ss, void *ucode_ptr __maybe_unu
  */
 static int fetch_next_offset(struct staging_state *ss)
 {
-	pr_debug_once("Staging mailbox response handling code needs to be implemented.\n\n");
+	const u64 expected_header = MBOX_HEADER(MBOX_HEADER_SIZE + MBOX_RESPONSE_SIZE);
+	u32 offset, status;
+	u64 header;
+	int err;
+
+	/*
+	 * The 'response' mailbox returns three fields, in order:
+	 *  1. Header
+	 *  2. Next offset in the microcode image
+	 *  3. Status flags
+	 */
+	header = read_mbox_header(ss->mmio_base);
+	offset = read_mbox_dword(ss->mmio_base);
+	status = read_mbox_dword(ss->mmio_base);
+
+	/* All valid responses must start with the expected header. */
+	if (header != expected_header) {
+		pr_err_once("staging: invalid response header\n");
+		err = -EINVAL;
+		goto err_out;
+	}
+
+	/*
+	 * Verify the offset: If not at the end marker, it must not
+	 * exceed the microcode image length
+	 */
+	if (!is_end_offset(offset) && offset > ss->ucode_len) {
+		pr_err_once("staging: invalid response offset\n");
+		err = -EINVAL;
+		goto err_out;
+	}
+
+	/* Hardware may report errors explicitly in the status field */
+	if (status & MASK_MBOX_RESP_ERROR) {
+		err = -EPROTO;
+		goto err_out;
+	}
+
+	ss->offset = offset;
+	ss->state  = UCODE_OK;
+	return 0;
+
+err_out:
 	ss->state = UCODE_ERROR;
-	return -EPROTONOSUPPORT;
+	return err;
 }
 
 /*
-- 
2.48.1

Re: [PATCH v5 6/7] x86/microcode/intel: Support mailbox transfer

Posted by Borislav Petkov 3 weeks, 1 day ago

On Sat, Aug 23, 2025 at 08:52:09AM -0700, Chang S. Bae wrote:
> Previously, the functions for sending microcode data and retrieving the
> next offset were placeholders, as they required handling the specific
> mailbox format. Implement them as following:
> 
> == Mailbox Format ==
> 
> The staging mailbox consists of two primary sections: 'header' and
> 'data'. While the microcode must be transferred following this format,
> the actual data transfer mechanism involves reading and writing to
> specific MMIO registers.
> 
> == Mailbox Data Registers ==
> 
> Unlike conventional interfaces that allocate MMIO space for each data
> chunk, the staging interface features a "narrow" interface, using only
> two dword-sized registers for read and write operations.
> 
> For example, if writing 2 dwords of data to a device. Typically, the
						      ^
						      |

abrupt sentence end. Looks like the idea was to not have a fullstop there but
continue...

> device would expose 2 dwords of "wide" MMIO space. To send the data to
> the device:
> 
> 	writel(buf[0], io_addr + 0);
> 	writel(buf[1], io_addr + 1);
> 
> But, this interface is a bit different. Instead of having a "wide"
> interface where there is separate MMIO space for each word in a
> transaction, it has a "narrow" interface where several words are written
> to the same spot in MMIO space:
> 
> 	writel(buf[0], io_addr);
> 	writel(buf[1], io_addr);
> 
> The same goes for the read side.

I don't understand what the point of this explanation is if all you end up
doing is writing to "io_addr". Why introduce the unnecessary confusion?

> 
> == Implementation Summary ==
> 
> Given that, introduce two layers of helper functions at first:
> 
>   * Low-level helpers for reading and writing to data registers directly.
>   * Wrapper functions for handling mailbox header and data sections.
> 
> Using them, implement send_data_chunk() and fetch_next_offset()
> functions. Add explicit error and timeout handling routine in
> wait_for_transaction(), finishing up the transfer.
>
> Both hardware error states and implicit errors -- invalid header or
> offset -- result in UCODE_ERROR. Emit a clear message for the latter.a

Can we pu-lease stop explaining what the code does?!

Do not talk about *what* the patch is doing in the commit message - that
should be obvious from the diff itself. Rather, concentrate on the *why*
it needs to be done.
 
> Note: The kernel has support for similar mailboxes. But none of them are
> compatible with this one. Trying to share code resulted in a bloated
> mess, so this code is standalone.

Now that belongs in a commit message. Stuff which is non-obvious etc.

> +static void write_mbox_data(void __iomem *mmio_base, u32 *chunk, unsigned int chunk_bytes)
> +{
> +	int i;
> +
> +	/*
> +	 * The MMIO space is mapped as Uncached (UC). Each write arrives
> +	 * at the device as an individual transaction in program order.
> +	 * The device can then resemble the sequence accordingly.

reassemble?

> +	 */
> +	for (i = 0; i < chunk_bytes / sizeof(u32); i++)
> +		write_mbox_dword(mmio_base, chunk[i]);
> +}
> +
>  /*
>   * Prepare for a new microcode transfer: reset hardware and record the
>   * image size.

...

> +static int wait_for_transaction(struct staging_state *ss)
> +{
> +	u32 timeout, status;
> +
> +	/* Allow time for hardware to complete the operation: */
> +	for (timeout = 0; timeout < MBOX_XACTION_TIMEOUT_MS; timeout++) {
> +		msleep(1);
> +
> +		status = readl(ss->mmio_base + MBOX_STATUS_OFFSET);
> +		/* Break out early if the hardware is ready: */
> +		if (status & MASK_MBOX_STATUS_READY)
> +			break;
> +	}
> +
> +	/* Check for explicit error response */
> +	if (status & MASK_MBOX_STATUS_ERROR) {
> +		ss->state = UCODE_ERROR;
> +		return -EPROTO;
> +	}
> +
> +	/*
> +	 * Hardware is neither responded to the action nor signaled any

s/is/has/

> +	 * error. Treat this as timeout.
> +	 */
> +	if (!(status & MASK_MBOX_STATUS_READY)) {
> +		ss->state = UCODE_TIMEOUT;
> +		return -ETIMEDOUT;
> +	}
> +
> +	ss->state = UCODE_OK;
> +	return 0;
>  }

...

> @@ -412,9 +542,51 @@ static int send_data_chunk(struct staging_state *ss, void *ucode_ptr __maybe_unu
>   */
>  static int fetch_next_offset(struct staging_state *ss)
>  {
> -	pr_debug_once("Staging mailbox response handling code needs to be implemented.\n\n");
> +	const u64 expected_header = MBOX_HEADER(MBOX_HEADER_SIZE + MBOX_RESPONSE_SIZE);
> +	u32 offset, status;
> +	u64 header;
> +	int err;
> +
> +	/*
> +	 * The 'response' mailbox returns three fields, in order:
> +	 *  1. Header
> +	 *  2. Next offset in the microcode image
> +	 *  3. Status flags
> +	 */
> +	header = read_mbox_header(ss->mmio_base);
> +	offset = read_mbox_dword(ss->mmio_base);
> +	status = read_mbox_dword(ss->mmio_base);
> +
> +	/* All valid responses must start with the expected header. */
> +	if (header != expected_header) {
> +		pr_err_once("staging: invalid response header\n");
> +		err = -EINVAL;
> +		goto err_out;
> +	}
> +
> +	/*
> +	 * Verify the offset: If not at the end marker, it must not
> +	 * exceed the microcode image length
> +	 */
> +	if (!is_end_offset(offset) && offset > ss->ucode_len) {
> +		pr_err_once("staging: invalid response offset\n");

You might want to dump some of the actual values in those pr_* statements so
that it dumps more breadcrumbs and those prints are more useful when debugging
issues.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Re: [PATCH v5 6/7] x86/microcode/intel: Support mailbox transfer

Posted by Chang S. Bae 3 weeks ago

On 9/12/2025 9:34 AM, Borislav Petkov wrote:
> On Sat, Aug 23, 2025 at 08:52:09AM -0700, Chang S. Bae wrote:
> 
> I don't understand what the point of this explanation is if all you end up
> doing is writing to "io_addr". Why introduce the unnecessary confusion?

Yikes, it looks like describing the format and interface went into too 
much detail when that's already covered in the spec.

>> Note: The kernel has support for similar mailboxes. But none of them are
>> compatible with this one. Trying to share code resulted in a bloated
>> mess, so this code is standalone.
> 
> Now that belongs in a commit message. Stuff which is non-obvious etc.

Okay, maybe the changelog could be something like this concisely:

The functions for sending microcode data and retrieving the next offset
were previously placeholders, as they need to handle a specific mailbox
format.

While the kernel supports similar mailboxes, none of them are compatible
with this one. Attempts to share code led to unnecessary complexity, so
add a dedicated implementation instead.

>> +	/*
>> +	 * The MMIO space is mapped as Uncached (UC). Each write arrives
>> +	 * at the device as an individual transaction in program order.
>> +	 * The device can then resemble the sequence accordingly.
> 
> reassemble?

Yep, fixed.

>> +	/*
>> +	 * Hardware is neither responded to the action nor signaled any
> 
> s/is/has/

Also fixed -- and I'll do a more careful pass to catch typos like that.

> You might want to dump some of the actual values in those pr_* statements so
> that it dumps more breadcrumbs and those prints are more useful when debugging
> issues.

Yeah, good suggestion -- maybe it could dump invalid header and offset 
values like these:

   pr_err_once("staging: invalid response header (0x%llx)\n", header);

   pr_err_once("staging: invalid offset (%u) after the image end (%u)",
               offset, ss->ucode_len);

Re: [PATCH v5 6/7] x86/microcode/intel: Support mailbox transfer

Posted by Borislav Petkov 2 weeks, 6 days ago

On Fri, Sep 12, 2025 at 05:51:49PM -0700, Chang S. Bae wrote:
> The functions for sending microcode data and retrieving the next offset
> were previously placeholders, as they need to handle a specific mailbox
> format.
> 
> While the kernel supports similar mailboxes, none of them are compatible
> with this one. Attempts to share code led to unnecessary complexity, so
> add a dedicated implementation instead.

Better.

> Yeah, good suggestion -- maybe it could dump invalid header and offset
> values like these:
> 
>   pr_err_once("staging: invalid response header (0x%llx)\n", header);
> 
>   pr_err_once("staging: invalid offset (%u) after the image end (%u)",
>               offset, ss->ucode_len);

Should be more helpful than before, yeah.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

[PATCH v5 7/7] x86/microcode/intel: Enable staging when available

Posted by Chang S. Bae 1 month, 1 week ago

With the staging code being ready, check for staging availability by
reading these following MSRs:

  * IA32_ARCH_CAPABILITIES (bit 16) for the presence of
    IA32_MCU_ENUMERATION

  * IA32_MCU_ENUMERATION (bit 4) for the staging feature availability.

When available, enable it by setting the feature bit.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Reviewed-by: Chao Gao <chao.gao@intel.com>
Tested-by: Anselm Busse <abusse@amazon.de>
---
V4 -> V5:
* Collect Chao's Review tag
* rdmsrl() -> rdmsrq() (Chao)

V1 -> V2:
* Fold MSR definings (Boris).
---
 arch/x86/include/asm/msr-index.h      |  7 +++++++
 arch/x86/kernel/cpu/microcode/intel.c | 17 +++++++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 0356155f9264..2b7b598c414f 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -166,6 +166,10 @@
 						 * Processor MMIO stale data
 						 * vulnerabilities.
 						 */
+#define ARCH_CAP_MCU_ENUM		BIT(16) /*
+						 * Indicates the presence of microcode update
+						 * feature enumeration and status information
+						 */
 #define ARCH_CAP_FB_CLEAR		BIT(17)	/*
 						 * VERW clears CPU fill buffer
 						 * even on MDS_NO CPUs.
@@ -913,6 +917,9 @@
 #define MSR_IA32_UCODE_WRITE		0x00000079
 #define MSR_IA32_UCODE_REV		0x0000008b
 
+#define MSR_IA32_MCU_ENUMERATION	0x0000007b
+#define MCU_STAGING			BIT(4)
+
 #define MSR_IA32_MCU_STAGING_MBOX_ADDR	0x000007a5
 
 /* Intel SGX Launch Enclave Public Key Hash MSRs */
diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index f6b365eba6a2..7d6582f8a80f 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -1007,6 +1007,18 @@ static __init void calc_llc_size_per_core(struct cpuinfo_x86 *c)
 	llc_size_per_core = (unsigned int)llc_size;
 }
 
+static __init bool staging_available(void)
+{
+	u64 val;
+
+	val = x86_read_arch_cap_msr();
+	if (!(val & ARCH_CAP_MCU_ENUM))
+		return false;
+
+	rdmsrq(MSR_IA32_MCU_ENUMERATION, val);
+	return !!(val & MCU_STAGING);
+}
+
 struct microcode_ops * __init init_intel_microcode(void)
 {
 	struct cpuinfo_x86 *c = &boot_cpu_data;
@@ -1017,6 +1029,11 @@ struct microcode_ops * __init init_intel_microcode(void)
 		return NULL;
 	}
 
+	if (staging_available()) {
+		microcode_intel_ops.use_staging = true;
+		pr_info("Enabled staging feature.\n");
+	}
+
 	calc_llc_size_per_core(c);
 
 	return &microcode_intel_ops;
-- 
2.48.1

Re: [PATCH v4 0/6] x86: Support for Intel Microcode Staging Feature

Posted by Dave Hansen 1 month, 3 weeks ago

On 8/13/25 10:26, Chang S. Bae wrote:
> The use case for this feature is late-loading with increased load times,
> most likely in cloud service environments. I reached out to potential
> users for reviews and feedback, and Anselm Busse has confirmed his
> testing.

I'm glad someone tested this, but it would also be ideal to have some
code reviews from the folks that are actually going to use this. From
the lack of reviews, I'm going to assume that folks don't really want
this merged any time soon.