[v2] x86: Support for Intel Microcode Staging Feature

[PATCH v2 4/6] x86/microcode/intel: Implement staging handler

Posted by Chang S. Bae 10 months, 3 weeks ago

Previously, per-package staging invocations were established for each
MMIO space. The next step is to implement the staging handler according
to the specified protocol. Below are key aspects to note:

  (a)  Each staging process must begin by resetting the staging hardware.

  (b)  The staging hardware processes up to a page-sized chunk of the
       microcode image per iteration, requiring software to submit data
       incrementally.

  (c)  Once a data chunk is processed, the hardware responds with an
       offset in the image for the next chunk.

  (d)  The offset may indicate completion or request retransmission of an
       already transferred chunk. As long as the total transferred data
       remains within the predefined limit (twice the image size),
       retransmissions should be acceptable.

With that, incorporate these code sequences to the staging handler:

  1.  Initialization: Map the MMIO space via ioremap(). Reset the staging
      hardware and initialize software state, ensuring a fresh staging
      process aligned with (a).

  2.  Processing Loop: Introduce a loop iterating over data chunk,
      following (b), with proper termination conditions established from
      (d) -- stop staging when the hardware signals completion, or if the
      total transmitted data exceeds the predefined limit.

  3.  Loop Body: Finally, compose the loop body with two steps --
      transmitting a data chunk and retrieving the next offset from the
      hardware response, aligning with (b) and (c).

Since data transmission and mailbox format handling require additional
details, they are implemented separately in next changes.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
---
V1 -> V2:
* Re-write the changelog for clarity (Dave).
* Move staging handling code into intel.c (Boris).
* Add extensive comments to clarify staging logic and hardware
  interactions, along with function renaming renaming (Dave).

 A key change to highlight is the updated main loop in do_stage(). With
 the introduction of global staging data, sub-functions now focus on
 being more self-explanatory through meaningful naming.

RFC-V1 -> V1:
* Rename the function name and change the return type.
---
 arch/x86/kernel/cpu/microcode/intel.c | 116 +++++++++++++++++++++++++-
 1 file changed, 113 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index 14c20b53f14d..05b5b73e525a 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -20,6 +20,8 @@
 #include <linux/cpu.h>
 #include <linux/uio.h>
 #include <linux/mm.h>
+#include <linux/delay.h>
+#include <linux/io.h>
 
 #include <asm/cpu_device_id.h>
 #include <asm/processor.h>
@@ -33,6 +35,29 @@ static const char ucode_path[] = "kernel/x86/microcode/GenuineIntel.bin";
 
 #define UCODE_BSP_LOADED	((struct microcode_intel *)0x1UL)
 
+/* Defines for the microcode staging mailbox interface */
+
+#define MBOX_REG_NUM		4
+#define MBOX_REG_SIZE		sizeof(u32)
+
+#define MBOX_CONTROL_OFFSET	0x0
+#define MBOX_STATUS_OFFSET	0x4
+
+#define MASK_MBOX_CTRL_ABORT	BIT(0)
+
+/*
+ * Each microcode image is divided into chunks, each at most
+ * MBOX_XACTION_SIZE in size. A 10-chunk image would typically require
+ * 10 transactions. However, the hardware managing the mailbox has
+ * limited resources and may not cache the entire image, potentially
+ * requesting the same chunk multiple times.
+ *
+ * To accommodate this behavior, allow up to twice the expected number of
+ * transactions (i.e., a 10-chunk image can take up to 20 attempts).
+ */
+#define MBOX_XACTION_SIZE	PAGE_SIZE
+#define MBOX_XACTION_MAX(imgsz)	((imgsz) * 2)
+
 /* Current microcode patch used in early patching on the APs. */
 static struct microcode_intel *ucode_patch_va __read_mostly;
 static struct microcode_intel *ucode_patch_late __read_mostly;
@@ -323,14 +348,99 @@ static __init struct microcode_intel *scan_microcode(void *data, size_t size,
 	return size ? NULL : patch;
 }
 
+/*
+ * Prepare for a new microcode transfer by resetting both hardware and
+ * software states.
+ */
+static inline void reset_stage(void)
+{
+	/* Reset tracking variables */
+	staging.offset = 0;
+	staging.bytes_sent = 0;
+
+	/*
+	 * Abort any ongoing process, effectively resetting the device.
+	 * Unlike regular mailbox data processing requests, this
+	 * operation does not require a status check.
+	 */
+	writel(MASK_MBOX_CTRL_ABORT, staging.mmio_base + MBOX_CONTROL_OFFSET);
+}
+
+/*
+ * Check if the staging process has completed. The hardware signals
+ * completion by setting a unique end offset.
+ */
+static inline bool is_stage_complete(void)
+{
+	return staging.offset == UINT_MAX;
+}
+
+/*
+ * Determine if the next data chunk can be sent. Each chunk is typically
+ * one page unless the remaining data is smaller. If the total
+ * transmitted data exceeds the defined limit, a timeout occurs.
+ */
+static bool can_send_next_chunk(void)
+{
+	WARN_ON_ONCE(staging.ucode_len < staging.offset);
+	staging.chunk_size = min(MBOX_XACTION_SIZE, staging.ucode_len - staging.offset);
+
+	if (staging.bytes_sent + staging.chunk_size > MBOX_XACTION_MAX(staging.ucode_len)) {
+		staging.state = UCODE_TIMEOUT;
+		return false;
+	}
+	return true;
+}
+
+/*
+ * Transmit a chunk of the microcode image to the hardware.
+ * Return true if the chunk is processed successfully.
+ */
+static bool send_data_chunk(void)
+{
+	pr_debug_once("Staging mailbox loading code needs to be implemented.\n");
+	return false;
+}
+
+/*
+ * Retrieve the next offset from the hardware response.
+ * Return true if the response is valid, false otherwise.
+ */
+static bool fetch_next_offset(void)
+{
+	pr_debug_once("Staging mailbox response handling code needs to be implemented.\n\n");
+	return false;
+}
+
 /*
  * Handle the staging process using the mailbox MMIO interface. The
- * caller is expected to check the result in staging.state.
+ * microcode image is transferred in chunks until completion. The caller
+ * is expected to check the result in staging.state.
  */
 static void do_stage(u64 mmio_pa)
 {
-	pr_debug_once("Staging implementation is pending.\n");
-	staging.state = UCODE_ERROR;
+	staging.mmio_base = ioremap(mmio_pa, MBOX_REG_NUM * MBOX_REG_SIZE);
+	if (WARN_ON_ONCE(!staging.mmio_base)) {
+		staging.state = UCODE_ERROR;
+		return;
+	}
+
+	reset_stage();
+
+	/* Perform the staging process while within the retry limit */
+	while (!is_stage_complete() && can_send_next_chunk()) {
+		/* Send a chunk of microcode each time: */
+		if (!send_data_chunk())
+			break;
+		/*
+		 * Then, ask the hardware which piece of the image it
+		 * needs next. The same piece may be sent more than once.
+		 */
+		if (!fetch_next_offset())
+			break;
+	}
+
+	iounmap(staging.mmio_base);
 }
 
 static void stage_microcode(void)
-- 
2.45.2

Re: [PATCH v2 4/6] x86/microcode/intel: Implement staging handler

Posted by Dave Hansen 10 months, 3 weeks ago

On 3/20/25 16:40, Chang S. Bae wrote:
>  static void do_stage(u64 mmio_pa)
>  {
> -	pr_debug_once("Staging implementation is pending.\n");
> -	staging.state = UCODE_ERROR;
> +	staging.mmio_base = ioremap(mmio_pa, MBOX_REG_NUM * MBOX_REG_SIZE);
> +	if (WARN_ON_ONCE(!staging.mmio_base)) {
> +		staging.state = UCODE_ERROR;
> +		return;
> +	}
> +
> +	reset_stage();

This is purely a style thing, but I really dislike using global
variables like this.

What is reset_stage() doing? What state is it resetting? What is locking
this? What is its scope?

Now, imagine you have a function variable. It can be on the stack or static:

static void do_stage(u64 mmio_pa)
{
	struct staging_state staging = {};

Then you pass it around:

	staging.mmio_base = ioremap(mmio_pa, MBOX_REG_NUM * ...
	if (WARN_ON_ONCE(!staging.mmio_base)) {
		staging.state = UCODE_ERROR;
		return;
	}

	reset_stage(&staging);

Yeah, it means passing around _one_ function argument to a function
that's not currently taking an argument. But it makes it a heck of a lot
more clear what is going on. It also makes the lifetime and
initialization state of 'staging' *MUCH* more obvious.

It also makes it clear what state reset_stage() needs to do its job.

Re: [PATCH v2 4/6] x86/microcode/intel: Implement staging handler

Posted by Chang S. Bae 10 months, 3 weeks ago

On 3/20/2025 5:15 PM, Dave Hansen wrote:
> 
> Yeah, it means passing around _one_ function argument to a function
> that's not currently taking an argument. But it makes it a heck of a lot
> more clear what is going on. It also makes the lifetime and
> initialization state of 'staging' *MUCH* more obvious.

I see -- that seemed like a case of over-simplification.

I've updated the relevant patches after passing the staging test. I've 
also thought reordering patches (patch2 <-> patch3), perhaps when 
posting v3.

Thanks,
Chang

[PATCH v2a 4/6] x86/microcode/intel: Implement staging handler

Posted by Chang S. Bae 10 months, 3 weeks ago

Previously, per-package staging invocations were established for each
MMIO space. The next step is to implement the staging handler according
to the specified protocol. Below are key aspects to note:

  (a)  Each staging process must begin by resetting the staging hardware.

  (b)  The staging hardware processes up to a page-sized chunk of the
       microcode image per iteration, requiring software to submit data
       incrementally.

  (c)  Once a data chunk is processed, the hardware responds with an
       offset in the image for the next chunk.

  (d)  The offset may indicate completion or request retransmission of an
       already transferred chunk. As long as the total transferred data
       remains within the predefined limit (twice the image size),
       retransmissions should be acceptable.

With that, incorporate these code sequences to the staging handler:

  1.  Initialization: Map the MMIO space via ioremap(). Reset the staging
      hardware and initialize software state, ensuring a fresh staging
      process aligned with (a).

  2.  Processing Loop: Introduce a loop iterating over data chunk,
      following (b), with proper termination conditions established from
      (d) -- stop staging when the hardware signals completion, or if the
      total transmitted data exceeds the predefined limit.

  3.  Loop Body: Finally, compose the loop body with two steps --
      transmitting a data chunk and retrieving the next offset from the
      hardware response, aligning with (b) and (c).

Since data transmission and mailbox format handling require additional
details, they are implemented separately in next changes.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
---
V2 -> V2a:
* Adjust the code with the global variable removal (Dave [1]).

Note: this quick revision is just intended to ensure that the feedback
has been properly addressed.

[1]: https://lore.kernel.org/lkml/b01224ee-c935-4b08-a76f-5dc49341182d@intel.com/
---
 arch/x86/kernel/cpu/microcode/intel.c | 122 +++++++++++++++++++++++++-
 1 file changed, 118 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index 5d0216e9aee5..44b94d4d05f7 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -20,6 +20,8 @@
 #include <linux/cpu.h>
 #include <linux/uio.h>
 #include <linux/mm.h>
+#include <linux/delay.h>
+#include <linux/io.h>
 
 #include <asm/cpu_device_id.h>
 #include <asm/processor.h>
@@ -33,6 +35,29 @@ static const char ucode_path[] = "kernel/x86/microcode/GenuineIntel.bin";
 
 #define UCODE_BSP_LOADED	((struct microcode_intel *)0x1UL)
 
+/* Defines for the microcode staging mailbox interface */
+
+#define MBOX_REG_NUM		4
+#define MBOX_REG_SIZE		sizeof(u32)
+
+#define MBOX_CONTROL_OFFSET	0x0
+#define MBOX_STATUS_OFFSET	0x4
+
+#define MASK_MBOX_CTRL_ABORT	BIT(0)
+
+/*
+ * Each microcode image is divided into chunks, each at most
+ * MBOX_XACTION_SIZE in size. A 10-chunk image would typically require
+ * 10 transactions. However, the hardware managing the mailbox has
+ * limited resources and may not cache the entire image, potentially
+ * requesting the same chunk multiple times.
+ *
+ * To accommodate this behavior, allow up to twice the expected number of
+ * transactions (i.e., a 10-chunk image can take up to 20 attempts).
+ */
+#define MBOX_XACTION_SIZE	PAGE_SIZE
+#define MBOX_XACTION_MAX(imgsz)	((imgsz) * 2)
+
 /* Current microcode patch used in early patching on the APs. */
 static struct microcode_intel *ucode_patch_va __read_mostly;
 static struct microcode_intel *ucode_patch_late __read_mostly;
@@ -321,13 +346,102 @@ static __init struct microcode_intel *scan_microcode(void *data, size_t size,
 }
 
 /*
- * Handle the staging process using the mailbox MMIO interface.
- * Return the result state.
+ * Prepare for a new microcode transfer by resetting hardware and
+ * initializing software states.
+ */
+static void init_stage(struct staging_state *ss)
+{
+	ss->ucode_ptr = ucode_patch_late;
+	ss->ucode_len = get_totalsize(&ucode_patch_late->hdr);
+
+	/* Reset tracking variables */
+	ss->offset = 0;
+	ss->bytes_sent = 0;
+
+	/*
+	 * Abort any ongoing process, effectively resetting the device.
+	 * Unlike regular mailbox data processing requests, this
+	 * operation does not require a status check.
+	 */
+	writel(MASK_MBOX_CTRL_ABORT, ss->mmio_base + MBOX_CONTROL_OFFSET);
+}
+
+/*
+ * Check if the staging process has completed. The hardware signals
+ * completion by setting a unique end offset.
+ */
+static inline bool is_stage_complete(unsigned int offset)
+{
+	return offset == UINT_MAX;
+}
+
+/*
+ * Determine if the next data chunk can be sent. Each chunk is typically
+ * one page unless the remaining data is smaller. If the total
+ * transmitted data exceeds the defined limit, a timeout occurs.
+ */
+static bool can_send_next_chunk(struct staging_state *ss)
+{
+	WARN_ON_ONCE(ss->ucode_len < ss->offset);
+	ss->chunk_size = min(MBOX_XACTION_SIZE, ss->ucode_len - ss->offset);
+
+	if (ss->bytes_sent + ss->chunk_size > MBOX_XACTION_MAX(ss->ucode_len)) {
+		ss->state = UCODE_TIMEOUT;
+		return false;
+	}
+	return true;
+}
+
+/*
+ * Transmit a chunk of the microcode image to the hardware.
+ * Return true if the chunk is processed successfully.
+ */
+static bool send_data_chunk(struct staging_state *unused)
+{
+	pr_debug_once("Staging mailbox loading code needs to be implemented.\n");
+	return false;
+}
+
+/*
+ * Retrieve the next offset from the hardware response.
+ * Return true if the response is valid, false otherwise.
+ */
+static bool fetch_next_offset(struct staging_state *unused)
+{
+	pr_debug_once("Staging mailbox response handling code needs to be implemented.\n\n");
+	return false;
+}
+
+/*
+ * Handle the staging process using the mailbox MMIO interface. The
+ * microcode image is transferred in chunks until completion. Return the
+ * result state.
  */
 static enum ucode_state do_stage(u64 mmio_pa)
 {
-	pr_debug_once("Staging implementation is pending.\n");
-	return UCODE_ERROR;
+	struct staging_state ss = {};
+
+	ss.mmio_base = ioremap(mmio_pa, MBOX_REG_NUM * MBOX_REG_SIZE);
+	if (WARN_ON_ONCE(!ss.mmio_base))
+		return UCODE_ERROR;
+
+	init_stage(&ss);
+
+	/* Perform the staging process while within the retry limit */
+	while (!is_stage_complete(ss.offset) && can_send_next_chunk(&ss)) {
+		/* Send a chunk of microcode each time: */
+		if (!send_data_chunk(&ss))
+			break;
+		/*
+		 * Then, ask the hardware which piece of the image it
+		 * needs next. The same piece may be sent more than once.
+		 */
+		if (!fetch_next_offset(&ss))
+			break;
+	}
+
+	iounmap(ss.mmio_base);
+	return ss.state;
 }
 
 static void stage_microcode(void)
-- 
2.45.2

Re: [PATCH v2a 4/6] x86/microcode/intel: Implement staging handler

Posted by Chao Gao 10 months, 2 weeks ago

> /*
>- * Handle the staging process using the mailbox MMIO interface.
>- * Return the result state.
>+ * Prepare for a new microcode transfer by resetting hardware and
>+ * initializing software states.
>+ */
>+static void init_stage(struct staging_state *ss)
>+{
>+	ss->ucode_ptr = ucode_patch_late;
>+	ss->ucode_len = get_totalsize(&ucode_patch_late->hdr);
>+
>+	/* Reset tracking variables */
>+	ss->offset = 0;
>+	ss->bytes_sent = 0;

Nit: no need to reset them, as

>+	struct staging_state ss = {};

in do_stage() will zero the whole structure.

>+/*
>+ * Handle the staging process using the mailbox MMIO interface. The
>+ * microcode image is transferred in chunks until completion. Return the
>+ * result state.
>  */
> static enum ucode_state do_stage(u64 mmio_pa)
> {
>-	pr_debug_once("Staging implementation is pending.\n");
>-	return UCODE_ERROR;
>+	struct staging_state ss = {};
>+
>+	ss.mmio_base = ioremap(mmio_pa, MBOX_REG_NUM * MBOX_REG_SIZE);
>+	if (WARN_ON_ONCE(!ss.mmio_base))
>+		return UCODE_ERROR;
>+
>+	init_stage(&ss);
>+
>+	/* Perform the staging process while within the retry limit */
>+	while (!is_stage_complete(ss.offset) && can_send_next_chunk(&ss)) {
>+		/* Send a chunk of microcode each time: */
>+		if (!send_data_chunk(&ss))
>+			break;
>+		/*
>+		 * Then, ask the hardware which piece of the image it
>+		 * needs next. The same piece may be sent more than once.
>+		 */
>+		if (!fetch_next_offset(&ss))
>+			break;

why send_data_chunk() and fetch_next_offset() return a boolean instead of
an error or ucode_state?

Using the return value to indicate just success or failure, while relying
on another variable to report detailed error/state, seems a bit clumsy to
me.

>+	}
>+
>+	iounmap(ss.mmio_base);
>+	return ss.state;
> }
> 
> static void stage_microcode(void)
>-- 
>2.45.2
>

Re: [PATCH v2a 4/6] x86/microcode/intel: Implement staging handler

Posted by Chang S. Bae 10 months, 2 weeks ago

On 3/26/2025 1:34 AM, Chao Gao wrote:
>> +
>> +	/* Reset tracking variables */
>> +	ss->offset = 0;
>> +	ss->bytes_sent = 0;
> 
> Nit: no need to reset them, as
> 
>> +	struct staging_state ss = {};
> 
> in do_stage() will zero the whole structure.

I initially wanted to explicitly highlight where these variables are 
reset, but you’re right — in practice, they can be removed.

> why send_data_chunk() and fetch_next_offset() return a boolean instead of
> an error or ucode_state?
> 
> Using the return value to indicate just success or failure, while relying
> on another variable to report detailed error/state, seems a bit clumsy to
> me.

The error state is interpreted at the call site, where the final result 
is returned to the caller.

At the end of each step, all the loop needs to decide is whether to 
continue or break, which naturally fits a boolean return. I don’t see a 
strong reason to change this approach.

Thanks,
Chang

[PATCH v2 1/6] x86/microcode: Introduce staging step to reduce late-loading time
[PATCH v2 2/6] x86/microcode/intel: Define staging state struct
[PATCH v2 3/6] x86/microcode/intel: Establish staging control logic
[PATCH v2 4/6] x86/microcode/intel: Implement staging handler
[PATCH v2 5/6] x86/microcode/intel: Support mailbox transfer
[PATCH v2 6/6] x86/microcode/intel: Enable staging when available