[v1] x86/microcode/AMD: Fix out-of-bounds on systems with CPU-less NUMA nodes

[PATCH] x86/microcode/AMD: Fix out-of-bounds on systems with CPU-less NUMA nodes

Posted by Florent Revest 11 months, 1 week ago

Currently, load_microcode_amd() iterates over all NUMA nodes, retrieves
their CPU masks and unconditonally accesses per-CPU data for the first
CPU of each mask.

According to Documentation/admin-guide/mm/numaperf.rst: "Some memory may
share the same node as a CPU, and others are provided as memory only
nodes." Therefore, some node CPU masks may be empty and wouldn't have a
"first CPU".

On a machine with far memory (and therefore CPU-less NUMA nodes):
- cpumask_of_node(nid) is 0
- cpumask_first(0) is CONFIG_NR_CPUS
- cpu_data(CONFIG_NR_CPUS) accesses the cpu_info per-CPU array at an
  index that is 1 out of bounds

This does not have any security implications since flashing microcode is
a privileged operation but I believe this has reliability implications
by potentially corrupting memory while flashing a microcode update.

When booting with CONFIG_UBSAN_BOUNDS=y on an AMD machine that flashes a
microcode update. I get the following splat:

UBSAN: array-index-out-of-bounds in arch/x86/kernel/cpu/microcode/amd.c:X:Y
index 512 is out of range for type 'unsigned long[512]'
[...]
Call Trace:
 dump_stack+0xdb/0x143
 __ubsan_handle_out_of_bounds+0xf5/0x120
 load_microcode_amd+0x58f/0x6b0
 request_microcode_amd+0x17c/0x250
 reload_store+0x174/0x2b0
 kernfs_fop_write_iter+0x227/0x2d0
 vfs_write+0x322/0x510
 ksys_write+0xb5/0x160
 do_syscall_64+0x6b/0xa0
 entry_SYSCALL_64_after_hwframe+0x67/0xd1

This patch checks that a NUMA node has CPUs before attempting to update
its first CPU's microcode.

Fixes: 7ff6edf4fef3 ("x86/microcode/AMD: Fix mixed steppings support")
Signed-off-by: Florent Revest <revest@chromium.org>
Cc: stable@vger.kernel.org
---
 arch/x86/kernel/cpu/microcode/amd.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
index 95ac1c6a84fbe..7c06425edc1b5 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -1059,6 +1059,7 @@ static enum ucode_state _load_microcode_amd(u8 family, const u8 *data, size_t si
 
 static enum ucode_state load_microcode_amd(u8 family, const u8 *data, size_t size)
 {
+	const struct cpumask *mask;
 	struct cpuinfo_x86 *c;
 	unsigned int nid, cpu;
 	struct ucode_patch *p;
@@ -1069,7 +1070,11 @@ static enum ucode_state load_microcode_amd(u8 family, const u8 *data, size_t siz
 		return ret;
 
 	for_each_node(nid) {
-		cpu = cpumask_first(cpumask_of_node(nid));
+		mask = cpumask_of_node(nid);
+		if (cpumask_empty(mask))
+			continue;
+
+		cpu = cpumask_first(mask);
 		c = &cpu_data(cpu);
 
 		p = find_patch(cpu);
-- 
2.49.0.rc0.332.g42c0ae87b1-goog

Re: [PATCH] x86/microcode/AMD: Fix out-of-bounds on systems with CPU-less NUMA nodes

Posted by Markus Elfring 11 months, 1 week ago

…
> This patch checks that a NUMA node …

Please improve such a change description another bit.

See also:
https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v6.14-rc5#n94

Regards,
Markus

Re: [PATCH] x86/microcode/AMD: Fix out-of-bounds on systems with CPU-less NUMA nodes

Posted by Greg KH 11 months, 1 week ago

On Fri, Mar 07, 2025 at 08:18:09PM +0100, Markus Elfring wrote:
> …
> > This patch checks that a NUMA node …
> 
> Please improve such a change description another bit.
> 
> See also:
> https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v6.14-rc5#n94
> 
> Regards,
> Markus
> 

Hi,

This is the semi-friendly patch-bot of Greg Kroah-Hartman.

Markus, you seem to have sent a nonsensical or otherwise pointless
review comment to a patch submission on a Linux kernel developer mailing
list.  I strongly suggest that you not do this anymore.  Please do not
bother developers who are actively working to produce patches and
features with comments that, in the end, are a waste of time.

Patch submitter, please ignore Markus's suggestion; you do not need to
follow it at all.  The person/bot/AI that sent it is being ignored by
almost all Linux kernel maintainers for having a persistent pattern of
behavior of producing distracting and pointless commentary, and
inability to adapt to feedback.  Please feel free to also ignore emails
from them.

thanks,

greg k-h's patch email bot

Re: [PATCH] x86/microcode/AMD: Fix out-of-bounds on systems with CPU-less NUMA nodes

Posted by Dave Hansen 11 months, 1 week ago

On 3/7/25 05:12, Florent Revest wrote:
>  	for_each_node(nid) {
> -		cpu = cpumask_first(cpumask_of_node(nid));
> +		mask = cpumask_of_node(nid);
> +		if (cpumask_empty(mask))
> +			continue;
> +
> +		cpu = cpumask_first(mask);

Would for_each_node_with_cpus() trim this down a bit?

Re: [PATCH] x86/microcode/AMD: Fix out-of-bounds on systems with CPU-less NUMA nodes

Posted by Florent Revest 11 months, 1 week ago

On Fri, Mar 7, 2025 at 3:55 PM Dave Hansen <dave.hansen@intel.com> wrote:
>
> On 3/7/25 05:12, Florent Revest wrote:
> >       for_each_node(nid) {
> > -             cpu = cpumask_first(cpumask_of_node(nid));
> > +             mask = cpumask_of_node(nid);
> > +             if (cpumask_empty(mask))
> > +                     continue;
> > +
> > +             cpu = cpumask_first(mask);
>
> Would for_each_node_with_cpus() trim this down a bit?

Oh nice, I didn't notice this macro, thanks for pointing it out! :)
I'm happy to respin a v2 using for_each_node_with_cpus(), I'll just
leave a bit more time in case there are other comments.

One thing I'm not entirely sure about is that
for_each_node_with_cpus() is implemented on top of
for_each_online_node(). This differs from the current code which uses
for_each_node(). I can't tell if iterating over offline nodes is a bug
or a feature of load_microcode_amd() so this would be an extra change
to the business logic which I can't really explain/justify.

Re: [PATCH] x86/microcode/AMD: Fix out-of-bounds on systems with CPU-less NUMA nodes

Posted by Dave Hansen 11 months, 1 week ago

On 3/7/25 07:58, Florent Revest wrote:
> One thing I'm not entirely sure about is that
> for_each_node_with_cpus() is implemented on top of
> for_each_online_node(). This differs from the current code which uses
> for_each_node(). I can't tell if iterating over offline nodes is a bug
> or a feature of load_microcode_amd() so this would be an extra change
> to the business logic which I can't really explain/justify.

Actually, the per-node caches seem to have gone away at some point too.
Boris would know the history. This might need a a cleanup like Boris
alluded to in 05e91e7211383. This might not even need a nid loop.

Re: [PATCH] x86/microcode/AMD: Fix out-of-bounds on systems with CPU-less NUMA nodes

Posted by Borislav Petkov 11 months, 1 week ago

On Fri, Mar 07, 2025 at 08:32:20AM -0800, Dave Hansen wrote:
> On 3/7/25 07:58, Florent Revest wrote:
> > One thing I'm not entirely sure about is that
> > for_each_node_with_cpus() is implemented on top of
> > for_each_online_node(). This differs from the current code which uses
> > for_each_node(). I can't tell if iterating over offline nodes is a bug

You better not have offlined nodes when applying microcode. The path you're
landing in here has already hotplug disabled, tho.

> > or a feature of load_microcode_amd() so this would be an extra change
> > to the business logic which I can't really explain/justify.
> 
> Actually, the per-node caches seem to have gone away at some point too.
> Boris would know the history. This might need a a cleanup like Boris
> alluded to in 05e91e7211383. This might not even need a nid loop.

Nah, the cache is still there. For now...

for_each_node_with_cpus() should simply work unless I'm missing some other
angle...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Re: [PATCH] x86/microcode/AMD: Fix out-of-bounds on systems with CPU-less NUMA nodes

Posted by Florent Revest 11 months, 1 week ago

On Fri, Mar 7, 2025 at 5:44 PM Borislav Petkov <bp@alien8.de> wrote:
>
> On Fri, Mar 07, 2025 at 08:32:20AM -0800, Dave Hansen wrote:
> > On 3/7/25 07:58, Florent Revest wrote:
> > > One thing I'm not entirely sure about is that
> > > for_each_node_with_cpus() is implemented on top of
> > > for_each_online_node(). This differs from the current code which uses
> > > for_each_node(). I can't tell if iterating over offline nodes is a bug
>
> You better not have offlined nodes when applying microcode. The path you're
> landing in here has already hotplug disabled, tho.
>
> > > or a feature of load_microcode_amd() so this would be an extra change
> > > to the business logic which I can't really explain/justify.
> >
> > Actually, the per-node caches seem to have gone away at some point too.
> > Boris would know the history. This might need a a cleanup like Boris
> > alluded to in 05e91e7211383. This might not even need a nid loop.
>
> Nah, the cache is still there. For now...
>
> for_each_node_with_cpus() should simply work unless I'm missing some other
> angle...

Awesome - thank you both! I'll send a v2 using
for_each_node_with_cpus() ... On Monday :) Have a good weekend!