[PATCH] x86/amd_nb: fix NULL deref in amd64_agp

René Rebe posted 1 patch 2 weeks ago
[PATCH] x86/amd_nb: fix NULL deref in amd64_agp
Posted by René Rebe 2 weeks ago
bc7b2e629e0c ("x86/amd_nb: Use topology info to get AMD node count")
broke amd_cache_northbridges as iterating a next_northbridge or two is
not identical to amd_num_nodes() on older systems.

Among other details, this causes amd64_agp nforce3_agp_init to oops w/
null-ptr deref at:

BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 2579067 P4D 2579067 PUD 2578067 PMD 0
Oops: Oops: 0000 [#1] SMP NOPTI
CPU: 0 UID: 0 PID: 56 Comm: kworker/0:2 Not tainted 6.15.0-t2 #1 PREEMPT(lazy)
Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./ALiveDual-eSATA2, BIOS P1.80 09/11/2009
Workqueue: events work_for_cpu_fn
RIP: 0010:amd64_fetch_size+0x1f/0xb0 [amd64_agp]
Code: 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 53 48 83 ec 10 65 48 8b 05 47 e7 05 e3 48 89 44 24 08 31 db 31 ff e8 e1 30 c
d e1 <48> 8b 38 48 85 ff 74 5e 48 8d 54 24 04 c7 02 00 00 00 00 be 90 00
RSP: 0018:ffffa1574019bd08 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff8b0241365100 RDI: 0000000000000000
RBP: 00000000000000c0 R08: 0000000000000004 R09: ffffa1574019bd54
R10: 00000000ffffef01 R11: ffffffffa2818aa0 R12: ffff8b02419cd870
R13: ffff8b024189d400 R14: ffff8b0241094000 R15: ffff8b0241094000
FS:  0000000000000000(0000) GS:ffff8b02ba601000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000000257a000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 nforce3_agp_init+0x23/0x1d0 [amd64_agp]
 agp_amd64_probe+0x3dd/0x470 [amd64_agp]

Fix this by only erroring out for the first node, limit
amd_northbridges accordingly.

Fixes: bc7b2e629e0c ("x86/amd_nb: Use topology info to get AMD node count")
Signed-off-by: Rene Rebe <rene@exactco.de>

---
Tested on AM2+ ASRock ALiveDual-eSATA2.

--- a/arch/x86/kernel/amd_nb.c	2025-05-29 11:53:25.952929235 +0200
+++ b/arch/x86/kernel/amd_nb.c	2025-05-29 13:00:02.191707970 +0200
@@ -80,9 +82,13 @@
 		 * If not, then uninitialize everything.
 		 */
 		if (!node_to_amd_nb(i)->misc) {
-			amd_northbridges.num = 0;
-			kfree(nb);
-			return -ENODEV;
+			if (i == 0) {
+				kfree(nb);
+				return -ENODEV;
+			}
+			pr_info("next amd_northbridge not found, limiting to: %d\n", i);
+			amd_northbridges.num = i;
+			break;
 		}
 
 		node_to_amd_nb(i)->link = amd_node_get_func(i, 4);

-- 
  René Rebe, ExactCODE GmbH, Berlin, Germany
  https://exactco.de | https://t2linux.com | https://rene.rebe.de
Re: [PATCH] x86/amd_nb: fix NULL deref in amd64_agp
Posted by Borislav Petkov 2 weeks ago
On Mon, Nov 17, 2025 at 08:12:13PM +0100, René Rebe wrote:
> bc7b2e629e0c ("x86/amd_nb: Use topology info to get AMD node count")

Can you pls try the current tip/master branch from here:

https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git

and see if your issue is fixed too?

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette
Re: [PATCH] x86/amd_nb: fix NULL deref in amd64_agp
Posted by René Rebe 2 weeks ago
On Mon, 17 Nov 2025 20:31:29 +0100,
Borislav Petkov <bp@alien8.de> wrote:

> On Mon, Nov 17, 2025 at 08:12:13PM +0100, René Rebe wrote:
> > bc7b2e629e0c ("x86/amd_nb: Use topology info to get AMD node count")
> 
> Can you pls try the current tip/master branch from here:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
> 
> and see if your issue is fixed too?

you say this because the patch had an old date? We are still shipping
the patch in produciton applied on top of 6.17.x.

So you say I should double check tip w/o the patch still oopses?

   René

-- 
  René Rebe, ExactCODE GmbH, Berlin, Germany
  https://exactco.de | https://t2linux.com | https://rene.rebe.de
Re: [PATCH] x86/amd_nb: fix NULL deref in amd64_agp
Posted by Borislav Petkov 2 weeks ago
On Mon, Nov 17, 2025 at 08:39:43PM +0100, René Rebe wrote:
> So you say I should double check tip w/o the patch still oopses?

We have another fix there:

https://git.kernel.org/tip/845ed7e04d9ae0146d5e003a5defd90eb95535fc

which should fix your issue too, I believe.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette
Re: [PATCH] x86/amd_nb: fix NULL deref in amd64_agp
Posted by René Rebe 2 weeks ago
On Mon, 17 Nov 2025 20:42:11 +0100, Borislav Petkov <bp@alien8.de> wrote:

> On Mon, Nov 17, 2025 at 08:39:43PM +0100, René Rebe wrote:
> > So you say I should double check tip w/o the patch still oopses?
> 
> We have another fix there:
> 
> https://git.kernel.org/tip/845ed7e04d9ae0146d5e003a5defd90eb95535fc
> 
> which should fix your issue too, I believe.

Interesting, it appears that might have fixed it, too.

Will report if this appears again.

Thanks!

	René

-- 
  René Rebe, ExactCODE GmbH, Berlin, Germany
  https://exactco.de | https://t2linux.com | https://rene.rebe.de
Re: [PATCH] x86/amd_nb: fix NULL deref in amd64_agp
Posted by Borislav Petkov 2 weeks ago
On Mon, Nov 17, 2025 at 09:47:24PM +0100, René Rebe wrote:
> Interesting, it appears that might have fixed it, too.
> 
> Will report if this appears again.

Good, thanks for testing!

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette