arch/arm64/include/asm/memory.h | 2 ++ 1 file changed, 2 insertions(+)
The memory hot-plug and resource management code needs to know the
largest address which can fit in the linear map, so set
PHYSMEM_END for that purpose.
This fixes a crash[1] at boot when amdgpu tries to create
DEVICE_PRIVATE_MEMORY and is given a physical address by the
resource management code which is outside the range which can have
a `struct page`
The Fixes: commit listed below isn't actually broken, but the
reorganization of vmemmap causes the improper DEVICE_PRIVATE_MEMORY address
to go from a warning to a crash.
[1]: Unable to handle kernel paging request at virtual address
000001ffa6000034
Mem abort info:
ESR = 0x0000000096000044
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x04: level 0 translation fault
Data abort info:
ISV = 0, ISS = 0x00000044, ISS2 = 0x00000000
CM = 0, WnR = 1, TnD = 0, TagAccess = 0
GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
user pgtable: 4k pages, 48-bit VAs, pgdp=000008000287c000
[000001ffa6000034] pgd=0000000000000000, p4d=0000000000000000
Call trace:
__init_zone_device_page.constprop.0+0x2c/0xa8
memmap_init_zone_device+0xf0/0x210
pagemap_range+0x1e0/0x410
memremap_pages+0x18c/0x2e0
devm_memremap_pages+0x30/0x90
kgd2kfd_init_zone_device+0xf0/0x200 [amdgpu]
amdgpu_device_ip_init+0x674/0x888 [amdgpu]
amdgpu_device_init+0x7a4/0xea0 [amdgpu]
amdgpu_driver_load_kms+0x28/0x1c0 [amdgpu]
amdgpu_pci_probe+0x1a0/0x560 [amdgpu]
local_pci_probe+0x48/0xb8
work_for_cpu_fn+0x24/0x40
process_one_work+0x170/0x3e0
worker_thread+0x2ac/0x3e0
kthread+0xf4/0x108
ret_from_fork+0x10/0x20
Fixes: 32697ff38287 ("arm64: vmemmap: Avoid base2 order of struct page size to dimension region")
Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com>
Cc: stable@vger.kernel.org
---
Link to v2: https://lore.kernel.org/all/20240709002757.2431399-1-scott@os.amperecomputing.com/
Changes since v1:
- Change approach again to defining the newly created PHYSMEM_END in
arch/arm64/include/asm/memory.h
Link to v1: https://lore.kernel.org/all/20240703210707.1986816-1-scott@os.amperecomputing.com/
Changes since v1:
- Change from fiddling the architecture's MAX_PHYSMEM_BITS to checking
arch_get_mappable_range().
arch/arm64/include/asm/memory.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 54fb014eba05..0480c61dbb4f 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -110,6 +110,8 @@
#define PAGE_END (_PAGE_END(VA_BITS_MIN))
#endif /* CONFIG_KASAN */
+#define PHYSMEM_END __pa(PAGE_END - 1)
+
#define MIN_THREAD_SHIFT (14 + KASAN_THREAD_SHIFT)
/*
--
2.46.0
On Tue, 03 Sep 2024 09:45:32 -0700, D Scott Phillips wrote:
> The memory hot-plug and resource management code needs to know the
> largest address which can fit in the linear map, so set
> PHYSMEM_END for that purpose.
>
> This fixes a crash[1] at boot when amdgpu tries to create
> DEVICE_PRIVATE_MEMORY and is given a physical address by the
> resource management code which is outside the range which can have
> a `struct page`
>
> [...]
Applied to arm64 (for-next/mm), thanks!
I dropped the cc: stable, however, as PHYSMEM_END looks like it only
exists in linux-next.
[1/1] arm64: Expose the end of the linear map in PHYSMEM_END
https://git.kernel.org/arm64/c/eeb8fdfcf090
Cheers,
--
Will
https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
On Tue, Sep 03, 2024 at 09:45:32AM -0700, D Scott Phillips wrote: > The memory hot-plug and resource management code needs to know the > largest address which can fit in the linear map, so set > PHYSMEM_END for that purpose. > > This fixes a crash[1] at boot when amdgpu tries to create > DEVICE_PRIVATE_MEMORY and is given a physical address by the > resource management code which is outside the range which can have > a `struct page` > > The Fixes: commit listed below isn't actually broken, but the > reorganization of vmemmap causes the improper DEVICE_PRIVATE_MEMORY address > to go from a warning to a crash. > > [1]: Unable to handle kernel paging request at virtual address No need to have [1]: prefix here and also read this https://www.kernel.org/doc/html/latest/process/submitting-patches.html#backtraces-in-commit-messages and amend commit message accordingly. > 000001ffa6000034 > Mem abort info: > ESR = 0x0000000096000044 > EC = 0x25: DABT (current EL), IL = 32 bits > SET = 0, FnV = 0 > EA = 0, S1PTW = 0 > FSC = 0x04: level 0 translation fault > Data abort info: > ISV = 0, ISS = 0x00000044, ISS2 = 0x00000000 > CM = 0, WnR = 1, TnD = 0, TagAccess = 0 > GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 > user pgtable: 4k pages, 48-bit VAs, pgdp=000008000287c000 > [000001ffa6000034] pgd=0000000000000000, p4d=0000000000000000 > Call trace: > __init_zone_device_page.constprop.0+0x2c/0xa8 > memmap_init_zone_device+0xf0/0x210 > pagemap_range+0x1e0/0x410 > memremap_pages+0x18c/0x2e0 > devm_memremap_pages+0x30/0x90 > kgd2kfd_init_zone_device+0xf0/0x200 [amdgpu] > amdgpu_device_ip_init+0x674/0x888 [amdgpu] > amdgpu_device_init+0x7a4/0xea0 [amdgpu] > amdgpu_driver_load_kms+0x28/0x1c0 [amdgpu] > amdgpu_pci_probe+0x1a0/0x560 [amdgpu] > local_pci_probe+0x48/0xb8 > work_for_cpu_fn+0x24/0x40 > process_one_work+0x170/0x3e0 > worker_thread+0x2ac/0x3e0 > kthread+0xf4/0x108 > ret_from_fork+0x10/0x20 -- With Best Regards, Andy Shevchenko
© 2016 - 2025 Red Hat, Inc.