[PATCH v2] x86/kaslr: Revisit entropy when CONFIG_PCI_P2PDMA is enabled

Balbir Singh posted 1 patch 10 months, 1 week ago
arch/x86/mm/kaslr.c | 10 ++++++++--
drivers/pci/Kconfig |  6 ++++++
2 files changed, 14 insertions(+), 2 deletions(-)
[PATCH v2] x86/kaslr: Revisit entropy when CONFIG_PCI_P2PDMA is enabled
Posted by Balbir Singh 10 months, 1 week ago
When CONFIG_PCI_P2PDMA is enabled, it maps the PFN's via a
ZONE_DEVICE mapping using devm_memremap_pages(). The mapped
virtual address range corresponds to the pci_resource_start()
of the BAR address and size corresponding to the BAR length.

When KASLR is enabled, the direct map range of the kernel is
reduced to the size of physical memory plus additional padding.
If the BAR address is beyond this limit, PCI peer to peer DMA
mappings fail.

Fix this by not shrinking the size of direct map when CONFIG_PCI_P2PDMA
is enabled. This reduces the total available entropy, but it's
better than the current work around of having to disable KASLR
completely.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/lkml/20250206023201.1481957-1-balbirs@nvidia.com/

Signed-off-by: Balbir Singh <balbirs@nvidia.com>
---
Changelog v2
  - Add information about entropy drop when PCI_P2PDMA is
    selected

Testing:

  commit 0483e1fa6e09d ("x86/mm: Implement ASLR for kernel memory regions") mentions the
  problems that the following problems need to be addressed.

  1 The three target memory sections are never at the same place between
    boots.
  2 The physical memory mapping can use a virtual address not aligned on
    the PGD page table.
  3 Have good entropy early at boot before get_random_bytes is available.
  4 Add optional padding for memory hotplug compatibility.

  Ran an automated test to ensure that (1) holds true across several
  iterations of automated reboot testing. 2, 3 and 4 are not impacted
  by this patch.

  Manual Testing on a system where the problem reproduces
  
  1. With KASLR

     Hotplug memory [0x240000000000-0x242000000000] exceeds maximum addressable range [0x0-0xaffffffffff]
     ------------[ cut here ]------------
  2. With the fixes

     added peer-to-peer DMA memory 0x240000000000-0x241fffffffff

     KASLR is still enabled as seen by kaslr_offset() (difference
     between __START_KERNEL and _stext)
  3. Without the fixes and KASLR disabled


 arch/x86/mm/kaslr.c | 10 ++++++++--
 drivers/pci/Kconfig |  6 ++++++
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index 11a93542d198..3c306de52fd4 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -113,8 +113,14 @@ void __init kernel_randomize_memory(void)
 	memory_tb = DIV_ROUND_UP(max_pfn << PAGE_SHIFT, 1UL << TB_SHIFT) +
 		CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
 
-	/* Adapt physical memory region size based on available memory */
-	if (memory_tb < kaslr_regions[0].size_tb)
+	/*
+	 * Adapt physical memory region size based on available memory,
+	 * except when CONFIG_PCI_P2PDMA is enabled. P2PDMA exposes the
+	 * device BAR space assuming the direct map space is large enough
+	 * for creating a ZONE_DEVICE mapping in the direct map corresponding
+	 * to the physical BAR address.
+	 */
+	if (!IS_ENABLED(CONFIG_PCI_P2PDMA) && (memory_tb < kaslr_regions[0].size_tb))
 		kaslr_regions[0].size_tb = memory_tb;
 
 	/*
diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index 2fbd379923fd..5c3054aaec8c 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -203,6 +203,12 @@ config PCI_P2PDMA
 	  P2P DMA transactions must be between devices behind the same root
 	  port.
 
+	  Enabling this option will reduce the entropy of x86 KASLR memory
+	  regions. For example - on a 46 bit system, the entropy goes down
+	  from 16 bits to 15 bits. The actual reduction in entropy depends
+	  on the physical address bits, on processor features, kernel config
+	  (5 level page table) and physical memory present on the system.
+
 	  If unsure, say N.
 
 config PCI_LABEL
-- 
2.48.1
Re: [PATCH v2] x86/kaslr: Revisit entropy when CONFIG_PCI_P2PDMA is enabled
Posted by Bjorn Helgaas 9 months, 3 weeks ago
On Fri, Feb 07, 2025 at 10:42:34AM +1100, Balbir Singh wrote:
> When CONFIG_PCI_P2PDMA is enabled, it maps the PFN's via a
> ZONE_DEVICE mapping using devm_memremap_pages(). The mapped
> virtual address range corresponds to the pci_resource_start()
> of the BAR address and size corresponding to the BAR length.
> 
> When KASLR is enabled, the direct map range of the kernel is
> reduced to the size of physical memory plus additional padding.
> If the BAR address is beyond this limit, PCI peer to peer DMA
> mappings fail.
> 
> Fix this by not shrinking the size of direct map when CONFIG_PCI_P2PDMA
> is enabled. This reduces the total available entropy, but it's
> better than the current work around of having to disable KASLR
> completely.
> 
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Kees Cook <kees@kernel.org>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Link: https://lore.kernel.org/lkml/20250206023201.1481957-1-balbirs@nvidia.com/
> 
> Signed-off-by: Balbir Singh <balbirs@nvidia.com>

Acked-by: Bjorn Helgaas <bhelgaas@google.com>	# drivers/pci/Kconfig

> ---
> Changelog v2
>   - Add information about entropy drop when PCI_P2PDMA is
>     selected
> 
> Testing:
> 
>   commit 0483e1fa6e09d ("x86/mm: Implement ASLR for kernel memory regions") mentions the
>   problems that the following problems need to be addressed.
> 
>   1 The three target memory sections are never at the same place between
>     boots.
>   2 The physical memory mapping can use a virtual address not aligned on
>     the PGD page table.
>   3 Have good entropy early at boot before get_random_bytes is available.
>   4 Add optional padding for memory hotplug compatibility.
> 
>   Ran an automated test to ensure that (1) holds true across several
>   iterations of automated reboot testing. 2, 3 and 4 are not impacted
>   by this patch.
> 
>   Manual Testing on a system where the problem reproduces
>   
>   1. With KASLR
> 
>      Hotplug memory [0x240000000000-0x242000000000] exceeds maximum addressable range [0x0-0xaffffffffff]
>      ------------[ cut here ]------------
>   2. With the fixes
> 
>      added peer-to-peer DMA memory 0x240000000000-0x241fffffffff
> 
>      KASLR is still enabled as seen by kaslr_offset() (difference
>      between __START_KERNEL and _stext)
>   3. Without the fixes and KASLR disabled
> 
> 
>  arch/x86/mm/kaslr.c | 10 ++++++++--
>  drivers/pci/Kconfig |  6 ++++++
>  2 files changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
> index 11a93542d198..3c306de52fd4 100644
> --- a/arch/x86/mm/kaslr.c
> +++ b/arch/x86/mm/kaslr.c
> @@ -113,8 +113,14 @@ void __init kernel_randomize_memory(void)
>  	memory_tb = DIV_ROUND_UP(max_pfn << PAGE_SHIFT, 1UL << TB_SHIFT) +
>  		CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
>  
> -	/* Adapt physical memory region size based on available memory */
> -	if (memory_tb < kaslr_regions[0].size_tb)
> +	/*
> +	 * Adapt physical memory region size based on available memory,
> +	 * except when CONFIG_PCI_P2PDMA is enabled. P2PDMA exposes the
> +	 * device BAR space assuming the direct map space is large enough
> +	 * for creating a ZONE_DEVICE mapping in the direct map corresponding
> +	 * to the physical BAR address.
> +	 */
> +	if (!IS_ENABLED(CONFIG_PCI_P2PDMA) && (memory_tb < kaslr_regions[0].size_tb))
>  		kaslr_regions[0].size_tb = memory_tb;
>  
>  	/*
> diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
> index 2fbd379923fd..5c3054aaec8c 100644
> --- a/drivers/pci/Kconfig
> +++ b/drivers/pci/Kconfig
> @@ -203,6 +203,12 @@ config PCI_P2PDMA
>  	  P2P DMA transactions must be between devices behind the same root
>  	  port.
>  
> +	  Enabling this option will reduce the entropy of x86 KASLR memory
> +	  regions. For example - on a 46 bit system, the entropy goes down
> +	  from 16 bits to 15 bits. The actual reduction in entropy depends
> +	  on the physical address bits, on processor features, kernel config
> +	  (5 level page table) and physical memory present on the system.
> +
>  	  If unsure, say N.
>  
>  config PCI_LABEL
> -- 
> 2.48.1
>
Re: [PATCH v2] x86/kaslr: Revisit entropy when CONFIG_PCI_P2PDMA is enabled
Posted by Kees Cook 10 months, 1 week ago
On Fri, Feb 07, 2025 at 10:42:34AM +1100, Balbir Singh wrote:
> When CONFIG_PCI_P2PDMA is enabled, it maps the PFN's via a
> ZONE_DEVICE mapping using devm_memremap_pages(). The mapped
> virtual address range corresponds to the pci_resource_start()
> of the BAR address and size corresponding to the BAR length.
> 
> When KASLR is enabled, the direct map range of the kernel is
> reduced to the size of physical memory plus additional padding.
> If the BAR address is beyond this limit, PCI peer to peer DMA
> mappings fail.
> 
> Fix this by not shrinking the size of direct map when CONFIG_PCI_P2PDMA
> is enabled. This reduces the total available entropy, but it's
> better than the current work around of having to disable KASLR
> completely.
> 
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Kees Cook <kees@kernel.org>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Link: https://lore.kernel.org/lkml/20250206023201.1481957-1-balbirs@nvidia.com/
> 
> Signed-off-by: Balbir Singh <balbirs@nvidia.com>

Thanks for the update!

Reviewed-by: Kees Cook <kees@kernel.org>

-- 
Kees Cook
Re: [PATCH v2] x86/kaslr: Revisit entropy when CONFIG_PCI_P2PDMA is enabled
Posted by Balbir Singh 9 months, 4 weeks ago
On 2/7/25 12:00, Kees Cook wrote:
> On Fri, Feb 07, 2025 at 10:42:34AM +1100, Balbir Singh wrote:
>> When CONFIG_PCI_P2PDMA is enabled, it maps the PFN's via a
>> ZONE_DEVICE mapping using devm_memremap_pages(). The mapped
>> virtual address range corresponds to the pci_resource_start()
>> of the BAR address and size corresponding to the BAR length.
>>
>> When KASLR is enabled, the direct map range of the kernel is
>> reduced to the size of physical memory plus additional padding.
>> If the BAR address is beyond this limit, PCI peer to peer DMA
>> mappings fail.
>>
>> Fix this by not shrinking the size of direct map when CONFIG_PCI_P2PDMA
>> is enabled. This reduces the total available entropy, but it's
>> better than the current work around of having to disable KASLR
>> completely.
>>
>> Cc: Dave Hansen <dave.hansen@linux.intel.com>
>> Cc: Andy Lutomirski <luto@kernel.org>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: Borislav Petkov <bp@alien8.de>
>> Cc: "H. Peter Anvin" <hpa@zytor.com>
>> Cc: Kees Cook <kees@kernel.org>
>> Cc: Bjorn Helgaas <bhelgaas@google.com>
>> Link: https://lore.kernel.org/lkml/20250206023201.1481957-1-balbirs@nvidia.com/
>>
>> Signed-off-by: Balbir Singh <balbirs@nvidia.com>
> 
> Thanks for the update!
> 
> Reviewed-by: Kees Cook <kees@kernel.org>
> 

Thank you Kees! 

I wanted to request that we pick this up in linux-next via the relevant
subtree for further testing

Balbir Singh
Re: [PATCH v2] x86/kaslr: Revisit entropy when CONFIG_PCI_P2PDMA is enabled
Posted by Dave Hansen 10 months, 1 week ago
On 2/6/25 15:42, Balbir Singh wrote:
> Fix this by not shrinking the size of direct map when CONFIG_PCI_P2PDMA
> is enabled. This reduces the total available entropy, but it's
> better than the current work around of having to disable KASLR
> completely.

Is the size of these P2PDMA mappings known up front? Or do you just need
them to be as large as possible?
Re: [PATCH v2] x86/kaslr: Revisit entropy when CONFIG_PCI_P2PDMA is enabled
Posted by Balbir Singh 10 months, 1 week ago
On 2/7/25 11:27, Dave Hansen wrote:
> On 2/6/25 15:42, Balbir Singh wrote:
>> Fix this by not shrinking the size of direct map when CONFIG_PCI_P2PDMA
>> is enabled. This reduces the total available entropy, but it's
>> better than the current work around of having to disable KASLR
>> completely.
> 
> Is the size of these P2PDMA mappings known up front? Or do you just need
> them to be as large as possible?

The size is not known upfront, it depends on the system configuration.
Yes, we need them to be as large as possible.

Balbir
[tip: x86/mm] x86/kaslr: Reduce KASLR entropy on most x86 systems
Posted by tip-bot2 for Balbir Singh 9 months, 3 weeks ago
The following commit has been merged into the x86/mm branch of tip:

Commit-ID:     7ffb791423c7c518269a9aad35039ef824a40adb
Gitweb:        https://git.kernel.org/tip/7ffb791423c7c518269a9aad35039ef824a40adb
Author:        Balbir Singh <balbirs@nvidia.com>
AuthorDate:    Fri, 07 Feb 2025 10:42:34 +11:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Sat, 22 Feb 2025 12:25:57 +01:00

x86/kaslr: Reduce KASLR entropy on most x86 systems

When CONFIG_PCI_P2PDMA=y (which is basically enabled on all
large x86 distros), it maps the PFN's via a ZONE_DEVICE
mapping using devm_memremap_pages(). The mapped virtual
address range corresponds to the pci_resource_start()
of the BAR address and size corresponding to the BAR length.

When KASLR is enabled, the direct map range of the kernel is
reduced to the size of physical memory plus additional padding.
If the BAR address is beyond this limit, PCI peer to peer DMA
mappings fail.

Fix this by not shrinking the size of the direct map when
CONFIG_PCI_P2PDMA=y.

This reduces the total available entropy, but it's better than
the current work around of having to disable KASLR completely.

[ mingo: Clarified the changelog to point out the broad impact ... ]

Signed-off-by: Balbir Singh <balbirs@nvidia.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> # drivers/pci/Kconfig
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Link: https://lore.kernel.org/lkml/20250206023201.1481957-1-balbirs@nvidia.com/
Link: https://lore.kernel.org/r/20250206234234.1912585-1-balbirs@nvidia.com
--
 arch/x86/mm/kaslr.c | 10 ++++++++--
 drivers/pci/Kconfig |  6 ++++++
 2 files changed, 14 insertions(+), 2 deletions(-)
---
 arch/x86/mm/kaslr.c | 10 ++++++++--
 drivers/pci/Kconfig |  6 ++++++
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index 11a9354..3c306de 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -113,8 +113,14 @@ void __init kernel_randomize_memory(void)
 	memory_tb = DIV_ROUND_UP(max_pfn << PAGE_SHIFT, 1UL << TB_SHIFT) +
 		CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
 
-	/* Adapt physical memory region size based on available memory */
-	if (memory_tb < kaslr_regions[0].size_tb)
+	/*
+	 * Adapt physical memory region size based on available memory,
+	 * except when CONFIG_PCI_P2PDMA is enabled. P2PDMA exposes the
+	 * device BAR space assuming the direct map space is large enough
+	 * for creating a ZONE_DEVICE mapping in the direct map corresponding
+	 * to the physical BAR address.
+	 */
+	if (!IS_ENABLED(CONFIG_PCI_P2PDMA) && (memory_tb < kaslr_regions[0].size_tb))
 		kaslr_regions[0].size_tb = memory_tb;
 
 	/*
diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index 2fbd379..5c3054a 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -203,6 +203,12 @@ config PCI_P2PDMA
 	  P2P DMA transactions must be between devices behind the same root
 	  port.
 
+	  Enabling this option will reduce the entropy of x86 KASLR memory
+	  regions. For example - on a 46 bit system, the entropy goes down
+	  from 16 bits to 15 bits. The actual reduction in entropy depends
+	  on the physical address bits, on processor features, kernel config
+	  (5 level page table) and physical memory present on the system.
+
 	  If unsure, say N.
 
 config PCI_LABEL
[tip: x86/mm] x86/kaslr: Reduce KASLR entropy on most x86 systems
Posted by tip-bot2 for Balbir Singh 9 months, 3 weeks ago
The following commit has been merged into the x86/mm branch of tip:

Commit-ID:     4816f6361fffb172c04e702c9af3f8aa80962cea
Gitweb:        https://git.kernel.org/tip/4816f6361fffb172c04e702c9af3f8aa80962cea
Author:        Balbir Singh <balbirs@nvidia.com>
AuthorDate:    Fri, 07 Feb 2025 10:42:34 +11:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Fri, 21 Feb 2025 17:08:26 +01:00

x86/kaslr: Reduce KASLR entropy on most x86 systems

When CONFIG_PCI_P2PDMA=y (which is basically enabled on all
large x86 distros), it maps the PFN's via a ZONE_DEVICE
mapping using devm_memremap_pages(). The mapped virtual
address range corresponds to the pci_resource_start()
of the BAR address and size corresponding to the BAR length.

When KASLR is enabled, the direct map range of the kernel is
reduced to the size of physical memory plus additional padding.
If the BAR address is beyond this limit, PCI peer to peer DMA
mappings fail.

Fix this by not shrinking the size of the direct map when
CONFIG_PCI_P2PDMA=y.

This reduces the total available entropy, but it's better than
the current work around of having to disable KASLR completely.

[ mingo: Clarified the changelog to point out the broad impact ... ]

Signed-off-by: Balbir Singh <balbirs@nvidia.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Link: https://lore.kernel.org/lkml/20250206023201.1481957-1-balbirs@nvidia.com/
Link: https://lore.kernel.org/r/20250206234234.1912585-1-balbirs@nvidia.com
--
 arch/x86/mm/kaslr.c | 10 ++++++++--
 drivers/pci/Kconfig |  6 ++++++
 2 files changed, 14 insertions(+), 2 deletions(-)
---
 arch/x86/mm/kaslr.c | 10 ++++++++--
 drivers/pci/Kconfig |  6 ++++++
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index 11a9354..3c306de 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -113,8 +113,14 @@ void __init kernel_randomize_memory(void)
 	memory_tb = DIV_ROUND_UP(max_pfn << PAGE_SHIFT, 1UL << TB_SHIFT) +
 		CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
 
-	/* Adapt physical memory region size based on available memory */
-	if (memory_tb < kaslr_regions[0].size_tb)
+	/*
+	 * Adapt physical memory region size based on available memory,
+	 * except when CONFIG_PCI_P2PDMA is enabled. P2PDMA exposes the
+	 * device BAR space assuming the direct map space is large enough
+	 * for creating a ZONE_DEVICE mapping in the direct map corresponding
+	 * to the physical BAR address.
+	 */
+	if (!IS_ENABLED(CONFIG_PCI_P2PDMA) && (memory_tb < kaslr_regions[0].size_tb))
 		kaslr_regions[0].size_tb = memory_tb;
 
 	/*
diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index 2fbd379..5c3054a 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -203,6 +203,12 @@ config PCI_P2PDMA
 	  P2P DMA transactions must be between devices behind the same root
 	  port.
 
+	  Enabling this option will reduce the entropy of x86 KASLR memory
+	  regions. For example - on a 46 bit system, the entropy goes down
+	  from 16 bits to 15 bits. The actual reduction in entropy depends
+	  on the physical address bits, on processor features, kernel config
+	  (5 level page table) and physical memory present on the system.
+
 	  If unsure, say N.
 
 config PCI_LABEL