[PATCH] perf/x86/intel/pt: Update topa_entry base len to support 52-bit physical addresses

Marco Cavenati posted 1 patch 1 year, 7 months ago
arch/x86/events/intel/pt.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
[PATCH] perf/x86/intel/pt: Update topa_entry base len to support 52-bit physical addresses
Posted by Marco Cavenati 1 year, 7 months ago
Increase topa_entry base to 40 bits to accommodate page addresses in
systems with 52-bit physical addresses.
The Base Physical Address field (base) has a length of MAXPHYADDR - 12 as
stated in Intel's SDM chapter 33.2.7.2.
The maximum MAXPHYADDR is 52 as stated in SDM 4.1.4.
Therefore, the maximum base bit length is 40.

Signed-off-by: Marco Cavenati <cavenati.marco@gmail.com>
---
 arch/x86/events/intel/pt.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/events/intel/pt.h b/arch/x86/events/intel/pt.h
index 96906a62aacd..f5e46c04c145 100644
--- a/arch/x86/events/intel/pt.h
+++ b/arch/x86/events/intel/pt.h
@@ -33,8 +33,8 @@ struct topa_entry {
 	u64	rsvd2	: 1;
 	u64	size	: 4;
 	u64	rsvd3	: 2;
-	u64	base	: 36;
-	u64	rsvd4	: 16;
+	u64	base	: 40;
+	u64	rsvd4	: 12;
 };
 
 /* TSC to Core Crystal Clock Ratio */
-- 
2.39.2
Re: [PATCH] perf/x86/intel/pt: Update topa_entry base len to support 52-bit physical addresses
Posted by Dave Hansen 1 year, 7 months ago
On 6/18/24 04:06, Marco Cavenati wrote:
> Increase topa_entry base to 40 bits to accommodate page addresses in
> systems with 52-bit physical addresses.
> The Base Physical Address field (base) has a length of MAXPHYADDR - 12 as
> stated in Intel's SDM chapter 33.2.7.2.
> The maximum MAXPHYADDR is 52 as stated in SDM 4.1.4.
> Therefore, the maximum base bit length is 40.

This makes it sound like it's _adding_ support for larger physical
addresses.  It really was a bug from day one.  MAXPHYADDR has been
defined to be "at most 52" for a long, long time.  I think it was well
before 5-level paging came on the scene and actual MAXPHYADDR=52 systems
came along.

It probably needs to say something more along the lines of:

	topa_entry->base needs to store a pfn.  It obviously needs to be
	large enough to store the largest possible x86 pfn which is
	MAXPHYADDR-PAGE_SIZE (52-12).  So it is 4 bits too small.

This isn't the only bug in the area:

> static void *pt_buffer_region(struct pt_buffer *buf)
> {
>         return phys_to_virt(TOPA_ENTRY(buf->cur, buf->cur_idx)->base << TOPA_SHIFT);
> }

At this point, ->base is still a 40-bit (or 36-bit before this patch)
type.  If it has anything in the high 12 bits, a <<TOPA_SHIFT will just
lose those bits.

But maybe I'm reading it wrong.  If I'm right, this malfunctions at pfns
over 36-12=24 bits, or 64GB of RAM.  Is it possible nobody has ever
allocated a 'struct pt_buffer' over 64GB?  Or is this somehow tolerant
of reading garbage?
Re: [PATCH] perf/x86/intel/pt: Update topa_entry base len to support 52-bit physical addresses
Posted by Adrian Hunter 1 year, 7 months ago
On 18/06/24 20:59, Dave Hansen wrote:
> On 6/18/24 04:06, Marco Cavenati wrote:
>> Increase topa_entry base to 40 bits to accommodate page addresses in
>> systems with 52-bit physical addresses.
>> The Base Physical Address field (base) has a length of MAXPHYADDR - 12 as
>> stated in Intel's SDM chapter 33.2.7.2.
>> The maximum MAXPHYADDR is 52 as stated in SDM 4.1.4.
>> Therefore, the maximum base bit length is 40.
> 
> This makes it sound like it's _adding_ support for larger physical
> addresses.  It really was a bug from day one.  MAXPHYADDR has been
> defined to be "at most 52" for a long, long time.  I think it was well
> before 5-level paging came on the scene and actual MAXPHYADDR=52 systems
> came along.
> 
> It probably needs to say something more along the lines of:
> 
> 	topa_entry->base needs to store a pfn.  It obviously needs to be
> 	large enough to store the largest possible x86 pfn which is
> 	MAXPHYADDR-PAGE_SIZE (52-12).  So it is 4 bits too small.
> 
> This isn't the only bug in the area:
> 
>> static void *pt_buffer_region(struct pt_buffer *buf)
>> {
>>         return phys_to_virt(TOPA_ENTRY(buf->cur, buf->cur_idx)->base << TOPA_SHIFT);
>> }
> 
> At this point, ->base is still a 40-bit (or 36-bit before this patch)
> type.  If it has anything in the high 12 bits, a <<TOPA_SHIFT will just
> lose those bits.

Yes

> 
> But maybe I'm reading it wrong.  If I'm right, this malfunctions at pfns
> over 36-12=24 bits, or 64GB of RAM.  Is it possible nobody has ever
> allocated a 'struct pt_buffer' over 64GB?  Or is this somehow tolerant
> of reading garbage?

Yes, it will go wrong with any physical address above 64GB - 1.
i.e. the machine just needs more than 64GB of memory.

However that code is used only in one place which is conditional on
!intel_pt_validate_hw_cap(PT_CAP_topa_multiple_entries) which is true
only for Broadwell.  Also "snapshot" (and sampling) modes are
unaffected.

Testing on a Broadwell with 400GB of memory confirmed the issue.
Re: [PATCH] perf/x86/intel/pt: Update topa_entry base len to support 52-bit physical addresses
Posted by Marco 1 year, 7 months ago
Hi Dave,

> On 18 Jun 2024, at 19:59, Dave Hansen <dave.hansen@intel.com> wrote:
> 
> On 6/18/24 04:06, Marco Cavenati wrote:
>> Increase topa_entry base to 40 bits to accommodate page addresses in
>> systems with 52-bit physical addresses.
>> The Base Physical Address field (base) has a length of MAXPHYADDR - 12 as
>> stated in Intel's SDM chapter 33.2.7.2.
>> The maximum MAXPHYADDR is 52 as stated in SDM 4.1.4.
>> Therefore, the maximum base bit length is 40.
> 
> This makes it sound like it's _adding_ support for larger physical
> addresses.  It really was a bug from day one.  MAXPHYADDR has been
> defined to be "at most 52" for a long, long time.  I think it was well
> before 5-level paging came on the scene and actual MAXPHYADDR=52 systems
> came along.

Thank you for pointing this out, I wasn't sure about the history of
MAXPHYADDR.

> It probably needs to say something more along the lines of:
> 
> topa_entry->base needs to store a pfn.  It obviously needs to be
> large enough to store the largest possible x86 pfn which is
> MAXPHYADDR-PAGE_SIZE (52-12).  So it is 4 bits too small.
> 
> This isn't the only bug in the area:
> 
>> static void *pt_buffer_region(struct pt_buffer *buf)
>> {
>>        return phys_to_virt(TOPA_ENTRY(buf->cur, buf->cur_idx)->base << TOPA_SHIFT);
>> }
> 
> At this point, ->base is still a 40-bit (or 36-bit before this patch)
> type.  If it has anything in the high 12 bits, a <<TOPA_SHIFT will just
> lose those bits.
> 
> But maybe I'm reading it wrong.  If I'm right, this malfunctions at pfns
> over 36-12=24 bits, or 64GB of RAM.  Is it possible nobody has ever
> allocated a 'struct pt_buffer' over 64GB?  Or is this somehow tolerant
> of reading garbage?

I might be wrong but I don't think this is the case, integral promotion
should make this work fine with no truncation.

Regards,
Marco
Re: [PATCH] perf/x86/intel/pt: Update topa_entry base len to support 52-bit physical addresses
Posted by Adrian Hunter 1 year, 7 months ago
On 18/06/24 14:06, Marco Cavenati wrote:
> Increase topa_entry base to 40 bits to accommodate page addresses in
> systems with 52-bit physical addresses.
> The Base Physical Address field (base) has a length of MAXPHYADDR - 12 as
> stated in Intel's SDM chapter 33.2.7.2.
> The maximum MAXPHYADDR is 52 as stated in SDM 4.1.4.
> Therefore, the maximum base bit length is 40.
> 
> Signed-off-by: Marco Cavenati <cavenati.marco@gmail.com>

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>

Getting 'base' physical address wrong would presumably
be bad, so:

Fixes: 52ca9ced3f70 ("perf/x86/intel/pt: Add Intel PT PMU driver")
Cc: stable@vger.kernel.org

> ---
>  arch/x86/events/intel/pt.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/events/intel/pt.h b/arch/x86/events/intel/pt.h
> index 96906a62aacd..f5e46c04c145 100644
> --- a/arch/x86/events/intel/pt.h
> +++ b/arch/x86/events/intel/pt.h
> @@ -33,8 +33,8 @@ struct topa_entry {
>  	u64	rsvd2	: 1;
>  	u64	size	: 4;
>  	u64	rsvd3	: 2;
> -	u64	base	: 36;
> -	u64	rsvd4	: 16;
> +	u64	base	: 40;
> +	u64	rsvd4	: 12;
>  };
>  
>  /* TSC to Core Crystal Clock Ratio */