[PATCH] perf/x86/intel/pt: Fix sampling using single range output

Adrian Hunter posted 1 patch 3 years, 5 months ago
There is a newer version of this series
arch/x86/events/intel/pt.c | 9 +++++++++
1 file changed, 9 insertions(+)
[PATCH] perf/x86/intel/pt: Fix sampling using single range output
Posted by Adrian Hunter 3 years, 5 months ago
Deal with errata TGL052, ADL037 and RPL017 "Trace May Contain Incorrect
Data When Configured With Single Range Output Larger Than 4KB" by
disabling single range output whenever larger than 4KB.

Fixes: 670638477aed ("perf/x86/intel/pt: Opportunistically use single range output mode")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 arch/x86/events/intel/pt.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
index 82ef87e9a897..42a55794004a 100644
--- a/arch/x86/events/intel/pt.c
+++ b/arch/x86/events/intel/pt.c
@@ -1263,6 +1263,15 @@ static int pt_buffer_try_single(struct pt_buffer *buf, int nr_pages)
 	if (1 << order != nr_pages)
 		goto out;
 
+	/*
+	 * Some processors cannot always support single range for more than
+	 * 4KB - refer errata TGL052, ADL037 and RPL017. Future processors might
+	 * also be affected, so for now rather than trying to keep track of
+	 * which ones, just disable it for all.
+	 */
+	if (nr_pages > 1)
+		goto out;
+
 	buf->single = true;
 	buf->nr_pages = nr_pages;
 	ret = 0;
-- 
2.34.1
Re: [PATCH] perf/x86/intel/pt: Fix sampling using single range output
Posted by Peter Zijlstra 3 years, 5 months ago
On Sat, Nov 12, 2022 at 05:15:08PM +0200, Adrian Hunter wrote:
> Deal with errata TGL052, ADL037 and RPL017 "Trace May Contain Incorrect
> Data When Configured With Single Range Output Larger Than 4KB" by
> disabling single range output whenever larger than 4KB.
> 
> Fixes: 670638477aed ("perf/x86/intel/pt: Opportunistically use single range output mode")
> Cc: stable@vger.kernel.org
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>  arch/x86/events/intel/pt.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
> index 82ef87e9a897..42a55794004a 100644
> --- a/arch/x86/events/intel/pt.c
> +++ b/arch/x86/events/intel/pt.c
> @@ -1263,6 +1263,15 @@ static int pt_buffer_try_single(struct pt_buffer *buf, int nr_pages)
>  	if (1 << order != nr_pages)
>  		goto out;
>  
> +	/*
> +	 * Some processors cannot always support single range for more than
> +	 * 4KB - refer errata TGL052, ADL037 and RPL017. Future processors might
> +	 * also be affected, so for now rather than trying to keep track of
> +	 * which ones, just disable it for all.
> +	 */
> +	if (nr_pages > 1)
> +		goto out;

This effectively declares single-output-mode dead? Because I don't think
anybody uses PT with a single 4K buffer.
Re: [PATCH] perf/x86/intel/pt: Fix sampling using single range output
Posted by Adrian Hunter 3 years, 5 months ago
On 14/11/22 12:51, Peter Zijlstra wrote:
> On Sat, Nov 12, 2022 at 05:15:08PM +0200, Adrian Hunter wrote:
>> Deal with errata TGL052, ADL037 and RPL017 "Trace May Contain Incorrect
>> Data When Configured With Single Range Output Larger Than 4KB" by
>> disabling single range output whenever larger than 4KB.
>>
>> Fixes: 670638477aed ("perf/x86/intel/pt: Opportunistically use single range output mode")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
>> ---
>>  arch/x86/events/intel/pt.c | 9 +++++++++
>>  1 file changed, 9 insertions(+)
>>
>> diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
>> index 82ef87e9a897..42a55794004a 100644
>> --- a/arch/x86/events/intel/pt.c
>> +++ b/arch/x86/events/intel/pt.c
>> @@ -1263,6 +1263,15 @@ static int pt_buffer_try_single(struct pt_buffer *buf, int nr_pages)
>>  	if (1 << order != nr_pages)
>>  		goto out;
>>  
>> +	/*
>> +	 * Some processors cannot always support single range for more than
>> +	 * 4KB - refer errata TGL052, ADL037 and RPL017. Future processors might
>> +	 * also be affected, so for now rather than trying to keep track of
>> +	 * which ones, just disable it for all.
>> +	 */
>> +	if (nr_pages > 1)
>> +		goto out;
> 
> This effectively declares single-output-mode dead? Because I don't think
> anybody uses PT with a single 4K buffer.

4K is the default size for "sample mode" i.e. stuffing 4KB of Intel PT trace
data into a PERF_RECORD_SAMPLE record that has sample_type bit PERF_SAMPLE_AUX

e.g.

$ perf record -vv --aux-sample -e '{intel_pt//u,cycles:u}' uname 2>err.txt
Linux
$ grep aux_sample_size err.txt
  aux_sample_size                  4096
$
Re: [PATCH] perf/x86/intel/pt: Fix sampling using single range output
Posted by Peter Zijlstra 3 years, 4 months ago
On Mon, Nov 14, 2022 at 01:10:38PM +0200, Adrian Hunter wrote:
> On 14/11/22 12:51, Peter Zijlstra wrote:
> > On Sat, Nov 12, 2022 at 05:15:08PM +0200, Adrian Hunter wrote:
> >> Deal with errata TGL052, ADL037 and RPL017 "Trace May Contain Incorrect
> >> Data When Configured With Single Range Output Larger Than 4KB" by
> >> disabling single range output whenever larger than 4KB.
> >>
> >> Fixes: 670638477aed ("perf/x86/intel/pt: Opportunistically use single range output mode")
> >> Cc: stable@vger.kernel.org
> >> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> >> ---
> >>  arch/x86/events/intel/pt.c | 9 +++++++++
> >>  1 file changed, 9 insertions(+)
> >>
> >> diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
> >> index 82ef87e9a897..42a55794004a 100644
> >> --- a/arch/x86/events/intel/pt.c
> >> +++ b/arch/x86/events/intel/pt.c
> >> @@ -1263,6 +1263,15 @@ static int pt_buffer_try_single(struct pt_buffer *buf, int nr_pages)
> >>  	if (1 << order != nr_pages)
> >>  		goto out;
> >>  
> >> +	/*
> >> +	 * Some processors cannot always support single range for more than
> >> +	 * 4KB - refer errata TGL052, ADL037 and RPL017. Future processors might
> >> +	 * also be affected, so for now rather than trying to keep track of
> >> +	 * which ones, just disable it for all.
> >> +	 */
> >> +	if (nr_pages > 1)
> >> +		goto out;
> > 
> > This effectively declares single-output-mode dead? Because I don't think
> > anybody uses PT with a single 4K buffer.
> 
> 4K is the default size for "sample mode" i.e. stuffing 4KB of Intel PT trace
> data into a PERF_RECORD_SAMPLE record that has sample_type bit PERF_SAMPLE_AUX
> 
> e.g.
> 
> $ perf record -vv --aux-sample -e '{intel_pt//u,cycles:u}' uname 2>err.txt
> Linux
> $ grep aux_sample_size err.txt
>   aux_sample_size                  4096

Ah, ok. Not as bad then. Anyway, I'll go queue it for perf/urgent I
suppose.
Re: [PATCH] perf/x86/intel/pt: Fix sampling using single range output
Posted by Andi Kleen 3 years, 4 months ago
Peter Zijlstra <peterz@infradead.org> writes:

> On Mon, Nov 14, 2022 at 01:10:38PM +0200, Adrian Hunter wrote:
>> On 14/11/22 12:51, Peter Zijlstra wrote:
>> > On Sat, Nov 12, 2022 at 05:15:08PM +0200, Adrian Hunter wrote:
>> >> Deal with errata TGL052, ADL037 and RPL017 "Trace May Contain Incorrect
>> >> Data When Configured With Single Range Output Larger Than 4KB" by
>> >> disabling single range output whenever larger than 4KB.
>> >>
>> >> Fixes: 670638477aed ("perf/x86/intel/pt: Opportunistically use single range output mode")
>> >> Cc: stable@vger.kernel.org
>> >> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
>> >> ---
>> >>  arch/x86/events/intel/pt.c | 9 +++++++++
>> >>  1 file changed, 9 insertions(+)
>> >>
>> >> diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
>> >> index 82ef87e9a897..42a55794004a 100644
>> >> --- a/arch/x86/events/intel/pt.c
>> >> +++ b/arch/x86/events/intel/pt.c
>> >> @@ -1263,6 +1263,15 @@ static int pt_buffer_try_single(struct pt_buffer *buf, int nr_pages)
>> >>  	if (1 << order != nr_pages)
>> >>  		goto out;
>> >>  
>> >> +	/*
>> >> +	 * Some processors cannot always support single range for more than
>> >> +	 * 4KB - refer errata TGL052, ADL037 and RPL017. Future processors might
>> >> +	 * also be affected, so for now rather than trying to keep track of
>> >> +	 * which ones, just disable it for all.
>> >> +	 */
>> >> +	if (nr_pages > 1)
>> >> +		goto out;
>> > 
>> > This effectively declares single-output-mode dead? Because I don't think
>> > anybody uses PT with a single 4K buffer.
>> 
>> 4K is the default size for "sample mode" i.e. stuffing 4KB of Intel PT trace
>> data into a PERF_RECORD_SAMPLE record that has sample_type bit PERF_SAMPLE_AUX
>> 
>> e.g.
>> 
>> $ perf record -vv --aux-sample -e '{intel_pt//u,cycles:u}' uname 2>err.txt
>> Linux
>> $ grep aux_sample_size err.txt
>>   aux_sample_size                  4096
>
> Ah, ok. Not as bad then. Anyway, I'll go queue it for perf/urgent I
> suppose.

It would be better to only limit on the CPUs with the bug because
switching buffers causes some extra latencies. So this patch may regress
PT overhead or tail latencies.

-Andi
Re: [PATCH] perf/x86/intel/pt: Fix sampling using single range output
Posted by Adrian Hunter 3 years, 4 months ago
On 15/11/22 21:46, Andi Kleen wrote:
> Peter Zijlstra <peterz@infradead.org> writes:
> 
>> On Mon, Nov 14, 2022 at 01:10:38PM +0200, Adrian Hunter wrote:
>>> On 14/11/22 12:51, Peter Zijlstra wrote:
>>>> On Sat, Nov 12, 2022 at 05:15:08PM +0200, Adrian Hunter wrote:
>>>>> Deal with errata TGL052, ADL037 and RPL017 "Trace May Contain Incorrect
>>>>> Data When Configured With Single Range Output Larger Than 4KB" by
>>>>> disabling single range output whenever larger than 4KB.
>>>>>
>>>>> Fixes: 670638477aed ("perf/x86/intel/pt: Opportunistically use single range output mode")
>>>>> Cc: stable@vger.kernel.org
>>>>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
>>>>> ---
>>>>>  arch/x86/events/intel/pt.c | 9 +++++++++
>>>>>  1 file changed, 9 insertions(+)
>>>>>
>>>>> diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
>>>>> index 82ef87e9a897..42a55794004a 100644
>>>>> --- a/arch/x86/events/intel/pt.c
>>>>> +++ b/arch/x86/events/intel/pt.c
>>>>> @@ -1263,6 +1263,15 @@ static int pt_buffer_try_single(struct pt_buffer *buf, int nr_pages)
>>>>>  	if (1 << order != nr_pages)
>>>>>  		goto out;
>>>>>  
>>>>> +	/*
>>>>> +	 * Some processors cannot always support single range for more than
>>>>> +	 * 4KB - refer errata TGL052, ADL037 and RPL017. Future processors might
>>>>> +	 * also be affected, so for now rather than trying to keep track of
>>>>> +	 * which ones, just disable it for all.
>>>>> +	 */
>>>>> +	if (nr_pages > 1)
>>>>> +		goto out;
>>>>
>>>> This effectively declares single-output-mode dead? Because I don't think
>>>> anybody uses PT with a single 4K buffer.
>>>
>>> 4K is the default size for "sample mode" i.e. stuffing 4KB of Intel PT trace
>>> data into a PERF_RECORD_SAMPLE record that has sample_type bit PERF_SAMPLE_AUX
>>>
>>> e.g.
>>>
>>> $ perf record -vv --aux-sample -e '{intel_pt//u,cycles:u}' uname 2>err.txt
>>> Linux
>>> $ grep aux_sample_size err.txt
>>>   aux_sample_size                  4096
>>
>> Ah, ok. Not as bad then. Anyway, I'll go queue it for perf/urgent I
>> suppose.
> 
> It would be better to only limit on the CPUs with the bug because
> switching buffers causes some extra latencies. So this patch may regress
> PT overhead or tail latencies.

I could whitelist CPUs that do not have the issue, because a blacklist
would keep expanding, which would be a bit of a pain to maintain.
[tip: perf/urgent] perf/x86/intel/pt: Fix sampling using single range output
Posted by tip-bot2 for Adrian Hunter 3 years, 4 months ago
The following commit has been merged into the perf/urgent branch of tip:

Commit-ID:     ce0d998be9274dd3a3d971cbeaa6fe28fd2c3062
Gitweb:        https://git.kernel.org/tip/ce0d998be9274dd3a3d971cbeaa6fe28fd2c3062
Author:        Adrian Hunter <adrian.hunter@intel.com>
AuthorDate:    Sat, 12 Nov 2022 17:15:08 +02:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Wed, 16 Nov 2022 10:12:59 +01:00

perf/x86/intel/pt: Fix sampling using single range output

Deal with errata TGL052, ADL037 and RPL017 "Trace May Contain Incorrect
Data When Configured With Single Range Output Larger Than 4KB" by
disabling single range output whenever larger than 4KB.

Fixes: 670638477aed ("perf/x86/intel/pt: Opportunistically use single range output mode")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20221112151508.13768-1-adrian.hunter@intel.com
---
 arch/x86/events/intel/pt.c |  9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
index 82ef87e..42a5579 100644
--- a/arch/x86/events/intel/pt.c
+++ b/arch/x86/events/intel/pt.c
@@ -1263,6 +1263,15 @@ static int pt_buffer_try_single(struct pt_buffer *buf, int nr_pages)
 	if (1 << order != nr_pages)
 		goto out;
 
+	/*
+	 * Some processors cannot always support single range for more than
+	 * 4KB - refer errata TGL052, ADL037 and RPL017. Future processors might
+	 * also be affected, so for now rather than trying to keep track of
+	 * which ones, just disable it for all.
+	 */
+	if (nr_pages > 1)
+		goto out;
+
 	buf->single = true;
 	buf->nr_pages = nr_pages;
 	ret = 0;