The length of Physical Address in General Media Event Record/DRAM Event
Record is 64-bit, so the field mask should be defined as such length.
Otherwise, this causes cxl_general_media and cxl_dram tracepoints to
mask off the upper-32-bits of DPA addresses. The cxl_poison event is
unaffected.
If userspace was doing its own DPA-to-HPA translation this could lead to
incorrect page retirement decisions, but there is no known consumer
(like rasdaemon) of this event today.
Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record")
Cc: <stable@vger.kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
---
drivers/cxl/core/trace.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
index e5f13260fc52..cdfce932d5b1 100644
--- a/drivers/cxl/core/trace.h
+++ b/drivers/cxl/core/trace.h
@@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
* DRAM Event Record
* CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
*/
-#define CXL_DPA_FLAGS_MASK 0x3F
+#define CXL_DPA_FLAGS_MASK 0x3FULL
#define CXL_DPA_MASK (~CXL_DPA_FLAGS_MASK)
#define CXL_DPA_VOLATILE BIT(0)
--
2.34.1
On Wed, Apr 17, 2024 at 03:50:52PM +0800, Shiyang Ruan wrote:
> The length of Physical Address in General Media Event Record/DRAM Event
> Record is 64-bit, so the field mask should be defined as such length.
> Otherwise, this causes cxl_general_media and cxl_dram tracepoints to
> mask off the upper-32-bits of DPA addresses. The cxl_poison event is
> unaffected.
>
> If userspace was doing its own DPA-to-HPA translation this could lead to
> incorrect page retirement decisions, but there is no known consumer
> (like rasdaemon) of this event today.
>
> Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record")
> Cc: <stable@vger.kernel.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Davidlohr Bueso <dave@stgolabs.net>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
Hi Ruan,
This fixup is important for the Event DPA->HPA translation work, so I
grabbed it, updated it with most* of the review comments, and posted
with that set. I expect you saw that in your mailbox.
DaveJ queued it in a topic branch for 6.10 here:
https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=for-6.10/dpa-to-hpa
*I did not create a common mask for events and poison because I wanted to
limit the changes. If you'd like to make that change it would be welcomed.
-- Alison
> ---
> drivers/cxl/core/trace.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> index e5f13260fc52..cdfce932d5b1 100644
> --- a/drivers/cxl/core/trace.h
> +++ b/drivers/cxl/core/trace.h
> @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
> * DRAM Event Record
> * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
> */
> -#define CXL_DPA_FLAGS_MASK 0x3F
> +#define CXL_DPA_FLAGS_MASK 0x3FULL
> #define CXL_DPA_MASK (~CXL_DPA_FLAGS_MASK)
>
> #define CXL_DPA_VOLATILE BIT(0)
> --
> 2.34.1
>
在 2024/5/1 5:00, Alison Schofield 写道:
> On Wed, Apr 17, 2024 at 03:50:52PM +0800, Shiyang Ruan wrote:
>> The length of Physical Address in General Media Event Record/DRAM Event
>> Record is 64-bit, so the field mask should be defined as such length.
>> Otherwise, this causes cxl_general_media and cxl_dram tracepoints to
>> mask off the upper-32-bits of DPA addresses. The cxl_poison event is
>> unaffected.
>>
>> If userspace was doing its own DPA-to-HPA translation this could lead to
>> incorrect page retirement decisions, but there is no known consumer
>> (like rasdaemon) of this event today.
>>
>> Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record")
>> Cc: <stable@vger.kernel.org>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Davidlohr Bueso <dave@stgolabs.net>
>> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> Cc: Ira Weiny <ira.weiny@intel.com>
>> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
>
> Hi Ruan,
>
> This fixup is important for the Event DPA->HPA translation work, so I
> grabbed it, updated it with most* of the review comments, and posted
> with that set. I expect you saw that in your mailbox.
Yes, I saw that. Nice fix!
>
> DaveJ queued it in a topic branch for 6.10 here:
> https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=for-6.10/dpa-to-hpa
>
That's good!
> *I did not create a common mask for events and poison because I wanted to
> limit the changes. If you'd like to make that change it would be welcomed.
Ok~
--
Thanks,
Ruan.
>
> -- Alison
>
>> ---
>> drivers/cxl/core/trace.h | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
>> index e5f13260fc52..cdfce932d5b1 100644
>> --- a/drivers/cxl/core/trace.h
>> +++ b/drivers/cxl/core/trace.h
>> @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
>> * DRAM Event Record
>> * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
>> */
>> -#define CXL_DPA_FLAGS_MASK 0x3F
>> +#define CXL_DPA_FLAGS_MASK 0x3FULL
>> #define CXL_DPA_MASK (~CXL_DPA_FLAGS_MASK)
>>
>> #define CXL_DPA_VOLATILE BIT(0)
>> --
>> 2.34.1
>>
On Wed, Apr 17, 2024 at 03:50:52PM +0800, Shiyang Ruan wrote:
> The length of Physical Address in General Media Event Record/DRAM Event
> Record is 64-bit, so the field mask should be defined as such length.
> Otherwise, this causes cxl_general_media and cxl_dram tracepoints to
> mask off the upper-32-bits of DPA addresses. The cxl_poison event is
> unaffected.
>
> If userspace was doing its own DPA-to-HPA translation this could lead to
> incorrect page retirement decisions, but there is no known consumer
> (like rasdaemon) of this event today.
>
So, an invalid DPA is emitted in the trace event log and that could
lead to 'incorrect page retirement decisions...'
> Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record")
> Cc: <stable@vger.kernel.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Davidlohr Bueso <dave@stgolabs.net>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> ---
> drivers/cxl/core/trace.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> index e5f13260fc52..cdfce932d5b1 100644
> --- a/drivers/cxl/core/trace.h
> +++ b/drivers/cxl/core/trace.h
> @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
> * DRAM Event Record
> * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
> */
> -#define CXL_DPA_FLAGS_MASK 0x3F
> +#define CXL_DPA_FLAGS_MASK 0x3FULL
> #define CXL_DPA_MASK (~CXL_DPA_FLAGS_MASK)
>
> #define CXL_DPA_VOLATILE BIT(0)
This works but I'm thinking this is the time to convene on one
CXL_EVENT_DPA_MASK for both all CXL events, rather than having
cxl_poison event be different.
I prefer how poison defines it:
cxlmem.h:#define CXL_POISON_START_MASK GENMASK_ULL(63, 6)
Can we rename that CXL_EVENT_DPA_MASK and use for all events?
--Alison
> --
> 2.34.1
>
Alison Schofield wrote: > On Wed, Apr 17, 2024 at 03:50:52PM +0800, Shiyang Ruan wrote: [snip] > > diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h > > index e5f13260fc52..cdfce932d5b1 100644 > > --- a/drivers/cxl/core/trace.h > > +++ b/drivers/cxl/core/trace.h > > @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event, > > * DRAM Event Record > > * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44 > > */ > > -#define CXL_DPA_FLAGS_MASK 0x3F > > +#define CXL_DPA_FLAGS_MASK 0x3FULL > > #define CXL_DPA_MASK (~CXL_DPA_FLAGS_MASK) > > > > #define CXL_DPA_VOLATILE BIT(0) > > This works but I'm thinking this is the time to convene on one > CXL_EVENT_DPA_MASK for both all CXL events, rather than having > cxl_poison event be different. > > I prefer how poison defines it: > > cxlmem.h:#define CXL_POISON_START_MASK GENMASK_ULL(63, 6) > > Can we rename that CXL_EVENT_DPA_MASK and use for all events? Ah! Great catch. I dont' know why I only masked off the 2 used bits. That was short sighted of me. Yes we should consolidate these. Ira
在 2024/4/24 5:04, Ira Weiny 写道: > Alison Schofield wrote: >> On Wed, Apr 17, 2024 at 03:50:52PM +0800, Shiyang Ruan wrote: > > [snip] > >>> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h >>> index e5f13260fc52..cdfce932d5b1 100644 >>> --- a/drivers/cxl/core/trace.h >>> +++ b/drivers/cxl/core/trace.h >>> @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event, >>> * DRAM Event Record >>> * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44 >>> */ >>> -#define CXL_DPA_FLAGS_MASK 0x3F >>> +#define CXL_DPA_FLAGS_MASK 0x3FULL >>> #define CXL_DPA_MASK (~CXL_DPA_FLAGS_MASK) >>> >>> #define CXL_DPA_VOLATILE BIT(0) >> >> This works but I'm thinking this is the time to convene on one >> CXL_EVENT_DPA_MASK for both all CXL events, rather than having >> cxl_poison event be different. >> >> I prefer how poison defines it: >> >> cxlmem.h:#define CXL_POISON_START_MASK GENMASK_ULL(63, 6) >> >> Can we rename that CXL_EVENT_DPA_MASK and use for all events? cxlmem.h:CXL_POISON_START_MASK is defined for MBOX commands (poison record, the lower 3 bits indicate "Error Source"), but CXL_DPA_MASK here is for events. They have different meaning though their values are same. So, I don't think we should consolidate them. > > Ah! Great catch. I dont' know why I only masked off the 2 used bits. Per spec, the lowest 2 bits of CXL event's DPA are defined for "Volatile or not" and "not repairable". So there is no mistake here. > That was short sighted of me. > > Yes we should consolidate these. > Ira -- Thanks, Ruan.
Shiyang Ruan wrote: > > > 在 2024/4/24 5:04, Ira Weiny 写道: > > Alison Schofield wrote: > >> On Wed, Apr 17, 2024 at 03:50:52PM +0800, Shiyang Ruan wrote: > > > > [snip] > > > >>> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h > >>> index e5f13260fc52..cdfce932d5b1 100644 > >>> --- a/drivers/cxl/core/trace.h > >>> +++ b/drivers/cxl/core/trace.h > >>> @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event, > >>> * DRAM Event Record > >>> * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44 > >>> */ > >>> -#define CXL_DPA_FLAGS_MASK 0x3F > >>> +#define CXL_DPA_FLAGS_MASK 0x3FULL > >>> #define CXL_DPA_MASK (~CXL_DPA_FLAGS_MASK) > >>> > >>> #define CXL_DPA_VOLATILE BIT(0) > >> > >> This works but I'm thinking this is the time to convene on one > >> CXL_EVENT_DPA_MASK for both all CXL events, rather than having > >> cxl_poison event be different. > >> > >> I prefer how poison defines it: > >> > >> cxlmem.h:#define CXL_POISON_START_MASK GENMASK_ULL(63, 6) > >> > >> Can we rename that CXL_EVENT_DPA_MASK and use for all events? > > cxlmem.h:CXL_POISON_START_MASK is defined for MBOX commands (poison > record, the lower 3 bits indicate "Error Source"), but CXL_DPA_MASK here > is for events. They have different meaning though their values are > same. So, I don't think we should consolidate them. By definition the DPA in all these payloads can't use the lower 6 bits. We are defining a mask to get the DPA. This has nothing to do with what may be stored in the other 6 bits. Defining a common DPA mask is correct per both sections of the spec. > > > > > Ah! Great catch. I dont' know why I only masked off the 2 used bits. > > Per spec, the lowest 2 bits of CXL event's DPA are defined for "Volatile > or not" and "not repairable". So there is no mistake here. I appreciate your not calling out my code as a bug. :-D However, bits [5:2] are also Reserved. So it is incorrect to mask off only the lower 2 bits. Even though the reserved bits must be 0 per the spec, it is still better to properly mask all reserved bits because they are by definition not part of the DPA. Could you create a common macro for the next version of the patch? Thanks, Ira > > > That was short sighted of me. > > > > Yes we should consolidate these. > > Ira > > -- > Thanks, > Ruan. >
Shiyang Ruan wrote:
> The length of Physical Address in General Media Event Record/DRAM Event
> Record is 64-bit, so the field mask should be defined as such length.
> Otherwise, this causes cxl_general_media and cxl_dram tracepoints to
> mask off the upper-32-bits of DPA addresses. The cxl_poison event is
> unaffected.
>
> If userspace was doing its own DPA-to-HPA translation this could lead to
> incorrect page retirement decisions, but there is no known consumer
> (like rasdaemon) of this event today.
>
> Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record")
> Cc: <stable@vger.kernel.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Davidlohr Bueso <dave@stgolabs.net>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> ---
> drivers/cxl/core/trace.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> index e5f13260fc52..cdfce932d5b1 100644
> --- a/drivers/cxl/core/trace.h
> +++ b/drivers/cxl/core/trace.h
> @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
> * DRAM Event Record
> * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
> */
> -#define CXL_DPA_FLAGS_MASK 0x3F
> +#define CXL_DPA_FLAGS_MASK 0x3FULL
> #define CXL_DPA_MASK (~CXL_DPA_FLAGS_MASK)
>
> #define CXL_DPA_VOLATILE BIT(0)
Looks good,
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Shiyang Ruan wrote:
> The length of Physical Address in General Media Event Record/DRAM Event
> Record is 64-bit, so the field mask should be defined as such length.
> Otherwise, this causes cxl_general_media and cxl_dram tracepoints to
> mask off the upper-32-bits of DPA addresses. The cxl_poison event is
> unaffected.
>
> If userspace was doing its own DPA-to-HPA translation this could lead to
> incorrect page retirement decisions, but there is no known consumer
> (like rasdaemon) of this event today.
>
> Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record")
> Cc: <stable@vger.kernel.org>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Davidlohr Bueso <dave@stgolabs.net>
> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Cc: Ira Weiny <ira.weiny@intel.com>
Apologies I thought I saw this go in before. But perhaps it was a
different mask.
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
> ---
> drivers/cxl/core/trace.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> index e5f13260fc52..cdfce932d5b1 100644
> --- a/drivers/cxl/core/trace.h
> +++ b/drivers/cxl/core/trace.h
> @@ -253,7 +253,7 @@ TRACE_EVENT(cxl_generic_event,
> * DRAM Event Record
> * CXL rev 3.0 section 8.2.9.2.1.2; Table 8-44
> */
> -#define CXL_DPA_FLAGS_MASK 0x3F
> +#define CXL_DPA_FLAGS_MASK 0x3FULL
> #define CXL_DPA_MASK (~CXL_DPA_FLAGS_MASK)
>
> #define CXL_DPA_VOLATILE BIT(0)
> --
> 2.34.1
>
© 2016 - 2026 Red Hat, Inc.