[RFC PATCH 8/8] Use Linux's PAT

Demi Marie Obenour posted 8 patches 3 years, 2 months ago
There is a newer version of this series
[RFC PATCH 8/8] Use Linux's PAT
Posted by Demi Marie Obenour 3 years, 2 months ago
This is purely for testing, to see if it works around a bug in i915.  It
is not intended to be merged.

NOT-signed-off-by: DO NOT MERGE
---
 xen/arch/x86/include/asm/page.h      |  4 ++--
 xen/arch/x86/include/asm/processor.h | 12 ++++++------
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/include/asm/page.h b/xen/arch/x86/include/asm/page.h
index b585235d064a567082582c8e92a4e8283fd949ca..ab9b46f1d0901e50a83fd035ff28d1bda0b781a2 100644
--- a/xen/arch/x86/include/asm/page.h
+++ b/xen/arch/x86/include/asm/page.h
@@ -333,11 +333,11 @@ void efi_update_l4_pgtable(unsigned int l4idx, l4_pgentry_t);
 
 /* Memory types, encoded under Xen's choice of MSR_PAT. */
 #define _PAGE_WB         (                                0)
-#define _PAGE_WT         (                        _PAGE_PWT)
+#define _PAGE_WC         (                        _PAGE_PWT)
 #define _PAGE_UCM        (            _PAGE_PCD            )
 #define _PAGE_UC         (            _PAGE_PCD | _PAGE_PWT)
-#define _PAGE_WC         (_PAGE_PAT                        )
 #define _PAGE_WP         (_PAGE_PAT |             _PAGE_PWT)
+#define _PAGE_WT         (_PAGE_PAT | _PAGE_PCD | _PAGE_PWT)
 
 /*
  * Debug option: Ensure that granted mappings are not implicitly unmapped.
diff --git a/xen/arch/x86/include/asm/processor.h b/xen/arch/x86/include/asm/processor.h
index 64b75e444947c64e2e5eba457deec92a873d7a63..7a27d19d1dc8aa9c102683985c30fb85864303fa 100644
--- a/xen/arch/x86/include/asm/processor.h
+++ b/xen/arch/x86/include/asm/processor.h
@@ -107,18 +107,18 @@
  * MSR_PAT value, and is an ABI with PV guests.
  */
 #define XEN_MSR_PAT (MSR_PAT_WB  << 0x00 | \
-                     MSR_PAT_WT  << 0x08 | \
+                     MSR_PAT_WC  << 0x08 | \
                      MSR_PAT_UCM << 0x10 | \
                      MSR_PAT_UC  << 0x18 | \
-                     MSR_PAT_WC  << 0x20 | \
+                     MSR_PAT_WB  << 0x20 | \
                      MSR_PAT_WP  << 0x28 | \
-                     MSR_PAT_UC  << 0x30 | \
-                     MSR_PAT_UC  << 0x38 | \
+                     MSR_PAT_UCM << 0x30 | \
+                     MSR_PAT_WT  << 0x38 | \
                      0)
 
 #ifndef __ASSEMBLY__
-_Static_assert(XEN_MSR_PAT == 0x050100070406ULL,
-               "wrong XEN_MSR_PAT breaks PV guests");
+_Static_assert(XEN_MSR_PAT == 0x0407050600070106ULL,
+               "wrong XEN_MSR_PAT breaks i915 driver on PV Linux");
 
 struct domain;
 struct vcpu;
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab
Re: [RFC PATCH 8/8] Use Linux's PAT
Posted by Andrew Cooper 3 years, 2 months ago
On 06/12/2022 04:33, Demi Marie Obenour wrote:
> This is purely for testing, to see if it works around a bug in i915.  It
> is not intended to be merged.
>
> NOT-signed-off-by: DO NOT MERGE

Following up on Marek's report on IRC/Matrix, you're saying that this
change does actually fix screen corruption issues on AlderLake, and
something on TigerLake too?

If that is actually the case, then one of two things is happening.  Either,

1) Drivers in Linux are bypassing the regular caching APIs, or
2) The translation logic between Linux's idea of cacheability and Xen's
PAT values is buggy.

~Andrew
Re: [RFC PATCH 8/8] Use Linux's PAT
Posted by Demi Marie Obenour 3 years, 2 months ago
On Tue, Dec 06, 2022 at 11:38:03AM +0000, Andrew Cooper wrote:
> On 06/12/2022 04:33, Demi Marie Obenour wrote:
> > This is purely for testing, to see if it works around a bug in i915.  It
> > is not intended to be merged.
> >
> > NOT-signed-off-by: DO NOT MERGE
> 
> Following up on Marek's report on IRC/Matrix, you're saying that this
> change does actually fix screen corruption issues on AlderLake, and
> something on TigerLake too?

Correct

> If that is actually the case, then one of two things is happening.  Either,
> 
> 1) Drivers in Linux are bypassing the regular caching APIs, or

This would not surprise me at all.

> 2) The translation logic between Linux's idea of cacheability and Xen's
> PAT values is buggy.

How could I check for this?
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab
Re: [RFC PATCH 8/8] Use Linux's PAT
Posted by Marek Marczykowski-Górecki 3 years, 2 months ago
On Tue, Dec 06, 2022 at 01:01:41PM -0500, Demi Marie Obenour wrote:
> On Tue, Dec 06, 2022 at 11:38:03AM +0000, Andrew Cooper wrote:
> > On 06/12/2022 04:33, Demi Marie Obenour wrote:
> > > This is purely for testing, to see if it works around a bug in i915.  It
> > > is not intended to be merged.
> > >
> > > NOT-signed-off-by: DO NOT MERGE
> > 
> > Following up on Marek's report on IRC/Matrix, you're saying that this
> > change does actually fix screen corruption issues on AlderLake, and
> > something on TigerLake too?
> 
> Correct
> 
> > If that is actually the case, then one of two things is happening.  Either,
> > 
> > 1) Drivers in Linux are bypassing the regular caching APIs, or
> 
> This would not surprise me at all.
> 
> > 2) The translation logic between Linux's idea of cacheability and Xen's
> > PAT values is buggy.
> 
> How could I check for this?

See Andy's unit test idea on #xendevel:

    as a pretty simple "unit" test in dom0, it might be a good idea to
    have a module which watches the PTE in question, and cycles through
    various of the memremap_*() APIs and checks the raw PTE that gets
    written after Linux and Xen are done fighting with it

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
Re: [RFC PATCH 8/8] Use Linux's PAT
Posted by Demi Marie Obenour 3 years, 1 month ago
On Tue, Dec 06, 2022 at 07:12:06PM +0100, Marek Marczykowski-Górecki wrote:
> On Tue, Dec 06, 2022 at 01:01:41PM -0500, Demi Marie Obenour wrote:
> > On Tue, Dec 06, 2022 at 11:38:03AM +0000, Andrew Cooper wrote:
> > > On 06/12/2022 04:33, Demi Marie Obenour wrote:
> > > > This is purely for testing, to see if it works around a bug in i915.  It
> > > > is not intended to be merged.
> > > >
> > > > NOT-signed-off-by: DO NOT MERGE
> > > 
> > > Following up on Marek's report on IRC/Matrix, you're saying that this
> > > change does actually fix screen corruption issues on AlderLake, and
> > > something on TigerLake too?
> > 
> > Correct
> > 
> > > If that is actually the case, then one of two things is happening.  Either,
> > > 
> > > 1) Drivers in Linux are bypassing the regular caching APIs, or
> > 
> > This would not surprise me at all.
> > 
> > > 2) The translation logic between Linux's idea of cacheability and Xen's
> > > PAT values is buggy.
> > 
> > How could I check for this?
> 
> See Andy's unit test idea on #xendevel:
> 
>     as a pretty simple "unit" test in dom0, it might be a good idea to
>     have a module which watches the PTE in question, and cycles through
>     various of the memremap_*() APIs and checks the raw PTE that gets
>     written after Linux and Xen are done fighting with it

This confirmed that the translation logic is correct, which means that
i195 is buggy.  I filed https://gitlab.freedesktop.org/drm/intel/-/issues/7648
for that.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab
Re: [RFC PATCH 8/8] Use Linux's PAT
Posted by Demi Marie Obenour 3 years, 2 months ago
On Tue, Dec 06, 2022 at 07:12:06PM +0100, Marek Marczykowski-Górecki wrote:
> On Tue, Dec 06, 2022 at 01:01:41PM -0500, Demi Marie Obenour wrote:
> > On Tue, Dec 06, 2022 at 11:38:03AM +0000, Andrew Cooper wrote:
> > > On 06/12/2022 04:33, Demi Marie Obenour wrote:
> > > > This is purely for testing, to see if it works around a bug in i915.  It
> > > > is not intended to be merged.
> > > >
> > > > NOT-signed-off-by: DO NOT MERGE
> > > 
> > > Following up on Marek's report on IRC/Matrix, you're saying that this
> > > change does actually fix screen corruption issues on AlderLake, and
> > > something on TigerLake too?
> > 
> > Correct
> > 
> > > If that is actually the case, then one of two things is happening.  Either,
> > > 
> > > 1) Drivers in Linux are bypassing the regular caching APIs, or
> > 
> > This would not surprise me at all.
> > 
> > > 2) The translation logic between Linux's idea of cacheability and Xen's
> > > PAT values is buggy.
> > 
> > How could I check for this?
> 
> See Andy's unit test idea on #xendevel:
> 
>     as a pretty simple "unit" test in dom0, it might be a good idea to
>     have a module which watches the PTE in question, and cycles through
>     various of the memremap_*() APIs and checks the raw PTE that gets
>     written after Linux and Xen are done fighting with it

Which PTEs should I check?
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab
Re: [RFC PATCH 8/8] Use Linux's PAT
Posted by Marek Marczykowski-Górecki 3 years, 2 months ago
On Tue, Dec 06, 2022 at 02:47:42PM -0500, Demi Marie Obenour wrote:
> On Tue, Dec 06, 2022 at 07:12:06PM +0100, Marek Marczykowski-Górecki wrote:
> > On Tue, Dec 06, 2022 at 01:01:41PM -0500, Demi Marie Obenour wrote:
> > > On Tue, Dec 06, 2022 at 11:38:03AM +0000, Andrew Cooper wrote:
> > > > 2) The translation logic between Linux's idea of cacheability and Xen's
> > > > PAT values is buggy.
> > > 
> > > How could I check for this?
> > 
> > See Andy's unit test idea on #xendevel:
> > 
> >     as a pretty simple "unit" test in dom0, it might be a good idea to
> >     have a module which watches the PTE in question, and cycles through
> >     various of the memremap_*() APIs and checks the raw PTE that gets
> >     written after Linux and Xen are done fighting with it
> 
> Which PTEs should I check?

The ones adjusted by the memremap_*() call.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab