x86/p2m: Some P2M refactoring

[RFC PATCH 0/4] x86/p2m: Some P2M refactoring

Posted by Teddy Astie 2 months, 1 week ago

First patch removes a shadow mode check in a function that can't be
called with shadow mode (only called with EPT hence HAP).

3 other patches drops guest_tlb_flush_mask by removing all its users :
in the shadow paging case by migrating it a shadow variant of it and
in the hap case by moving it to p2m->flush_tlb logic.

One of the goal is to prepare adding HAP-specific optimizations to TLB
flushing code without risking regressions in the shadow paging code.

Teddy Astie (4):
  x86/ept: Drop shadow mode check in ept_sync_domain()
  x86/shadow: Replace guest_tlb_flush_mask with sh_flush_tlb_mask
  x86/p2m-pt: Set p2m->need_flush if page was present before
  x86/hap: Migrate tlb flush logic to p2m->flush_tlb

 xen/arch/x86/flushtlb.c             | 15 ---------------
 xen/arch/x86/include/asm/flushtlb.h |  3 ---
 xen/arch/x86/include/asm/p2m.h      |  3 ---
 xen/arch/x86/mm/hap/hap.c           | 14 +++-----------
 xen/arch/x86/mm/hap/nested_hap.c    |  7 +------
 xen/arch/x86/mm/nested.c            |  2 +-
 xen/arch/x86/mm/p2m-ept.c           |  5 +++--
 xen/arch/x86/mm/p2m-pt.c            | 13 +++++--------
 xen/arch/x86/mm/p2m.c               |  8 ++++----
 xen/arch/x86/mm/shadow/common.c     | 12 ++++++------
 xen/arch/x86/mm/shadow/hvm.c        |  8 ++++----
 xen/arch/x86/mm/shadow/multi.c      | 18 ++++++------------
 xen/arch/x86/mm/shadow/private.h    | 22 ++++++++++++++++++++++
 13 files changed, 55 insertions(+), 75 deletions(-)

-- 
2.51.2



--
Teddy Astie | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech

Re: [RFC PATCH 0/4] x86/p2m: Some P2M refactoring

Posted by Andrew Cooper 2 months, 1 week ago

On 27/11/2025 1:39 pm, Teddy Astie wrote:
> First patch removes a shadow mode check in a function that can't be
> called with shadow mode (only called with EPT hence HAP).
>
> 3 other patches drops guest_tlb_flush_mask by removing all its users :
> in the shadow paging case by migrating it a shadow variant of it and
> in the hap case by moving it to p2m->flush_tlb logic.
>
> One of the goal is to prepare adding HAP-specific optimizations to TLB
> flushing code without risking regressions in the shadow paging code.

You need a clearer background to try and explain the changes.  I've
discussed some of it with you, but it needs describing here for everyone
else.

From memory, encrypted AMD VMs cannot use "asid++" to flush TLBs, and
must used VMCB->tlb_ctrl instead.

On top of that, Xen does not have a correct abstraction for "flush guest
VA space" vs "flush guest PA space" vs "flush Xen's mappings of guest VA
space".  This comes about because of the shadow code originally had all
3 things together, and HAP didn't clean this up when the need first arose.

This causes over-flushing (Tamas found and reported this on Intel), and
FLUSH_HVM_ASID_CORE isn't an adequate abstraction either.

All of that said, it would help to have a sketch of what you want the
logic to look like in the end.

"flush guest VA space" and "flush guest PA space" originate from
different actions.  VA flushes always from emulation of a guest action,
whereas PA flushes originate from modifications to the P2M.  When shadow
is in use, both of these escalate into a full local flush because of
Xen's use of shadow linear mappings, but this escalation should be
inside the shadow code, not the top level primitive.

Have I missed anything else?

~Andrew

Re: [RFC PATCH 0/4] x86/p2m: Some P2M refactoring

Posted by Teddy Astie 2 months, 1 week ago

Le 27/11/2025 à 20:52, Andrew Cooper a écrit :
> On 27/11/2025 1:39 pm, Teddy Astie wrote:
>> First patch removes a shadow mode check in a function that can't be
>> called with shadow mode (only called with EPT hence HAP).
>>
>> 3 other patches drops guest_tlb_flush_mask by removing all its users :
>> in the shadow paging case by migrating it a shadow variant of it and
>> in the hap case by moving it to p2m->flush_tlb logic.
>>
>> One of the goal is to prepare adding HAP-specific optimizations to TLB
>> flushing code without risking regressions in the shadow paging code.
> 
> You need a clearer background to try and explain the changes.  I've
> discussed some of it with you, but it needs describing here for everyone
> else.
> 
>  From memory, encrypted AMD VMs cannot use "asid++" to flush TLBs, and
> must used VMCB->tlb_ctrl instead.
> 

Not only for encrypted VMs, but also for broadcast TLB invalidations 
(like AMD INVLPGB and Intel RAR) which also requires this.

I'm also wondering if the current model works well when Xen is running 
as a nested guest (as the L0 may get confused about the ASID changing 
constantly).

> On top of that, Xen does not have a correct abstraction for "flush guest
> VA space" vs "flush guest PA space" vs "flush Xen's mappings of guest VA
> space".  This comes about because of the shadow code originally had all
> 3 things together, and HAP didn't clean this up when the need first arose.
> 
> This causes over-flushing (Tamas found and reported this on Intel), and
> FLUSH_HVM_ASID_CORE isn't an adequate abstraction either.
> 

I guess that also wants to have a way to optionally specify the address 
we want to flush (whether it is gva or gpa). So that it can lower to 
things like single-address invalidation (instead of flushing everything) 
if possible and meaningful.

Having a clearer model would definetely help clarifying 
p2m_tlb_flush_sync vs paging_flush_tlb (which sounds like it does the 
same thing, but actually not really).

> 
> All of that said, it would help to have a sketch of what you want the
> logic to look like in the end.
> 

Yes, that's something I planned to do.

> "flush guest VA space" and "flush guest PA space" originate from
> different actions.  VA flushes always from emulation of a guest action,
> whereas PA flushes originate from modifications to the P2M.  When shadow
> is in use, both of these escalate into a full local flush because of
> Xen's use of shadow linear mappings, but this escalation should be
> inside the shadow code, not the top level primitive.
> > Have I missed anything else?
> 

Nested virtualization will also wants clearly defined p2m semantics to 
avoid falling into obscure corner cases.

> ~Andrew
> 



--
Teddy Astie | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech