[v2] x86/HVM: restrict use of pinned cache attributes as well as associated flushing

[PATCH v2] x86/HVM: restrict use of pinned cache attributes as well as associated flushing

Posted by Jan Beulich 4 months, 4 weeks ago

We don't permit use of uncachable memory types elsewhere unless a domain
meets certain criteria. Enforce this also during registration of pinned
cache attribute ranges.

Furthermore restrict cache flushing to just uncachable range registration.
While there, also (mainly be calling memory_type_changed())
- take CPU self-snoop as well as IOMMU snoop into account (albeit the
  latter still is a global property rather than a per-domain one),
- avoid flushes when the domain isn't running yet (which ought to be the
  common case).

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
At the expense of yet larger a diff it would be possible to get away
without any "goto", by moving the whole "new entry" handling into the
switch(). Personally I'd prefer that, but the larger diff may be
unwelcome.

I have to admit that I can't spot what part of epte_get_entry_emt() the
comment refers to that is being deleted. The function does use
hvm_get_mem_pinned_cacheattr(), yes, but there's nothing there that talks
about cache flushes (and their avoiding) in any way.

Is it really sensible to add/remove ranges once the guest is already
running? (If it is, limiting the scope of the flush would be nice, but
would require knowing dirtyness for the domain wrt the caches, which
currently we don't track.)

This is kind of amending XSA-428.
---
v2: Use memory_type_changed() and conditionalize call to
    p2m_memory_type_changed().

--- a/xen/arch/x86/hvm/mtrr.c
+++ b/xen/arch/x86/hvm/mtrr.c
@@ -582,6 +582,7 @@ int hvm_set_mem_pinned_cacheattr(struct
 {
     struct hvm_mem_pinned_cacheattr_range *range, *newr;
     unsigned int nr = 0;
+    bool flush = false;
     int rc = 1;
 
     if ( !is_hvm_domain(d) )
@@ -605,31 +606,35 @@ int hvm_set_mem_pinned_cacheattr(struct
 
                 type = range->type;
                 call_rcu(&range->rcu, free_pinned_cacheattr_entry);
-                p2m_memory_type_changed(d);
                 switch ( type )
                 {
-                case X86_MT_UCM:
+                case X86_MT_WB:
+                case X86_MT_WP:
+                case X86_MT_WT:
                     /*
-                     * For EPT we can also avoid the flush in this case;
-                     * see epte_get_entry_emt().
+                     * Flush since we don't know what the cachability is going
+                     * to be.
                      */
-                    if ( hap_enabled(d) && cpu_has_vmx )
-                case X86_MT_UC:
-                        break;
-                    /* fall through */
-                default:
-                    flush_all(FLUSH_CACHE);
+                    if ( is_iommu_enabled(d) || cache_flush_permitted(d) )
+                        flush = true;
                     break;
                 }
-                return 0;
+                rc = 0;
+                goto finish;
             }
         domain_unlock(d);
         return -ENOENT;
 
     case X86_MT_UCM:
     case X86_MT_UC:
-    case X86_MT_WB:
     case X86_MT_WC:
+        /* Flush since we don't know what the cachability was. */
+        if ( !is_iommu_enabled(d) && !cache_flush_permitted(d) )
+            return -EPERM;
+        flush = true;
+        break;
+
+    case X86_MT_WB:
     case X86_MT_WP:
     case X86_MT_WT:
         break;
@@ -682,9 +687,11 @@ int hvm_set_mem_pinned_cacheattr(struct
 
     xfree(newr);
 
-    p2m_memory_type_changed(d);
-    if ( type != X86_MT_WB )
-        flush_all(FLUSH_CACHE);
+ finish:
+    if ( flush )
+        memory_type_changed(d);
+    else if ( d->vcpu && d->vcpu[0] )
+        p2m_memory_type_changed(d);
 
     return rc;
 }

Re: [PATCH v2] x86/HVM: restrict use of pinned cache attributes as well as associated flushing

Posted by Roger Pau Monné 4 months, 3 weeks ago

On Wed, Jun 04, 2025 at 11:48:00AM +0200, Jan Beulich wrote:
> We don't permit use of uncachable memory types elsewhere unless a domain
> meets certain criteria. Enforce this also during registration of pinned
> cache attribute ranges.
> 
> Furthermore restrict cache flushing to just uncachable range registration.
> While there, also (mainly be calling memory_type_changed())
> - take CPU self-snoop as well as IOMMU snoop into account (albeit the
>   latter still is a global property rather than a per-domain one),
> - avoid flushes when the domain isn't running yet (which ought to be the
>   common case).
> 
> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> At the expense of yet larger a diff it would be possible to get away
> without any "goto", by moving the whole "new entry" handling into the
> switch(). Personally I'd prefer that, but the larger diff may be
> unwelcome.
> 
> I have to admit that I can't spot what part of epte_get_entry_emt() the
> comment refers to that is being deleted. The function does use
> hvm_get_mem_pinned_cacheattr(), yes, but there's nothing there that talks
> about cache flushes (and their avoiding) in any way.
> 
> Is it really sensible to add/remove ranges once the guest is already
> running? (If it is, limiting the scope of the flush would be nice, but
> would require knowing dirtyness for the domain wrt the caches, which
> currently we don't track.)
> 
> This is kind of amending XSA-428.
> ---
> v2: Use memory_type_changed() and conditionalize call to
>     p2m_memory_type_changed().
> 
> --- a/xen/arch/x86/hvm/mtrr.c
> +++ b/xen/arch/x86/hvm/mtrr.c
> @@ -582,6 +582,7 @@ int hvm_set_mem_pinned_cacheattr(struct
>  {
>      struct hvm_mem_pinned_cacheattr_range *range, *newr;
>      unsigned int nr = 0;
> +    bool flush = false;
>      int rc = 1;
>  
>      if ( !is_hvm_domain(d) )
> @@ -605,31 +606,35 @@ int hvm_set_mem_pinned_cacheattr(struct
>  
>                  type = range->type;
>                  call_rcu(&range->rcu, free_pinned_cacheattr_entry);
> -                p2m_memory_type_changed(d);
>                  switch ( type )
>                  {
> -                case X86_MT_UCM:
> +                case X86_MT_WB:
> +                case X86_MT_WP:
> +                case X86_MT_WT:
>                      /*
> -                     * For EPT we can also avoid the flush in this case;
> -                     * see epte_get_entry_emt().
> +                     * Flush since we don't know what the cachability is going
> +                     * to be.
>                       */
> -                    if ( hap_enabled(d) && cpu_has_vmx )
> -                case X86_MT_UC:
> -                        break;
> -                    /* fall through */
> -                default:
> -                    flush_all(FLUSH_CACHE);
> +                    if ( is_iommu_enabled(d) || cache_flush_permitted(d) )
> +                        flush = true;

Is the check here required?  memory_type_changed() will already check
for is_iommu_enabled() and cache_flush_permitted(), and hence you
could just set flush to true unconditionally here IMO.

>                      break;
>                  }
> -                return 0;
> +                rc = 0;
> +                goto finish;
>              }
>          domain_unlock(d);
>          return -ENOENT;
>  
>      case X86_MT_UCM:
>      case X86_MT_UC:
> -    case X86_MT_WB:
>      case X86_MT_WC:
> +        /* Flush since we don't know what the cachability was. */
> +        if ( !is_iommu_enabled(d) && !cache_flush_permitted(d) )
> +            return -EPERM;
> +        flush = true;
> +        break;
> +
> +    case X86_MT_WB:
>      case X86_MT_WP:
>      case X86_MT_WT:
>          break;
> @@ -682,9 +687,11 @@ int hvm_set_mem_pinned_cacheattr(struct
>  
>      xfree(newr);
>  
> -    p2m_memory_type_changed(d);
> -    if ( type != X86_MT_WB )
> -        flush_all(FLUSH_CACHE);
> + finish:
> +    if ( flush )
> +        memory_type_changed(d);
> +    else if ( d->vcpu && d->vcpu[0] )
> +        p2m_memory_type_changed(d);

FWIW, I would just call memory_type_changed() unconditionally
regardless of the change.  We suspect the hypercall is only used at
domain creation time (where memory_type_changed() won't do a cache
flush anyway).

Thanks, Roger.

Re: [PATCH v2] x86/HVM: restrict use of pinned cache attributes as well as associated flushing

Posted by Jan Beulich 4 months, 3 weeks ago

On 09.06.2025 12:36, Roger Pau Monné wrote:
> On Wed, Jun 04, 2025 at 11:48:00AM +0200, Jan Beulich wrote:
>> @@ -605,31 +606,35 @@ int hvm_set_mem_pinned_cacheattr(struct
>>  
>>                  type = range->type;
>>                  call_rcu(&range->rcu, free_pinned_cacheattr_entry);
>> -                p2m_memory_type_changed(d);
>>                  switch ( type )
>>                  {
>> -                case X86_MT_UCM:
>> +                case X86_MT_WB:
>> +                case X86_MT_WP:
>> +                case X86_MT_WT:
>>                      /*
>> -                     * For EPT we can also avoid the flush in this case;
>> -                     * see epte_get_entry_emt().
>> +                     * Flush since we don't know what the cachability is going
>> +                     * to be.
>>                       */
>> -                    if ( hap_enabled(d) && cpu_has_vmx )
>> -                case X86_MT_UC:
>> -                        break;
>> -                    /* fall through */
>> -                default:
>> -                    flush_all(FLUSH_CACHE);
>> +                    if ( is_iommu_enabled(d) || cache_flush_permitted(d) )
>> +                        flush = true;
> 
> Is the check here required?  memory_type_changed() will already check
> for is_iommu_enabled() and cache_flush_permitted(), and hence you
> could just set flush to true unconditionally here IMO.

The behavioral difference is when both predicates are false: The way I have
it now, p2m_memory_type_changed() will then still be called (conditionally),
better matching prior behavior.

>>                      break;
>>                  }
>> -                return 0;
>> +                rc = 0;
>> +                goto finish;
>>              }
>>          domain_unlock(d);
>>          return -ENOENT;
>>  
>>      case X86_MT_UCM:
>>      case X86_MT_UC:
>> -    case X86_MT_WB:
>>      case X86_MT_WC:
>> +        /* Flush since we don't know what the cachability was. */
>> +        if ( !is_iommu_enabled(d) && !cache_flush_permitted(d) )
>> +            return -EPERM;
>> +        flush = true;
>> +        break;
>> +
>> +    case X86_MT_WB:
>>      case X86_MT_WP:
>>      case X86_MT_WT:
>>          break;
>> @@ -682,9 +687,11 @@ int hvm_set_mem_pinned_cacheattr(struct
>>  
>>      xfree(newr);
>>  
>> -    p2m_memory_type_changed(d);
>> -    if ( type != X86_MT_WB )
>> -        flush_all(FLUSH_CACHE);
>> + finish:
>> +    if ( flush )
>> +        memory_type_changed(d);
>> +    else if ( d->vcpu && d->vcpu[0] )
>> +        p2m_memory_type_changed(d);
> 
> FWIW, I would just call memory_type_changed() unconditionally
> regardless of the change.

In which case the need for the "flush" local var would go away, if I
understand your suggestion correctly. Like above, there'll then be
more of a behavioral change than intended. In particular ...

>  We suspect the hypercall is only used at
> domain creation time (where memory_type_changed() won't do a cache
> flush anyway).

... "suspect" is not enough for my taste. The only alternative there
that I see (as mentioned in a post-commit-message remark) is to
refuse such "late" changes altogether. Yet for that we need to be
sure, which it looks like no-one of us is.

Jan

Re: [PATCH v2] x86/HVM: restrict use of pinned cache attributes as well as associated flushing

Posted by Roger Pau Monné 4 months, 3 weeks ago

On Tue, Jun 10, 2025 at 09:40:38AM +0200, Jan Beulich wrote:
> On 09.06.2025 12:36, Roger Pau Monné wrote:
> > On Wed, Jun 04, 2025 at 11:48:00AM +0200, Jan Beulich wrote:
> >> @@ -605,31 +606,35 @@ int hvm_set_mem_pinned_cacheattr(struct
> >>  
> >>                  type = range->type;
> >>                  call_rcu(&range->rcu, free_pinned_cacheattr_entry);
> >> -                p2m_memory_type_changed(d);
> >>                  switch ( type )
> >>                  {
> >> -                case X86_MT_UCM:
> >> +                case X86_MT_WB:
> >> +                case X86_MT_WP:
> >> +                case X86_MT_WT:
> >>                      /*
> >> -                     * For EPT we can also avoid the flush in this case;
> >> -                     * see epte_get_entry_emt().
> >> +                     * Flush since we don't know what the cachability is going
> >> +                     * to be.
> >>                       */
> >> -                    if ( hap_enabled(d) && cpu_has_vmx )
> >> -                case X86_MT_UC:
> >> -                        break;
> >> -                    /* fall through */
> >> -                default:
> >> -                    flush_all(FLUSH_CACHE);
> >> +                    if ( is_iommu_enabled(d) || cache_flush_permitted(d) )
> >> +                        flush = true;
> > 
> > Is the check here required?  memory_type_changed() will already check
> > for is_iommu_enabled() and cache_flush_permitted(), and hence you
> > could just set flush to true unconditionally here IMO.
> 
> The behavioral difference is when both predicates are false: The way I have
> it now, p2m_memory_type_changed() will then still be called (conditionally),
> better matching prior behavior.

I see.  Yes, p2m_memory_type_changed() needs to be called.

> 
> >>                      break;
> >>                  }
> >> -                return 0;
> >> +                rc = 0;
> >> +                goto finish;
> >>              }
> >>          domain_unlock(d);
> >>          return -ENOENT;
> >>  
> >>      case X86_MT_UCM:
> >>      case X86_MT_UC:
> >> -    case X86_MT_WB:
> >>      case X86_MT_WC:
> >> +        /* Flush since we don't know what the cachability was. */
> >> +        if ( !is_iommu_enabled(d) && !cache_flush_permitted(d) )
> >> +            return -EPERM;

When assigning IO resources without an IOMMU enabled we likely need
to allow the pinned cache attributes to be set, but there's no need to
propagate the changes to the p2m, as the EMT calculation won't take
into account the pinned attributes.

IOW: I don't think we can safely short-circuit and return -EPERM here
without agreeing that it's a behavioral difference form the previous
implementation.

> >> +        flush = true;
> >> +        break;
> >> +
> >> +    case X86_MT_WB:
> >>      case X86_MT_WP:
> >>      case X86_MT_WT:
> >>          break;
> >> @@ -682,9 +687,11 @@ int hvm_set_mem_pinned_cacheattr(struct
> >>  
> >>      xfree(newr);
> >>  
> >> -    p2m_memory_type_changed(d);
> >> -    if ( type != X86_MT_WB )
> >> -        flush_all(FLUSH_CACHE);
> >> + finish:
> >> +    if ( flush )
> >> +        memory_type_changed(d);
> >> +    else if ( d->vcpu && d->vcpu[0] )
> >> +        p2m_memory_type_changed(d);
> > 
> > FWIW, I would just call memory_type_changed() unconditionally
> > regardless of the change.
> 
> In which case the need for the "flush" local var would go away, if I
> understand your suggestion correctly. Like above, there'll then be
> more of a behavioral change than intended. In particular ...

There will be a behavioral change, but not one that the guest would
notice IMO.

> >  We suspect the hypercall is only used at
> > domain creation time (where memory_type_changed() won't do a cache
> > flush anyway).
> 
> ... "suspect" is not enough for my taste. The only alternative there
> that I see (as mentioned in a post-commit-message remark) is to
> refuse such "late" changes altogether. Yet for that we need to be
> sure, which it looks like no-one of us is.

Why do you say only alternative?

Calling memory_type_changed() unconditionally (without taking into
account the previous or new cache attributes) would also be an
acceptable solution, that might wide the cache flushing a bit, but
would still be correct and much simpler IMO.

Thanks, Roger.

Re: [PATCH v2] x86/HVM: restrict use of pinned cache attributes as well as associated flushing

Posted by Jan Beulich 4 months, 3 weeks ago

On 10.06.2025 12:44, Roger Pau Monné wrote:
> On Tue, Jun 10, 2025 at 09:40:38AM +0200, Jan Beulich wrote:
>> On 09.06.2025 12:36, Roger Pau Monné wrote:
>>> On Wed, Jun 04, 2025 at 11:48:00AM +0200, Jan Beulich wrote:
>>>> @@ -605,31 +606,35 @@ int hvm_set_mem_pinned_cacheattr(struct
>>>>  
>>>>                  type = range->type;
>>>>                  call_rcu(&range->rcu, free_pinned_cacheattr_entry);
>>>> -                p2m_memory_type_changed(d);
>>>>                  switch ( type )
>>>>                  {
>>>> -                case X86_MT_UCM:
>>>> +                case X86_MT_WB:
>>>> +                case X86_MT_WP:
>>>> +                case X86_MT_WT:
>>>>                      /*
>>>> -                     * For EPT we can also avoid the flush in this case;
>>>> -                     * see epte_get_entry_emt().
>>>> +                     * Flush since we don't know what the cachability is going
>>>> +                     * to be.
>>>>                       */
>>>> -                    if ( hap_enabled(d) && cpu_has_vmx )
>>>> -                case X86_MT_UC:
>>>> -                        break;
>>>> -                    /* fall through */
>>>> -                default:
>>>> -                    flush_all(FLUSH_CACHE);
>>>> +                    if ( is_iommu_enabled(d) || cache_flush_permitted(d) )
>>>> +                        flush = true;
>>>
>>> Is the check here required?  memory_type_changed() will already check
>>> for is_iommu_enabled() and cache_flush_permitted(), and hence you
>>> could just set flush to true unconditionally here IMO.
>>
>> The behavioral difference is when both predicates are false: The way I have
>> it now, p2m_memory_type_changed() will then still be called (conditionally),
>> better matching prior behavior.
> 
> I see.  Yes, p2m_memory_type_changed() needs to be called.
> 
>>
>>>>                      break;
>>>>                  }
>>>> -                return 0;
>>>> +                rc = 0;
>>>> +                goto finish;
>>>>              }
>>>>          domain_unlock(d);
>>>>          return -ENOENT;
>>>>  
>>>>      case X86_MT_UCM:
>>>>      case X86_MT_UC:
>>>> -    case X86_MT_WB:
>>>>      case X86_MT_WC:
>>>> +        /* Flush since we don't know what the cachability was. */
>>>> +        if ( !is_iommu_enabled(d) && !cache_flush_permitted(d) )
>>>> +            return -EPERM;
> 
> When assigning IO resources without an IOMMU enabled we likely need
> to allow the pinned cache attributes to be set, but there's no need to
> propagate the changes to the p2m, as the EMT calculation won't take
> into account the pinned attributes.

Why would it not do so? Am I overlooking a conditional there that would
cause hvm_get_mem_pinned_cacheattr() to not be called? The only related
one I see is

    if ( type != p2m_mmio_direct && !is_iommu_enabled(d) &&
         !cache_flush_permitted(d) )

covering the without-IOMMU case just the same as the "with" one. (The
"without" case looks dubious to me, as I don't think we arrange for
any identity mapping, but that's a separate topic.)

> IOW: I don't think we can safely short-circuit and return -EPERM here
> without agreeing that it's a behavioral difference form the previous
> implementation.

There's no question there is a behavioral change here. Without I/O
resources (and without IOMMU) we simply don't accept cache attributes
other then WB elsewhere; the change is to avoid doing so here as well,
to get things to be consistent. Hence the -EPERM return.

>>>> @@ -682,9 +687,11 @@ int hvm_set_mem_pinned_cacheattr(struct
>>>>  
>>>>      xfree(newr);
>>>>  
>>>> -    p2m_memory_type_changed(d);
>>>> -    if ( type != X86_MT_WB )
>>>> -        flush_all(FLUSH_CACHE);
>>>> + finish:
>>>> +    if ( flush )
>>>> +        memory_type_changed(d);
>>>> +    else if ( d->vcpu && d->vcpu[0] )
>>>> +        p2m_memory_type_changed(d);
>>>
>>> FWIW, I would just call memory_type_changed() unconditionally
>>> regardless of the change.
>>
>> In which case the need for the "flush" local var would go away, if I
>> understand your suggestion correctly. Like above, there'll then be
>> more of a behavioral change than intended. In particular ...
> 
> There will be a behavioral change, but not one that the guest would
> notice IMO.
> 
>>>  We suspect the hypercall is only used at
>>> domain creation time (where memory_type_changed() won't do a cache
>>> flush anyway).
>>
>> ... "suspect" is not enough for my taste. The only alternative there
>> that I see (as mentioned in a post-commit-message remark) is to
>> refuse such "late" changes altogether. Yet for that we need to be
>> sure, which it looks like no-one of us is.
> 
> Why do you say only alternative?

Oh, sorry, I meant "only" just in regard to options keeping the main
code structure of the change. I agree ...

> Calling memory_type_changed() unconditionally (without taking into
> account the previous or new cache attributes) would also be an
> acceptable solution, that might wide the cache flushing a bit, but
> would still be correct and much simpler IMO.

... that this, too, is a possibility. It would, however, go against the
stated purpose of the change (in the subject "... as well as associated
flushing"), which - after all - was the main goal here, seeing the
series this was originally part of.

Jan

Re: [PATCH v2] x86/HVM: restrict use of pinned cache attributes as well as associated flushing

Posted by Roger Pau Monné 4 months, 3 weeks ago

On Tue, Jun 10, 2025 at 01:59:52PM +0200, Jan Beulich wrote:
> On 10.06.2025 12:44, Roger Pau Monné wrote:
> > On Tue, Jun 10, 2025 at 09:40:38AM +0200, Jan Beulich wrote:
> >> On 09.06.2025 12:36, Roger Pau Monné wrote:
> >>> On Wed, Jun 04, 2025 at 11:48:00AM +0200, Jan Beulich wrote:
> >>>> @@ -605,31 +606,35 @@ int hvm_set_mem_pinned_cacheattr(struct
> >>>>  
> >>>>                  type = range->type;
> >>>>                  call_rcu(&range->rcu, free_pinned_cacheattr_entry);
> >>>> -                p2m_memory_type_changed(d);
> >>>>                  switch ( type )
> >>>>                  {
> >>>> -                case X86_MT_UCM:
> >>>> +                case X86_MT_WB:
> >>>> +                case X86_MT_WP:
> >>>> +                case X86_MT_WT:
> >>>>                      /*
> >>>> -                     * For EPT we can also avoid the flush in this case;
> >>>> -                     * see epte_get_entry_emt().
> >>>> +                     * Flush since we don't know what the cachability is going
> >>>> +                     * to be.
> >>>>                       */
> >>>> -                    if ( hap_enabled(d) && cpu_has_vmx )
> >>>> -                case X86_MT_UC:
> >>>> -                        break;
> >>>> -                    /* fall through */
> >>>> -                default:
> >>>> -                    flush_all(FLUSH_CACHE);
> >>>> +                    if ( is_iommu_enabled(d) || cache_flush_permitted(d) )
> >>>> +                        flush = true;
> >>>
> >>> Is the check here required?  memory_type_changed() will already check
> >>> for is_iommu_enabled() and cache_flush_permitted(), and hence you
> >>> could just set flush to true unconditionally here IMO.
> >>
> >> The behavioral difference is when both predicates are false: The way I have
> >> it now, p2m_memory_type_changed() will then still be called (conditionally),
> >> better matching prior behavior.
> > 
> > I see.  Yes, p2m_memory_type_changed() needs to be called.

This is complicated.  Looking at memory_type_changed() it's true that
p2m_memory_type_changed() will only be called when
cache_flush_permitted() == true, however by cache_flush_permitted()
returning false we could also imply that there are no p2m_mmio_direct
entries in the p2m, and hence effectively nothing to flush?

> >>
> >>>>                      break;
> >>>>                  }
> >>>> -                return 0;
> >>>> +                rc = 0;
> >>>> +                goto finish;
> >>>>              }
> >>>>          domain_unlock(d);
> >>>>          return -ENOENT;
> >>>>  
> >>>>      case X86_MT_UCM:
> >>>>      case X86_MT_UC:
> >>>> -    case X86_MT_WB:
> >>>>      case X86_MT_WC:
> >>>> +        /* Flush since we don't know what the cachability was. */
> >>>> +        if ( !is_iommu_enabled(d) && !cache_flush_permitted(d) )
> >>>> +            return -EPERM;
> > 
> > When assigning IO resources without an IOMMU enabled we likely need
> > to allow the pinned cache attributes to be set, but there's no need to
> > propagate the changes to the p2m, as the EMT calculation won't take
> > into account the pinned attributes.
> 
> Why would it not do so? Am I overlooking a conditional there that would
> cause hvm_get_mem_pinned_cacheattr() to not be called? The only related
> one I see is
> 
>     if ( type != p2m_mmio_direct && !is_iommu_enabled(d) &&
>          !cache_flush_permitted(d) )

Sorry, what I wrote is nonsense.  I wanted to note that it should be
possible to set cache attributes ahead of assigning IO resources, and
so returning -EPERM might alter existing users of the interface.

However setting cache attributes ahead of having actually assigned IO
resources would seem like an out of order operation logic, and hence
there's no point in supporting it.

> covering the without-IOMMU case just the same as the "with" one. (The
> "without" case looks dubious to me, as I don't think we arrange for
> any identity mapping, but that's a separate topic.)
> 
> > IOW: I don't think we can safely short-circuit and return -EPERM here
> > without agreeing that it's a behavioral difference form the previous
> > implementation.
> 
> There's no question there is a behavioral change here. Without I/O
> resources (and without IOMMU) we simply don't accept cache attributes
> other then WB elsewhere; the change is to avoid doing so here as well,
> to get things to be consistent. Hence the -EPERM return.
> 
> >>>> @@ -682,9 +687,11 @@ int hvm_set_mem_pinned_cacheattr(struct
> >>>>  
> >>>>      xfree(newr);
> >>>>  
> >>>> -    p2m_memory_type_changed(d);
> >>>> -    if ( type != X86_MT_WB )
> >>>> -        flush_all(FLUSH_CACHE);
> >>>> + finish:
> >>>> +    if ( flush )
> >>>> +        memory_type_changed(d);
> >>>> +    else if ( d->vcpu && d->vcpu[0] )
> >>>> +        p2m_memory_type_changed(d);
> >>>
> >>> FWIW, I would just call memory_type_changed() unconditionally
> >>> regardless of the change.
> >>
> >> In which case the need for the "flush" local var would go away, if I
> >> understand your suggestion correctly. Like above, there'll then be
> >> more of a behavioral change than intended. In particular ...
> > 
> > There will be a behavioral change, but not one that the guest would
> > notice IMO.
> > 
> >>>  We suspect the hypercall is only used at
> >>> domain creation time (where memory_type_changed() won't do a cache
> >>> flush anyway).
> >>
> >> ... "suspect" is not enough for my taste. The only alternative there
> >> that I see (as mentioned in a post-commit-message remark) is to
> >> refuse such "late" changes altogether. Yet for that we need to be
> >> sure, which it looks like no-one of us is.
> > 
> > Why do you say only alternative?
> 
> Oh, sorry, I meant "only" just in regard to options keeping the main
> code structure of the change. I agree ...
> 
> > Calling memory_type_changed() unconditionally (without taking into
> > account the previous or new cache attributes) would also be an
> > acceptable solution, that might wide the cache flushing a bit, but
> > would still be correct and much simpler IMO.
> 
> ... that this, too, is a possibility. It would, however, go against the
> stated purpose of the change (in the subject "... as well as associated
> flushing"), which - after all - was the main goal here, seeing the
> series this was originally part of.

Given the recently added logic in memory_type_changed() to limit the
flushing, I'm not sure it would go against the patch subject.  Just
using memory_type_changed() as-is will already prevent flushing when
the guest is not yet started, and would also limit the flushing to the
cases where snooping accesses cannot be forced from the IOMMU.
It will mostly likely result in a reduction of the flushing, even if
the attribute type checking was removed.

IMO the added complexity here is not worth the performance
improvement, not without a clear justification that it's needed.

Thanks, Roger.

Re: [PATCH v2] x86/HVM: restrict use of pinned cache attributes as well as associated flushing

Posted by Jan Beulich 4 months, 3 weeks ago

On 10.06.2025 15:24, Roger Pau Monné wrote:
> IMO the added complexity here is not worth the performance
> improvement, not without a clear justification that it's needed.

Well, okay, I'll simply consider the patch in this shape rejected then.
I don't see much value in wasting further time on it.

Jan

Re: [PATCH v2] x86/HVM: restrict use of pinned cache attributes as well as associated flushing

Posted by Roger Pau Monné 4 months, 3 weeks ago

On Tue, Jun 10, 2025 at 04:13:40PM +0200, Jan Beulich wrote:
> On 10.06.2025 15:24, Roger Pau Monné wrote:
> > IMO the added complexity here is not worth the performance
> > improvement, not without a clear justification that it's needed.
> 
> Well, okay, I'll simply consider the patch in this shape rejected then.
> I don't see much value in wasting further time on it.

The code is already there, so I think there's value in this patch.
Did you see my suggestion in the email yo uare replying to about not
needing to add the is_iommu_enabled(d) || cache_flush_permitted(d)
checks?

With that dropped (if it's indeed OK), I would be fine with Acking the
patch.

Thanks, Roger.

Re: [PATCH v2] x86/HVM: restrict use of pinned cache attributes as well as associated flushing

Posted by Jan Beulich 4 months, 3 weeks ago

On 10.06.2025 17:54, Roger Pau Monné wrote:
> On Tue, Jun 10, 2025 at 04:13:40PM +0200, Jan Beulich wrote:
>> On 10.06.2025 15:24, Roger Pau Monné wrote:
>>> IMO the added complexity here is not worth the performance
>>> improvement, not without a clear justification that it's needed.
>>
>> Well, okay, I'll simply consider the patch in this shape rejected then.
>> I don't see much value in wasting further time on it.
> 
> The code is already there, so I think there's value in this patch.
> Did you see my suggestion in the email yo uare replying to about not
> needing to add the is_iommu_enabled(d) || cache_flush_permitted(d)
> checks?

I did, but it didn't look like we were able to come to an agreement.

> With that dropped (if it's indeed OK), I would be fine with Acking the
> patch.

I guess I'll make one more try then, removing the instance where the
setting of the "flush" local is made conditional, but retaining the
-EPERM return path as was.

Jan