[PATCH v4] memory: Optimize replay of guest mapping

Zhenzhong Duan posted 1 patch 1 year ago
Failed in applying to current master (apply log)
There is a newer version of this series
hw/i386/intel_iommu.c |  2 +-
include/exec/memory.h | 13 +++++++------
softmmu/memory.c      |  5 +++--
3 files changed, 11 insertions(+), 9 deletions(-)
[PATCH v4] memory: Optimize replay of guest mapping
Posted by Zhenzhong Duan 1 year ago
On x86, there are two notifiers registered due to vtd-ir memory
region splitting the entire address space. During replay of the
address space for each notifier, the whole address space is
scanned which is unnecessary. We only need to scan the space
belong to notifier monitored space.

While on x86 IOMMU memory region spans over entire address space,
but on some other platforms(e.g. arm mps3-an547), IOMMU memory
region is only a window in the entire address space. User could
register a notifier with arbitrary scope beyond IOMMU memory
region. Though in current implementation replay is only triggered
by VFIO and dirty page sync with notifiers derived from memory
region section, but this isn't guaranteed in the future.

So, we replay the intersection part of IOMMU memory region and
notifier's scope in memory_region_iommu_replay(). Update doc
comment to match this change.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
v4: Update doc comment per Perter Maydell
v3: Fix assert failure on mps3-an547, fix typos in log per Michael
v2: Add an assert per Peter Xu
Tested on x86 with a net card passed to guest(kvm/tcg), ping/ssh pass.
Also did simple bootup test with mps3-an547

 hw/i386/intel_iommu.c |  2 +-
 include/exec/memory.h | 13 +++++++------
 softmmu/memory.c      |  5 +++--
 3 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index a62896759c78..faade7def867 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3850,7 +3850,7 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n)
                 .domain_id = vtd_get_domain_id(s, &ce, vtd_as->pasid),
             };
 
-            vtd_page_walk(s, &ce, 0, ~0ULL, &info, vtd_as->pasid);
+            vtd_page_walk(s, &ce, n->start, n->end, &info, vtd_as->pasid);
         }
     } else {
         trace_vtd_replay_ce_invalid(bus_n, PCI_SLOT(vtd_as->devfn),
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 15ade918baa4..61da32d8a428 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -425,12 +425,13 @@ struct IOMMUMemoryRegionClass {
      * Called to handle memory_region_iommu_replay().
      *
      * The default implementation of memory_region_iommu_replay() is to
-     * call the IOMMU translate method for every page in the address space
+     * call the IOMMU translate method for every page falling in the
+     * intersection part of IOMMU memory region and notifier's scope
      * with flag == IOMMU_NONE and then call the notifier if translate
      * returns a valid mapping. If this method is implemented then it
      * overrides the default behaviour, and must provide the full semantics
      * of memory_region_iommu_replay(), by calling @notifier for every
-     * translation present in the IOMMU.
+     * translation present in the intersection part.
      *
      * Optional method -- an IOMMU only needs to provide this method
      * if the default is inefficient or produces undesirable side effects.
@@ -1760,9 +1761,10 @@ int memory_region_register_iommu_notifier(MemoryRegion *mr,
                                           IOMMUNotifier *n, Error **errp);
 
 /**
- * memory_region_iommu_replay: replay existing IOMMU translations to
+ * memory_region_iommu_replay: replay existing IOMMU translations in
+ * intersection part of IOMMU memory region and notifier's scope to
  * a notifier with the minimum page granularity returned by
- * mr->iommu_ops->get_page_size().
+ * memory_region_iommu_get_min_page_size().
  *
  * Note: this is not related to record-and-replay functionality.
  *
@@ -1775,8 +1777,7 @@ void memory_region_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n);
  * memory_region_unregister_iommu_notifier: unregister a notifier for
  * changes to IOMMU translation entries.
  *
- * @mr: the memory region which was observed and for which notity_stopped()
- *      needs to be called
+ * @mr: the memory region which was observed.
  * @n: the notifier to be removed.
  */
 void memory_region_unregister_iommu_notifier(MemoryRegion *mr,
diff --git a/softmmu/memory.c b/softmmu/memory.c
index b1a6cae6f583..f7af691991de 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -1925,7 +1925,7 @@ void memory_region_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n)
 {
     MemoryRegion *mr = MEMORY_REGION(iommu_mr);
     IOMMUMemoryRegionClass *imrc = IOMMU_MEMORY_REGION_GET_CLASS(iommu_mr);
-    hwaddr addr, granularity;
+    hwaddr addr, end, granularity;
     IOMMUTLBEntry iotlb;
 
     /* If the IOMMU has its own replay callback, override */
@@ -1935,8 +1935,9 @@ void memory_region_iommu_replay(IOMMUMemoryRegion *iommu_mr, IOMMUNotifier *n)
     }
 
     granularity = memory_region_iommu_get_min_page_size(iommu_mr);
+    end = MIN(n->end, memory_region_size(mr));
 
-    for (addr = 0; addr < memory_region_size(mr); addr += granularity) {
+    for (addr = n->start; addr < end; addr += granularity) {
         iotlb = imrc->translate(iommu_mr, addr, IOMMU_NONE, n->iommu_idx);
         if (iotlb.perm != IOMMU_NONE) {
             n->notify(n, &iotlb);
-- 
2.25.1
Re: [PATCH v4] memory: Optimize replay of guest mapping
Posted by Peter Maydell 1 year ago
On Wed, 19 Apr 2023 at 11:41, Zhenzhong Duan <zhenzhong.duan@intel.com> wrote:
>
> On x86, there are two notifiers registered due to vtd-ir memory
> region splitting the entire address space. During replay of the
> address space for each notifier, the whole address space is
> scanned which is unnecessary. We only need to scan the space
> belong to notifier monitored space.
>
> While on x86 IOMMU memory region spans over entire address space,
> but on some other platforms(e.g. arm mps3-an547), IOMMU memory
> region is only a window in the entire address space. User could
> register a notifier with arbitrary scope beyond IOMMU memory
> region. Though in current implementation replay is only triggered
> by VFIO and dirty page sync with notifiers derived from memory
> region section, but this isn't guaranteed in the future.
>
> So, we replay the intersection part of IOMMU memory region and
> notifier's scope in memory_region_iommu_replay(). Update doc
> comment to match this change.

Hi; I have a couple of minor wording tweaks, and one question
about the docs:

> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index 15ade918baa4..61da32d8a428 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -425,12 +425,13 @@ struct IOMMUMemoryRegionClass {
>       * Called to handle memory_region_iommu_replay().
>       *
>       * The default implementation of memory_region_iommu_replay() is to
> -     * call the IOMMU translate method for every page in the address space
> +     * call the IOMMU translate method for every page falling in the
> +     * intersection part of IOMMU memory region and notifier's scope

"falling in the intersection of the IOMMU MemoryRegion and the
MemoryRegion which the notifier was registered for"

>       * with flag == IOMMU_NONE and then call the notifier if translate
>       * returns a valid mapping. If this method is implemented then it
>       * overrides the default behaviour, and must provide the full semantics
>       * of memory_region_iommu_replay(), by calling @notifier for every
> -     * translation present in the IOMMU.
> +     * translation present in the intersection part.

"present in the IOMMU that is within the MemoryRegion the
notifier was registered for."

Question: is it OK for an implementation of this method to call
the notifier for translations that are in the IOMMU and which
are not in the scope of the notifier (ie which are outside
the intersection) ? Or must it specifically restrict itself
to only calling the notifier for translations which are
inside the notifier's range ?

If the latter, we need to check all 4 existing implementations
of this method to ensure that they are not sending notifications
they should not; if the former, we should document that
implementations have that leeway.

thanks
-- PMM
RE: [PATCH v4] memory: Optimize replay of guest mapping
Posted by Duan, Zhenzhong 1 year ago

>-----Original Message-----
>From: Peter Maydell <peter.maydell@linaro.org>
>Sent: Thursday, April 20, 2023 12:10 AM
>To: Duan, Zhenzhong <zhenzhong.duan@intel.com>
>Cc: qemu-devel@nongnu.org; mst@redhat.com; peterx@redhat.com;
>jasowang@redhat.com; marcel.apfelbaum@gmail.com;
>pbonzini@redhat.com; richard.henderson@linaro.org; eduardo@habkost.net;
>david@redhat.com; philmd@linaro.org; Peng, Chao P
><chao.p.peng@intel.com>
>Subject: Re: [PATCH v4] memory: Optimize replay of guest mapping
>
>On Wed, 19 Apr 2023 at 11:41, Zhenzhong Duan <zhenzhong.duan@intel.com>
>wrote:
>>
>> On x86, there are two notifiers registered due to vtd-ir memory region
>> splitting the entire address space. During replay of the address space
>> for each notifier, the whole address space is scanned which is
>> unnecessary. We only need to scan the space belong to notifier
>> monitored space.
>>
>> While on x86 IOMMU memory region spans over entire address space, but
>> on some other platforms(e.g. arm mps3-an547), IOMMU memory region is
>> only a window in the entire address space. User could register a
>> notifier with arbitrary scope beyond IOMMU memory region. Though in
>> current implementation replay is only triggered by VFIO and dirty page
>> sync with notifiers derived from memory region section, but this isn't
>> guaranteed in the future.
>>
>> So, we replay the intersection part of IOMMU memory region and
>> notifier's scope in memory_region_iommu_replay(). Update doc comment
>> to match this change.
>
>Hi; I have a couple of minor wording tweaks, and one question about the docs:
>
>> diff --git a/include/exec/memory.h b/include/exec/memory.h index
>> 15ade918baa4..61da32d8a428 100644
>> --- a/include/exec/memory.h
>> +++ b/include/exec/memory.h
>> @@ -425,12 +425,13 @@ struct IOMMUMemoryRegionClass {
>>       * Called to handle memory_region_iommu_replay().
>>       *
>>       * The default implementation of memory_region_iommu_replay() is to
>> -     * call the IOMMU translate method for every page in the address space
>> +     * call the IOMMU translate method for every page falling in the
>> +     * intersection part of IOMMU memory region and notifier's scope
>
>"falling in the intersection of the IOMMU MemoryRegion and the
>MemoryRegion which the notifier was registered for"
>
>>       * with flag == IOMMU_NONE and then call the notifier if translate
>>       * returns a valid mapping. If this method is implemented then it
>>       * overrides the default behaviour, and must provide the full semantics
>>       * of memory_region_iommu_replay(), by calling @notifier for every
>> -     * translation present in the IOMMU.
>> +     * translation present in the intersection part.
>
>"present in the IOMMU that is within the MemoryRegion the notifier was
>registered for."
>
>Question: is it OK for an implementation of this method to call the notifier for
>translations that are in the IOMMU and which are not in the scope of the
>notifier (ie which are outside the intersection) ? Or must it specifically restrict
>itself to only calling the notifier for translations which are inside the notifier's
>range ?
In the calling path to notifier->notify(), memory_region_notify_iommu_one()
ensures passing an IOMMUTLBEntry in the notifier's range.
I think it's ok for an implementation of replay() to walk over the entire
IOMMU MemoryRegion because unrelated entries are filtered as above.
It's just less optimistic which this patch try to address for intel_iommu.

>
>If the latter, we need to check all 4 existing implementations of this method to
>ensure that they are not sending notifications they should not; if the former,
>we should document that implementations have that leeway.
I think the latter and memory_region_notify_iommu_one() already ensure it.

Thanks
Zhenzhong
Re: [PATCH v4] memory: Optimize replay of guest mapping
Posted by Peter Maydell 1 year ago
On Thu, 20 Apr 2023 at 04:13, Duan, Zhenzhong <zhenzhong.duan@intel.com> wrote:
> >From: Peter Maydell <peter.maydell@linaro.org>

> >Question: is it OK for an implementation of this method to call the notifier for
> >translations that are in the IOMMU and which are not in the scope of the
> >notifier (ie which are outside the intersection) ? Or must it specifically restrict
> >itself to only calling the notifier for translations which are inside the notifier's
> >range ?
> In the calling path to notifier->notify(), memory_region_notify_iommu_one()
> ensures passing an IOMMUTLBEntry in the notifier's range.
> I think it's ok for an implementation of replay() to walk over the entire
> IOMMU MemoryRegion because unrelated entries are filtered as above.
> It's just less optimistic which this patch try to address for intel_iommu.

OK, great.

-- PMM
Re: [PATCH v4] memory: Optimize replay of guest mapping
Posted by Peter Xu 1 year ago
On Wed, Apr 19, 2023 at 06:29:40PM +0800, Zhenzhong Duan wrote:
> On x86, there are two notifiers registered due to vtd-ir memory
> region splitting the entire address space. During replay of the
> address space for each notifier, the whole address space is
> scanned which is unnecessary. We only need to scan the space
> belong to notifier monitored space.
> 
> While on x86 IOMMU memory region spans over entire address space,
> but on some other platforms(e.g. arm mps3-an547), IOMMU memory
> region is only a window in the entire address space. User could
> register a notifier with arbitrary scope beyond IOMMU memory
> region. Though in current implementation replay is only triggered
> by VFIO and dirty page sync with notifiers derived from memory
> region section, but this isn't guaranteed in the future.
> 
> So, we replay the intersection part of IOMMU memory region and
> notifier's scope in memory_region_iommu_replay(). Update doc
> comment to match this change.
> 
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>

Reviewed-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu