[Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables

Paul Durrant posted 6 patches 6 years, 6 months ago
[Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables
Posted by Paul Durrant 6 years, 6 months ago
Now that there is a per-domain IOMMU enable flag, which should be enabled if
any device is going to be passed through, stop deferring page table
construction until the assignment is done. Also don't tear down the tables
again when the last device is de-assigned; defer that task until domain
destruction.

This allows the has_iommu_pt() helper and iommu_status enumeration to be
removed. Calls to has_iommu_pt() are simply replaced by calls to
is_iommu_enabled(). Remaining open-code tests of iommu_hap_pt_share can also
be replaced by calls to iommu_use_hap_pt().
The arch_iommu_populate_page_table() and iommu_construct() functions become
redundant, as does the 'strict mode' dom0 page_list mapping code in
iommu_hwdom_init(), and iommu_teardown() can be made static is its only
remaining caller, iommu_domain_destroy(), is within the same source
module.

All in all, about 220 lines of code are rmeoved.

NOTE: This patch will cause a small amount of extra resource to be used
      to accommodate IOMMU page tables that may never be used, since the
      per-domain IOMMU flag enable flag is currently set to the value
      of the global iommu_enable flag. A subsequent patch will add an
      option to the toolstack to allow it to be turned off if there is
      no intention to assign passthrough hardware to the domain.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wl@xen.org>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Tamas K Lengyel <tamas@tklengyel.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
Cc: Alexandru Isaila <aisaila@bitdefender.com>
Cc: Petre Pircalabu <ppircalabu@bitdefender.com>
---
 xen/arch/arm/p2m.c                    |   2 +-
 xen/arch/x86/dom0_build.c             |   2 +-
 xen/arch/x86/hvm/mtrr.c               |   5 +-
 xen/arch/x86/mm/mem_sharing.c         |   2 +-
 xen/arch/x86/mm/paging.c              |   2 +-
 xen/arch/x86/x86_64/mm.c              |   2 +-
 xen/common/memory.c                   |   4 +-
 xen/common/vm_event.c                 |   2 +-
 xen/drivers/passthrough/device_tree.c |  11 ---
 xen/drivers/passthrough/iommu.c       | 134 ++++++--------------------
 xen/drivers/passthrough/pci.c         |  12 ---
 xen/drivers/passthrough/vtd/iommu.c   |  10 +-
 xen/drivers/passthrough/x86/iommu.c   |  95 ------------------
 xen/include/asm-arm/iommu.h           |   2 +-
 xen/include/asm-x86/iommu.h           |   2 +-
 xen/include/xen/iommu.h               |  16 ---
 xen/include/xen/sched.h               |   2 -
 17 files changed, 42 insertions(+), 263 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index c5cea25caa..b091c64e1e 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1056,7 +1056,7 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
          !mfn_eq(lpae_get_mfn(*entry), lpae_get_mfn(orig_pte)) )
         p2m_free_entry(p2m, orig_pte, level);
 
-    if ( has_iommu_pt(p2m->domain) &&
+    if ( is_iommu_enabled(p2m->domain) &&
          (lpae_is_valid(orig_pte) || lpae_is_valid(*entry)) )
     {
         unsigned int flush_flags = 0;
diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c
index d381784edd..7cfab2dc25 100644
--- a/xen/arch/x86/dom0_build.c
+++ b/xen/arch/x86/dom0_build.c
@@ -365,7 +365,7 @@ unsigned long __init dom0_compute_nr_pages(
     }
 
     need_paging = is_hvm_domain(d) &&
-        (!iommu_hap_pt_share || !paging_mode_hap(d));
+        (!iommu_use_hap_pt(d) || !paging_mode_hap(d));
     for ( ; ; need_paging = false )
     {
         nr_pages = get_memsize(&dom0_size, avail);
diff --git a/xen/arch/x86/hvm/mtrr.c b/xen/arch/x86/hvm/mtrr.c
index 7ccd85bcea..829b089e79 100644
--- a/xen/arch/x86/hvm/mtrr.c
+++ b/xen/arch/x86/hvm/mtrr.c
@@ -783,7 +783,8 @@ HVM_REGISTER_SAVE_RESTORE(MTRR, hvm_save_mtrr_msr, hvm_load_mtrr_msr, 1,
 
 void memory_type_changed(struct domain *d)
 {
-    if ( (has_iommu_pt(d) || cache_flush_permitted(d)) && d->vcpu && d->vcpu[0] )
+    if ( (is_iommu_enabled(d) || cache_flush_permitted(d)) && d->vcpu &&
+         d->vcpu[0] )
     {
         p2m_memory_type_changed(d);
         flush_all(FLUSH_CACHE);
@@ -831,7 +832,7 @@ int epte_get_entry_emt(struct domain *d, unsigned long gfn, mfn_t mfn,
         return MTRR_TYPE_UNCACHABLE;
     }
 
-    if ( !has_iommu_pt(d) && !cache_flush_permitted(d) )
+    if ( !is_iommu_enabled(d) && !cache_flush_permitted(d) )
     {
         *ipat = 1;
         return MTRR_TYPE_WRBACK;
diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
index a5fe89e339..efb8821768 100644
--- a/xen/arch/x86/mm/mem_sharing.c
+++ b/xen/arch/x86/mm/mem_sharing.c
@@ -1664,7 +1664,7 @@ int mem_sharing_domctl(struct domain *d, struct xen_domctl_mem_sharing_op *mec)
         case XEN_DOMCTL_MEM_SHARING_CONTROL:
         {
             rc = 0;
-            if ( unlikely(has_iommu_pt(d) && mec->u.enable) )
+            if ( unlikely(is_iommu_enabled(d) && mec->u.enable) )
                 rc = -EXDEV;
             else
                 d->arch.hvm.mem_sharing_enabled = mec->u.enable;
diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c
index 69aa228e46..d9a52c4db4 100644
--- a/xen/arch/x86/mm/paging.c
+++ b/xen/arch/x86/mm/paging.c
@@ -213,7 +213,7 @@ int paging_log_dirty_enable(struct domain *d, bool_t log_global)
 {
     int ret;
 
-    if ( has_iommu_pt(d) && log_global )
+    if ( is_iommu_enabled(d) && log_global )
     {
         /*
          * Refuse to turn on global log-dirty mode
diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
index 1919cae18b..827b3f5e27 100644
--- a/xen/arch/x86/x86_64/mm.c
+++ b/xen/arch/x86/x86_64/mm.c
@@ -1433,7 +1433,7 @@ int memory_add(unsigned long spfn, unsigned long epfn, unsigned int pxm)
      * shared or being kept in sync then newly added memory needs to be
      * mapped here.
      */
-    if ( has_iommu_pt(hardware_domain) &&
+    if ( is_iommu_enabled(hardware_domain) &&
          !iommu_use_hap_pt(hardware_domain) &&
          !need_iommu_pt_sync(hardware_domain) )
     {
diff --git a/xen/common/memory.c b/xen/common/memory.c
index d9b35a608c..71445c2f53 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -823,7 +823,7 @@ int xenmem_add_to_physmap(struct domain *d, struct xen_add_to_physmap *xatp,
     xatp->gpfn += start;
     xatp->size -= start;
 
-    if ( has_iommu_pt(d) )
+    if ( is_iommu_enabled(d) )
        this_cpu(iommu_dont_flush_iotlb) = 1;
 
     while ( xatp->size > done )
@@ -844,7 +844,7 @@ int xenmem_add_to_physmap(struct domain *d, struct xen_add_to_physmap *xatp,
         }
     }
 
-    if ( has_iommu_pt(d) )
+    if ( is_iommu_enabled(d) )
     {
         int ret;
 
diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
index 2a1c87e44b..3b18195ebf 100644
--- a/xen/common/vm_event.c
+++ b/xen/common/vm_event.c
@@ -630,7 +630,7 @@ int vm_event_domctl(struct domain *d, struct xen_domctl_vm_event_op *vec)
 
             /* No paging if iommu is used */
             rc = -EMLINK;
-            if ( unlikely(has_iommu_pt(d)) )
+            if ( unlikely(is_iommu_enabled(d)) )
                 break;
 
             rc = -EXDEV;
diff --git a/xen/drivers/passthrough/device_tree.c b/xen/drivers/passthrough/device_tree.c
index 12f2c4c3f2..ea9fd54e3b 100644
--- a/xen/drivers/passthrough/device_tree.c
+++ b/xen/drivers/passthrough/device_tree.c
@@ -40,17 +40,6 @@ int iommu_assign_dt_device(struct domain *d, struct dt_device_node *dev)
     if ( !list_empty(&dev->domain_list) )
         goto fail;
 
-    /*
-     * The hwdom is forced to use IOMMU for protecting assigned
-     * device. Therefore the IOMMU data is already set up.
-     */
-    ASSERT(!is_hardware_domain(d) ||
-           hd->status == IOMMU_STATUS_initialized);
-
-    rc = iommu_construct(d);
-    if ( rc )
-        goto fail;
-
     /* The flag field doesn't matter to DT device. */
     rc = hd->platform_ops->assign_device(d, 0, dt_to_dev(dev), 0);
 
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index d0200d82f0..30976b4406 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -146,6 +146,17 @@ static int __init parse_dom0_iommu_param(const char *s)
 }
 custom_param("dom0-iommu", parse_dom0_iommu_param);
 
+static void __hwdom_init check_hwdom_reqs(struct domain *d)
+{
+    if ( iommu_hwdom_none || !paging_mode_translate(d) )
+        return;
+
+    arch_iommu_check_autotranslated_hwdom(d);
+
+    iommu_hwdom_passthrough = false;
+    iommu_hwdom_strict = true;
+}
+
 int iommu_domain_init(struct domain *d)
 {
     struct domain_iommu *hd = dom_iommu(d);
@@ -159,129 +170,44 @@ int iommu_domain_init(struct domain *d)
         return ret;
 
     hd->platform_ops = iommu_get_ops();
-    return hd->platform_ops->init(d);
-}
+    ret = hd->platform_ops->init(d);
+    if ( ret )
+        return ret;
 
-static void __hwdom_init check_hwdom_reqs(struct domain *d)
-{
-    if ( iommu_hwdom_none || !paging_mode_translate(d) )
-        return;
+    /*
+     * NB: 'relaxed' h/w domains don't need the IOMMU mappings to be kept
+     *     in-sync with their assigned pages because all host RAM will be
+     *     mapped during hwdom_init().
+     */
+    if ( is_hardware_domain(d) )
+        check_hwdom_reqs(d); /* may modify iommu_hwdom_strict */
 
-    arch_iommu_check_autotranslated_hwdom(d);
+    if ( !is_hardware_domain(d) || iommu_hwdom_strict )
+        hd->need_sync = !iommu_use_hap_pt(d);
 
-    iommu_hwdom_passthrough = false;
-    iommu_hwdom_strict = true;
+    return 0;
 }
 
 void __hwdom_init iommu_hwdom_init(struct domain *d)
 {
     struct domain_iommu *hd = dom_iommu(d);
 
-    check_hwdom_reqs(d);
-
     if ( !is_iommu_enabled(d) )
         return;
 
     register_keyhandler('o', &iommu_dump_p2m_table, "dump iommu p2m table", 0);
 
-    hd->status = IOMMU_STATUS_initializing;
-    /*
-     * NB: relaxed hw domains don't need sync because all ram is already
-     * mapped in the iommu page tables.
-     */
-    hd->need_sync = iommu_hwdom_strict && !iommu_use_hap_pt(d);
-    if ( need_iommu_pt_sync(d) )
-    {
-        struct page_info *page;
-        unsigned int i = 0, flush_flags = 0;
-        int rc = 0;
-
-        page_list_for_each ( page, &d->page_list )
-        {
-            unsigned long mfn = mfn_x(page_to_mfn(page));
-            unsigned long dfn = mfn_to_gmfn(d, mfn);
-            unsigned int mapping = IOMMUF_readable;
-            int ret;
-
-            if ( ((page->u.inuse.type_info & PGT_count_mask) == 0) ||
-                 ((page->u.inuse.type_info & PGT_type_mask)
-                  == PGT_writable_page) )
-                mapping |= IOMMUF_writable;
-
-            ret = iommu_map(d, _dfn(dfn), _mfn(mfn), 0, mapping,
-                            &flush_flags);
-
-            if ( !rc )
-                rc = ret;
-
-            if ( !(i++ & 0xfffff) )
-                process_pending_softirqs();
-        }
-
-        /* Use while-break to avoid compiler warning */
-        while ( iommu_iotlb_flush_all(d, flush_flags) )
-            break;
-
-        if ( rc )
-            printk(XENLOG_WARNING "d%d: IOMMU mapping failed: %d\n",
-                   d->domain_id, rc);
-    }
-
     hd->platform_ops->hwdom_init(d);
-
-    hd->status = IOMMU_STATUS_initialized;
 }
 
-void iommu_teardown(struct domain *d)
+static void iommu_teardown(struct domain *d)
 {
     struct domain_iommu *hd = dom_iommu(d);
 
-    hd->status = IOMMU_STATUS_disabled;
     hd->platform_ops->teardown(d);
     tasklet_schedule(&iommu_pt_cleanup_tasklet);
 }
 
-int iommu_construct(struct domain *d)
-{
-    struct domain_iommu *hd = dom_iommu(d);
-
-    if ( hd->status == IOMMU_STATUS_initialized )
-        return 0;
-
-    hd->status = IOMMU_STATUS_initializing;
-
-    if ( !iommu_use_hap_pt(d) )
-    {
-        int rc;
-
-        hd->need_sync = true;
-
-        rc = arch_iommu_populate_page_table(d);
-        if ( rc )
-        {
-            if ( rc != -ERESTART )
-            {
-                hd->need_sync = false;
-                hd->status = IOMMU_STATUS_disabled;
-            }
-
-            return rc;
-        }
-    }
-
-    hd->status = IOMMU_STATUS_initialized;
-
-    /*
-     * There may be dirty cache lines when a device is assigned
-     * and before has_iommu_pt(d) becoming true, this will cause
-     * memory_type_changed lose effect if memory type changes.
-     * Call memory_type_changed here to amend this.
-     */
-    memory_type_changed(d);
-
-    return 0;
-}
-
 void iommu_domain_destroy(struct domain *d)
 {
     if ( !is_iommu_enabled(d) )
@@ -574,11 +500,8 @@ int iommu_do_domctl(
 void iommu_share_p2m_table(struct domain* d)
 {
     ASSERT(hap_enabled(d));
-    /*
-     * iommu_use_hap_pt(d) cannot be used here because during domain
-     * construction has_iommu_pt(d) will always return false here.
-     */
-    if ( is_iommu_enabled(d) && iommu_hap_pt_share )
+
+    if ( iommu_use_hap_pt(d) )
         iommu_get_ops()->share_p2m(d);
 }
 
@@ -625,8 +548,7 @@ static void iommu_dump_p2m_table(unsigned char key)
     ops = iommu_get_ops();
     for_each_domain(d)
     {
-        if ( is_hardware_domain(d) ||
-             dom_iommu(d)->status < IOMMU_STATUS_initialized )
+        if ( !is_iommu_enabled(d) )
             continue;
 
         if ( iommu_use_hap_pt(d) )
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index 61b5b330ca..25ff10f4cb 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -1434,13 +1434,6 @@ static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn, u32 flag)
     if ( !pcidevs_trylock() )
         return -ERESTART;
 
-    rc = iommu_construct(d);
-    if ( rc )
-    {
-        pcidevs_unlock();
-        return rc;
-    }
-
     pdev = pci_get_pdev_by_domain(hardware_domain, seg, bus, devfn);
     if ( !pdev )
     {
@@ -1469,8 +1462,6 @@ static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn, u32 flag)
     }
 
  done:
-    if ( !has_arch_pdevs(d) && has_iommu_pt(d) )
-        iommu_teardown(d);
     pcidevs_unlock();
 
     return rc;
@@ -1519,9 +1510,6 @@ int deassign_device(struct domain *d, u16 seg, u8 bus, u8 devfn)
 
     pdev->fault.count = 0;
 
-    if ( !has_arch_pdevs(d) && has_iommu_pt(d) )
-        iommu_teardown(d);
-
     return ret;
 }
 
diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c
index 01f0bc4689..4ac5da197a 100644
--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -1759,15 +1759,7 @@ static void iommu_domain_teardown(struct domain *d)
 
     ASSERT(is_iommu_enabled(d));
 
-    /*
-     * We can't use iommu_use_hap_pt here because either IOMMU state
-     * is already changed to IOMMU_STATUS_disabled at this point or
-     * has always been IOMMU_STATUS_disabled.
-     *
-     * We also need to test if HAP is enabled because PV guests can
-     * enter this path too.
-     */
-    if ( hap_enabled(d) && iommu_hap_pt_share )
+    if ( iommu_use_hap_pt(d) )
         return;
 
     spin_lock(&hd->arch.mapping_lock);
diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c
index 9879558c17..47a3e55213 100644
--- a/xen/drivers/passthrough/x86/iommu.c
+++ b/xen/drivers/passthrough/x86/iommu.c
@@ -81,101 +81,6 @@ int __init iommu_setup_hpet_msi(struct msi_desc *msi)
     return ops->setup_hpet_msi ? ops->setup_hpet_msi(msi) : -ENODEV;
 }
 
-int arch_iommu_populate_page_table(struct domain *d)
-{
-    struct page_info *page;
-    int rc = 0, n = 0;
-
-    spin_lock(&d->page_alloc_lock);
-
-    if ( unlikely(d->is_dying) )
-        rc = -ESRCH;
-
-    while ( !rc && (page = page_list_remove_head(&d->page_list)) )
-    {
-        if ( is_hvm_domain(d) ||
-            (page->u.inuse.type_info & PGT_type_mask) == PGT_writable_page )
-        {
-            unsigned long mfn = mfn_x(page_to_mfn(page));
-            unsigned long gfn = mfn_to_gmfn(d, mfn);
-            unsigned int flush_flags = 0;
-
-            if ( gfn != gfn_x(INVALID_GFN) )
-            {
-                ASSERT(!(gfn >> DEFAULT_DOMAIN_ADDRESS_WIDTH));
-                BUG_ON(SHARED_M2P(gfn));
-                rc = iommu_map(d, _dfn(gfn), _mfn(mfn), PAGE_ORDER_4K,
-                               IOMMUF_readable | IOMMUF_writable,
-                               &flush_flags);
-
-                /*
-                 * We may be working behind the back of a running guest, which
-                 * may change the type of a page at any time.  We can't prevent
-                 * this (for instance, by bumping the type count while mapping
-                 * the page) without causing legitimate guest type-change
-                 * operations to fail.  So after adding the page to the IOMMU,
-                 * check again to make sure this is still valid.  NB that the
-                 * writable entry in the iommu is harmless until later, when
-                 * the actual device gets assigned.
-                 */
-                if ( !rc && !is_hvm_domain(d) &&
-                     ((page->u.inuse.type_info & PGT_type_mask) !=
-                      PGT_writable_page) )
-                {
-                    rc = iommu_unmap(d, _dfn(gfn), PAGE_ORDER_4K, &flush_flags);
-                    /* If the type changed yet again, simply force a retry. */
-                    if ( !rc && ((page->u.inuse.type_info & PGT_type_mask) ==
-                                 PGT_writable_page) )
-                        rc = -ERESTART;
-                }
-            }
-            if ( rc )
-            {
-                page_list_add(page, &d->page_list);
-                break;
-            }
-        }
-        page_list_add_tail(page, &d->arch.relmem_list);
-        if ( !(++n & 0xff) && !page_list_empty(&d->page_list) &&
-             hypercall_preempt_check() )
-            rc = -ERESTART;
-    }
-
-    if ( !rc )
-    {
-        /*
-         * The expectation here is that generally there are many normal pages
-         * on relmem_list (the ones we put there) and only few being in an
-         * offline/broken state. The latter ones are always at the head of the
-         * list. Hence we first move the whole list, and then move back the
-         * first few entries.
-         */
-        page_list_move(&d->page_list, &d->arch.relmem_list);
-        while ( !page_list_empty(&d->page_list) &&
-                (page = page_list_first(&d->page_list),
-                 (page->count_info & (PGC_state|PGC_broken))) )
-        {
-            page_list_del(page, &d->page_list);
-            page_list_add_tail(page, &d->arch.relmem_list);
-        }
-    }
-
-    spin_unlock(&d->page_alloc_lock);
-
-    if ( !rc )
-        /*
-         * flush_flags are not tracked across hypercall pre-emption so
-         * assume a full flush is necessary.
-         */
-        rc = iommu_iotlb_flush_all(
-            d, IOMMU_FLUSHF_added | IOMMU_FLUSHF_modified);
-
-    if ( rc && rc != -ERESTART )
-        iommu_teardown(d);
-
-    return rc;
-}
-
 void __hwdom_init arch_iommu_check_autotranslated_hwdom(struct domain *d)
 {
     if ( !is_iommu_enabled(d) )
diff --git a/xen/include/asm-arm/iommu.h b/xen/include/asm-arm/iommu.h
index 904c9aec11..e5050636d7 100644
--- a/xen/include/asm-arm/iommu.h
+++ b/xen/include/asm-arm/iommu.h
@@ -21,7 +21,7 @@ struct arch_iommu
 };
 
 /* Always share P2M Table between the CPU and the IOMMU */
-#define iommu_use_hap_pt(d) (has_iommu_pt(d))
+#define iommu_use_hap_pt(d) (is_iommu_enabled(d))
 
 const struct iommu_ops *iommu_get_ops(void);
 void iommu_set_ops(const struct iommu_ops *ops);
diff --git a/xen/include/asm-x86/iommu.h b/xen/include/asm-x86/iommu.h
index facf835ada..6d024d5c0e 100644
--- a/xen/include/asm-x86/iommu.h
+++ b/xen/include/asm-x86/iommu.h
@@ -81,7 +81,7 @@ extern const struct iommu_init_ops *iommu_init_ops;
 
 /* Are we using the domain P2M table as its IOMMU pagetable? */
 #define iommu_use_hap_pt(d) \
-    (hap_enabled(d) && has_iommu_pt(d) && iommu_hap_pt_share)
+    (hap_enabled(d) && is_iommu_enabled(d) && iommu_hap_pt_share)
 
 void iommu_update_ire_from_apic(unsigned int apic, unsigned int reg, unsigned int value);
 unsigned int iommu_read_apic_from_ire(unsigned int apic, unsigned int reg);
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 48f87480a7..5b9611a134 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -74,15 +74,9 @@ int deassign_device(struct domain *d, u16 seg, u8 bus, u8 devfn);
 
 void arch_iommu_domain_destroy(struct domain *d);
 int arch_iommu_domain_init(struct domain *d);
-int arch_iommu_populate_page_table(struct domain *d);
 void arch_iommu_check_autotranslated_hwdom(struct domain *d);
 void arch_iommu_hwdom_init(struct domain *d);
 
-int iommu_construct(struct domain *d);
-
-/* Function used internally, use iommu_domain_destroy */
-void iommu_teardown(struct domain *d);
-
 /*
  * The following flags are passed to map operations and passed by lookup
  * operations.
@@ -249,13 +243,6 @@ struct iommu_ops {
 # define iommu_vcall iommu_call
 #endif
 
-enum iommu_status
-{
-    IOMMU_STATUS_disabled,
-    IOMMU_STATUS_initializing,
-    IOMMU_STATUS_initialized
-};
-
 struct domain_iommu {
     struct arch_iommu arch;
 
@@ -270,9 +257,6 @@ struct domain_iommu {
     /* Features supported by the IOMMU */
     DECLARE_BITMAP(features, IOMMU_FEAT_count);
 
-    /* Status of guest IOMMU mappings */
-    enum iommu_status status;
-
     /*
      * Does the guest reqire mappings to be synchonized, to maintain
      * the default dfn == pfn map. (See comment on dfn at the top of
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index bad9734626..538be7120b 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -964,10 +964,8 @@ static inline bool is_hwdom_pinned_vcpu(const struct vcpu *v)
 }
 
 #ifdef CONFIG_HAS_PASSTHROUGH
-#define has_iommu_pt(d) (dom_iommu(d)->status != IOMMU_STATUS_disabled)
 #define need_iommu_pt_sync(d) (dom_iommu(d)->need_sync)
 #else
-#define has_iommu_pt(d) false
 #define need_iommu_pt_sync(d) false
 #endif
 
-- 
2.20.1.2.gb21ebb671


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables
Posted by Alexandru Stefan ISAILA 6 years, 6 months ago

On 30.07.2019 16:44, Paul Durrant wrote:
> Now that there is a per-domain IOMMU enable flag, which should be enabled if
> any device is going to be passed through, stop deferring page table
> construction until the assignment is done. Also don't tear down the tables
> again when the last device is de-assigned; defer that task until domain
> destruction.
> 
> This allows the has_iommu_pt() helper and iommu_status enumeration to be
> removed. Calls to has_iommu_pt() are simply replaced by calls to
> is_iommu_enabled(). Remaining open-code tests of iommu_hap_pt_share can also
> be replaced by calls to iommu_use_hap_pt().
> The arch_iommu_populate_page_table() and iommu_construct() functions become
> redundant, as does the 'strict mode' dom0 page_list mapping code in
> iommu_hwdom_init(), and iommu_teardown() can be made static is its only
> remaining caller, iommu_domain_destroy(), is within the same source
> module.
> 
> All in all, about 220 lines of code are rmeoved.
> 
> NOTE: This patch will cause a small amount of extra resource to be used
>        to accommodate IOMMU page tables that may never be used, since the
>        per-domain IOMMU flag enable flag is currently set to the value
>        of the global iommu_enable flag. A subsequent patch will add an
>        option to the toolstack to allow it to be turned off if there is
>        no intention to assign passthrough hardware to the domain.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
For the vm_event part.

Reviewed-by: Alexandru Isaila <aisaila@bitdefender.com>

> ---
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Julien Grall <julien.grall@arm.com>
> Cc: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: George Dunlap <George.Dunlap@eu.citrix.com>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Tim Deegan <tim@xen.org>
> Cc: Wei Liu <wl@xen.org>
> Cc: "Roger Pau Monné" <roger.pau@citrix.com>
> Cc: Tamas K Lengyel <tamas@tklengyel.com>
> Cc: George Dunlap <george.dunlap@eu.citrix.com>
> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com>
> Cc: Alexandru Isaila <aisaila@bitdefender.com>
> Cc: Petre Pircalabu <ppircalabu@bitdefender.com>
> ---
>   xen/arch/arm/p2m.c                    |   2 +-
>   xen/arch/x86/dom0_build.c             |   2 +-
>   xen/arch/x86/hvm/mtrr.c               |   5 +-
>   xen/arch/x86/mm/mem_sharing.c         |   2 +-
>   xen/arch/x86/mm/paging.c              |   2 +-
>   xen/arch/x86/x86_64/mm.c              |   2 +-
>   xen/common/memory.c                   |   4 +-
>   xen/common/vm_event.c                 |   2 +-
>   xen/drivers/passthrough/device_tree.c |  11 ---
>   xen/drivers/passthrough/iommu.c       | 134 ++++++--------------------
>   xen/drivers/passthrough/pci.c         |  12 ---
>   xen/drivers/passthrough/vtd/iommu.c   |  10 +-
>   xen/drivers/passthrough/x86/iommu.c   |  95 ------------------
>   xen/include/asm-arm/iommu.h           |   2 +-
>   xen/include/asm-x86/iommu.h           |   2 +-
>   xen/include/xen/iommu.h               |  16 ---
>   xen/include/xen/sched.h               |   2 -
>   17 files changed, 42 insertions(+), 263 deletions(-)
> 
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables
Posted by Jan Beulich 6 years, 6 months ago
On 30.07.2019 15:44, Paul Durrant wrote:
> NOTE: This patch will cause a small amount of extra resource to be used
>        to accommodate IOMMU page tables that may never be used, since the
>        per-domain IOMMU flag enable flag is currently set to the value
>        of the global iommu_enable flag. A subsequent patch will add an
>        option to the toolstack to allow it to be turned off if there is
>        no intention to assign passthrough hardware to the domain.

In particular if the default of this is going to be "true" (I
didn't look at that patch yet, but the wording above makes me
assume so), in auto-ballooning mode without shared page tables
more memory should imo be ballooned out of Dom0 now. It has
always been a bug that IOMMU page tables weren't accounted for,
but it would become even more prominent then.

> --- a/xen/arch/x86/hvm/mtrr.c
> +++ b/xen/arch/x86/hvm/mtrr.c
> @@ -783,7 +783,8 @@ HVM_REGISTER_SAVE_RESTORE(MTRR, hvm_save_mtrr_msr, hvm_load_mtrr_msr, 1,
>   
>   void memory_type_changed(struct domain *d)
>   {
> -    if ( (has_iommu_pt(d) || cache_flush_permitted(d)) && d->vcpu && d->vcpu[0] )
> +    if ( (is_iommu_enabled(d) || cache_flush_permitted(d)) && d->vcpu &&
> +         d->vcpu[0] )

As a really minor comment - I think it wouldn't be bad for both
d->vcpu references to end up on the same line.

> @@ -625,8 +548,7 @@ static void iommu_dump_p2m_table(unsigned char key)
>       ops = iommu_get_ops();
>       for_each_domain(d)
>       {
> -        if ( is_hardware_domain(d) ||
> -             dom_iommu(d)->status < IOMMU_STATUS_initialized )
> +        if ( !is_iommu_enabled(d) )
>               continue;

Why do you drop the hwdom check here?

> --- a/xen/include/asm-arm/iommu.h
> +++ b/xen/include/asm-arm/iommu.h
> @@ -21,7 +21,7 @@ struct arch_iommu
>   };
>   
>   /* Always share P2M Table between the CPU and the IOMMU */
> -#define iommu_use_hap_pt(d) (has_iommu_pt(d))
> +#define iommu_use_hap_pt(d) (is_iommu_enabled(d))

I'd suggest dropping the stray outer pair of parentheses at the
same time.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables
Posted by Paul Durrant 6 years, 6 months ago
> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 07 August 2019 11:32
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: xen-devel@lists.xenproject.org; Julien Grall <julien.grall@arm.com>; Alexandru Isaila
> <aisaila@bitdefender.com>; Petre Pircalabu <ppircalabu@bitdefender.com>; Razvan Cojocaru
> <rcojocaru@bitdefender.com>; Andrew Cooper <Andrew.Cooper3@citrix.com>; Roger Pau Monne
> <roger.pau@citrix.com>; VolodymyrBabchuk <Volodymyr_Babchuk@epam.com>; George Dunlap
> <George.Dunlap@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Stefano Stabellini
> <sstabellini@kernel.org>; Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; Tamas K Lengyel
> <tamas@tklengyel.com>; Tim (Xen.org) <tim@xen.org>; Wei Liu <wl@xen.org>
> Subject: Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables
> 
> On 30.07.2019 15:44, Paul Durrant wrote:
> > NOTE: This patch will cause a small amount of extra resource to be used
> >        to accommodate IOMMU page tables that may never be used, since the
> >        per-domain IOMMU flag enable flag is currently set to the value
> >        of the global iommu_enable flag. A subsequent patch will add an
> >        option to the toolstack to allow it to be turned off if there is
> >        no intention to assign passthrough hardware to the domain.
> 
> In particular if the default of this is going to be "true" (I
> didn't look at that patch yet, but the wording above makes me
> assume so), in auto-ballooning mode without shared page tables
> more memory should imo be ballooned out of Dom0 now. It has
> always been a bug that IOMMU page tables weren't accounted for,
> but it would become even more prominent then.

Ultimately, once the whole series is applied, then nothing much should change for those specifying passthrough h/w in an xl.cfg. The main difference will be that h/w cannot be passed through to a domain that was not originally created with IOMMU pagetables.
With patches applied up to this point then, yes, every domain will get IOMMU page tables. I guess I'll take a look at the auto-ballooning code and see what needs to be done.

> 
> > --- a/xen/arch/x86/hvm/mtrr.c
> > +++ b/xen/arch/x86/hvm/mtrr.c
> > @@ -783,7 +783,8 @@ HVM_REGISTER_SAVE_RESTORE(MTRR, hvm_save_mtrr_msr, hvm_load_mtrr_msr, 1,
> >
> >   void memory_type_changed(struct domain *d)
> >   {
> > -    if ( (has_iommu_pt(d) || cache_flush_permitted(d)) && d->vcpu && d->vcpu[0] )
> > +    if ( (is_iommu_enabled(d) || cache_flush_permitted(d)) && d->vcpu &&
> > +         d->vcpu[0] )
> 
> As a really minor comment - I think it wouldn't be bad for both
> d->vcpu references to end up on the same line.

Ok.

> 
> > @@ -625,8 +548,7 @@ static void iommu_dump_p2m_table(unsigned char key)
> >       ops = iommu_get_ops();
> >       for_each_domain(d)
> >       {
> > -        if ( is_hardware_domain(d) ||
> > -             dom_iommu(d)->status < IOMMU_STATUS_initialized )
> > +        if ( !is_iommu_enabled(d) )
> >               continue;
> 
> Why do you drop the hwdom check here?

Because is_iommu_enabled() for the h/w domain will always be true if iommu_enabled is true, so no need for a special case.

> 
> > --- a/xen/include/asm-arm/iommu.h
> > +++ b/xen/include/asm-arm/iommu.h
> > @@ -21,7 +21,7 @@ struct arch_iommu
> >   };
> >
> >   /* Always share P2M Table between the CPU and the IOMMU */
> > -#define iommu_use_hap_pt(d) (has_iommu_pt(d))
> > +#define iommu_use_hap_pt(d) (is_iommu_enabled(d))
> 
> I'd suggest dropping the stray outer pair of parentheses at the
> same time.

Ok, will do.

  Paul

> 
> Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables
Posted by Paul Durrant 6 years, 6 months ago
> -----Original Message-----
[snip]
> >
> > On 30.07.2019 15:44, Paul Durrant wrote:
> > > NOTE: This patch will cause a small amount of extra resource to be used
> > >        to accommodate IOMMU page tables that may never be used, since the
> > >        per-domain IOMMU flag enable flag is currently set to the value
> > >        of the global iommu_enable flag. A subsequent patch will add an
> > >        option to the toolstack to allow it to be turned off if there is
> > >        no intention to assign passthrough hardware to the domain.
> >
> > In particular if the default of this is going to be "true" (I
> > didn't look at that patch yet, but the wording above makes me
> > assume so), in auto-ballooning mode without shared page tables
> > more memory should imo be ballooned out of Dom0 now. It has
> > always been a bug that IOMMU page tables weren't accounted for,
> > but it would become even more prominent then.
> 
> Ultimately, once the whole series is applied, then nothing much should change for those specifying
> passthrough h/w in an xl.cfg. The main difference will be that h/w cannot be passed through to a
> domain that was not originally created with IOMMU pagetables.
> With patches applied up to this point then, yes, every domain will get IOMMU page tables. I guess I'll
> take a look at the auto-ballooning code and see what needs to be done.
> 

Ok, I've had a look...

I could make a rough calculation in libxl_domain_need_memory() based on the domain's max_memkb and an assumption of a 4 level translation with 512 PTEs per level, or would prefer such guestimation to be overridable using an xl.cfg parameter in a broadly similar way to shadow_memkb?

  Paul
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables
Posted by Paul Durrant 6 years, 5 months ago
> -----Original Message-----
> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of Paul Durrant
> Sent: 12 August 2019 17:26
> To: 'Jan Beulich' <jbeulich@suse.com>
> Cc: 'Petre Pircalabu' <ppircalabu@bitdefender.com>; 'Stefano Stabellini' <sstabellini@kernel.org>;
> 'Wei Liu' <wl@xen.org>; 'Razvan Cojocaru' <rcojocaru@bitdefender.com>; 'Konrad Rzeszutek Wilk'
> <konrad.wilk@oracle.com>; Andrew Cooper <Andrew.Cooper3@citrix.com>; Tim (Xen.org) <tim@xen.org>;
> George Dunlap <George.Dunlap@citrix.com>; 'Julien Grall' <julien.grall@arm.com>; 'Tamas K Lengyel'
> <tamas@tklengyel.com>; Ian Jackson <Ian.Jackson@citrix.com>; 'Alexandru Isaila'
> <aisaila@bitdefender.com>; 'xen-devel@lists.xenproject.org' <xen-devel@lists.xenproject.org>;
> 'VolodymyrBabchuk' <Volodymyr_Babchuk@epam.com>; Roger Pau Monne <roger.pau@citrix.com>
> Subject: Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables
> 
> > -----Original Message-----
> [snip]
> > >
> > > On 30.07.2019 15:44, Paul Durrant wrote:
> > > > NOTE: This patch will cause a small amount of extra resource to be used
> > > >        to accommodate IOMMU page tables that may never be used, since the
> > > >        per-domain IOMMU flag enable flag is currently set to the value
> > > >        of the global iommu_enable flag. A subsequent patch will add an
> > > >        option to the toolstack to allow it to be turned off if there is
> > > >        no intention to assign passthrough hardware to the domain.
> > >
> > > In particular if the default of this is going to be "true" (I
> > > didn't look at that patch yet, but the wording above makes me
> > > assume so), in auto-ballooning mode without shared page tables
> > > more memory should imo be ballooned out of Dom0 now. It has
> > > always been a bug that IOMMU page tables weren't accounted for,
> > > but it would become even more prominent then.
> >
> > Ultimately, once the whole series is applied, then nothing much should change for those specifying
> > passthrough h/w in an xl.cfg. The main difference will be that h/w cannot be passed through to a
> > domain that was not originally created with IOMMU pagetables.
> > With patches applied up to this point then, yes, every domain will get IOMMU page tables. I guess
> I'll
> > take a look at the auto-ballooning code and see what needs to be done.
> >
> 
> Ok, I've had a look...
> 
> I could make a rough calculation in libxl_domain_need_memory() based on the domain's max_memkb and an
> assumption of a 4 level translation with 512 PTEs per level, or would prefer such guestimation to be
> overridable using an xl.cfg parameter in a broadly similar way to shadow_memkb?
> 

I think I'm going to say no to this anyway since, as you say, the overhead as never been accounted for and having to make assumptions about the IOMMU table structure is not very attractive. Given that any issue is going to be transient (i.e. until patch #6 is committed) I don't think fixing auto-ballooning ought to be in scope for this series.

  Paul

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables
Posted by Jan Beulich 6 years, 5 months ago
On 14.08.2019 11:39, Paul Durrant wrote:
>> -----Original Message-----
>> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of Paul Durrant
>> Sent: 12 August 2019 17:26
>> To: 'Jan Beulich' <jbeulich@suse.com>
>> Cc: 'Petre Pircalabu' <ppircalabu@bitdefender.com>; 'Stefano Stabellini' <sstabellini@kernel.org>;
>> 'Wei Liu' <wl@xen.org>; 'Razvan Cojocaru' <rcojocaru@bitdefender.com>; 'Konrad Rzeszutek Wilk'
>> <konrad.wilk@oracle.com>; Andrew Cooper <Andrew.Cooper3@citrix.com>; Tim (Xen.org) <tim@xen.org>;
>> George Dunlap <George.Dunlap@citrix.com>; 'Julien Grall' <julien.grall@arm.com>; 'Tamas K Lengyel'
>> <tamas@tklengyel.com>; Ian Jackson <Ian.Jackson@citrix.com>; 'Alexandru Isaila'
>> <aisaila@bitdefender.com>; 'xen-devel@lists.xenproject.org' <xen-devel@lists.xenproject.org>;
>> 'VolodymyrBabchuk' <Volodymyr_Babchuk@epam.com>; Roger Pau Monne <roger.pau@citrix.com>
>> Subject: Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables
>>
>>> -----Original Message-----
>> [snip]
>>>>
>>>> On 30.07.2019 15:44, Paul Durrant wrote:
>>>>> NOTE: This patch will cause a small amount of extra resource to be used
>>>>>         to accommodate IOMMU page tables that may never be used, since the
>>>>>         per-domain IOMMU flag enable flag is currently set to the value
>>>>>         of the global iommu_enable flag. A subsequent patch will add an
>>>>>         option to the toolstack to allow it to be turned off if there is
>>>>>         no intention to assign passthrough hardware to the domain.
>>>>
>>>> In particular if the default of this is going to be "true" (I
>>>> didn't look at that patch yet, but the wording above makes me
>>>> assume so), in auto-ballooning mode without shared page tables
>>>> more memory should imo be ballooned out of Dom0 now. It has
>>>> always been a bug that IOMMU page tables weren't accounted for,
>>>> but it would become even more prominent then.
>>>
>>> Ultimately, once the whole series is applied, then nothing much should change for those specifying
>>> passthrough h/w in an xl.cfg. The main difference will be that h/w cannot be passed through to a
>>> domain that was not originally created with IOMMU pagetables.
>>> With patches applied up to this point then, yes, every domain will get IOMMU page tables. I guess
>> I'll
>>> take a look at the auto-ballooning code and see what needs to be done.
>>>
>>
>> Ok, I've had a look...
>>
>> I could make a rough calculation in libxl_domain_need_memory() based on the domain's max_memkb and an
>> assumption of a 4 level translation with 512 PTEs per level, or would prefer such guestimation to be
>> overridable using an xl.cfg parameter in a broadly similar way to shadow_memkb?
>>
> 
> I think I'm going to say no to this anyway since, as you say, the overhead as never been accounted for and having to make assumptions about the IOMMU table structure is not very attractive. Given that any issue is going to be transient (i.e. until patch #6 is committed) I don't think fixing auto-ballooning ought to be in scope for this series.

I'm afraid I disagree: The series extends a pre-existing problem
affecting some guests to all ones (at least by default).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables
Posted by Paul Durrant 6 years, 5 months ago
> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 27 August 2019 08:49
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: 'JulienGrall' <julien.grall@arm.com>; 'Alexandru Isaila' <aisaila@bitdefender.com>; 'Petre
> Pircalabu' <ppircalabu@bitdefender.com>; 'Razvan Cojocaru' <rcojocaru@bitdefender.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Ian Jackson
> <Ian.Jackson@citrix.com>; Roger Pau Monne <roger.pau@citrix.com>; 'VolodymyrBabchuk'
> <Volodymyr_Babchuk@epam.com>; 'Stefano Stabellini' <sstabellini@kernel.org>; 'xen-
> devel@lists.xenproject.org' <xen-devel@lists.xenproject.org>; 'Konrad Rzeszutek Wilk'
> <konrad.wilk@oracle.com>; 'Tamas K Lengyel' <tamas@tklengyel.com>; Tim (Xen.org) <tim@xen.org>; 'Wei
> Liu' <wl@xen.org>
> Subject: Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables
> 
> On 14.08.2019 11:39, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of Paul Durrant
> >> Sent: 12 August 2019 17:26
> >> To: 'Jan Beulich' <jbeulich@suse.com>
> >> Cc: 'Petre Pircalabu' <ppircalabu@bitdefender.com>; 'Stefano Stabellini' <sstabellini@kernel.org>;
> >> 'Wei Liu' <wl@xen.org>; 'Razvan Cojocaru' <rcojocaru@bitdefender.com>; 'Konrad Rzeszutek Wilk'
> >> <konrad.wilk@oracle.com>; Andrew Cooper <Andrew.Cooper3@citrix.com>; Tim (Xen.org) <tim@xen.org>;
> >> George Dunlap <George.Dunlap@citrix.com>; 'Julien Grall' <julien.grall@arm.com>; 'Tamas K Lengyel'
> >> <tamas@tklengyel.com>; Ian Jackson <Ian.Jackson@citrix.com>; 'Alexandru Isaila'
> >> <aisaila@bitdefender.com>; 'xen-devel@lists.xenproject.org' <xen-devel@lists.xenproject.org>;
> >> 'VolodymyrBabchuk' <Volodymyr_Babchuk@epam.com>; Roger Pau Monne <roger.pau@citrix.com>
> >> Subject: Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables
> >>
> >>> -----Original Message-----
> >> [snip]
> >>>>
> >>>> On 30.07.2019 15:44, Paul Durrant wrote:
> >>>>> NOTE: This patch will cause a small amount of extra resource to be used
> >>>>>         to accommodate IOMMU page tables that may never be used, since the
> >>>>>         per-domain IOMMU flag enable flag is currently set to the value
> >>>>>         of the global iommu_enable flag. A subsequent patch will add an
> >>>>>         option to the toolstack to allow it to be turned off if there is
> >>>>>         no intention to assign passthrough hardware to the domain.
> >>>>
> >>>> In particular if the default of this is going to be "true" (I
> >>>> didn't look at that patch yet, but the wording above makes me
> >>>> assume so), in auto-ballooning mode without shared page tables
> >>>> more memory should imo be ballooned out of Dom0 now. It has
> >>>> always been a bug that IOMMU page tables weren't accounted for,
> >>>> but it would become even more prominent then.
> >>>
> >>> Ultimately, once the whole series is applied, then nothing much should change for those specifying
> >>> passthrough h/w in an xl.cfg. The main difference will be that h/w cannot be passed through to a
> >>> domain that was not originally created with IOMMU pagetables.
> >>> With patches applied up to this point then, yes, every domain will get IOMMU page tables. I guess
> >> I'll
> >>> take a look at the auto-ballooning code and see what needs to be done.
> >>>
> >>
> >> Ok, I've had a look...
> >>
> >> I could make a rough calculation in libxl_domain_need_memory() based on the domain's max_memkb and
> an
> >> assumption of a 4 level translation with 512 PTEs per level, or would prefer such guestimation to
> be
> >> overridable using an xl.cfg parameter in a broadly similar way to shadow_memkb?
> >>
> >
> > I think I'm going to say no to this anyway since, as you say, the overhead as never been accounted
> for and having to make assumptions about the IOMMU table structure is not very attractive. Given that
> any issue is going to be transient (i.e. until patch #6 is committed) I don't think fixing auto-
> ballooning ought to be in scope for this series.
> 
> I'm afraid I disagree: The series extends a pre-existing problem
> affecting some guests to all ones (at least by default).

TBH I've seen sufficient numbers of domain create failures when using auto-ballooning that I stopped using it some time ago (besides the fact that it's slow). If you're happy with the simplistic double-the-page-table-reservation calculation then I can add that but IMO it would be better to add another patch to just remove auto-ballooning.

 Paul

> 
> Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables
Posted by Jan Beulich 6 years, 5 months ago
On 29.08.2019 11:33, Paul Durrant wrote:
> TBH I've seen sufficient numbers of domain create failures when using
> auto-ballooning that I stopped using it some time ago (besides the fact
> that it's slow). If you're happy with the simplistic
> double-the-page-table-reservation calculation then I can add that but
> IMO it would be better to add another patch to just remove auto-ballooning.

I'd not be overly happy, but I could live with this. But this needs
consent by others, not the least the tool stack maintainers.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables
Posted by Paul Durrant 6 years, 5 months ago
> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 29 August 2019 10:52
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: 'JulienGrall' <julien.grall@arm.com>; 'Alexandru Isaila' <aisaila@bitdefender.com>; 'Petre
> Pircalabu' <ppircalabu@bitdefender.com>; 'Razvan Cojocaru' <rcojocaru@bitdefender.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Ian Jackson
> <Ian.Jackson@citrix.com>; Roger Pau Monne <roger.pau@citrix.com>; 'VolodymyrBabchuk'
> <Volodymyr_Babchuk@epam.com>; 'Stefano Stabellini' <sstabellini@kernel.org>; 'xen-
> devel@lists.xenproject.org' <xen-devel@lists.xenproject.org>; 'Konrad Rzeszutek Wilk'
> <konrad.wilk@oracle.com>; 'Tamas K Lengyel' <tamas@tklengyel.com>; Tim (Xen.org) <tim@xen.org>; 'Wei
> Liu' <wl@xen.org>
> Subject: Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables
> 
> On 29.08.2019 11:33, Paul Durrant wrote:
> > TBH I've seen sufficient numbers of domain create failures when using
> > auto-ballooning that I stopped using it some time ago (besides the fact
> > that it's slow). If you're happy with the simplistic
> > double-the-page-table-reservation calculation then I can add that but
> > IMO it would be better to add another patch to just remove auto-ballooning.
> 
> I'd not be overly happy, but I could live with this. But this needs
> consent by others, not the least the tool stack maintainers.
> 

Ok. I'll deal with that later. Looking again though, I'm not sure what you mean by 'the amount reserved for page tables'? Do you mean 'shadow_memkb' or something else?

  Paul
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables
Posted by Jan Beulich 6 years, 5 months ago
On 29.08.2019 12:20, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: 29 August 2019 10:52
>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> Cc: 'JulienGrall' <julien.grall@arm.com>; 'Alexandru Isaila' <aisaila@bitdefender.com>; 'Petre
>> Pircalabu' <ppircalabu@bitdefender.com>; 'Razvan Cojocaru' <rcojocaru@bitdefender.com>; Andrew Cooper
>> <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Ian Jackson
>> <Ian.Jackson@citrix.com>; Roger Pau Monne <roger.pau@citrix.com>; 'VolodymyrBabchuk'
>> <Volodymyr_Babchuk@epam.com>; 'Stefano Stabellini' <sstabellini@kernel.org>; 'xen-
>> devel@lists.xenproject.org' <xen-devel@lists.xenproject.org>; 'Konrad Rzeszutek Wilk'
>> <konrad.wilk@oracle.com>; 'Tamas K Lengyel' <tamas@tklengyel.com>; Tim (Xen.org) <tim@xen.org>; 'Wei
>> Liu' <wl@xen.org>
>> Subject: Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables
>>
>> On 29.08.2019 11:33, Paul Durrant wrote:
>>> TBH I've seen sufficient numbers of domain create failures when using
>>> auto-ballooning that I stopped using it some time ago (besides the fact
>>> that it's slow). If you're happy with the simplistic
>>> double-the-page-table-reservation calculation then I can add that but
>>> IMO it would be better to add another patch to just remove auto-ballooning.
>>
>> I'd not be overly happy, but I could live with this. But this needs
>> consent by others, not the least the tool stack maintainers.
>>
> 
> Ok. I'll deal with that later. Looking again though, I'm not sure what
> you mean by 'the amount reserved for page tables'? Do you mean
> 'shadow_memkb' or something else?

This one, I think, yes, or the value it gets defaulted to when not set
in the guest config file.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables
Posted by Jan Beulich 6 years, 5 months ago
On 12.08.2019 18:26, Paul Durrant wrote:
>> -----Original Message-----
> [snip]
>>>
>>> On 30.07.2019 15:44, Paul Durrant wrote:
>>>> NOTE: This patch will cause a small amount of extra resource to be used
>>>>         to accommodate IOMMU page tables that may never be used, since the
>>>>         per-domain IOMMU flag enable flag is currently set to the value
>>>>         of the global iommu_enable flag. A subsequent patch will add an
>>>>         option to the toolstack to allow it to be turned off if there is
>>>>         no intention to assign passthrough hardware to the domain.
>>>
>>> In particular if the default of this is going to be "true" (I
>>> didn't look at that patch yet, but the wording above makes me
>>> assume so), in auto-ballooning mode without shared page tables
>>> more memory should imo be ballooned out of Dom0 now. It has
>>> always been a bug that IOMMU page tables weren't accounted for,
>>> but it would become even more prominent then.
>>
>> Ultimately, once the whole series is applied, then nothing much should change for those specifying
>> passthrough h/w in an xl.cfg. The main difference will be that h/w cannot be passed through to a
>> domain that was not originally created with IOMMU pagetables.
>> With patches applied up to this point then, yes, every domain will get IOMMU page tables. I guess I'll
>> take a look at the auto-ballooning code and see what needs to be done.
>>
> 
> Ok, I've had a look...
> 
> I could make a rough calculation in libxl_domain_need_memory() based on
> the domain's max_memkb and an assumption of a 4 level translation with
> 512 PTEs per level, or would prefer such guestimation to be overridable
> using an xl.cfg parameter in a broadly similar way to shadow_memkb?

I've always been thinking that for the non-shared case the amount
reserved for page tables could simply be doubled.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables
Posted by Jan Beulich 6 years, 5 months ago
On 12.08.2019 17:41, Paul Durrant wrote:
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: 07 August 2019 11:32
>>
>> On 30.07.2019 15:44, Paul Durrant wrote:
>>> @@ -625,8 +548,7 @@ static void iommu_dump_p2m_table(unsigned char key)
>>>        ops = iommu_get_ops();
>>>        for_each_domain(d)
>>>        {
>>> -        if ( is_hardware_domain(d) ||
>>> -             dom_iommu(d)->status < IOMMU_STATUS_initialized )
>>> +        if ( !is_iommu_enabled(d) )
>>>                continue;
>>
>> Why do you drop the hwdom check here?
> 
> Because is_iommu_enabled() for the h/w domain will always be true if
> iommu_enabled is true, so no need for a special case.

But the effect of the extra check was to _skip_ Dom0. If you mean to
change this, then you should say so (and why) in the description.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables
Posted by Paul Durrant 6 years, 5 months ago
> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: 27 August 2019 08:46
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: JulienGrall <julien.grall@arm.com>; Alexandru Isaila <aisaila@bitdefender.com>; Petre Pircalabu
> <ppircalabu@bitdefender.com>; Razvan Cojocaru <rcojocaru@bitdefender.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Ian Jackson
> <Ian.Jackson@citrix.com>; Roger Pau Monne <roger.pau@citrix.com>; VolodymyrBabchuk
> <Volodymyr_Babchuk@epam.com>; Stefano Stabellini <sstabellini@kernel.org>; xen-
> devel@lists.xenproject.org; Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; Tamas K Lengyel
> <tamas@tklengyel.com>; Tim (Xen.org) <tim@xen.org>; Wei Liu <wl@xen.org>
> Subject: Re: [Xen-devel] [PATCH 3/6] remove late (on-demand) construction of IOMMU page tables
> 
> On 12.08.2019 17:41, Paul Durrant wrote:
> >> From: Jan Beulich <jbeulich@suse.com>
> >> Sent: 07 August 2019 11:32
> >>
> >> On 30.07.2019 15:44, Paul Durrant wrote:
> >>> @@ -625,8 +548,7 @@ static void iommu_dump_p2m_table(unsigned char key)
> >>>        ops = iommu_get_ops();
> >>>        for_each_domain(d)
> >>>        {
> >>> -        if ( is_hardware_domain(d) ||
> >>> -             dom_iommu(d)->status < IOMMU_STATUS_initialized )
> >>> +        if ( !is_iommu_enabled(d) )
> >>>                continue;
> >>
> >> Why do you drop the hwdom check here?
> >
> > Because is_iommu_enabled() for the h/w domain will always be true if
> > iommu_enabled is true, so no need for a special case.
> 
> But the effect of the extra check was to _skip_ Dom0. If you mean to
> change this, then you should say so (and why) in the description.
> 

Ah, yes, it does still need to remain.

  Paul

> Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel