[PATCH v3 00/23] IOMMU: superpage support when not sharing pagetables

Jan Beulich posted 23 patches 2 weeks ago
Test gitlab-ci passed
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/76cb9f26-e316-98a2-b1ba-e51e3d20f335@suse.com

[PATCH v3 00/23] IOMMU: superpage support when not sharing pagetables

Posted by Jan Beulich 2 weeks ago
For a long time we've been rather inefficient with IOMMU page table
management when not sharing page tables, i.e. in particular for PV (and
further specifically also for PV Dom0) and AMD (where nowadays we never
share page tables). While up to about 2.5 years ago AMD code had logic
to un-shatter page mappings, that logic was ripped out for being buggy
(XSA-275 plus follow-on).

This series enables use of large pages in AMD and Intel (VT-d) code;
Arm is presently not in need of any enabling as pagetables are always
shared there. It also augments PV Dom0 creation with suitable explicit
IOMMU mapping calls to facilitate use of large pages there. Depending
on the amount of memory handed to Dom0 this improves booting time
(latency until Dom0 actually starts) quite a bit; subsequent shattering
of some of the large pages may of course consume some of the saved time.

Known fallout has been spelled out here:

I'm inclined to say "of course" there are also a few seemingly unrelated
changes included here, which I just came to consider necessary or at
least desirable (in part for having been in need of adjustment for a
long time) along the way. Some of these changes are likely independent
of the bulk of the work here, and hence may be fine to go in ahead of
earlier patches.

v3, besides addressing review feedback, now also implements unshattering
of large pages. There are also a few other new small patches. See
individual patches for details.

01: AMD/IOMMU: have callers specify the target level for page table walks
02: VT-d: have callers specify the target level for page table walks
03: VT-d: limit page table population in domain_pgd_maddr()
04: IOMMU: have vendor code announce supported page sizes
05: IOMMU: simplify unmap-on-error in iommu_map()
06: IOMMU: add order parameter to ->{,un}map_page() hooks
07: IOMMU: have iommu_{,un}map() split requests into largest possible chunks
08: IOMMU/x86: restrict IO-APIC mappings for PV Dom0
09: IOMMU/x86: perform PV Dom0 mappings in batches
10: IOMMU/x86: support freeing of pagetables
11: AMD/IOMMU: drop stray TLB flush
12: AMD/IOMMU: walk trees upon page fault
13: AMD/IOMMU: return old PTE from {set,clear}_iommu_pte_present()
14: AMD/IOMMU: allow use of superpage mappings
15: VT-d: allow use of superpage mappings
16: IOMMU: fold flush-all hook into "flush one"
17: IOMMU/x86: prefill newly allocate page tables
18: x86: introduce helper for recording degree of contiguity in page tables
19: AMD/IOMMU: free all-empty page tables
20: VT-d: free all-empty page tables
21: AMD/IOMMU: replace all-contiguous page tables by superpage mappings
22: VT-d: replace all-contiguous page tables by superpage mappings
23: IOMMU/x86: add perf counters for page table splitting / coalescing

While not directly related (except that making this mode work properly
here was a fair part of the overall work), at this occasion I'd also
like to renew my proposal to make "iommu=dom0-strict" the default going
forward. It already is not only the default, but the only possible mode
for PVH Dom0.