hw/i386/intel_iommu.c | 442 +++++++++++++++++++++++++++++++++++++++-- hw/i386/intel_iommu_internal.h | 1 + hw/i386/trace-events | 10 +- hw/vfio/common.c | 12 +- hw/virtio/vhost.c | 10 +- include/exec/memory.h | 49 ++++- include/hw/i386/intel_iommu.h | 10 + memory.c | 52 ++++- 8 files changed, 552 insertions(+), 34 deletions(-)
This is v8 of vt-d vfio enablement series. v8 - remove patches 1-9 since merged already - add David's r-b for all the patches - add Aviv's s-o-b in the last patch - rename iommu to iommu_dmar [Jason] - rename last patch subject to "remote IOTLB" [Jason] - pick up jason's two patches to fix vhost breakage - let vhost leverage the new IOMMU notifier interface v7: - for the two traces patches: Change subjects. Remove vtd_err() and vtd_err_nonzero_rsvd() tracers, instead using standalone trace for each of the places. Don't remove any DPRINTF() if there is no replacement. [Jason] - add r-b and a-b for Alex/David/Jason. - in patch "intel_iommu: renaming gpa to iova where proper", convert one more place where I missed [Jason] - fix the place where I should use "~0ULL" not "~0" [Jason] - squash patch 16 into 18 [Jason] v6: - do unmap in all cases when replay [Jason] - do global replay even if context entry is invalidated [Jason] - when iommu reset, send unmap to all registered notifiers [Jason] - use rcu read lock to protect the whole vfio_iommu_map_notify() [Alex, Paolo] v5: - fix patch 4 subject too long, and error spelling [Eric] - add ack-by for alex in patch 1 [Alex] - squashing patch 19/20 into patch 18 [Jason] - fix comments in vtd_page_walk() [Jason] - remove all error_report() [Jason] - add comment for patch 18, mention about that enabled vhost without ATS as well [Jason] - remove skipped debug thing during page walk [Jason] - remove duplicated page walk trace [Jason] - some tunings in vtd_address_space_unmap(), to provide correct iova and addr_mask. For this, I tuned this patch as well a bit: "memory: add section range info for IOMMU notifier" to loosen the range check v4: - convert all error_report()s into traces (in the two patches that did that) - rebased to Jason's DMAR series (master + one more patch: "[PATCH V4 net-next] vhost_net: device IOTLB support") - let vhost use the new api iommu_notifier_init() so it won't break vhost dmar [Jason] - touch commit message of the patch: "intel_iommu: provide its own replay() callback" old replay is not a dead loop, but it will just consume lots of time [Jason] - add comment for patch: "intel_iommu: do replay when context invalidate" telling why replay won't be a problem even without CM=1 [Jason] - remove a useless comment line [Jason] - remove dmar_enabled parameter for vtd_switch_address_space() and vtd_switch_address_space_all() [Mst, Jason] - merged the vfio patches in, to support unmap of big ranges at the beginning ("[PATCH RFC 0/3] vfio: allow to notify unmap for very big region") - using caching_mode instead of cache_mode_enabled, and "caching-mode" instead of "cache-mode" [Kevin] - when receive context entry invalidation, we unmap the entire region first, then replay [Alex] - fix commit message for patch: "intel_iommu: simplify irq region translation" [Kevin] - handle domain/global invalidation, and notify where proper [Jason, Kevin] v3: - fix style error reported by patchew - fix comment in domain switch patch: use "IOMMU address space" rather than "IOMMU region" [Kevin] - add ack-by for Paolo in patch: "memory: add section range info for IOMMU notifier" (this is seperately collected besides this thread) - remove 3 patches which are merged already (from Jason) - rebase to master b6c0897 v2: - change comment for "end" parameter in vtd_page_walk() [Tianyu] - change comment for "a iova" to "an iova" [Yi] - fix fault printed val for GPA address in vtd_page_walk_level (debug only) - rebased to master (rather than Aviv's v6 series) and merged Aviv's series v6: picked patch 1 (as patch 1 in this series), dropped patch 2, re-wrote patch 3 (as patch 17 of this series). - picked up two more bugfix patches from Jason's DMAR series - picked up the following patch as well: "[PATCH v3] intel_iommu: allow dynamic switch of IOMMU region" This RFC series is a re-work for Aviv B.D.'s vfio enablement series with vt-d: https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg01452.html Aviv has done a great job there, and what we still lack there are mostly the following: (1) VFIO got duplicated IOTLB notifications due to splitted VT-d IOMMU memory region. (2) VT-d still haven't provide a correct replay() mechanism (e.g., when IOMMU domain switches, things will broke). This series should have solved the above two issues. Online repo: https://github.com/xzpeter/qemu/tree/vtd-vfio-enablement-v8 I would be glad to hear about any review comments for above patches. ========= Test Done ========= Build test passed for x86_64/arm/ppc64. Simply tested with x86_64, assigning two PCI devices to a single VM, boot the VM using: bin=x86_64-softmmu/qemu-system-x86_64 $bin -M q35,accel=kvm,kernel-irqchip=split -m 1G \ -device intel-iommu,intremap=on,eim=off,caching-mode=on \ -netdev user,id=net0,hostfwd=tcp::5555-:22 \ -device virtio-net-pci,netdev=net0 \ -device vfio-pci,host=03:00.0 \ -device vfio-pci,host=02:00.0 \ -trace events=".trace.vfio" \ /var/lib/libvirt/images/vm1.qcow2 pxdev:bin [vtd-vfio-enablement]# cat .trace.vfio vtd_page_walk* vtd_replay* vtd_inv_desc* Then, in the guest, run the following tool: https://github.com/xzpeter/clibs/blob/master/gpl/userspace/vfio-bind-group/vfio-bind-group.c With parameter: ./vfio-bind-group 00:03.0 00:04.0 Check host side trace log, I can see pages are replayed and mapped in 00:04.0 device address space, like: ... vtd_replay_ce_valid replay valid context device 00:04.00 hi 0x401 lo 0x38fe1001 vtd_page_walk Page walk for ce (0x401, 0x38fe1001) iova range 0x0 - 0x8000000000 vtd_page_walk_level Page walk (base=0x38fe1000, level=3) iova range 0x0 - 0x8000000000 vtd_page_walk_level Page walk (base=0x35d31000, level=2) iova range 0x0 - 0x40000000 vtd_page_walk_level Page walk (base=0x34979000, level=1) iova range 0x0 - 0x200000 vtd_page_walk_one Page walk detected map level 0x1 iova 0x0 -> gpa 0x22dc3000 mask 0xfff perm 3 vtd_page_walk_one Page walk detected map level 0x1 iova 0x1000 -> gpa 0x22e25000 mask 0xfff perm 3 vtd_page_walk_one Page walk detected map level 0x1 iova 0x2000 -> gpa 0x22e12000 mask 0xfff perm 3 vtd_page_walk_one Page walk detected map level 0x1 iova 0x3000 -> gpa 0x22e2d000 mask 0xfff perm 3 vtd_page_walk_one Page walk detected map level 0x1 iova 0x4000 -> gpa 0x12a49000 mask 0xfff perm 3 vtd_page_walk_one Page walk detected map level 0x1 iova 0x5000 -> gpa 0x129bb000 mask 0xfff perm 3 vtd_page_walk_one Page walk detected map level 0x1 iova 0x6000 -> gpa 0x128db000 mask 0xfff perm 3 vtd_page_walk_one Page walk detected map level 0x1 iova 0x7000 -> gpa 0x12a80000 mask 0xfff perm 3 vtd_page_walk_one Page walk detected map level 0x1 iova 0x8000 -> gpa 0x12a7e000 mask 0xfff perm 3 vtd_page_walk_one Page walk detected map level 0x1 iova 0x9000 -> gpa 0x12b22000 mask 0xfff perm 3 vtd_page_walk_one Page walk detected map level 0x1 iova 0xa000 -> gpa 0x12b41000 mask 0xfff perm 3 ... ========= Todo List ========= - error reporting for the assigned devices (as Tianyu has mentioned) - per-domain address-space: A better solution in the future may be - we maintain one address space per IOMMU domain in the guest (so multiple devices can share a same address space if they are sharing the same IOMMU domains in the guest), rather than one address space per device (which is current implementation of vt-d). However that's a step further than this series, and let's see whether we can first provide a workable version of device assignment with vt-d protection. - don't need to notify IOTLB (psi/gsi/global) invalidations to devices that with ATS enabled - investigate when guest map page while mask contains existing mapped pages (e.g. map 12k-16k first, then map 0-12k) - coalesce unmap during page walk (currently, we send it once per page) - when do PSI for unmap, whether we can send one notify directly instead of walking over the page table? - more to come... Thanks, Jason Wang (1): intel_iommu: use the correct memory region for device IOTLB notification Peter Xu (8): memory: add section range info for IOMMU notifier memory: provide IOMMU_NOTIFIER_FOREACH macro memory: provide iommu_replay_all() memory: introduce memory_region_notify_one() memory: add MemoryRegionIOMMUOps.replay() callback intel_iommu: provide its own replay() callback intel_iommu: allow dynamic switch of IOMMU region intel_iommu: enable remote IOTLB hw/i386/intel_iommu.c | 442 +++++++++++++++++++++++++++++++++++++++-- hw/i386/intel_iommu_internal.h | 1 + hw/i386/trace-events | 10 +- hw/vfio/common.c | 12 +- hw/virtio/vhost.c | 10 +- include/exec/memory.h | 49 ++++- include/hw/i386/intel_iommu.h | 10 + memory.c | 52 ++++- 8 files changed, 552 insertions(+), 34 deletions(-) -- 2.7.4
On Thu, Apr 06, 2017 at 03:08:35PM +0800, Peter Xu wrote: > This is v8 of vt-d vfio enablement series. > > v8 > - remove patches 1-9 since merged already > - add David's r-b for all the patches > - add Aviv's s-o-b in the last patch > - rename iommu to iommu_dmar [Jason] > - rename last patch subject to "remote IOTLB" [Jason] > - pick up jason's two patches to fix vhost breakage I only see one (6/9) - is a patch missing or misattributed? > - let vhost leverage the new IOMMU notifier interface Which patch does this? > v7: > - for the two traces patches: Change subjects. Remove vtd_err() and > vtd_err_nonzero_rsvd() tracers, instead using standalone trace for > each of the places. Don't remove any DPRINTF() if there is no > replacement. [Jason] > - add r-b and a-b for Alex/David/Jason. > - in patch "intel_iommu: renaming gpa to iova where proper", convert > one more place where I missed [Jason] > - fix the place where I should use "~0ULL" not "~0" [Jason] > - squash patch 16 into 18 [Jason] > > v6: > - do unmap in all cases when replay [Jason] > - do global replay even if context entry is invalidated [Jason] > - when iommu reset, send unmap to all registered notifiers [Jason] > - use rcu read lock to protect the whole vfio_iommu_map_notify() > [Alex, Paolo] > > v5: > - fix patch 4 subject too long, and error spelling [Eric] > - add ack-by for alex in patch 1 [Alex] > - squashing patch 19/20 into patch 18 [Jason] > - fix comments in vtd_page_walk() [Jason] > - remove all error_report() [Jason] > - add comment for patch 18, mention about that enabled vhost without > ATS as well [Jason] > - remove skipped debug thing during page walk [Jason] > - remove duplicated page walk trace [Jason] > - some tunings in vtd_address_space_unmap(), to provide correct iova > and addr_mask. For this, I tuned this patch as well a bit: > "memory: add section range info for IOMMU notifier" > to loosen the range check > > v4: > - convert all error_report()s into traces (in the two patches that did > that) > - rebased to Jason's DMAR series (master + one more patch: > "[PATCH V4 net-next] vhost_net: device IOTLB support") > - let vhost use the new api iommu_notifier_init() so it won't break > vhost dmar [Jason] > - touch commit message of the patch: > "intel_iommu: provide its own replay() callback" > old replay is not a dead loop, but it will just consume lots of time > [Jason] > - add comment for patch: > "intel_iommu: do replay when context invalidate" > telling why replay won't be a problem even without CM=1 [Jason] > - remove a useless comment line [Jason] > - remove dmar_enabled parameter for vtd_switch_address_space() and > vtd_switch_address_space_all() [Mst, Jason] > - merged the vfio patches in, to support unmap of big ranges at the > beginning ("[PATCH RFC 0/3] vfio: allow to notify unmap for very big > region") > - using caching_mode instead of cache_mode_enabled, and "caching-mode" > instead of "cache-mode" [Kevin] > - when receive context entry invalidation, we unmap the entire region > first, then replay [Alex] > - fix commit message for patch: > "intel_iommu: simplify irq region translation" [Kevin] > - handle domain/global invalidation, and notify where proper [Jason, > Kevin] > > v3: > - fix style error reported by patchew > - fix comment in domain switch patch: use "IOMMU address space" rather > than "IOMMU region" [Kevin] > - add ack-by for Paolo in patch: > "memory: add section range info for IOMMU notifier" > (this is seperately collected besides this thread) > - remove 3 patches which are merged already (from Jason) > - rebase to master b6c0897 > > v2: > - change comment for "end" parameter in vtd_page_walk() [Tianyu] > - change comment for "a iova" to "an iova" [Yi] > - fix fault printed val for GPA address in vtd_page_walk_level (debug > only) > - rebased to master (rather than Aviv's v6 series) and merged Aviv's > series v6: picked patch 1 (as patch 1 in this series), dropped patch > 2, re-wrote patch 3 (as patch 17 of this series). > - picked up two more bugfix patches from Jason's DMAR series > - picked up the following patch as well: > "[PATCH v3] intel_iommu: allow dynamic switch of IOMMU region" > > This RFC series is a re-work for Aviv B.D.'s vfio enablement series > with vt-d: > > https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg01452.html > > Aviv has done a great job there, and what we still lack there are > mostly the following: > > (1) VFIO got duplicated IOTLB notifications due to splitted VT-d IOMMU > memory region. > > (2) VT-d still haven't provide a correct replay() mechanism (e.g., > when IOMMU domain switches, things will broke). > > This series should have solved the above two issues. > > Online repo: > > https://github.com/xzpeter/qemu/tree/vtd-vfio-enablement-v8 > > I would be glad to hear about any review comments for above patches. > > ========= > Test Done > ========= > > Build test passed for x86_64/arm/ppc64. > > Simply tested with x86_64, assigning two PCI devices to a single VM, > boot the VM using: > > bin=x86_64-softmmu/qemu-system-x86_64 > $bin -M q35,accel=kvm,kernel-irqchip=split -m 1G \ > -device intel-iommu,intremap=on,eim=off,caching-mode=on \ > -netdev user,id=net0,hostfwd=tcp::5555-:22 \ > -device virtio-net-pci,netdev=net0 \ > -device vfio-pci,host=03:00.0 \ > -device vfio-pci,host=02:00.0 \ > -trace events=".trace.vfio" \ > /var/lib/libvirt/images/vm1.qcow2 > > pxdev:bin [vtd-vfio-enablement]# cat .trace.vfio > vtd_page_walk* > vtd_replay* > vtd_inv_desc* > > Then, in the guest, run the following tool: > > https://github.com/xzpeter/clibs/blob/master/gpl/userspace/vfio-bind-group/vfio-bind-group.c > > With parameter: > > ./vfio-bind-group 00:03.0 00:04.0 > > Check host side trace log, I can see pages are replayed and mapped in > 00:04.0 device address space, like: > > ... > vtd_replay_ce_valid replay valid context device 00:04.00 hi 0x401 lo 0x38fe1001 > vtd_page_walk Page walk for ce (0x401, 0x38fe1001) iova range 0x0 - 0x8000000000 > vtd_page_walk_level Page walk (base=0x38fe1000, level=3) iova range 0x0 - 0x8000000000 > vtd_page_walk_level Page walk (base=0x35d31000, level=2) iova range 0x0 - 0x40000000 > vtd_page_walk_level Page walk (base=0x34979000, level=1) iova range 0x0 - 0x200000 > vtd_page_walk_one Page walk detected map level 0x1 iova 0x0 -> gpa 0x22dc3000 mask 0xfff perm 3 > vtd_page_walk_one Page walk detected map level 0x1 iova 0x1000 -> gpa 0x22e25000 mask 0xfff perm 3 > vtd_page_walk_one Page walk detected map level 0x1 iova 0x2000 -> gpa 0x22e12000 mask 0xfff perm 3 > vtd_page_walk_one Page walk detected map level 0x1 iova 0x3000 -> gpa 0x22e2d000 mask 0xfff perm 3 > vtd_page_walk_one Page walk detected map level 0x1 iova 0x4000 -> gpa 0x12a49000 mask 0xfff perm 3 > vtd_page_walk_one Page walk detected map level 0x1 iova 0x5000 -> gpa 0x129bb000 mask 0xfff perm 3 > vtd_page_walk_one Page walk detected map level 0x1 iova 0x6000 -> gpa 0x128db000 mask 0xfff perm 3 > vtd_page_walk_one Page walk detected map level 0x1 iova 0x7000 -> gpa 0x12a80000 mask 0xfff perm 3 > vtd_page_walk_one Page walk detected map level 0x1 iova 0x8000 -> gpa 0x12a7e000 mask 0xfff perm 3 > vtd_page_walk_one Page walk detected map level 0x1 iova 0x9000 -> gpa 0x12b22000 mask 0xfff perm 3 > vtd_page_walk_one Page walk detected map level 0x1 iova 0xa000 -> gpa 0x12b41000 mask 0xfff perm 3 > ... > > ========= > Todo List > ========= > > - error reporting for the assigned devices (as Tianyu has mentioned) > > - per-domain address-space: A better solution in the future may be - > we maintain one address space per IOMMU domain in the guest (so > multiple devices can share a same address space if they are sharing > the same IOMMU domains in the guest), rather than one address space > per device (which is current implementation of vt-d). However that's > a step further than this series, and let's see whether we can first > provide a workable version of device assignment with vt-d > protection. > > - don't need to notify IOTLB (psi/gsi/global) invalidations to devices > that with ATS enabled > > - investigate when guest map page while mask contains existing mapped > pages (e.g. map 12k-16k first, then map 0-12k) > > - coalesce unmap during page walk (currently, we send it once per > page) > > - when do PSI for unmap, whether we can send one notify directly > instead of walking over the page table? > > - more to come... > > Thanks, > > Jason Wang (1): > intel_iommu: use the correct memory region for device IOTLB > notification > > Peter Xu (8): > memory: add section range info for IOMMU notifier > memory: provide IOMMU_NOTIFIER_FOREACH macro > memory: provide iommu_replay_all() > memory: introduce memory_region_notify_one() > memory: add MemoryRegionIOMMUOps.replay() callback > intel_iommu: provide its own replay() callback > intel_iommu: allow dynamic switch of IOMMU region > intel_iommu: enable remote IOTLB > > hw/i386/intel_iommu.c | 442 +++++++++++++++++++++++++++++++++++++++-- > hw/i386/intel_iommu_internal.h | 1 + > hw/i386/trace-events | 10 +- > hw/vfio/common.c | 12 +- > hw/virtio/vhost.c | 10 +- > include/exec/memory.h | 49 ++++- > include/hw/i386/intel_iommu.h | 10 + > memory.c | 52 ++++- > 8 files changed, 552 insertions(+), 34 deletions(-) > > -- > 2.7.4
On Thu, Apr 06, 2017 at 02:53:46PM +0300, Michael S. Tsirkin wrote: > On Thu, Apr 06, 2017 at 03:08:35PM +0800, Peter Xu wrote: > > This is v8 of vt-d vfio enablement series. > > > > v8 > > - remove patches 1-9 since merged already > > - add David's r-b for all the patches > > - add Aviv's s-o-b in the last patch > > - rename iommu to iommu_dmar [Jason] > > - rename last patch subject to "remote IOTLB" [Jason] > > - pick up jason's two patches to fix vhost breakage > > I only see one (6/9) - is a patch missing or misattributed? Oh sorry I should say "jason's one patch". The other patch has already been merged upstream, which is 375f74f47 ("vhost: generalize iommu memory region"). > > > - let vhost leverage the new IOMMU notifier interface > > Which patch does this? The first one ("memory: add section range info for IOMMU notifier"). Thanks, -- peterx
On Thu, Apr 06, 2017 at 03:08:35PM +0800, Peter Xu wrote: > This is v8 of vt-d vfio enablement series. > > v8 > - remove patches 1-9 since merged already > - add David's r-b for all the patches > - add Aviv's s-o-b in the last patch > - rename iommu to iommu_dmar [Jason] > - rename last patch subject to "remote IOTLB" [Jason] > - pick up jason's two patches to fix vhost breakage > - let vhost leverage the new IOMMU notifier interface Looks good to me except I'm still wondering why there's a single patch by Jason. Marcel - you mentioned that you intend to try to open and maintain pci-next. Would you like to pick this one up? Otherwise, pls repost after 2.10. > v7: > - for the two traces patches: Change subjects. Remove vtd_err() and > vtd_err_nonzero_rsvd() tracers, instead using standalone trace for > each of the places. Don't remove any DPRINTF() if there is no > replacement. [Jason] > - add r-b and a-b for Alex/David/Jason. > - in patch "intel_iommu: renaming gpa to iova where proper", convert > one more place where I missed [Jason] > - fix the place where I should use "~0ULL" not "~0" [Jason] > - squash patch 16 into 18 [Jason] > > v6: > - do unmap in all cases when replay [Jason] > - do global replay even if context entry is invalidated [Jason] > - when iommu reset, send unmap to all registered notifiers [Jason] > - use rcu read lock to protect the whole vfio_iommu_map_notify() > [Alex, Paolo] > > v5: > - fix patch 4 subject too long, and error spelling [Eric] > - add ack-by for alex in patch 1 [Alex] > - squashing patch 19/20 into patch 18 [Jason] > - fix comments in vtd_page_walk() [Jason] > - remove all error_report() [Jason] > - add comment for patch 18, mention about that enabled vhost without > ATS as well [Jason] > - remove skipped debug thing during page walk [Jason] > - remove duplicated page walk trace [Jason] > - some tunings in vtd_address_space_unmap(), to provide correct iova > and addr_mask. For this, I tuned this patch as well a bit: > "memory: add section range info for IOMMU notifier" > to loosen the range check > > v4: > - convert all error_report()s into traces (in the two patches that did > that) > - rebased to Jason's DMAR series (master + one more patch: > "[PATCH V4 net-next] vhost_net: device IOTLB support") > - let vhost use the new api iommu_notifier_init() so it won't break > vhost dmar [Jason] > - touch commit message of the patch: > "intel_iommu: provide its own replay() callback" > old replay is not a dead loop, but it will just consume lots of time > [Jason] > - add comment for patch: > "intel_iommu: do replay when context invalidate" > telling why replay won't be a problem even without CM=1 [Jason] > - remove a useless comment line [Jason] > - remove dmar_enabled parameter for vtd_switch_address_space() and > vtd_switch_address_space_all() [Mst, Jason] > - merged the vfio patches in, to support unmap of big ranges at the > beginning ("[PATCH RFC 0/3] vfio: allow to notify unmap for very big > region") > - using caching_mode instead of cache_mode_enabled, and "caching-mode" > instead of "cache-mode" [Kevin] > - when receive context entry invalidation, we unmap the entire region > first, then replay [Alex] > - fix commit message for patch: > "intel_iommu: simplify irq region translation" [Kevin] > - handle domain/global invalidation, and notify where proper [Jason, > Kevin] > > v3: > - fix style error reported by patchew > - fix comment in domain switch patch: use "IOMMU address space" rather > than "IOMMU region" [Kevin] > - add ack-by for Paolo in patch: > "memory: add section range info for IOMMU notifier" > (this is seperately collected besides this thread) > - remove 3 patches which are merged already (from Jason) > - rebase to master b6c0897 > > v2: > - change comment for "end" parameter in vtd_page_walk() [Tianyu] > - change comment for "a iova" to "an iova" [Yi] > - fix fault printed val for GPA address in vtd_page_walk_level (debug > only) > - rebased to master (rather than Aviv's v6 series) and merged Aviv's > series v6: picked patch 1 (as patch 1 in this series), dropped patch > 2, re-wrote patch 3 (as patch 17 of this series). > - picked up two more bugfix patches from Jason's DMAR series > - picked up the following patch as well: > "[PATCH v3] intel_iommu: allow dynamic switch of IOMMU region" > > This RFC series is a re-work for Aviv B.D.'s vfio enablement series > with vt-d: > > https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg01452.html > > Aviv has done a great job there, and what we still lack there are > mostly the following: > > (1) VFIO got duplicated IOTLB notifications due to splitted VT-d IOMMU > memory region. > > (2) VT-d still haven't provide a correct replay() mechanism (e.g., > when IOMMU domain switches, things will broke). > > This series should have solved the above two issues. > > Online repo: > > https://github.com/xzpeter/qemu/tree/vtd-vfio-enablement-v8 > > I would be glad to hear about any review comments for above patches. > > ========= > Test Done > ========= > > Build test passed for x86_64/arm/ppc64. > > Simply tested with x86_64, assigning two PCI devices to a single VM, > boot the VM using: > > bin=x86_64-softmmu/qemu-system-x86_64 > $bin -M q35,accel=kvm,kernel-irqchip=split -m 1G \ > -device intel-iommu,intremap=on,eim=off,caching-mode=on \ > -netdev user,id=net0,hostfwd=tcp::5555-:22 \ > -device virtio-net-pci,netdev=net0 \ > -device vfio-pci,host=03:00.0 \ > -device vfio-pci,host=02:00.0 \ > -trace events=".trace.vfio" \ > /var/lib/libvirt/images/vm1.qcow2 > > pxdev:bin [vtd-vfio-enablement]# cat .trace.vfio > vtd_page_walk* > vtd_replay* > vtd_inv_desc* > > Then, in the guest, run the following tool: > > https://github.com/xzpeter/clibs/blob/master/gpl/userspace/vfio-bind-group/vfio-bind-group.c > > With parameter: > > ./vfio-bind-group 00:03.0 00:04.0 > > Check host side trace log, I can see pages are replayed and mapped in > 00:04.0 device address space, like: > > ... > vtd_replay_ce_valid replay valid context device 00:04.00 hi 0x401 lo 0x38fe1001 > vtd_page_walk Page walk for ce (0x401, 0x38fe1001) iova range 0x0 - 0x8000000000 > vtd_page_walk_level Page walk (base=0x38fe1000, level=3) iova range 0x0 - 0x8000000000 > vtd_page_walk_level Page walk (base=0x35d31000, level=2) iova range 0x0 - 0x40000000 > vtd_page_walk_level Page walk (base=0x34979000, level=1) iova range 0x0 - 0x200000 > vtd_page_walk_one Page walk detected map level 0x1 iova 0x0 -> gpa 0x22dc3000 mask 0xfff perm 3 > vtd_page_walk_one Page walk detected map level 0x1 iova 0x1000 -> gpa 0x22e25000 mask 0xfff perm 3 > vtd_page_walk_one Page walk detected map level 0x1 iova 0x2000 -> gpa 0x22e12000 mask 0xfff perm 3 > vtd_page_walk_one Page walk detected map level 0x1 iova 0x3000 -> gpa 0x22e2d000 mask 0xfff perm 3 > vtd_page_walk_one Page walk detected map level 0x1 iova 0x4000 -> gpa 0x12a49000 mask 0xfff perm 3 > vtd_page_walk_one Page walk detected map level 0x1 iova 0x5000 -> gpa 0x129bb000 mask 0xfff perm 3 > vtd_page_walk_one Page walk detected map level 0x1 iova 0x6000 -> gpa 0x128db000 mask 0xfff perm 3 > vtd_page_walk_one Page walk detected map level 0x1 iova 0x7000 -> gpa 0x12a80000 mask 0xfff perm 3 > vtd_page_walk_one Page walk detected map level 0x1 iova 0x8000 -> gpa 0x12a7e000 mask 0xfff perm 3 > vtd_page_walk_one Page walk detected map level 0x1 iova 0x9000 -> gpa 0x12b22000 mask 0xfff perm 3 > vtd_page_walk_one Page walk detected map level 0x1 iova 0xa000 -> gpa 0x12b41000 mask 0xfff perm 3 > ... > > ========= > Todo List > ========= > > - error reporting for the assigned devices (as Tianyu has mentioned) > > - per-domain address-space: A better solution in the future may be - > we maintain one address space per IOMMU domain in the guest (so > multiple devices can share a same address space if they are sharing > the same IOMMU domains in the guest), rather than one address space > per device (which is current implementation of vt-d). However that's > a step further than this series, and let's see whether we can first > provide a workable version of device assignment with vt-d > protection. > > - don't need to notify IOTLB (psi/gsi/global) invalidations to devices > that with ATS enabled > > - investigate when guest map page while mask contains existing mapped > pages (e.g. map 12k-16k first, then map 0-12k) > > - coalesce unmap during page walk (currently, we send it once per > page) > > - when do PSI for unmap, whether we can send one notify directly > instead of walking over the page table? > > - more to come... > > Thanks, > > Jason Wang (1): > intel_iommu: use the correct memory region for device IOTLB > notification > > Peter Xu (8): > memory: add section range info for IOMMU notifier > memory: provide IOMMU_NOTIFIER_FOREACH macro > memory: provide iommu_replay_all() > memory: introduce memory_region_notify_one() > memory: add MemoryRegionIOMMUOps.replay() callback > intel_iommu: provide its own replay() callback > intel_iommu: allow dynamic switch of IOMMU region > intel_iommu: enable remote IOTLB > > hw/i386/intel_iommu.c | 442 +++++++++++++++++++++++++++++++++++++++-- > hw/i386/intel_iommu_internal.h | 1 + > hw/i386/trace-events | 10 +- > hw/vfio/common.c | 12 +- > hw/virtio/vhost.c | 10 +- > include/exec/memory.h | 49 ++++- > include/hw/i386/intel_iommu.h | 10 + > memory.c | 52 ++++- > 8 files changed, 552 insertions(+), 34 deletions(-) > > -- > 2.7.4
On Thu, Apr 06, 2017 at 03:00:28PM +0300, Michael S. Tsirkin wrote: > On Thu, Apr 06, 2017 at 03:08:35PM +0800, Peter Xu wrote: > > This is v8 of vt-d vfio enablement series. > > > > v8 > > - remove patches 1-9 since merged already > > - add David's r-b for all the patches > > - add Aviv's s-o-b in the last patch > > - rename iommu to iommu_dmar [Jason] > > - rename last patch subject to "remote IOTLB" [Jason] > > - pick up jason's two patches to fix vhost breakage > > - let vhost leverage the new IOMMU notifier interface > > Looks good to me except I'm still wondering why there's a single patch > by Jason. Answered in the other thread (the other one has been picked up already). > Marcel - you mentioned that you intend to try to open and > maintain pci-next. Would you like to pick this one up? Otherwise, pls > repost after 2.10. Thanks, -- peterx
© 2016 - 2024 Red Hat, Inc.