include/sysemu/reset.h | 5 ++++ hw/core/reset.c | 67 ++++++++++++++++++++++++++++++------------ hw/i386/intel_iommu.c | 56 +++++++++++++++++++++++++++++++++-- 3 files changed, 107 insertions(+), 21 deletions(-)
From: Peter Xu <peterx@redhat.com> There're issue reported that when syetem_reset the VM with an intel iommu device and MT2892 PF(mlx5_core driver), the host kernel throws DMAR error. https://issues.redhat.com/browse/RHEL-7188 Alex quickly spot a possible issue on ordering of device resets. It's verified by our QE team then that it is indeed the root cause of the problem. Consider when vIOMMU is reset before a VFIO device in a system reset: the device can be doing DMAs even if the vIOMMU is gone; in this specific context it means the shadow mapping can already be completely destroyed. Host will see these DMAs as malicious and report. To fix it, we'll need to make sure all devices under the vIOMMU device hierachy will be reset before the vIOMMU itself. There's plenty of trick inside, one can get those by reading the last patch. I didn't check other vIOMMUs, but this series should fix the issue for VT-d as of now. The solution can be slightly ugly, but a beautiful one can be very non-trivial. Review comments welcomed, thanks. Peter Xu (4): reset: qemu_register_reset_one() reset: Allow multiple stages of system resets intel_iommu: Tear down address spaces before IOMMU reset intel_iommu: Reset vIOMMU at the last stage of system reset include/sysemu/reset.h | 5 ++++ hw/core/reset.c | 67 ++++++++++++++++++++++++++++++------------ hw/i386/intel_iommu.c | 56 +++++++++++++++++++++++++++++++++-- 3 files changed, 107 insertions(+), 21 deletions(-) -- 2.43.0
Hi, On 1/17/24 10:15 AM, peterx@redhat.com wrote: > From: Peter Xu <peterx@redhat.com> > > There're issue reported that when syetem_reset the VM with an intel iommu > device and MT2892 PF(mlx5_core driver), the host kernel throws DMAR error. > > https://issues.redhat.com/browse/RHEL-7188 > > Alex quickly spot a possible issue on ordering of device resets. > > It's verified by our QE team then that it is indeed the root cause of the > problem. Consider when vIOMMU is reset before a VFIO device in a system > reset: the device can be doing DMAs even if the vIOMMU is gone; in this > specific context it means the shadow mapping can already be completely > destroyed. Host will see these DMAs as malicious and report. > > To fix it, we'll need to make sure all devices under the vIOMMU device > hierachy will be reset before the vIOMMU itself. There's plenty of trick > inside, one can get those by reading the last patch. I have a case with intel iommu and vhost-net where I see that the vIOMMU gets disabled by the guest before vhost_dev_stop() causing some spurious lookup failures. This happens when rebooting the guest (see [PATCH] hw/virtio/vhost: Disable IOTLB callbacks when IOMMU gets disabled <https://lore.kernel.org/all/20250120173339.865681-1-eric.auger@redhat.com/> https://lore.kernel.org/all/20250120173339.865681-1-eric.auger@redhat.com/ for more context). I observe that 1) the guest disables the IOMMU through the vtd_handle_gcmd_write 2) vtd_reset (initiated from qemu_system_reset) 3) vhost_dev_stop (initiated from qemu_system_reset) So I can effectively see the viommu is reset before vhost-net stop. 2) is already an issue that looks the same as the one addressed in this series. Now I am also in trouble wrt 1). I don't know yet which chain of events causes the VTD GCMD TE bit to be written but this would also cause spurious vhost IOLTB misses. I haven't seen any follow-up on this series. Is anyone still looking at this issue? Peter gave some guidance about the way to rework the reset chain. Is it still up to date? Thanks Eric > > I didn't check other vIOMMUs, but this series should fix the issue for VT-d > as of now. The solution can be slightly ugly, but a beautiful one can be > very non-trivial. > > Review comments welcomed, thanks. > > Peter Xu (4): > reset: qemu_register_reset_one() > reset: Allow multiple stages of system resets > intel_iommu: Tear down address spaces before IOMMU reset > intel_iommu: Reset vIOMMU at the last stage of system reset > > include/sysemu/reset.h | 5 ++++ > hw/core/reset.c | 67 ++++++++++++++++++++++++++++++------------ > hw/i386/intel_iommu.c | 56 +++++++++++++++++++++++++++++++++-- > 3 files changed, 107 insertions(+), 21 deletions(-) >
On Thu, Jan 23, 2025 at 10:16:23AM +0100, Eric Auger wrote: > I haven't seen any follow-up on this series. Is anyone still looking at > this issue? Peter gave some guidance about the way to rework the reset > chain. Is it still up to date? I didn't continue looking at this issue since that time (and also stopped working on vIOMMU stuff). No plan to continue from my side.. I suppose nobody else has either, or I should have got some email like this. :) It may not be uptodate indeed, so may worth rechecking its validity. Thanks, -- Peter Xu
Hi Peter, On 1/23/25 6:57 PM, Peter Xu wrote: > On Thu, Jan 23, 2025 at 10:16:23AM +0100, Eric Auger wrote: >> I haven't seen any follow-up on this series. Is anyone still looking at >> this issue? Peter gave some guidance about the way to rework the reset >> chain. Is it still up to date? > I didn't continue looking at this issue since that time (and also stopped > working on vIOMMU stuff). No plan to continue from my side.. I suppose > nobody else has either, or I should have got some email like this. :) > > It may not be uptodate indeed, so may worth rechecking its validity. thanks for the update. I will try to pursue the efforts then Eric > > Thanks, >
Hi Peter, On 1/23/25 6:57 PM, Peter Xu wrote: > On Thu, Jan 23, 2025 at 10:16:23AM +0100, Eric Auger wrote: >> I haven't seen any follow-up on this series. Is anyone still looking at >> this issue? Peter gave some guidance about the way to rework the reset >> chain. Is it still up to date? > I didn't continue looking at this issue since that time (and also stopped > working on vIOMMU stuff). No plan to continue from my side.. I suppose > nobody else has either, or I should have got some email like this. :) > > It may not be uptodate indeed, so may worth rechecking its validity. > > Thanks, > Than you for the confirmation Eric
Hi Peter, On 1/17/24 10:15, peterx@redhat.com wrote: > From: Peter Xu <peterx@redhat.com> > > There're issue reported that when syetem_reset the VM with an intel iommu system_reset > device and MT2892 PF(mlx5_core driver), the host kernel throws DMAR error. > > https://issues.redhat.com/browse/RHEL-7188 > > Alex quickly spot a possible issue on ordering of device resets. > > It's verified by our QE team then that it is indeed the root cause of the > problem. Consider when vIOMMU is reset before a VFIO device in a system > reset: the device can be doing DMAs even if the vIOMMU is gone; in this > specific context it means the shadow mapping can already be completely > destroyed. Host will see these DMAs as malicious and report. That's curious we did not get this earlier? > > To fix it, we'll need to make sure all devices under the vIOMMU device > hierachy will be reset before the vIOMMU itself. There's plenty of trick > inside, one can get those by reading the last patch. Not sure what you meant here ;-) > > I didn't check other vIOMMUs, but this series should fix the issue for VT-d > as of now. The solution can be slightly ugly, but a beautiful one can be > very non-trivial. > > Review comments welcomed, thanks. Thanks Eric > > Peter Xu (4): > reset: qemu_register_reset_one() > reset: Allow multiple stages of system resets > intel_iommu: Tear down address spaces before IOMMU reset > intel_iommu: Reset vIOMMU at the last stage of system reset > > include/sysemu/reset.h | 5 ++++ > hw/core/reset.c | 67 ++++++++++++++++++++++++++++++------------ > hw/i386/intel_iommu.c | 56 +++++++++++++++++++++++++++++++++-- > 3 files changed, 107 insertions(+), 21 deletions(-) >
On Wed, Jan 17, 2024 at 11:29:08AM +0100, Eric Auger wrote: > Hi Peter, Hi, Eric, Thanks for the reviews! > > On 1/17/24 10:15, peterx@redhat.com wrote: > > From: Peter Xu <peterx@redhat.com> > > > > There're issue reported that when syetem_reset the VM with an intel iommu > system_reset > > device and MT2892 PF(mlx5_core driver), the host kernel throws DMAR error. > > > > https://issues.redhat.com/browse/RHEL-7188 > > > > Alex quickly spot a possible issue on ordering of device resets. > > > > It's verified by our QE team then that it is indeed the root cause of the > > problem. Consider when vIOMMU is reset before a VFIO device in a system > > reset: the device can be doing DMAs even if the vIOMMU is gone; in this > > specific context it means the shadow mapping can already be completely > > destroyed. Host will see these DMAs as malicious and report. > That's curious we did not get this earlier? I sincerely don't know. It could be that we just didn't test much on system resets. Or, we could have overlooked the host dmesgs; after all the error messages can be benign from functional pov. > > > > To fix it, we'll need to make sure all devices under the vIOMMU device > > hierachy will be reset before the vIOMMU itself. There's plenty of trick > > inside, one can get those by reading the last patch. > Not sure what you meant here ;-) I meant "how to make sure all the vIOMMU managed devices will be reset before the vIOMMU" is tricky on the implementation. I didn't reference any of those in the cover letter, because I think I stated mostly in patch 4, I want to reference that patch for the details. Since I think it's very tricky, I left that major comment in the code to persist. Thanks, -- Peter Xu
© 2016 - 2026 Red Hat, Inc.