From: YiFei Zhu <zhuyifei@google.com>
Iterate through all the IOAS objects and the underlying hwpt_paging
objects. Persist each iommu domain using API iommu_domain_preserve.
This is temporary as only the domains attached to the persisted devices
need to preserved.
Signed-off-by: YiFei Zhu <zhuyifei@google.com>
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
---
drivers/iommu/iommufd/liveupdate.c | 47 ++++++++++++++++++++++++++++++
1 file changed, 47 insertions(+)
diff --git a/drivers/iommu/iommufd/liveupdate.c b/drivers/iommu/iommufd/liveupdate.c
index 1bdd5a82af90..0af0c6fadff1 100644
--- a/drivers/iommu/iommufd/liveupdate.c
+++ b/drivers/iommu/iommufd/liveupdate.c
@@ -8,9 +8,52 @@
#include <linux/kexec_handover.h>
#include <linux/liveupdate.h>
#include <linux/mm.h>
+#include <linux/pci.h>
#include "iommufd_private.h"
+static int iommufd_save_ioas(struct iommufd_ctx *ictx,
+ struct iommufd_lu *iommufd_lu)
+{
+ struct iommufd_hwpt_paging *hwpt_paging;
+ struct iommufd_ioas *ioas = NULL;
+ struct iommufd_object *obj;
+ unsigned long index;
+ int rc;
+
+ /* Iterate each ioas. */
+ xa_for_each(&ictx->objects, index, obj) {
+ if (obj->type != IOMMUFD_OBJ_IOAS)
+ continue;
+
+ ioas = (struct iommufd_ioas *)obj;
+ mutex_lock(&ioas->mutex);
+
+ /*
+ * TODO: Iterate over each device of this iommufd and only save
+ * hwpt/domain if the device is persisted.
+ */
+ list_for_each_entry(hwpt_paging, &ioas->hwpt_list, hwpt_item) {
+ if (!hwpt_paging->common.domain)
+ continue;
+
+ rc = iommu_domain_preserve(hwpt_paging->common.domain);
+ if (rc)
+ goto err;
+ }
+
+ mutex_unlock(&ioas->mutex);
+ ioas = NULL;
+ }
+
+ return 0;
+
+err:
+ if (ioas)
+ mutex_unlock(&ioas->mutex);
+ return rc;
+}
+
static int iommufd_liveupdate_prepare(struct liveupdate_file_handler *handler,
struct file *file, u64 *data)
{
@@ -33,6 +76,10 @@ static int iommufd_liveupdate_prepare(struct liveupdate_file_handler *handler,
iommufd_lu = folio_address(folio_lu);
+ rc = iommufd_save_ioas(ictx, iommufd_lu);
+ if (rc)
+ goto err_folio_put;
+
rc = kho_preserve_folio(folio_lu);
if (rc)
goto err_folio_put;
--
2.51.0.536.g15c5d4f767-goog
On Sun, Sep 28, 2025 at 07:06:21PM +0000, Samiullah Khawaja wrote: > +static int iommufd_save_ioas(struct iommufd_ctx *ictx, > + struct iommufd_lu *iommufd_lu) > +{ > + struct iommufd_hwpt_paging *hwpt_paging; > + struct iommufd_ioas *ioas = NULL; > + struct iommufd_object *obj; > + unsigned long index; > + int rc; > + > + /* Iterate each ioas. */ > + xa_for_each(&ictx->objects, index, obj) { > + if (obj->type != IOMMUFD_OBJ_IOAS) > + continue; Wrong locking > + > + ioas = (struct iommufd_ioas *)obj; > + mutex_lock(&ioas->mutex); > + > + /* > + * TODO: Iterate over each device of this iommufd and only save > + * hwpt/domain if the device is persisted. > + */ > + list_for_each_entry(hwpt_paging, &ioas->hwpt_list, hwpt_item) { > + if (!hwpt_paging->common.domain) > + continue; I don't think this should be automatic. The user should directly serialize/unserialize HWPTs by ID. Jason
On Mon, Sep 29, 2025 at 12:00 PM Jason Gunthorpe <jgg@ziepe.ca> wrote: > > On Sun, Sep 28, 2025 at 07:06:21PM +0000, Samiullah Khawaja wrote: > > +static int iommufd_save_ioas(struct iommufd_ctx *ictx, > > + struct iommufd_lu *iommufd_lu) > > +{ > > + struct iommufd_hwpt_paging *hwpt_paging; > > + struct iommufd_ioas *ioas = NULL; > > + struct iommufd_object *obj; > > + unsigned long index; > > + int rc; > > + > > + /* Iterate each ioas. */ > > + xa_for_each(&ictx->objects, index, obj) { > > + if (obj->type != IOMMUFD_OBJ_IOAS) > > + continue; > > Wrong locking > > > + > > + ioas = (struct iommufd_ioas *)obj; > > + mutex_lock(&ioas->mutex); > > + > > + /* > > + * TODO: Iterate over each device of this iommufd and only save > > + * hwpt/domain if the device is persisted. > > + */ > > + list_for_each_entry(hwpt_paging, &ioas->hwpt_list, hwpt_item) { > > + if (!hwpt_paging->common.domain) > > + continue; > > I don't think this should be automatic. The user should directly > serialize/unserialize HWPTs by ID. Why not? Live Updated uAPI is handled through FDs, and both iommufd and vfiofd have to be preserved; I assume we can automatically determine the hwpt to be preserved through dependencies. Why would we delegate this to the user? Pasha
On Tue, Sep 30, 2025 at 09:07:48AM -0400, Pasha Tatashin wrote: > On Mon, Sep 29, 2025 at 12:00 PM Jason Gunthorpe <jgg@ziepe.ca> wrote: > > > > On Sun, Sep 28, 2025 at 07:06:21PM +0000, Samiullah Khawaja wrote: > > > +static int iommufd_save_ioas(struct iommufd_ctx *ictx, > > > + struct iommufd_lu *iommufd_lu) > > > +{ > > > + struct iommufd_hwpt_paging *hwpt_paging; > > > + struct iommufd_ioas *ioas = NULL; > > > + struct iommufd_object *obj; > > > + unsigned long index; > > > + int rc; > > > + > > > + /* Iterate each ioas. */ > > > + xa_for_each(&ictx->objects, index, obj) { > > > + if (obj->type != IOMMUFD_OBJ_IOAS) > > > + continue; > > > > Wrong locking > > > > > + > > > + ioas = (struct iommufd_ioas *)obj; > > > + mutex_lock(&ioas->mutex); > > > + > > > + /* > > > + * TODO: Iterate over each device of this iommufd and only save > > > + * hwpt/domain if the device is persisted. > > > + */ > > > + list_for_each_entry(hwpt_paging, &ioas->hwpt_list, hwpt_item) { > > > + if (!hwpt_paging->common.domain) > > > + continue; > > > > I don't think this should be automatic. The user should directly > > serialize/unserialize HWPTs by ID. > > Why not? Live Updated uAPI is handled through FDs, and both iommufd > and vfiofd have to be preserved; I assume we can automatically > determine the hwpt to be preserved through dependencies. Why would we > delegate this to the user? There are HWPTs outside the IOAS so it is inconsisent. We are not going to reconstruct the IOAS. The IDR ids of the HWPT may not be available on restore (we cannot make this ABI), so without userspace expressly labeling them and recovering the new IDR ids it doesn't work. Finally we expect to discard the preserved HWPTs and replace them we rebuilt ones at least as a first step. Userspace needs to sequence all of this.. Jason
On Tue, Sep 30, 2025 at 6:59 AM Jason Gunthorpe <jgg@ziepe.ca> wrote: > > On Tue, Sep 30, 2025 at 09:07:48AM -0400, Pasha Tatashin wrote: > > On Mon, Sep 29, 2025 at 12:00 PM Jason Gunthorpe <jgg@ziepe.ca> wrote: > > > > > > On Sun, Sep 28, 2025 at 07:06:21PM +0000, Samiullah Khawaja wrote: > > > > +static int iommufd_save_ioas(struct iommufd_ctx *ictx, > > > > + struct iommufd_lu *iommufd_lu) > > > > +{ > > > > + struct iommufd_hwpt_paging *hwpt_paging; > > > > + struct iommufd_ioas *ioas = NULL; > > > > + struct iommufd_object *obj; > > > > + unsigned long index; > > > > + int rc; > > > > + > > > > + /* Iterate each ioas. */ > > > > + xa_for_each(&ictx->objects, index, obj) { > > > > + if (obj->type != IOMMUFD_OBJ_IOAS) > > > > + continue; > > > > > > Wrong locking > > > > > > > + > > > > + ioas = (struct iommufd_ioas *)obj; > > > > + mutex_lock(&ioas->mutex); > > > > + > > > > + /* > > > > + * TODO: Iterate over each device of this iommufd and only save > > > > + * hwpt/domain if the device is persisted. > > > > + */ > > > > + list_for_each_entry(hwpt_paging, &ioas->hwpt_list, hwpt_item) { > > > > + if (!hwpt_paging->common.domain) > > > > + continue; > > > > > > I don't think this should be automatic. The user should directly > > > serialize/unserialize HWPTs by ID. > > > > Why not? Live Updated uAPI is handled through FDs, and both iommufd > > and vfiofd have to be preserved; I assume we can automatically > > determine the hwpt to be preserved through dependencies. Why would we > > delegate this to the user? > > There are HWPTs outside the IOAS so it is inconsisent. This makes sense. But if I understand correctly a HWPT should be associated one way or another to a preserved device or IOAS. Also the nested ones will have parent HWPT. Can we not look at the dependencies here and find the HWPTs that need to preserved. > > We are not going to reconstruct the IOAS. > > The IDR ids of the HWPT may not be available on restore (we cannot > make this ABI), so without userspace expressly labeling them and > recovering the new IDR ids it doesn't work. > > Finally we expect to discard the preserved HWPTs and replace them we > rebuilt ones at least as a first step. Userspace needs to sequence all > of this.. But if we discard the old HWPTs and replace them with the new ones, we shouldn't need labeling of the old HWPTs? We would definitely need to sequence the replacement and discard of the old ones, but that can also be inferred through the dependencies between the new HWPTs? > > Jason
On Tue, Sep 30, 2025 at 01:02:31PM -0700, Samiullah Khawaja wrote: > > There are HWPTs outside the IOAS so it is inconsisent. > > This makes sense. But if I understand correctly a HWPT should be > associated one way or another to a preserved device or IOAS. Also the > nested ones will have parent HWPT. Can we not look at the dependencies > here and find the HWPTs that need to preserved. Maybe in some capacity, but I would say more of don't allow preserving things that depend on things not already preserved somehow. > > Finally we expect to discard the preserved HWPTs and replace them we > > rebuilt ones at least as a first step. Userspace needs to sequence all > > of this.. > > But if we discard the old HWPTs and replace them with the new ones, we > shouldn't need labeling of the old HWPTs? We would definitely need to > sequence the replacement and discard of the old ones, but that can > also be inferred through the dependencies between the new HWPTs? It depends how this ends up being designed and who is responsible to free the restored iommu_domain. The iommu core code should be restoring the iommu_domain as soon as the attached device is plugged in and attaching the preserved domain instead of something else during the device probe sequence This logic should not be in drivers. From there you either put the hwpt back into iommufd and have it free the iommu_domain when it destroys the hwpt Or you have the iommu core code free the iommu_domain at some point after iommufd has replaced the attachment with a new iommu_domain? I'm not sure which is a better option.. Also there is an interesting behavior to note that if the iommu driver restores a domain then it will also prevent a non-vfio driver from binding to that device. Jason
On Tue, Sep 30, 2025 at 2:05 PM Jason Gunthorpe <jgg@ziepe.ca> wrote: > > On Tue, Sep 30, 2025 at 01:02:31PM -0700, Samiullah Khawaja wrote: > > > There are HWPTs outside the IOAS so it is inconsisent. > > > > This makes sense. But if I understand correctly a HWPT should be > > associated one way or another to a preserved device or IOAS. Also the > > nested ones will have parent HWPT. Can we not look at the dependencies > > here and find the HWPTs that need to preserved. > > Maybe in some capacity, but I would say more of don't allow preserving > things that depend on things not already preserved somehow. I agree. I think this makes sense. Users can explicitly indicate that they want to preserve HWPTs and iommufd can enforce the dependencies. > > > > Finally we expect to discard the preserved HWPTs and replace them we > > > rebuilt ones at least as a first step. Userspace needs to sequence all > > > of this.. > > > > But if we discard the old HWPTs and replace them with the new ones, we > > shouldn't need labeling of the old HWPTs? We would definitely need to > > sequence the replacement and discard of the old ones, but that can > > also be inferred through the dependencies between the new HWPTs? > > It depends how this ends up being designed and who is responsible to > free the restored iommu_domain. Agreed. I think it depends on how much is restored from the previous kernel. Discussed further below inline. > > The iommu core code should be restoring the iommu_domain as soon as > the attached device is plugged in and attaching the preserved domain > instead of something else during the device probe sequence > > This logic should not be in drivers. > > From there you either put the hwpt back into iommufd and have it free > the iommu_domain when it destroys the hwpt > > Or you have the iommu core code free the iommu_domain at some point > after iommufd has replaced the attachment with a new iommu_domain? But we cannot do the replacement during domain attachment because userspace might not have fully prepared the new domain with all the required DMA mappings. Replace during LUO finish? This is actually very close to what I had in mind for the "Hotswap" model. My thought was: 1. During boot, the IOMMU core sets up a default domain but doesn't program the context entries for the preserved device. The hardware keeps on using the old preserved tables. 2. Userspace restores the iommufd, creates a new HWPT/domain and populates mappings. 3. On FINISH, the IOMMU core updates the context entries of preserved devices to point to the new domain. I have a sequence diagram for this in the cover letter also. I understand the desire to have the preserved iommu domain be restored during boot so the device has a default domain and there is an owner of the attached restored domain, but that would prevent the iommfud from cooking a clean new domain. Maybe we can refine the "Hotswap" model I had in mind. Basically on boot the core restores the preserved iommu domain, but core lets iommufd attach a new domain with preserved devices without replacing the underlying context entries? The core replaces the context entries when the iommufd indicates that the domain is fully prepared (during luo finish). > > I'm not sure which is a better option.. > > Also there is an interesting behavior to note that if the iommu driver > restores a domain then it will also prevent a non-vfio driver from > binding to that device. Agreed. I think in the "Hotswap" approach I discussed above, if we don't restore the domain, the core can just commit the context entries of the new default domain if a non-vfio driver is bound to the device. > > Jason
On Tue, Sep 30, 2025 at 04:15:43PM -0700, Samiullah Khawaja wrote: > > The iommu core code should be restoring the iommu_domain as soon as > > the attached device is plugged in and attaching the preserved domain > > instead of something else during the device probe sequence > > > > This logic should not be in drivers. > > > > From there you either put the hwpt back into iommufd and have it free > > the iommu_domain when it destroys the hwpt > > > > Or you have the iommu core code free the iommu_domain at some point > > after iommufd has replaced the attachment with a new iommu_domain? > > But we cannot do the replacement during domain attachment because > userspace might not have fully prepared the new domain with all the > required DMA mappings. Replace during LUO finish? The idea is the kernel will restore the iommu_domain during early boot in the iommu_core and then attach it. This should "rewrite" the IOMMU HW context for that device with identical content. Drivers must be enhanced to support this hitless rewrite (AMD and ARM are already done). At this point the kernel is operating normally with a normal domain and a normal driver, no special luo stuff. Later iommufd will come along and establish a HWPT that has an identical translation. Then we replace the luo domain with the new HWPT and free the luo domain. > 1. During boot, the IOMMU core sets up a default domain but doesn't > program the context entries for the preserved device. The hardware > keeps on using the old preserved tables. When the iommu driver first starts up it can take over the context memory from the predecessor kernel. But it has to go through it and clear out most of the context entries. Only context entries belonging to devices marked for preservation should be kept unchanged. Later we probe the struct device to the iommu and do as I said above to restore consistency. > 2. Userspace restores the iommufd, creates a new HWPT/domain and > populates mappings. Yes > 3. On FINISH, the IOMMU core updates the context entries of preserved > devices to point to the new domain. No, finish should never do anything on the restore path, IMHO. User should directly attach the newly created HWPT when it is ready. > I understand the desire to have the preserved iommu domain be restored > during boot so the device has a default domain and there is an owner > of the attached restored domain, but that would prevent the iommfud > from cooking a clean new domain. The "default domain" is the "DMA API domain" and it has to be created and setup always. The change here is instead of attaching the default domain we attach the luo restored domain at early boot. This sets the device into an "owned" mode but vfio can still attach and nothing prevents iommufd from building a new hwpt and attaching it. > Maybe we can refine the "Hotswap" model I had in mind. Basically on > boot the core restores the preserved iommu domain, but core lets > iommufd attach a new domain with preserved devices without replacing > the underlying context entries? Replace the context entries. If everything is working properly the preserved domain should compute an identical context entry, so no reason to not just "replace" it which should be a NOP. > > Also there is an interesting behavior to note that if the iommu driver > > restores a domain then it will also prevent a non-vfio driver from > > binding to that device. > > Agreed. I think in the "Hotswap" approach I discussed above, if we > don't restore the domain, the core can just commit the context entries > of the new default domain if a non-vfio driver is bound to the device. As I said, the owned nature of the device will prevent attaching a non-vfio driver in the first place. So the only path forward for userspace is to attach vfio, and then iommufd should take over that luo restored iommu_domain and eventually free it. You might consider that finish should de-own the device if vfio didn't claim it. But that is a bit tricky since it needs a FLR before the domains can be switched around. Jason
> > 3. On FINISH, the IOMMU core updates the context entries of preserved > > devices to point to the new domain. > > No, finish should never do anything on the restore path, IMHO. User > should directly attach the newly created HWPT when it is ready. But, finish is our indicator that a particular session (VM) is out of blackout, and now we are free to do slow things, such as re-allocating/recreating page tables. Why start it before a VM is out of blackout?
On Tue, Sep 30, 2025 at 9:59 AM Jason Gunthorpe <jgg@ziepe.ca> wrote: > > On Tue, Sep 30, 2025 at 09:07:48AM -0400, Pasha Tatashin wrote: > > On Mon, Sep 29, 2025 at 12:00 PM Jason Gunthorpe <jgg@ziepe.ca> wrote: > > > > > > On Sun, Sep 28, 2025 at 07:06:21PM +0000, Samiullah Khawaja wrote: > > > > +static int iommufd_save_ioas(struct iommufd_ctx *ictx, > > > > + struct iommufd_lu *iommufd_lu) > > > > +{ > > > > + struct iommufd_hwpt_paging *hwpt_paging; > > > > + struct iommufd_ioas *ioas = NULL; > > > > + struct iommufd_object *obj; > > > > + unsigned long index; > > > > + int rc; > > > > + > > > > + /* Iterate each ioas. */ > > > > + xa_for_each(&ictx->objects, index, obj) { > > > > + if (obj->type != IOMMUFD_OBJ_IOAS) > > > > + continue; > > > > > > Wrong locking > > > > > > > + > > > > + ioas = (struct iommufd_ioas *)obj; > > > > + mutex_lock(&ioas->mutex); > > > > + > > > > + /* > > > > + * TODO: Iterate over each device of this iommufd and only save > > > > + * hwpt/domain if the device is persisted. > > > > + */ > > > > + list_for_each_entry(hwpt_paging, &ioas->hwpt_list, hwpt_item) { > > > > + if (!hwpt_paging->common.domain) > > > > + continue; > > > > > > I don't think this should be automatic. The user should directly > > > serialize/unserialize HWPTs by ID. > > > > Why not? Live Updated uAPI is handled through FDs, and both iommufd > > and vfiofd have to be preserved; I assume we can automatically > > determine the hwpt to be preserved through dependencies. Why would we > > delegate this to the user? > > There are HWPTs outside the IOAS so it is inconsisent. > > We are not going to reconstruct the IOAS. > > The IDR ids of the HWPT may not be available on restore (we cannot > make this ABI), so without userspace expressly labeling them and > recovering the new IDR ids it doesn't work. > > Finally we expect to discard the preserved HWPTs and replace them we > rebuilt ones at least as a first step. Userspace needs to sequence all > of this.. The way LUOv4 is implemented, "LUO sessions" are always participating LU. Once a user adds file descriptors to a session, that session and its contents are automatically carried across multiple consecutive live updates. The user only needs to act if they explicitly want to remove an FD and opt-out of preservation, or close session. This is consistent and convenient for long-running VM that should survive by default. I was hoping for a similar "preserve by default" or "opt-in-once" model for iommufd objects that are put into the LUO session to avoid a flurry of IOCTLs to re-register before every single live update. On the other hand, userspace still has to issue IOCTLs after retrieval to bring the restored FDs and associated objects back to a workable state. Perhaps, we could do something like "Yes, I'm actively using this object again, so please preserve it if another live update happens." ? Pasha
On Tue, Sep 30, 2025 at 11:09:59AM -0400, Pasha Tatashin wrote: > > The way LUOv4 is implemented, "LUO sessions" are always participating > LU. Once a user adds file descriptors to a session, that session and > its contents are automatically carried across multiple consecutive > live updates. The user only needs to act if they explicitly want to > remove an FD and opt-out of preservation, or close session. This is > consistent and convenient for long-running VM that should survive by > default. I don't think this is a good idea. Each kernel should decide on its own what and how things get included and manage the labels, from scratch. If you do this then alot more stuff becomes ABI and I think it will turn into a huge PITA. The userspace already has to have the code to setup the luo if it is on a clean reboot - what is the point of not running that every time? Jason
On Mon, Sep 29, 2025 at 9:00 AM Jason Gunthorpe <jgg@ziepe.ca> wrote: > > On Sun, Sep 28, 2025 at 07:06:21PM +0000, Samiullah Khawaja wrote: > > +static int iommufd_save_ioas(struct iommufd_ctx *ictx, > > + struct iommufd_lu *iommufd_lu) > > +{ > > + struct iommufd_hwpt_paging *hwpt_paging; > > + struct iommufd_ioas *ioas = NULL; > > + struct iommufd_object *obj; > > + unsigned long index; > > + int rc; > > + > > + /* Iterate each ioas. */ > > + xa_for_each(&ictx->objects, index, obj) { > > + if (obj->type != IOMMUFD_OBJ_IOAS) > > + continue; > > Wrong locking > > > + > > + ioas = (struct iommufd_ioas *)obj; > > + mutex_lock(&ioas->mutex); > > + > > + /* > > + * TODO: Iterate over each device of this iommufd and only save > > + * hwpt/domain if the device is persisted. > > + */ > > + list_for_each_entry(hwpt_paging, &ioas->hwpt_list, hwpt_item) { > > + if (!hwpt_paging->common.domain) > > + continue; > > I don't think this should be automatic. The user should directly > serialize/unserialize HWPTs by ID. Interesting. So the user should be able to serialize/unserialize HWPTs before the Live Update PREPARE event? But what if a device was marked for preservation but the user never serialized the attached HWPT, would that be considered an error during LUO PREPARE or should iommufd serialize the remaining HWPTs here? > > Jason
On Mon, Sep 29, 2025 at 1:32 PM Samiullah Khawaja <skhawaja@google.com> wrote: > > On Mon, Sep 29, 2025 at 9:00 AM Jason Gunthorpe <jgg@ziepe.ca> wrote: > > > > On Sun, Sep 28, 2025 at 07:06:21PM +0000, Samiullah Khawaja wrote: > > > +static int iommufd_save_ioas(struct iommufd_ctx *ictx, > > > + struct iommufd_lu *iommufd_lu) > > > +{ > > > + struct iommufd_hwpt_paging *hwpt_paging; > > > + struct iommufd_ioas *ioas = NULL; > > > + struct iommufd_object *obj; > > > + unsigned long index; > > > + int rc; > > > + > > > + /* Iterate each ioas. */ > > > + xa_for_each(&ictx->objects, index, obj) { > > > + if (obj->type != IOMMUFD_OBJ_IOAS) > > > + continue; > > > > Wrong locking > > > > > + > > > + ioas = (struct iommufd_ioas *)obj; > > > + mutex_lock(&ioas->mutex); > > > + > > > + /* > > > + * TODO: Iterate over each device of this iommufd and only save > > > + * hwpt/domain if the device is persisted. > > > + */ > > > + list_for_each_entry(hwpt_paging, &ioas->hwpt_list, hwpt_item) { > > > + if (!hwpt_paging->common.domain) > > > + continue; > > > > I don't think this should be automatic. The user should directly > > serialize/unserialize HWPTs by ID. > Interesting. So the user should be able to serialize/unserialize HWPTs > before the Live Update PREPARE event? But what if a device was marked > for preservation but the user never serialized the attached HWPT, > would that be considered an error during LUO PREPARE or should iommufd > serialize the remaining HWPTs here? Users ~can~ serialize their sessions before system-wide prepare event. During prepare event all unserialized sessions and their FDs are going to be serialized anyways. Pasha > > > > Jason
On Mon, Sep 29, 2025 at 10:32:22AM -0700, Samiullah Khawaja wrote: > On Mon, Sep 29, 2025 at 9:00 AM Jason Gunthorpe <jgg@ziepe.ca> wrote: > > > > On Sun, Sep 28, 2025 at 07:06:21PM +0000, Samiullah Khawaja wrote: > > > +static int iommufd_save_ioas(struct iommufd_ctx *ictx, > > > + struct iommufd_lu *iommufd_lu) > > > +{ > > > + struct iommufd_hwpt_paging *hwpt_paging; > > > + struct iommufd_ioas *ioas = NULL; > > > + struct iommufd_object *obj; > > > + unsigned long index; > > > + int rc; > > > + > > > + /* Iterate each ioas. */ > > > + xa_for_each(&ictx->objects, index, obj) { > > > + if (obj->type != IOMMUFD_OBJ_IOAS) > > > + continue; > > > > Wrong locking > > > > > + > > > + ioas = (struct iommufd_ioas *)obj; > > > + mutex_lock(&ioas->mutex); > > > + > > > + /* > > > + * TODO: Iterate over each device of this iommufd and only save > > > + * hwpt/domain if the device is persisted. > > > + */ > > > + list_for_each_entry(hwpt_paging, &ioas->hwpt_list, hwpt_item) { > > > + if (!hwpt_paging->common.domain) > > > + continue; > > > > I don't think this should be automatic. The user should directly > > serialize/unserialize HWPTs by ID. > Interesting. So the user should be able to serialize/unserialize HWPTs > before the Live Update PREPARE event? But what if a device was marked > for preservation but the user never serialized the attached HWPT, > would that be considered an error during LUO PREPARE or should iommufd > serialize the remaining HWPTs here? yes that would be an error I also think your patch series is a bit upside down, you should present the iommufd and core pieces first, then come with a driver implementation last. It will be easier to understand the context that having a driver implementation appear out of no where with no callers.. And everything should be driven by iommufd in this step, the iommu driver should not be magically auto-preserving itself. Just preserve the drivers linked to devices being preserved by iommufd. Jason
© 2016 - 2025 Red Hat, Inc.