From nobody Fri Nov 29 18:27:50 2024 Received: from smtp-fw-80008.amazon.com (smtp-fw-80008.amazon.com [99.78.197.219]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D10B6136345; Mon, 16 Sep 2024 11:31:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=99.78.197.219 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726486307; cv=none; b=Qi6ouis+s5PKO95vJR0zlMBwi/54t8CZsm3KcvK/JCd/3nJzYHc+nds0iqduhYHoY0s9l4hh6cBwEF4R7dYKZHfaAKT48kkd/fqsVTuWDdU2dZxixA+DI0r1ShbA8MXSzex8wpobJnFYnPRS2171vo4ZDdBZcGKQ4iAdPnPs3oI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726486307; c=relaxed/simple; bh=9NXEqGFX9ne4oh9zK4Gzz+zWzFEI/zjLFeRQPJqT5Zc=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=aLeJu0cOGG00Qu4kwNtj6qSKENUBC+i0bjQAsFFfrxZRJPRAe//3HWcBUb1otQcAVMrRiXr1Jm6Mi7cnEPjE0NLjPfR+yl6FrQy3icEF9RAIz/Hgs2d4sY/uaQCE7Uex81MMJP12smSQGa1toxIxBlJl5urph2tZSs3PKx3Tfl8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.com; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=ApkcddzU; arc=none smtp.client-ip=99.78.197.219 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="ApkcddzU" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1726486304; x=1758022304; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=YBMtYNcR4Cg8RdM83OXxt0GVv0Yun58zU3Vf1MOia6s=; b=ApkcddzUlvUqRj5RUNfu825VQe6VonVs9daV3btHsdWqdklER4uoNptx 4AnVDEdgVgYSVW4j8yCfTdaXCKovm1ftPz1z4VeVgHjTt9gGr7guNaidC qKlMOmVY7ckKzu03lbz/5gyNq0hnbJ8+OKeWZYxDVSlchPtOwsqLNdnU1 Q=; X-IronPort-AV: E=Sophos;i="6.10,233,1719878400"; d="scan'208";a="126592449" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.214]) by smtp-border-fw-80008.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Sep 2024 11:31:42 +0000 Received: from EX19MTAEUC002.ant.amazon.com [10.0.43.254:26360] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.27.249:2525] with esmtp (Farcaster) id a8bee893-5a46-4c75-8e61-dea3d9a45d8b; Mon, 16 Sep 2024 11:31:40 +0000 (UTC) X-Farcaster-Flow-ID: a8bee893-5a46-4c75-8e61-dea3d9a45d8b Received: from EX19D014EUC004.ant.amazon.com (10.252.51.182) by EX19MTAEUC002.ant.amazon.com (10.252.51.245) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 16 Sep 2024 11:31:40 +0000 Received: from u5d18b891348c5b.ant.amazon.com (10.146.13.221) by EX19D014EUC004.ant.amazon.com (10.252.51.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 16 Sep 2024 11:31:30 +0000 From: James Gowans To: CC: Jason Gunthorpe , Kevin Tian , "Joerg Roedel" , =?UTF-8?q?Krzysztof=20Wilczy=C5=84ski?= , Will Deacon , Robin Murphy , Mike Rapoport , "Madhavan T. Venkataraman" , , "Sean Christopherson" , Paolo Bonzini , , David Woodhouse , Lu Baolu , Alexander Graf , , , , "Saenz Julienne, Nicolas" Subject: [RFC PATCH 01/13] iommufd: Support marking and tracking persistent iommufds Date: Mon, 16 Sep 2024 13:30:50 +0200 Message-ID: <20240916113102.710522-2-jgowans@amazon.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240916113102.710522-1-jgowans@amazon.com> References: <20240916113102.710522-1-jgowans@amazon.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: EX19D041UWB003.ant.amazon.com (10.13.139.176) To EX19D014EUC004.ant.amazon.com (10.252.51.182) Content-Type: text/plain; charset="utf-8" Introduce a new iommufd option to mark an iommufd as persistent. For now this allocates it a unique persistent ID from an xarray index and keeps a reference to the domain. This will be used so that at serialisation time the open iommufds can be iterated through and serialised. --- drivers/iommu/iommufd/iommufd_private.h | 1 + drivers/iommu/iommufd/main.c | 47 +++++++++++++++++++++++++ include/uapi/linux/iommufd.h | 5 +++ 3 files changed, 53 insertions(+) diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommuf= d/iommufd_private.h index 92efe30a8f0d..b23f7766066c 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -28,6 +28,7 @@ struct iommufd_ctx { /* Compatibility with VFIO no iommu */ u8 no_iommu_mode; struct iommufd_ioas *vfio_ioas; + unsigned long persistent_id; }; =20 /* diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c index 83bbd7c5d160..6708ad629b1e 100644 --- a/drivers/iommu/iommufd/main.c +++ b/drivers/iommu/iommufd/main.c @@ -29,6 +29,8 @@ struct iommufd_object_ops { static const struct iommufd_object_ops iommufd_object_ops[]; static struct miscdevice vfio_misc_dev; =20 +static DEFINE_XARRAY_ALLOC(persistent_iommufds); + struct iommufd_object *_iommufd_object_alloc(struct iommufd_ctx *ictx, size_t size, enum iommufd_object_type type) @@ -287,10 +289,52 @@ static int iommufd_fops_release(struct inode *inode, = struct file *filp) break; } WARN_ON(!xa_empty(&ictx->groups)); + + rcu_read_lock(); + if (ictx->persistent_id) + xa_erase(&persistent_iommufds, ictx->persistent_id); + rcu_read_unlock(); kfree(ictx); return 0; } =20 +static int iommufd_option_persistent(struct iommufd_ucmd *ucmd) +{ + unsigned int persistent_id; + int rc; + struct iommu_option *cmd =3D ucmd->cmd; + struct iommufd_ctx *ictx =3D ucmd->ictx; + struct xa_limit id_limit =3D XA_LIMIT(1, UINT_MAX); + + if (cmd->op =3D=3D IOMMU_OPTION_OP_GET) { + cmd->val64 =3D ictx->persistent_id; + return 0; + } + + if (cmd->op =3D=3D IOMMU_OPTION_OP_SET) { + /* + * iommufds can only be marked persistent before they + * have been used for DMA mappings. HWPTs must be known + * to be persistent at creation time. + */ + if (!xa_empty(&ictx->objects)) { + pr_warn("iommufd can only be marked persistented when unused\n"); + return -EFAULT; + } + + rc =3D xa_alloc(&persistent_iommufds, &persistent_id, ictx, id_limit, GF= P_KERNEL_ACCOUNT); + if (rc) { + pr_warn("Unable to keep track of iommufd object\n"); + return rc; + } + + ictx->persistent_id =3D persistent_id; + cmd->val64 =3D ictx->persistent_id; + return 0; + } + return -EOPNOTSUPP; +} + static int iommufd_option(struct iommufd_ucmd *ucmd) { struct iommu_option *cmd =3D ucmd->cmd; @@ -306,6 +350,9 @@ static int iommufd_option(struct iommufd_ucmd *ucmd) case IOMMU_OPTION_HUGE_PAGES: rc =3D iommufd_ioas_option(ucmd); break; + case IOMMU_OPTION_PERSISTENT: + rc =3D iommufd_option_persistent(ucmd); + break; default: return -EOPNOTSUPP; } diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index 4dde745cfb7e..7d8cb242e9b0 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -276,10 +276,15 @@ struct iommu_ioas_unmap { * iommu mappings. Value 0 disables combining, everything is mapped to * PAGE_SIZE. This can be useful for benchmarking. This is a per-IOAS * option, the object_id must be the IOAS ID. + * @IOMMU_OPTION_PERSISTENT + * Value 1 sets this iommufd object as a persistent iommufd. Mappings w= ill + * survive across kexec. The returned value is the persistent ID which = can + * be used to restore the iommufd after kexec. */ enum iommufd_option { IOMMU_OPTION_RLIMIT_MODE =3D 0, IOMMU_OPTION_HUGE_PAGES =3D 1, + IOMMU_OPTION_PERSISTENT =3D 2, }; =20 /** --=20 2.34.1 From nobody Fri Nov 29 18:27:50 2024 Received: from smtp-fw-9106.amazon.com (smtp-fw-9106.amazon.com [207.171.188.206]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 60B7114AD3F; Mon, 16 Sep 2024 11:31:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=207.171.188.206 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726486319; cv=none; b=LM974H40RdjlEjlnAWe7jeFJ71fnA3B0N7kp5ngoLSe3Iik0bysaBZE8olQVsBSrRxcHkV6G6kiXXUqACVDCBfPq2r9avhcOJy5oTH7KB+My84E+wZ1vz/Jr0Feebql8nwbdw2IlgnSp/WVHR9Rf8Os4y+C9qFqcvi2K4raPEls= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726486319; c=relaxed/simple; bh=50zHaqmrB4SYyMYyNydrR+zrxyQS+qWdmvPRgZ9y0gY=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=rssrUZA8DphDrmQag27819LxRKQCLNybHIB77zsj9l2IgEuRNX1YXdyMsF7PkpnfQCKJAc2mK13sPWcw2dSw/lSbyRzX07ZP0OCH9d8QZzqxVX2LhzuLzKjUaQtz4L50Rwd4cQ717mHO1TgIdHsRyqVHVK4GAeoWaIymDfX6Ak0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.com; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=IbR0D/vm; arc=none smtp.client-ip=207.171.188.206 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="IbR0D/vm" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1726486319; x=1758022319; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=M05pjVVQI9aTtoEIvcWtno5GUtF2a6TkNMVEd9muAz4=; b=IbR0D/vme83UwV8mbI3QAu55DpQ/zE5N5ODHiOVyu2WSIw+re7PHi72n CIL6tmQmunXQte1aTZKAal6hv7ZXXDYyTLeQEQ7dHzOJ1/lfbFviXJqyz lMzm5QFpSzOMitQA1UJCetLbi1Sf34j4WgZz6T/Uvw0O91PlfBTRf6yWj I=; X-IronPort-AV: E=Sophos;i="6.10,233,1719878400"; d="scan'208";a="760323496" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.210]) by smtp-border-fw-9106.sea19.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Sep 2024 11:31:53 +0000 Received: from EX19MTAEUC001.ant.amazon.com [10.0.10.100:34005] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.1.23:2525] with esmtp (Farcaster) id 64df1e2b-6d4b-4ece-83c0-47b88ea24823; Mon, 16 Sep 2024 11:31:51 +0000 (UTC) X-Farcaster-Flow-ID: 64df1e2b-6d4b-4ece-83c0-47b88ea24823 Received: from EX19D014EUC004.ant.amazon.com (10.252.51.182) by EX19MTAEUC001.ant.amazon.com (10.252.51.155) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 16 Sep 2024 11:31:51 +0000 Received: from u5d18b891348c5b.ant.amazon.com (10.146.13.221) by EX19D014EUC004.ant.amazon.com (10.252.51.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 16 Sep 2024 11:31:40 +0000 From: James Gowans To: CC: Jason Gunthorpe , Kevin Tian , "Joerg Roedel" , =?UTF-8?q?Krzysztof=20Wilczy=C5=84ski?= , Will Deacon , Robin Murphy , Mike Rapoport , "Madhavan T. Venkataraman" , , "Sean Christopherson" , Paolo Bonzini , , David Woodhouse , Lu Baolu , Alexander Graf , , , , "Saenz Julienne, Nicolas" Subject: [RFC PATCH 02/13] iommufd: Add plumbing for KHO (de)serialise Date: Mon, 16 Sep 2024 13:30:51 +0200 Message-ID: <20240916113102.710522-3-jgowans@amazon.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240916113102.710522-1-jgowans@amazon.com> References: <20240916113102.710522-1-jgowans@amazon.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: EX19D041UWB003.ant.amazon.com (10.13.139.176) To EX19D014EUC004.ant.amazon.com (10.252.51.182) Content-Type: text/plain; charset="utf-8" To support serialising persistent iommufd objects to KHO, and to be able to restore the persisted data it is necessary to have a serialise hook on the KHO active path as well as a deserialise hook on module init. This commit adds those hooks and the new serialise.c file which will hold the logic here; for now it's just empty functions. --- drivers/iommu/iommufd/Makefile | 1 + drivers/iommu/iommufd/iommufd_private.h | 22 +++++++++++++++++++++ drivers/iommu/iommufd/main.c | 24 ++++++++++++++++++++++- drivers/iommu/iommufd/serialise.c | 26 +++++++++++++++++++++++++ 4 files changed, 72 insertions(+), 1 deletion(-) create mode 100644 drivers/iommu/iommufd/serialise.c diff --git a/drivers/iommu/iommufd/Makefile b/drivers/iommu/iommufd/Makefile index cf4605962bea..80bc775c170d 100644 --- a/drivers/iommu/iommufd/Makefile +++ b/drivers/iommu/iommufd/Makefile @@ -13,3 +13,4 @@ iommufd-$(CONFIG_IOMMUFD_TEST) +=3D selftest.o =20 obj-$(CONFIG_IOMMUFD) +=3D iommufd.o obj-$(CONFIG_IOMMUFD_DRIVER) +=3D iova_bitmap.o +obj-$(CONFIG_KEXEC_KHO) +=3D serialise.o diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommuf= d/iommufd_private.h index b23f7766066c..a26728646a22 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -497,6 +497,28 @@ static inline void iommufd_hwpt_detach_device(struct i= ommufd_hw_pagetable *hwpt, iommu_detach_group(hwpt->domain, idev->igroup->group); } =20 +/* + * Serialise is invoked as a callback by KHO when changing KHO active stat= e, + * it stores current iommufd state into KHO's persistent store. + * Deserialise is run by the iommufd module when loaded to re-hydrate state + * carried across from the previous kernel. + */ +#ifdef CONFIG_KEXEC_KHO +int iommufd_serialise_kho(struct notifier_block *self, unsigned long cmd, + void *fdt); +int __init iommufd_deserialise_kho(void); +#else +int iommufd_serialise_kho(struct notifier_block *self, unsigned long cmd, + void *fdt) +{ + return 0; +} +int __init iommufd_deserialise_kho(void) +{ + return 0; +} +#endif + static inline int iommufd_hwpt_replace_device(struct iommufd_device *idev, struct iommufd_hw_pagetable *hwpt, struct iommufd_hw_pagetable *old) diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c index 6708ad629b1e..fa4f0fe336ad 100644 --- a/drivers/iommu/iommufd/main.c +++ b/drivers/iommu/iommufd/main.c @@ -10,6 +10,7 @@ =20 #include #include +#include #include #include #include @@ -590,6 +591,10 @@ static struct miscdevice vfio_misc_dev =3D { .mode =3D 0666, }; =20 +static struct notifier_block serialise_kho_nb =3D { + .notifier_call =3D iommufd_serialise_kho, +}; + static int __init iommufd_init(void) { int ret; @@ -603,11 +608,26 @@ static int __init iommufd_init(void) if (ret) goto err_misc; } + + if (IS_ENABLED(CONFIG_KEXEC_KHO)) { + ret =3D register_kho_notifier(&serialise_kho_nb); + if (ret) + goto err_vfio_misc; + } + + ret =3D iommufd_deserialise_kho(); + if (ret) + goto err_kho; + ret =3D iommufd_test_init(); + if (ret) - goto err_vfio_misc; + goto err_kho; return 0; =20 +err_kho: + if (IS_ENABLED(CONFIG_KEXEC_KHO)) + unregister_kho_notifier(&serialise_kho_nb); err_vfio_misc: if (IS_ENABLED(CONFIG_IOMMUFD_VFIO_CONTAINER)) misc_deregister(&vfio_misc_dev); @@ -621,6 +641,8 @@ static void __exit iommufd_exit(void) iommufd_test_exit(); if (IS_ENABLED(CONFIG_IOMMUFD_VFIO_CONTAINER)) misc_deregister(&vfio_misc_dev); + if (IS_ENABLED(CONFIG_FTRACE_KHO)) + unregister_kho_notifier(&serialise_kho_nb); misc_deregister(&iommu_misc_dev); } =20 diff --git a/drivers/iommu/iommufd/serialise.c b/drivers/iommu/iommufd/seri= alise.c new file mode 100644 index 000000000000..6e8bcc384771 --- /dev/null +++ b/drivers/iommu/iommufd/serialise.c @@ -0,0 +1,26 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include +#include "iommufd_private.h" + +int iommufd_serialise_kho(struct notifier_block *self, unsigned long cmd, + void *fdt) +{ + pr_info("would serialise here\n"); + switch (cmd) { + case KEXEC_KHO_ABORT: + /* Would do serialise rollback here. */ + return NOTIFY_DONE; + case KEXEC_KHO_DUMP: + /* Would do serialise here. */ + return NOTIFY_DONE; + default: + return NOTIFY_BAD; + } +} + +int __init iommufd_deserialise_kho(void) +{ + pr_info("would deserialise here\n"); + return 0; +} --=20 2.34.1 From nobody Fri Nov 29 18:27:50 2024 Received: from smtp-fw-80007.amazon.com (smtp-fw-80007.amazon.com [99.78.197.218]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E6E0361FD8; Mon, 16 Sep 2024 11:32:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=99.78.197.218 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726486357; cv=none; b=XFDJADcxmlpvQgNM1YTRUPx7wWQffnhT54bQ3JToauDJH16hQZytH6HK0ekJW5TiOXemh7UCGxMZgiyaAt9dINUXddhOtB/O7w89DXwLHy4nLx36IWzJwsn7jHLSTjMPipW1DF+QGfwzHjqA5VzaUHWNjQ3S39wnLTIuLsPEMgg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726486357; c=relaxed/simple; bh=ks1Z6Zt5tZ7f32h9uedMQeu3E//CwrI54ZII4uARJM0=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=TZ5IYIo1l/WY4k4T1ZShLzAYrcMX8nppSR5gFcedsdzoW22d2uZPcD0X7RQqI+hW80GjeKueG0sSjwy10boYjfR1ym8slgtfc2iKGKenMn5CtQDK8Ktu7n5qHmYfYDjddY9ZrlB4qyJdp4+Bo4jBU4IoPTSQMcjmbu78h0hzdUs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.com; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=hk7ysDl6; arc=none smtp.client-ip=99.78.197.218 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="hk7ysDl6" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1726486356; x=1758022356; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=cMe1v1ZokVNMH0b3UoTmzNhpYQFLD8f+J2f2juiv+X4=; b=hk7ysDl6yZ7tAMPvkReG+Wb9XkKXSgImgAqiWh1AK38nFLmRKTm+Dtc7 PX1r4yW0lNEYSdYM72S4pYr0f5fOzssTBDF4s5seoFU3dUDJXDqaomBWi 5nz2YaGf5/uZ/+dx+VuriQ41tD+inkDUHtW34v03TNWYuln+/R+nDdxth U=; X-IronPort-AV: E=Sophos;i="6.10,233,1719878400"; d="scan'208";a="331432668" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.210]) by smtp-border-fw-80007.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Sep 2024 11:32:35 +0000 Received: from EX19MTAEUA002.ant.amazon.com [10.0.17.79:3128] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.42.185:2525] with esmtp (Farcaster) id 4e097c01-0ec2-475e-bf7b-c130f572e46b; Mon, 16 Sep 2024 11:32:34 +0000 (UTC) X-Farcaster-Flow-ID: 4e097c01-0ec2-475e-bf7b-c130f572e46b Received: from EX19D014EUC004.ant.amazon.com (10.252.51.182) by EX19MTAEUA002.ant.amazon.com (10.252.50.126) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 16 Sep 2024 11:32:34 +0000 Received: from u5d18b891348c5b.ant.amazon.com (10.146.13.221) by EX19D014EUC004.ant.amazon.com (10.252.51.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 16 Sep 2024 11:32:23 +0000 From: James Gowans To: CC: Jason Gunthorpe , Kevin Tian , "Joerg Roedel" , =?UTF-8?q?Krzysztof=20Wilczy=C5=84ski?= , Will Deacon , Robin Murphy , Mike Rapoport , "Madhavan T. Venkataraman" , , "Sean Christopherson" , Paolo Bonzini , , David Woodhouse , Lu Baolu , Alexander Graf , , , , "Saenz Julienne, Nicolas" Subject: [RFC PATCH 03/13] iommu/intel: zap context table entries on kexec Date: Mon, 16 Sep 2024 13:30:52 +0200 Message-ID: <20240916113102.710522-4-jgowans@amazon.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240916113102.710522-1-jgowans@amazon.com> References: <20240916113102.710522-1-jgowans@amazon.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: EX19D044UWB004.ant.amazon.com (10.13.139.134) To EX19D014EUC004.ant.amazon.com (10.252.51.182) Content-Type: text/plain; charset="utf-8" Instead of fully shutting down the IOMMU on kexec, rather zap context table entries for devices. This is the initial step to be able to persist some domains. Once a struct iommu_domain can be marked persistent then those persistent domains will be skipped when doing the IOMMU shut down. --- drivers/iommu/intel/dmar.c | 1 + drivers/iommu/intel/iommu.c | 34 ++++++++++++++++++++++++++++++---- drivers/iommu/intel/iommu.h | 2 ++ 3 files changed, 33 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c index 1c8d3141cb55..f79aba382e77 100644 --- a/drivers/iommu/intel/dmar.c +++ b/drivers/iommu/intel/dmar.c @@ -1099,6 +1099,7 @@ static int alloc_iommu(struct dmar_drhd_unit *drhd) spin_lock_init(&iommu->device_rbtree_lock); mutex_init(&iommu->iopf_lock); iommu->node =3D NUMA_NO_NODE; + INIT_LIST_HEAD(&iommu->domains); =20 ver =3D readl(iommu->reg + DMAR_VER_REG); pr_info("%s: reg_base_addr %llx ver %d:%d cap %llx ecap %llx\n", diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 9ff8b83c19a3..2297cbb0253f 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -1575,6 +1575,7 @@ int domain_attach_iommu(struct dmar_domain *domain, s= truct intel_iommu *iommu) goto err_clear; } domain_update_iommu_cap(domain); + list_add(&domain->domains, &iommu->domains); =20 spin_unlock(&iommu->lock); return 0; @@ -3185,6 +3186,33 @@ static void intel_disable_iommus(void) iommu_disable_translation(iommu); } =20 +static void zap_context_table_entries(struct intel_iommu *iommu) +{ + struct context_entry *context; + struct dmar_domain *domain; + struct device_domain_info *device; + int bus, devfn; + u16 did_old; + + list_for_each_entry(domain, &iommu->domains, domains) { + list_for_each_entry(device, &domain->devices, link) { + context =3D iommu_context_addr(iommu, device->bus, device->devfn, 0); + if (!context || !context_present(context)) + continue; + context_domain_id(context); + context_clear_entry(context); + __iommu_flush_cache(iommu, context, sizeof(*context)); + iommu->flush.flush_context(iommu, + did_old, + (((u16)bus) << 8) | devfn, + DMA_CCMD_MASK_NOBIT, + DMA_CCMD_DEVICE_INVL); + iommu->flush.flush_iotlb(iommu, did_old, 0, 0, + DMA_TLB_DSI_FLUSH); + } + } +} + void intel_iommu_shutdown(void) { struct dmar_drhd_unit *drhd; @@ -3197,10 +3225,8 @@ void intel_iommu_shutdown(void) =20 /* Disable PMRs explicitly here. */ for_each_iommu(iommu, drhd) - iommu_disable_protect_mem_regions(iommu); - - /* Make sure the IOMMUs are switched off */ - intel_disable_iommus(); + zap_context_table_entries(iommu); + return =20 up_write(&dmar_global_lock); } diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index b67c14da1240..cfd006588824 100644 --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -606,6 +606,7 @@ struct dmar_domain { spinlock_t lock; /* Protect device tracking lists */ struct list_head devices; /* all devices' list */ struct list_head dev_pasids; /* all attached pasids */ + struct list_head domains; /* all struct dmar_domains on this IOMMU */ =20 spinlock_t cache_lock; /* Protect the cache tag list */ struct list_head cache_tags; /* Cache tag list */ @@ -749,6 +750,7 @@ struct intel_iommu { void *perf_statistic; =20 struct iommu_pmu *pmu; + struct list_head domains; /* all struct dmar_domains on this IOMMU */ }; =20 /* PCI domain-device relationship */ --=20 2.34.1 From nobody Fri Nov 29 18:27:50 2024 Received: from smtp-fw-80008.amazon.com (smtp-fw-80008.amazon.com [99.78.197.219]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AEF2214F9FA; Mon, 16 Sep 2024 11:32:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=99.78.197.219 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726486369; cv=none; b=fI9YuFuWTHipb48cFm7vdwOuJslJf7wCJMcGZgl63QBeueApHFB8RNwMblSZ4FID26oP2LSdbGFiMtAPCl9E+B0fwbIqhsNTluSo4jV3rhlGPmwnDyUIpsttYA4Xo7OWF8Bi/20k9lJ0k6U+3jhWgcToj7vdexJ5TU+d9Qh3+S4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726486369; c=relaxed/simple; bh=DdVs8r2AEn1WG2+lhc4MpBm5E2Yz6DBnB+uvR5C4mKI=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=T4dv9K/beS/gxVisMeVSw1bmbcM682KmqAW/zCpl7+hwluhKU5ivOlH5se5uzSDDmbacx/2sW6J0qgcA5Ot1BfpoOJ71bY6UM0fPbwe54hhvIGJGCye01x/pn8k0Q9jXXpNcMGDT181NdP1mYPYEkPCVJmtt9IUurNiSgBwPRK8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.com; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=uCOR3esL; arc=none smtp.client-ip=99.78.197.219 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="uCOR3esL" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1726486367; x=1758022367; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=vf5kR9S6dXHOFbBgiKTEcIG+RP/hLGBiZ5b2JkKK8MA=; b=uCOR3esL6MABLwt1krLAfp26lzsN99OUuS/Lkn6oZjJ3dvWvmrFAVfU8 WDsJwwxQj4icHnHjKYkWcfaVsxUpKDTQ7cuACpyvGUFolTyZIlKGdLQSB AgC3tvumIxMzs86J1/DoyVdM0ch5+maatp7ZSXaPWuzswvtr6Jf0KPlIX 0=; X-IronPort-AV: E=Sophos;i="6.10,233,1719878400"; d="scan'208";a="126592765" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO smtpout.prod.us-east-1.prod.farcaster.email.amazon.dev) ([10.25.36.214]) by smtp-border-fw-80008.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Sep 2024 11:32:46 +0000 Received: from EX19MTAEUA002.ant.amazon.com [10.0.17.79:6868] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.20.15:2525] with esmtp (Farcaster) id 11e1b852-b74c-4c12-98c6-961bbe493613; Mon, 16 Sep 2024 11:32:45 +0000 (UTC) X-Farcaster-Flow-ID: 11e1b852-b74c-4c12-98c6-961bbe493613 Received: from EX19D014EUC004.ant.amazon.com (10.252.51.182) by EX19MTAEUA002.ant.amazon.com (10.252.50.126) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 16 Sep 2024 11:32:45 +0000 Received: from u5d18b891348c5b.ant.amazon.com (10.146.13.221) by EX19D014EUC004.ant.amazon.com (10.252.51.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 16 Sep 2024 11:32:34 +0000 From: James Gowans To: CC: Jason Gunthorpe , Kevin Tian , "Joerg Roedel" , =?UTF-8?q?Krzysztof=20Wilczy=C5=84ski?= , Will Deacon , Robin Murphy , Mike Rapoport , "Madhavan T. Venkataraman" , , "Sean Christopherson" , Paolo Bonzini , , David Woodhouse , Lu Baolu , Alexander Graf , , , , "Saenz Julienne, Nicolas" Subject: [RFC PATCH 04/13] iommu: Support marking domains as persistent on alloc Date: Mon, 16 Sep 2024 13:30:53 +0200 Message-ID: <20240916113102.710522-5-jgowans@amazon.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240916113102.710522-1-jgowans@amazon.com> References: <20240916113102.710522-1-jgowans@amazon.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: EX19D044UWB004.ant.amazon.com (10.13.139.134) To EX19D014EUC004.ant.amazon.com (10.252.51.182) Content-Type: text/plain; charset="utf-8" Adding the persistent ID field to the struct iommu_domain and allow it to be set from the domain_alloc callback which is used by iommufd. So far unused, and for now it will only be supported on Intel IOMMU as proof of concept. Going forward this ID will be used as a unique handle on the iommu_domain so that after kexec the caller (iommufd) will be able to restore a reference to the struct iommu_domain which existed before kexec. --- drivers/iommu/amd/iommu.c | 4 +++- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 3 ++- drivers/iommu/intel/iommu.c | 2 ++ drivers/iommu/iommufd/hw_pagetable.c | 5 ++++- drivers/iommu/iommufd/selftest.c | 1 + include/linux/iommu.h | 11 +++++++++-- 6 files changed, 21 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index b19e8c0f48fa..daeb609aa4f5 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -2432,12 +2432,14 @@ static struct iommu_domain *amd_iommu_domain_alloc(= unsigned int type) static struct iommu_domain * amd_iommu_domain_alloc_user(struct device *dev, u32 flags, struct iommu_domain *parent, + unsigned long persistent_id, const struct iommu_user_data *user_data) =20 { unsigned int type =3D IOMMU_DOMAIN_UNMANAGED; =20 - if ((flags & ~IOMMU_HWPT_ALLOC_DIRTY_TRACKING) || parent || user_data) + if ((flags & ~IOMMU_HWPT_ALLOC_DIRTY_TRACKING) || parent || user_data + || persistent_id) return ERR_PTR(-EOPNOTSUPP); =20 return do_iommu_domain_alloc(type, dev, flags); diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/ar= m/arm-smmu-v3/arm-smmu-v3.c index a31460f9f3d4..41c964891c84 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -3049,6 +3049,7 @@ static struct iommu_domain arm_smmu_blocked_domain = =3D { static struct iommu_domain * arm_smmu_domain_alloc_user(struct device *dev, u32 flags, struct iommu_domain *parent, + unsigned long persistent_id, const struct iommu_user_data *user_data) { struct arm_smmu_master *master =3D dev_iommu_priv_get(dev); @@ -3058,7 +3059,7 @@ arm_smmu_domain_alloc_user(struct device *dev, u32 fl= ags, =20 if (flags & ~PAGING_FLAGS) return ERR_PTR(-EOPNOTSUPP); - if (parent || user_data) + if (parent || user_data || persistent_id) return ERR_PTR(-EOPNOTSUPP); =20 smmu_domain =3D arm_smmu_domain_alloc(); diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 2297cbb0253f..f473a8c008a7 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -3729,6 +3729,7 @@ static struct iommu_domain *intel_iommu_domain_alloc(= unsigned type) static struct iommu_domain * intel_iommu_domain_alloc_user(struct device *dev, u32 flags, struct iommu_domain *parent, + unsigned long persistent_id, const struct iommu_user_data *user_data) { struct device_domain_info *info =3D dev_iommu_priv_get(dev); @@ -3761,6 +3762,7 @@ intel_iommu_domain_alloc_user(struct device *dev, u32= flags, domain->type =3D IOMMU_DOMAIN_UNMANAGED; domain->owner =3D &intel_iommu_ops; domain->ops =3D intel_iommu_ops.default_domain_ops; + domain->persistent_id =3D persistent_id; =20 if (nested_parent) { dmar_domain->nested_parent =3D true; diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/h= w_pagetable.c index aefde4443671..4bbf1dc98053 100644 --- a/drivers/iommu/iommufd/hw_pagetable.c +++ b/drivers/iommu/iommufd/hw_pagetable.c @@ -137,6 +137,7 @@ iommufd_hwpt_paging_alloc(struct iommufd_ctx *ictx, str= uct iommufd_ioas *ioas, =20 if (ops->domain_alloc_user) { hwpt->domain =3D ops->domain_alloc_user(idev->dev, flags, NULL, + ictx->persistent_id, user_data); if (IS_ERR(hwpt->domain)) { rc =3D PTR_ERR(hwpt->domain); @@ -239,7 +240,9 @@ iommufd_hwpt_nested_alloc(struct iommufd_ctx *ictx, =20 hwpt->domain =3D ops->domain_alloc_user(idev->dev, flags & ~IOMMU_HWPT_FAULT_ID_VALID, - parent->common.domain, user_data); + parent->common.domain, + ictx->persistent_id, + user_data); if (IS_ERR(hwpt->domain)) { rc =3D PTR_ERR(hwpt->domain); hwpt->domain =3D NULL; diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selft= est.c index 222cfc11ebfd..7a9a454369d5 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -318,6 +318,7 @@ __mock_domain_alloc_nested(struct mock_iommu_domain *mo= ck_parent, static struct iommu_domain * mock_domain_alloc_user(struct device *dev, u32 flags, struct iommu_domain *parent, + unsigned long persistent_id, const struct iommu_user_data *user_data) { struct mock_iommu_domain *mock_parent; diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 04cbdae0052e..a616e8702a1c 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -215,6 +215,11 @@ struct iommu_domain { struct iommu_dma_cookie *iova_cookie; int (*iopf_handler)(struct iopf_group *group); void *fault_data; + /* + * Persisting and restoring across kexec via KHO. + * 0 indicates non-persistent. + */ + unsigned long persistent_id; union { struct { iommu_fault_handler_t handler; @@ -518,7 +523,9 @@ static inline int __iommu_copy_struct_from_user_array( * IOMMU_DOMAIN_NESTED type; otherwise, the @parent mu= st be * NULL while the @user_data can be optionally provide= d, the * new domain must support __IOMMU_DOMAIN_PAGING. - * Upon failure, ERR_PTR must be returned. + * Upon failure, ERR_PTR must be returned. Persistent = ID is + * used to save/restore across kexec; 0 indicates not + * persistent. * @domain_alloc_paging: Allocate an iommu_domain that can be used for * UNMANAGED, DMA, and DMA_FQ domain types. * @domain_alloc_sva: Allocate an iommu_domain for Shared Virtual Addressi= ng. @@ -564,7 +571,7 @@ struct iommu_ops { struct iommu_domain *(*domain_alloc)(unsigned iommu_domain_type); struct iommu_domain *(*domain_alloc_user)( struct device *dev, u32 flags, struct iommu_domain *parent, - const struct iommu_user_data *user_data); + unsigned long persistent_id, const struct iommu_user_data *user_data); struct iommu_domain *(*domain_alloc_paging)(struct device *dev); struct iommu_domain *(*domain_alloc_sva)(struct device *dev, struct mm_struct *mm); --=20 2.34.1 From nobody Fri Nov 29 18:27:50 2024 Received: from smtp-fw-52004.amazon.com (smtp-fw-52004.amazon.com [52.119.213.154]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2E5001534EC; Mon, 16 Sep 2024 11:33:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=52.119.213.154 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726486383; cv=none; b=InD0lAcrKfKuS877iHhUifsDufEB5a0q5HTKsEeYV59VJ9OVlXLu5C0Q6iBvxIEjLExJ9XmsieHTpWVeIo9fqzSt61HOP5o4p+1WiPIpAc7g68OM26HIn3uow5dg4DzdSaK4r8hx5mWSUPiHT2FEc+/RwbeuJEGC1Qe4tFCKt3s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726486383; c=relaxed/simple; bh=FsUWElab6zr300JppAticloJLvPgUVkfWRFKCIZtgQg=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=PBN9uhWP+eO8ggKxDmKlWFUyuyM18loil/B2vo2UUTgpmcALY/fbaY8twMLnR0bE8+lqMZR0k/9FruVBJAUPWlUNp/Gkx1h9Xqp8i/SiZDhbHB0piqbwuY01a4f65WPXh3htUMX6zB2gowosLa3ENqSjGfj5kbLspklkKLCu3L4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.com; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=hromEiUv; arc=none smtp.client-ip=52.119.213.154 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="hromEiUv" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1726486382; x=1758022382; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=U/ULYcUe9jT7ezhYmtE5t4Iepfj5AYWdglXy5q3aDlM=; b=hromEiUvxORSRUPX+Y8YO3nMKzsT1gXXgg4KZYuo9sxFPiY7JcQfdpke Q90hMxmqtJzBeVV3PkVoK0Pct+j1l37cLDPDXwBjQglsBnX19k4Vuck6L OQo61SwLZbFkO/JniIIikShn4kx7r1CZEQaXitCYP8zTQXNyA2zMhMQ0f g=; X-IronPort-AV: E=Sophos;i="6.10,233,1719878400"; d="scan'208";a="232155180" Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.2]) by smtp-border-fw-52004.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Sep 2024 11:32:59 +0000 Received: from EX19MTAEUB001.ant.amazon.com [10.0.17.79:32997] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.1.23:2525] with esmtp (Farcaster) id 314ad269-748a-4395-b54d-92a65f8e6174; Mon, 16 Sep 2024 11:32:57 +0000 (UTC) X-Farcaster-Flow-ID: 314ad269-748a-4395-b54d-92a65f8e6174 Received: from EX19D014EUC004.ant.amazon.com (10.252.51.182) by EX19MTAEUB001.ant.amazon.com (10.252.51.28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 16 Sep 2024 11:32:55 +0000 Received: from u5d18b891348c5b.ant.amazon.com (10.146.13.221) by EX19D014EUC004.ant.amazon.com (10.252.51.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 16 Sep 2024 11:32:45 +0000 From: James Gowans To: CC: Jason Gunthorpe , Kevin Tian , "Joerg Roedel" , =?UTF-8?q?Krzysztof=20Wilczy=C5=84ski?= , Will Deacon , Robin Murphy , Mike Rapoport , "Madhavan T. Venkataraman" , , "Sean Christopherson" , Paolo Bonzini , , David Woodhouse , Lu Baolu , Alexander Graf , , , , "Saenz Julienne, Nicolas" Subject: [RFC PATCH 05/13] iommufd: Serialise persisted iommufds and ioas Date: Mon, 16 Sep 2024 13:30:54 +0200 Message-ID: <20240916113102.710522-6-jgowans@amazon.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240916113102.710522-1-jgowans@amazon.com> References: <20240916113102.710522-1-jgowans@amazon.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: EX19D044UWB004.ant.amazon.com (10.13.139.134) To EX19D014EUC004.ant.amazon.com (10.252.51.182) Content-Type: text/plain; charset="utf-8" Now actually implementing the serialise callback for iommufd. On KHO activate, iterate through all persisted domains and write their metadata to the device tree format. For now just a few fields are serialised to demonstrate the concept. To actually make this useful a lot more field and related objects will need to be serialised too. --- drivers/iommu/iommufd/iommufd_private.h | 2 + drivers/iommu/iommufd/main.c | 2 +- drivers/iommu/iommufd/serialise.c | 81 ++++++++++++++++++++++++- 3 files changed, 81 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommuf= d/iommufd_private.h index a26728646a22..ad8d180269bd 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -18,6 +18,8 @@ struct iommu_group; struct iommu_option; struct iommufd_device; =20 +extern struct xarray persistent_iommufds; + struct iommufd_ctx { struct file *file; struct xarray objects; diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c index fa4f0fe336ad..21a7e1ad40d1 100644 --- a/drivers/iommu/iommufd/main.c +++ b/drivers/iommu/iommufd/main.c @@ -30,7 +30,7 @@ struct iommufd_object_ops { static const struct iommufd_object_ops iommufd_object_ops[]; static struct miscdevice vfio_misc_dev; =20 -static DEFINE_XARRAY_ALLOC(persistent_iommufds); +DEFINE_XARRAY_ALLOC(persistent_iommufds); =20 struct iommufd_object *_iommufd_object_alloc(struct iommufd_ctx *ictx, size_t size, diff --git a/drivers/iommu/iommufd/serialise.c b/drivers/iommu/iommufd/seri= alise.c index 6e8bcc384771..6b4c306dce40 100644 --- a/drivers/iommu/iommufd/serialise.c +++ b/drivers/iommu/iommufd/serialise.c @@ -1,19 +1,94 @@ // SPDX-License-Identifier: GPL-2.0-only =20 #include +#include #include "iommufd_private.h" +#include "io_pagetable.h" + +/** + * Serialised format: + * /iommufd + * compatible =3D "iommufd-v0", + * iommufds =3D [ + * persistent_id =3D { + * account_mode =3D u8 + * ioases =3D [ + * { + * areas =3D [ + * ] + * } + * ] + * } + * ] + */ +static int serialise_iommufd(void *fdt, struct iommufd_ctx *ictx) +{ + int err =3D 0; + char name[24]; + struct iommufd_object *obj; + unsigned long obj_idx; + + snprintf(name, sizeof(name), "%lu", ictx->persistent_id); + err |=3D fdt_begin_node(fdt, name); + err |=3D fdt_begin_node(fdt, "ioases"); + xa_for_each(&ictx->objects, obj_idx, obj) { + struct iommufd_ioas *ioas; + struct iopt_area *area; + int area_idx =3D 0; + + if (obj->type !=3D IOMMUFD_OBJ_IOAS) + continue; + + ioas =3D (struct iommufd_ioas *) obj; + snprintf(name, sizeof(name), "%lu", obj_idx); + err |=3D fdt_begin_node(fdt, name); + + for (area =3D iopt_area_iter_first(&ioas->iopt, 0, ULONG_MAX); area; + area =3D iopt_area_iter_next(area, 0, ULONG_MAX)) { + unsigned long iova_start, iova_len; + + snprintf(name, sizeof(name), "%i", area_idx); + err |=3D fdt_begin_node(fdt, name); + iova_start =3D iopt_area_iova(area); + iova_len =3D iopt_area_length(area); + err |=3D fdt_property(fdt, "iova-start", + &iova_start, sizeof(iova_start)); + err |=3D fdt_property(fdt, "iova-len", + &iova_len, sizeof(iova_len)); + err |=3D fdt_property(fdt, "iommu-prot", + &area->iommu_prot, sizeof(area->iommu_prot)); + err |=3D fdt_end_node(fdt); /* area_idx */ + ++area_idx; + } + err |=3D fdt_end_node(fdt); /* ioas obj_idx */ + } + err |=3D fdt_end_node(fdt); /* ioases*/ + err |=3D fdt_end_node(fdt); /* ictx->persistent_id */ + return 0; +} =20 int iommufd_serialise_kho(struct notifier_block *self, unsigned long cmd, void *fdt) { - pr_info("would serialise here\n"); + static const char compatible[] =3D "iommufd-v0"; + struct iommufd_ctx *ictx; + unsigned long xa_idx; + int err =3D 0; + switch (cmd) { case KEXEC_KHO_ABORT: /* Would do serialise rollback here. */ return NOTIFY_DONE; case KEXEC_KHO_DUMP: - /* Would do serialise here. */ - return NOTIFY_DONE; + err |=3D fdt_begin_node(fdt, "iommufd"); + fdt_property(fdt, "compatible", compatible, sizeof(compatible)); + err |=3D fdt_begin_node(fdt, "iommufds"); + xa_for_each(&persistent_iommufds, xa_idx, ictx) { + err |=3D serialise_iommufd(fdt, ictx); + } + err |=3D fdt_end_node(fdt); /* iommufds */ + err |=3D fdt_end_node(fdt); /* iommufd */ + return err? NOTIFY_BAD : NOTIFY_DONE; default: return NOTIFY_BAD; } --=20 2.34.1 From nobody Fri Nov 29 18:27:50 2024 Received: from smtp-fw-6002.amazon.com (smtp-fw-6002.amazon.com [52.95.49.90]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9C5D461FD8; Mon, 16 Sep 2024 11:33:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=52.95.49.90 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726486426; cv=none; b=tBZf6KLrGE6oV/5coCWVMF1Wa1kf53AvXUtH4yAKaku61Io78NCnUCCfkpuKJPIX5NuMzxSRR39KalZWD2NHU9yUR8ODWiESgadzb+uTfpxWJIh482X2HDYGgfswqLn2AEF11FXGGLNVXxlK9kabhCvd2oW7KJxuce3pZrbkVps= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726486426; c=relaxed/simple; bh=Ymegnh3AxHOuRLT5t0NoKc98Iicef7tf8alEV8sqb7k=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Dhr0mId/wO2f5YzOVpC8UTk3SFDCZsoSxB6QJyzSV1uNSU9dK7A0KGpFNLbtU59eCYeWUFwGSDzKQWi+208FjdthtQNALfp7DRoYqa+3yLN3lJz74gcZuH6wjqJrFZoMvNLsOCrZ8R6Jy4MbtmKceJ5x1ixTqf+5fLcff0HuqYg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.com; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=RSU/e5Kt; arc=none smtp.client-ip=52.95.49.90 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="RSU/e5Kt" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1726486425; x=1758022425; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=QopJSZ/RSdPzLioeqfhnmkCX0oT6c27cpVIgczu8bNo=; b=RSU/e5KtlNGxkQjTcEAuPaXf6DSvtHS7jA+yJwtG7Pglgind9weKxuNM ASjx/2OT6S1LmSm4WYdDEXWyrXRCI+cIk0gL47D31YfWOOFolCJW/K1GW FjdOQc2CqbJ52A+Qs4e7jyZPX7XOPxH44uNPh1mMQjYEworGh6u+J43I8 k=; X-IronPort-AV: E=Sophos;i="6.10,233,1719878400"; d="scan'208";a="433694478" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.6]) by smtp-border-fw-6002.iad6.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Sep 2024 11:33:41 +0000 Received: from EX19MTAEUC002.ant.amazon.com [10.0.43.254:4607] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.25.198:2525] with esmtp (Farcaster) id c747a2bc-a762-49dd-ae3f-d900ed03c88e; Mon, 16 Sep 2024 11:33:39 +0000 (UTC) X-Farcaster-Flow-ID: c747a2bc-a762-49dd-ae3f-d900ed03c88e Received: from EX19D014EUC004.ant.amazon.com (10.252.51.182) by EX19MTAEUC002.ant.amazon.com (10.252.51.245) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 16 Sep 2024 11:33:38 +0000 Received: from u5d18b891348c5b.ant.amazon.com (10.146.13.221) by EX19D014EUC004.ant.amazon.com (10.252.51.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 16 Sep 2024 11:33:28 +0000 From: James Gowans To: CC: Jason Gunthorpe , Kevin Tian , "Joerg Roedel" , =?UTF-8?q?Krzysztof=20Wilczy=C5=84ski?= , Will Deacon , Robin Murphy , Mike Rapoport , "Madhavan T. Venkataraman" , , "Sean Christopherson" , Paolo Bonzini , , David Woodhouse , Lu Baolu , Alexander Graf , , , , "Saenz Julienne, Nicolas" Subject: [RFC PATCH 06/13] iommufd: Expose persistent iommufd IDs in sysfs Date: Mon, 16 Sep 2024 13:30:55 +0200 Message-ID: <20240916113102.710522-7-jgowans@amazon.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240916113102.710522-1-jgowans@amazon.com> References: <20240916113102.710522-1-jgowans@amazon.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: EX19D046UWB003.ant.amazon.com (10.13.139.174) To EX19D014EUC004.ant.amazon.com (10.252.51.182) Content-Type: text/plain; charset="utf-8" After kexec userspace needs the ability to re-acquire a handle to the IOMMUFD which it was using before kexec. To provide userspace the ability to discover persisted domains and get a handle to them, expose all of the persisted IDs in sysfs. Each persisted ID will create a directory like: /sys/kernel/persisted_iommufd/ In the next commit a file will be added to this directory to allow actually restoring the IOMMUFD. --- drivers/iommu/iommufd/serialise.c | 48 ++++++++++++++++++++++++++++++- 1 file changed, 47 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/iommufd/serialise.c b/drivers/iommu/iommufd/seri= alise.c index 6b4c306dce40..7f2e7b1eda13 100644 --- a/drivers/iommu/iommufd/serialise.c +++ b/drivers/iommu/iommufd/serialise.c @@ -21,6 +21,9 @@ * } * ] */ + +static struct kobject *persisted_dir_kobj; + static int serialise_iommufd(void *fdt, struct iommufd_ctx *ictx) { int err =3D 0; @@ -94,8 +97,51 @@ int iommufd_serialise_kho(struct notifier_block *self, u= nsigned long cmd, } } =20 +static ssize_t iommufd_show(struct kobject *kobj, struct kobj_attribute *a= ttr, + char *buf) +{ + return 0; +} + +static struct kobj_attribute persisted_attr =3D + __ATTR_RO_MODE(iommufd, 0440); + +static int deserialise_iommufds(const void *fdt, int root_off) +{ + int off; + + /* + * For each persisted iommufd id, create a directory + * in sysfs with an iommufd file in it. + */ + fdt_for_each_subnode(off, fdt, root_off) { + struct kobject *kobj; + const char *name =3D fdt_get_name(fdt, off, NULL); + int rc; + + kobj =3D kobject_create_and_add(name, persisted_dir_kobj); + rc =3D sysfs_create_file(kobj, &persisted_attr.attr); + if (rc) + pr_warn("Unable to create sysfs file for iommufd node %s\n", name); + } + return 0; +} + int __init iommufd_deserialise_kho(void) { - pr_info("would deserialise here\n"); + const void *fdt =3D kho_get_fdt(); + int off; + + if (!fdt) + return 0; + + /* Parent directory for persisted iommufd files. */ + persisted_dir_kobj =3D kobject_create_and_add("iommufd_persisted", kernel= _kobj); + + off =3D fdt_path_offset(fdt, "/iommufd"); + if (off <=3D 0) + return 0; /* No data in KHO */ + + deserialise_iommufds(fdt, fdt_subnode_offset(fdt, off, "iommufds")); return 0; } --=20 2.34.1 From nobody Fri Nov 29 18:27:50 2024 Received: from smtp-fw-80008.amazon.com (smtp-fw-80008.amazon.com [99.78.197.219]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2597461FD8; Mon, 16 Sep 2024 11:33:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=99.78.197.219 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726486432; cv=none; b=HXT2GrPwa/RRhwqpS4sCUwyS/lFW6cKAGVDTAen2XEEDd7VBBR17XV9tDrdXJ73J8q+O0vBHoM80/pnSGN2EJiSg6kdURsEIgnoIyWL1RA5OuZu7bi/nit00e/73p0WZgGA81ZlL1RFRv3uQ0MXFSRNoe2Fah1H3lQX9CTAOh9Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726486432; c=relaxed/simple; bh=dxOdRSAZOWofWEYK0VVawmecWoCTfCFqvejo79rI17o=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=XLJ3DqcTcWUfw5F5a9bUP+vZYOvoWTVz14F9HlPYsLFdZMBjN6yUObOpL/McwkTC0aUHJUtbwQ66ooGarlg3L8hfBnqkS+8Fc5gzShCcXMtbraB2w4X2JeqTC875hdomMw7lMikM0vRi6lqskjzD00zgISNob2F3thd5od1fqYI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.com; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=igbE2LUH; arc=none smtp.client-ip=99.78.197.219 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="igbE2LUH" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1726486431; x=1758022431; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=tEGWnhFMypw6xRZtfZafM76snQDV4sMYnYAjAyQltqs=; b=igbE2LUHolPHphzAxJJReYzI81iE55v7eU4Pp1+bKe3+IpCc4JJ17+6a rb5UpvVgb/Nag5JJ+d+7F4ZaVomW4BmNoGFeScRHPW6L8W0/iAeGYJvcb 6TNdBiLdHgZ895OzNVD/HciPLPPmUyYGDWtzh5LIOT2av65nzEr1CC8is U=; X-IronPort-AV: E=Sophos;i="6.10,233,1719878400"; d="scan'208";a="126593048" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.214]) by smtp-border-fw-80008.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Sep 2024 11:33:50 +0000 Received: from EX19MTAEUA001.ant.amazon.com [10.0.43.254:42594] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.25.198:2525] with esmtp (Farcaster) id f532e37d-8857-4c4d-b867-21ef2f8b2bae; Mon, 16 Sep 2024 11:33:49 +0000 (UTC) X-Farcaster-Flow-ID: f532e37d-8857-4c4d-b867-21ef2f8b2bae Received: from EX19D014EUC004.ant.amazon.com (10.252.51.182) by EX19MTAEUA001.ant.amazon.com (10.252.50.223) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 16 Sep 2024 11:33:49 +0000 Received: from u5d18b891348c5b.ant.amazon.com (10.146.13.221) by EX19D014EUC004.ant.amazon.com (10.252.51.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 16 Sep 2024 11:33:39 +0000 From: James Gowans To: CC: Jason Gunthorpe , Kevin Tian , "Joerg Roedel" , =?UTF-8?q?Krzysztof=20Wilczy=C5=84ski?= , Will Deacon , Robin Murphy , Mike Rapoport , "Madhavan T. Venkataraman" , , "Sean Christopherson" , Paolo Bonzini , , David Woodhouse , Lu Baolu , Alexander Graf , , , , "Saenz Julienne, Nicolas" Subject: [RFC PATCH 07/13] iommufd: Re-hydrate a usable iommufd ctx from sysfs Date: Mon, 16 Sep 2024 13:30:56 +0200 Message-ID: <20240916113102.710522-8-jgowans@amazon.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240916113102.710522-1-jgowans@amazon.com> References: <20240916113102.710522-1-jgowans@amazon.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: EX19D046UWB003.ant.amazon.com (10.13.139.174) To EX19D014EUC004.ant.amazon.com (10.252.51.182) Content-Type: text/plain; charset="utf-8" When the sysfs file is read, create an iommufd file descriptor, create a fresh iommufd_ctx, and populate that ictx struct and related structs with the data about mapped IOVA ranges from KHO. This is done in a super yucky way by having the sysfs file's .show() callback create a new file and then print out the new file's fd number. Done this way because I couldn't figure out how to define a custom .open() callback on a sysfs object. An alternative would be to have a new iommufd pseudo-filesystem which could be mounted somewhere and would have all of the relevant persistent data in it. Opinions/ideas on how best to expose persisted domains to userspace are welcome. --- drivers/iommu/iommufd/io_pagetable.c | 2 +- drivers/iommu/iommufd/iommufd_private.h | 4 ++ drivers/iommu/iommufd/main.c | 4 +- drivers/iommu/iommufd/serialise.c | 54 ++++++++++++++++++++++++- 4 files changed, 60 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/iommufd/io_pagetable.c b/drivers/iommu/iommufd/i= o_pagetable.c index 05fd9d3abf1b..b4b75663d7cf 100644 --- a/drivers/iommu/iommufd/io_pagetable.c +++ b/drivers/iommu/iommufd/io_pagetable.c @@ -222,7 +222,7 @@ static int iopt_insert_area(struct io_pagetable *iopt, = struct iopt_area *area, return 0; } =20 -static struct iopt_area *iopt_area_alloc(void) +struct iopt_area *iopt_area_alloc(void) { struct iopt_area *area; =20 diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommuf= d/iommufd_private.h index ad8d180269bd..94612cec2814 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -59,6 +59,10 @@ struct io_pagetable { unsigned long iova_alignment; }; =20 +extern const struct file_operations iommufd_fops; +int iommufd_fops_open(struct inode *inode, struct file *filp); +struct iopt_area *iopt_area_alloc(void); + void iopt_init_table(struct io_pagetable *iopt); void iopt_destroy_table(struct io_pagetable *iopt); int iopt_get_pages(struct io_pagetable *iopt, unsigned long iova, diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c index 21a7e1ad40d1..f78a4cf23741 100644 --- a/drivers/iommu/iommufd/main.c +++ b/drivers/iommu/iommufd/main.c @@ -233,7 +233,7 @@ static int iommufd_destroy(struct iommufd_ucmd *ucmd) return iommufd_object_remove(ucmd->ictx, NULL, cmd->id, 0); } =20 -static int iommufd_fops_open(struct inode *inode, struct file *filp) +int iommufd_fops_open(struct inode *inode, struct file *filp) { struct iommufd_ctx *ictx; =20 @@ -473,7 +473,7 @@ static long iommufd_fops_ioctl(struct file *filp, unsig= ned int cmd, return ret; } =20 -static const struct file_operations iommufd_fops =3D { +const struct file_operations iommufd_fops =3D { .owner =3D THIS_MODULE, .open =3D iommufd_fops_open, .release =3D iommufd_fops_release, diff --git a/drivers/iommu/iommufd/serialise.c b/drivers/iommu/iommufd/seri= alise.c index 7f2e7b1eda13..9519969bd201 100644 --- a/drivers/iommu/iommufd/serialise.c +++ b/drivers/iommu/iommufd/serialise.c @@ -1,5 +1,7 @@ // SPDX-License-Identifier: GPL-2.0-only =20 +#include +#include #include #include #include "iommufd_private.h" @@ -97,10 +99,60 @@ int iommufd_serialise_kho(struct notifier_block *self, = unsigned long cmd, } } =20 +static int rehydrate_iommufd(char *iommufd_name) +{ + struct file *file; + int fd; + int off; + struct iommufd_ctx *ictx; + struct files_struct *files =3D current->files; // Current process's file= s_struct + const void *fdt =3D kho_get_fdt(); + char kho_path[42]; + + fd =3D anon_inode_getfd("iommufd", &iommufd_fops, NULL, O_RDWR); + if (fd < 0) + return fd; + file =3D files_lookup_fd_raw(files, fd); + iommufd_fops_open(NULL, file); + ictx =3D file->private_data; + + snprintf(kho_path, sizeof(kho_path), "/iommufd/iommufds/%s/ioases", iommu= fd_name); + fdt_for_each_subnode(off, fdt, fdt_path_offset(fdt, kho_path)) { + struct iommufd_ioas *ioas; + int range_off; + + ioas =3D iommufd_ioas_alloc(ictx); + iommufd_object_finalize(ictx, &ioas->obj); + + fdt_for_each_subnode(range_off, fdt, off) { + const unsigned long *iova_start, *iova_len; + const int *iommu_prot; + int len; + struct iopt_area *area =3D iopt_area_alloc(); + + iova_start =3D fdt_getprop(fdt, range_off, "iova-start", &len); + iova_len =3D fdt_getprop(fdt, range_off, "iova-len", &len); + iommu_prot =3D fdt_getprop(fdt, range_off, "iommu-prot", &len); + + area->iommu_prot =3D *iommu_prot; + area->node.start =3D *iova_start; + area->node.last =3D *iova_start + *iova_len - 1; + interval_tree_insert(&area->node, &ioas->iopt.area_itree); + } + /* TODO: restore link from ioas to hwpt. */ + } + + return fd; +} + static ssize_t iommufd_show(struct kobject *kobj, struct kobj_attribute *a= ttr, char *buf) { - return 0; + char fd_str[10]; + ssize_t len; + + len =3D snprintf(buf, sizeof(fd_str), "%i\n", rehydrate_iommufd("1")); + return len; } =20 static struct kobj_attribute persisted_attr =3D --=20 2.34.1 From nobody Fri Nov 29 18:27:50 2024 Received: from smtp-fw-80008.amazon.com (smtp-fw-80008.amazon.com [99.78.197.219]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C21541547C4; Mon, 16 Sep 2024 11:34:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=99.78.197.219 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726486443; cv=none; b=JeUg548jyIdRl7jDhfPZN498FUpQZVBDfiZzunzuiRGCMXc+DtZfCxjM9Od1HS6qw5K9zJ5KHJDZS0/8UOZtLStMdOHnO6S9nyVEPqBS15PajrwXe8r8wvLvs3AZ4O3vyj63/abLipfuGxI7wAdpVPmt6M2hZq+4qTjXVoQnzRE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726486443; c=relaxed/simple; bh=qnq7WvFftCoeWcTSBoWunoZuNfa1NE67PLgl/LAceBM=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=ph9ON1e9gzVLQDuByruSOMTFOZlHjpnWJGKvDvth8IBuksb1Zh0Tad1O8dMidDGdC4Z/JMUiQ7yeHHOUhQ+tuzSnuPwv++hjUfMGI+xOvUIGJdciS4vtazaXF6bnYLogSvLXpSVj/4crGFJBBsal8fWMqFeiMbc+QscptboH96M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.com; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=aLsPqo0s; arc=none smtp.client-ip=99.78.197.219 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="aLsPqo0s" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1726486441; x=1758022441; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=znWocKnBAQ/hYk1869lWadM7cB9SuIb60CkQR+dfh54=; b=aLsPqo0sk+v5tjA3Tbmhi/px0YeWE46JGcM9HGSTKp7U8a8SCC6kPTyz S7EiXJ185mOs0HlSf5db5vamicp96TdlCGHIUvYaWLoXu6BNlXHO910jV FVkYIIyYJjSVr5WT4n/gh1nNYUan8WlToSp+kWGJk0JIe3X8Nw0+Oaccr A=; X-IronPort-AV: E=Sophos;i="6.10,233,1719878400"; d="scan'208";a="126593101" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.214]) by smtp-border-fw-80008.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Sep 2024 11:34:01 +0000 Received: from EX19MTAEUA002.ant.amazon.com [10.0.17.79:20132] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.3.169:2525] with esmtp (Farcaster) id 2782e688-9c8a-483f-87c2-fcf35b3201fd; Mon, 16 Sep 2024 11:34:00 +0000 (UTC) X-Farcaster-Flow-ID: 2782e688-9c8a-483f-87c2-fcf35b3201fd Received: from EX19D014EUC004.ant.amazon.com (10.252.51.182) by EX19MTAEUA002.ant.amazon.com (10.252.50.126) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 16 Sep 2024 11:34:00 +0000 Received: from u5d18b891348c5b.ant.amazon.com (10.146.13.221) by EX19D014EUC004.ant.amazon.com (10.252.51.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 16 Sep 2024 11:33:49 +0000 From: James Gowans To: CC: Jason Gunthorpe , Kevin Tian , "Joerg Roedel" , =?UTF-8?q?Krzysztof=20Wilczy=C5=84ski?= , Will Deacon , Robin Murphy , Mike Rapoport , "Madhavan T. Venkataraman" , , "Sean Christopherson" , Paolo Bonzini , , David Woodhouse , Lu Baolu , Alexander Graf , , , , "Saenz Julienne, Nicolas" Subject: [RFC PATCH 08/13] intel-iommu: Add serialise and deserialise boilerplate Date: Mon, 16 Sep 2024 13:30:57 +0200 Message-ID: <20240916113102.710522-9-jgowans@amazon.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240916113102.710522-1-jgowans@amazon.com> References: <20240916113102.710522-1-jgowans@amazon.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: EX19D046UWB003.ant.amazon.com (10.13.139.174) To EX19D014EUC004.ant.amazon.com (10.252.51.182) Content-Type: text/plain; charset="utf-8" Similar to how iommufd got serialise and deserialise hooks, now add this to the platform iommu driver, in this case intel-iommu. Once again this will be fleshed out in the next commits to actually serialise the struct dmar_domain before kexec and restore them after kexec. --- drivers/iommu/intel/Makefile | 1 + drivers/iommu/intel/iommu.c | 18 +++++++++++++++ drivers/iommu/intel/iommu.h | 18 +++++++++++++++ drivers/iommu/intel/serialise.c | 40 +++++++++++++++++++++++++++++++++ 4 files changed, 77 insertions(+) create mode 100644 drivers/iommu/intel/serialise.c diff --git a/drivers/iommu/intel/Makefile b/drivers/iommu/intel/Makefile index c8beb0281559..ca9f73992620 100644 --- a/drivers/iommu/intel/Makefile +++ b/drivers/iommu/intel/Makefile @@ -9,3 +9,4 @@ ifdef CONFIG_INTEL_IOMMU obj-$(CONFIG_IRQ_REMAP) +=3D irq_remapping.o endif obj-$(CONFIG_INTEL_IOMMU_PERF_EVENTS) +=3D perfmon.o +obj-$(CONFIG_KEXEC_KHO) +=3D serialise.o diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index f473a8c008a7..7e77b787148a 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -65,6 +65,7 @@ static int rwbf_quirk; static int force_on =3D 0; static int intel_iommu_tboot_noforce; static int no_platform_optin; +DEFINE_XARRAY(persistent_domains); =20 #define ROOT_ENTRY_NR (VTD_PAGE_SIZE/sizeof(struct root_entry)) =20 @@ -3393,6 +3394,10 @@ static __init int tboot_force_iommu(void) return 1; } =20 +static struct notifier_block serialise_kho_nb =3D { + .notifier_call =3D intel_iommu_serialise_kho, +}; + int __init intel_iommu_init(void) { int ret =3D -ENODEV; @@ -3432,6 +3437,12 @@ int __init intel_iommu_init(void) if (!no_iommu) intel_iommu_debugfs_init(); =20 + if (IS_ENABLED(CONFIG_KEXEC_KHO)) { + ret =3D register_kho_notifier(&serialise_kho_nb); + if (ret) + goto out_free_dmar; + } + if (no_iommu || dmar_disabled) { /* * We exit the function here to ensure IOMMU's remapping and @@ -3738,6 +3749,7 @@ intel_iommu_domain_alloc_user(struct device *dev, u32= flags, struct intel_iommu *iommu =3D info->iommu; struct dmar_domain *dmar_domain; struct iommu_domain *domain; + int rc; =20 /* Must be NESTING domain */ if (parent) { @@ -3778,6 +3790,12 @@ intel_iommu_domain_alloc_user(struct device *dev, u3= 2 flags, domain->dirty_ops =3D &intel_dirty_ops; } =20 + if (persistent_id) { + rc =3D xa_insert(&persistent_domains, persistent_id, domain, GFP_KERNEL_= ACCOUNT); + if (rc) + pr_warn("Unable to track persistent domain %lu\n", persistent_id); + } + return domain; } =20 diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index cfd006588824..7866342f0909 100644 --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -11,6 +11,7 @@ #define _INTEL_IOMMU_H_ =20 #include +#include #include #include #include @@ -496,6 +497,7 @@ struct q_inval { #define PRQ_DEPTH ((0x1000 << PRQ_ORDER) >> 5) =20 struct dmar_pci_notify_info; +extern struct xarray persistent_domains; =20 #ifdef CONFIG_IRQ_REMAP /* 1MB - maximum possible interrupt remapping table size */ @@ -1225,6 +1227,22 @@ static inline int iommu_calculate_max_sagaw(struct i= ntel_iommu *iommu) #define intel_iommu_sm (0) #endif =20 +#ifdef CONFIG_KEXEC_KHO +int intel_iommu_serialise_kho(struct notifier_block *self, unsigned long c= md, + void *fdt); +int __init intel_iommu_deserialise_kho(void); +#else +int intel_iommu_serialise_kho(struct notifier_block *self, unsigned long c= md, + void *fdt) +{ + return 0; +} +int __init intel_iommu_deserialise_kho(void) +{ + return 0; +} +#endif /* CONFIG_KEXEC_KHO */ + static inline const char *decode_prq_descriptor(char *str, size_t size, u64 dw0, u64 dw1, u64 dw2, u64 dw3) { diff --git a/drivers/iommu/intel/serialise.c b/drivers/iommu/intel/serialis= e.c new file mode 100644 index 000000000000..08a548b33703 --- /dev/null +++ b/drivers/iommu/intel/serialise.c @@ -0,0 +1,40 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include "iommu.h" + +static int serialise_domain(void *fdt, struct iommu_domain *domain) +{ + return 0; +} + +int intel_iommu_serialise_kho(struct notifier_block *self, unsigned long c= md, + void *fdt) +{ + static const char compatible[] =3D "intel-iommu-v0"; + struct iommu_domain *domain; + unsigned long xa_idx; + int err =3D 0; + + switch (cmd) { + case KEXEC_KHO_ABORT: + /* Would do serialise rollback here. */ + return NOTIFY_DONE; + case KEXEC_KHO_DUMP: + err |=3D fdt_begin_node(fdt, "intel-iommu"); + fdt_property(fdt, "compatible", compatible, sizeof(compatible)); + err |=3D fdt_begin_node(fdt, "domains"); + xa_for_each(&persistent_domains, xa_idx, domain) { + err |=3D serialise_domain(fdt, domain); + } + err |=3D fdt_end_node(fdt); /* domains */ + err |=3D fdt_end_node(fdt); /* intel-iommu*/ + return err? NOTIFY_BAD : NOTIFY_DONE; + default: + return NOTIFY_BAD; + } +} + +int __init intel_iommu_deserialise_kho(void) +{ + return 0; +} --=20 2.34.1 From nobody Fri Nov 29 18:27:50 2024 Received: from smtp-fw-52004.amazon.com (smtp-fw-52004.amazon.com [52.119.213.154]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6F8E1154429; Mon, 16 Sep 2024 11:34:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=52.119.213.154 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726486491; cv=none; b=Er1sY23pbg7CnoSqyHYdTkRB1tigzqlbe3ooDcqI+JB3W8DX7EJqtyoXdpHdq2S0n2oIvZ9a62f3Rsmc37ZOdrVdbWfjqTHyurW6oB6M01ul96NlJQNL+aJMBPe50eWYr5bJMlCzHO1P4txaQ7Zgrqq3gecjbg+nyty0Juz2sXw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726486491; c=relaxed/simple; bh=2Z+qKyzqOqbutmfC3zXpXkioDGSuvF+lOnNL+tKeDWY=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Bh5Pb21gb441yL6l+sBjBr0NPzWWMdfghmvg+ZLU/q7phQWPEaej40fYS+fy2RPK1+epjIbjscVFfCa7bHpctDlInh45jrsnP9zhISCIaTPssVZ85MF4X/hK8dYkoQh8IRq7hVgJjcTld0xi4GvhFVAv/PU2cOhS1JuRLAAesxo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.com; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=caQYpc8V; arc=none smtp.client-ip=52.119.213.154 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="caQYpc8V" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1726486490; x=1758022490; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=d7SA1xd31CQwM5KfV5meLxmO5eWfDrrSDBGteh7IWAY=; b=caQYpc8V/YMvfAic+xKc3k5DqmY7+aRb4xQ/S1/GELaK8wox/PHO3hHY jZ6X3ep9CD0mMR4on5/5av56AuXp91EyQS+RCY024uMnuh03WhLuYqjTo 7eKp/bhycYLd3xFBB6oZ+Nb1cr9NOKzAEboFnzt+gPKjT0lZHVMQZCiuJ 4=; X-IronPort-AV: E=Sophos;i="6.10,233,1719878400"; d="scan'208";a="232155494" Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO smtpout.prod.us-east-1.prod.farcaster.email.amazon.dev) ([10.43.8.2]) by smtp-border-fw-52004.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Sep 2024 11:34:49 +0000 Received: from EX19MTAEUA001.ant.amazon.com [10.0.17.79:46245] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.43.112:2525] with esmtp (Farcaster) id 86fb6cac-b9f4-49c1-bff6-bddf8af55ee2; Mon, 16 Sep 2024 11:34:47 +0000 (UTC) X-Farcaster-Flow-ID: 86fb6cac-b9f4-49c1-bff6-bddf8af55ee2 Received: from EX19D014EUC004.ant.amazon.com (10.252.51.182) by EX19MTAEUA001.ant.amazon.com (10.252.50.223) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 16 Sep 2024 11:34:43 +0000 Received: from u5d18b891348c5b.ant.amazon.com (10.146.13.221) by EX19D014EUC004.ant.amazon.com (10.252.51.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 16 Sep 2024 11:34:32 +0000 From: James Gowans To: CC: Jason Gunthorpe , Kevin Tian , "Joerg Roedel" , =?UTF-8?q?Krzysztof=20Wilczy=C5=84ski?= , Will Deacon , Robin Murphy , Mike Rapoport , "Madhavan T. Venkataraman" , , "Sean Christopherson" , Paolo Bonzini , , David Woodhouse , Lu Baolu , Alexander Graf , , , , "Saenz Julienne, Nicolas" Subject: [RFC PATCH 09/13] intel-iommu: Serialise dmar_domain on KHO activaet Date: Mon, 16 Sep 2024 13:30:58 +0200 Message-ID: <20240916113102.710522-10-jgowans@amazon.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240916113102.710522-1-jgowans@amazon.com> References: <20240916113102.710522-1-jgowans@amazon.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: EX19D046UWB003.ant.amazon.com (10.13.139.174) To EX19D014EUC004.ant.amazon.com (10.252.51.182) Content-Type: text/plain; charset="utf-8" Add logic to iterate through persistent domains, add the page table pages to KHO persistent memory pages. Also serialise some metadata about the domains and attached PCI devices. By adding the page table pages to the `mem` attribute on the KHO object these pages will be carved out of system memory early in boot by KHO, guaranteeing that they will not be used for any other purpose by the new kernel. This persists the page tables across kexec. --- drivers/iommu/intel/iommu.c | 9 ---- drivers/iommu/intel/iommu.h | 10 ++++ drivers/iommu/intel/serialise.c | 92 ++++++++++++++++++++++++++++++++- 3 files changed, 101 insertions(+), 10 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 7e77b787148a..0a2118a3b7c4 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -46,15 +46,6 @@ =20 #define DEFAULT_DOMAIN_ADDRESS_WIDTH 57 =20 -#define __DOMAIN_MAX_PFN(gaw) ((((uint64_t)1) << ((gaw) - VTD_PAGE_SHIFT)= ) - 1) -#define __DOMAIN_MAX_ADDR(gaw) ((((uint64_t)1) << (gaw)) - 1) - -/* We limit DOMAIN_MAX_PFN to fit in an unsigned long, and DOMAIN_MAX_ADDR - to match. That way, we can use 'unsigned long' for PFNs with impunity. = */ -#define DOMAIN_MAX_PFN(gaw) ((unsigned long) min_t(uint64_t, \ - __DOMAIN_MAX_PFN(gaw), (unsigned long)-1)) -#define DOMAIN_MAX_ADDR(gaw) (((uint64_t)__DOMAIN_MAX_PFN(gaw)) << VTD_PAG= E_SHIFT) - static void __init check_tylersburg_isoch(void); static int rwbf_quirk; =20 diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index 7866342f0909..cd932a97a9bc 100644 --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -38,6 +38,16 @@ =20 #define IOVA_PFN(addr) ((addr) >> PAGE_SHIFT) =20 +#define __DOMAIN_MAX_PFN(gaw) ((((uint64_t)1) << ((gaw) - VTD_PAGE_SHIFT)= ) - 1) +#define __DOMAIN_MAX_ADDR(gaw) ((((uint64_t)1) << (gaw)) - 1) + +/* We limit DOMAIN_MAX_PFN to fit in an unsigned long, and DOMAIN_MAX_ADDR + to match. That way, we can use 'unsigned long' for PFNs with impunity. = */ +#define DOMAIN_MAX_PFN(gaw) ((unsigned long) min_t(uint64_t, \ + __DOMAIN_MAX_PFN(gaw), (unsigned long)-1)) +#define DOMAIN_MAX_ADDR(gaw) (((uint64_t)__DOMAIN_MAX_PFN(gaw)) << VTD_PAG= E_SHIFT) + + #define VTD_STRIDE_SHIFT (9) #define VTD_STRIDE_MASK (((u64)-1) << VTD_STRIDE_SHIFT) =20 diff --git a/drivers/iommu/intel/serialise.c b/drivers/iommu/intel/serialis= e.c index 08a548b33703..bc755e51732b 100644 --- a/drivers/iommu/intel/serialise.c +++ b/drivers/iommu/intel/serialise.c @@ -2,9 +2,99 @@ =20 #include "iommu.h" =20 +/* + * Serialised format: + * /intel-iommu + * compatible =3D str + * domains =3D { + * persistent-id =3D { + * mem =3D [ ... ] // page table pages + * agaw =3D i32 + * pgd =3D u64 + * devices =3D { + * id =3D { + * u8 bus; + * u8 devfn + * }, + * ... + * } + * } + * } + */ + +/* + * Adds all present PFNs on the PTE page to the kho_mem pointer and advanc= es + * the pointer. + * Stolen from dma_pte_list_pagetables() */ +static void save_pte_pages(struct dmar_domain *domain, int level, + struct dma_pte *pte, struct kho_mem **kho_mem) +{ + struct page *pg; + + pg =3D pfn_to_page(dma_pte_addr(pte) >> PAGE_SHIFT); +=09 + if (level =3D=3D 1) + return; + + pte =3D page_address(pg); + do { + if (dma_pte_present(pte)) { + (*kho_mem)->addr =3D dma_pte_addr(pte); + (*kho_mem)->len =3D PAGE_SIZE; + (*kho_mem)++; + if (!dma_pte_superpage(pte)) + save_pte_pages(domain, level - 1, pte, kho_mem); + } + pte++; + } while (!first_pte_in_page(pte)); +} + =09 static int serialise_domain(void *fdt, struct iommu_domain *domain) { - return 0; + struct dmar_domain *dmar_domain =3D to_dmar_domain(domain); + /* + * kho_mems_start points to the original allocated array; kho_mems + * is incremented by the callee. Keep both to know how many were added. + */ + struct kho_mem *kho_mems, *kho_mems_start; + struct device_domain_info *info; + int err =3D 0; + char name[24]; + int device_idx =3D 0; + phys_addr_t pgd; + + /* + * Assume just one page worth of kho_mem objects is enough. + * Better would be to keep track of number of allocated pages in the doma= in. + * */ + kho_mems_start =3D kho_mems =3D kzalloc(PAGE_SIZE, GFP_KERNEL); + + save_pte_pages(dmar_domain, agaw_to_level(dmar_domain->agaw), + dmar_domain->pgd, &kho_mems); + + snprintf(name, sizeof(name), "%lu", domain->persistent_id); + err |=3D fdt_begin_node(fdt, name); + err |=3D fdt_property(fdt, "mem", kho_mems_start, + sizeof(struct kho_mem) * (kho_mems - kho_mems_start)); + err |=3D fdt_property(fdt, "persistent_id", &domain->persistent_id, + sizeof(domain->persistent_id)); + pgd =3D virt_to_phys(dmar_domain->pgd); + err |=3D fdt_property(fdt, "pgd", &pgd, sizeof(pgd)); + err |=3D fdt_property(fdt, "agaw", &dmar_domain->agaw, + sizeof(dmar_domain->agaw)); + + err |=3D fdt_begin_node(fdt, "devices"); + list_for_each_entry(info, &dmar_domain->devices, link) { + snprintf(name, sizeof(name), "%i", device_idx++); + err |=3D fdt_begin_node(fdt, name); + err |=3D fdt_property(fdt, "bus", &info->bus, sizeof(info->bus)); + err |=3D fdt_property(fdt, "devfn", &info->devfn, sizeof(info->devfn)); + err |=3D fdt_end_node(fdt); /* device_idx */ + } + err |=3D fdt_end_node(fdt); /* devices */ + err |=3D fdt_end_node(fdt); /* domain->persistent_id */ + + return err; } =20 int intel_iommu_serialise_kho(struct notifier_block *self, unsigned long c= md, --=20 2.34.1 From nobody Fri Nov 29 18:27:50 2024 Received: from smtp-fw-52004.amazon.com (smtp-fw-52004.amazon.com [52.119.213.154]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6AF9E14F9D5; Mon, 16 Sep 2024 11:34:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=52.119.213.154 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726486499; cv=none; b=cJw9SWN/hW9MS6P0g/E8lV9zCvmhzye/WZNc9qKcWKe12jasZ0d5zzPwd4DsJT71NoKrOkjEOdAhlXafkczAEwyCXxM5GsMKtbert1N/+CyCznEVUG9kmCsMkaNpS61nhTtoQrB1qElcsGPaiqU2NKEp9AvlZeFXMD5AhL5EUqM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726486499; c=relaxed/simple; bh=qgAWrKAqTmQwyd4blG1RANeLvbuBYnyiRPo7Iag+Qgg=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=LHxCAsBZiO7GV5TdiGqLjEf7g/5NFjlhO+tASGLrJKKyR82+uVCZBNy0BV1uOmrSaBcAyXpfvZqsSYb1eHeJJVrLdhfhnsvbKybb8HCA65941AIpQkvAy1obvZ/GzqxRQ3wNZJmO3+ECDZvtrPCGbPb6tUQRLA890Oe6JatbItw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.com; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=XVPRMUGK; arc=none smtp.client-ip=52.119.213.154 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="XVPRMUGK" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1726486498; x=1758022498; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=UHPN1sMG5/6h+BnEt3zqjnOKfZMJcCvehN1igJFWOWk=; b=XVPRMUGKTiliDAX5fflXEN6DjEFbJkGitMSIoDGNdsrmP4WbBxd+twCY nrwo349wnd/DvpsamEBChTN8AvJ/fRDn2YoAupEtAOq7Ml5N05ml32/Oh k2A7S2OPTGyaLXS9sg8QhUBdvlgbihMupVW6LEedmvVqTO773yBpAM3AZ c=; X-IronPort-AV: E=Sophos;i="6.10,233,1719878400"; d="scan'208";a="232155512" Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.43.8.2]) by smtp-border-fw-52004.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Sep 2024 11:34:55 +0000 Received: from EX19MTAEUA002.ant.amazon.com [10.0.17.79:61744] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.20.15:2525] with esmtp (Farcaster) id be110ece-689d-4c7e-83b2-236c50a7aaba; Mon, 16 Sep 2024 11:34:54 +0000 (UTC) X-Farcaster-Flow-ID: be110ece-689d-4c7e-83b2-236c50a7aaba Received: from EX19D014EUC004.ant.amazon.com (10.252.51.182) by EX19MTAEUA002.ant.amazon.com (10.252.50.126) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 16 Sep 2024 11:34:54 +0000 Received: from u5d18b891348c5b.ant.amazon.com (10.146.13.221) by EX19D014EUC004.ant.amazon.com (10.252.51.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 16 Sep 2024 11:34:43 +0000 From: James Gowans To: CC: Jason Gunthorpe , Kevin Tian , "Joerg Roedel" , =?UTF-8?q?Krzysztof=20Wilczy=C5=84ski?= , Will Deacon , Robin Murphy , Mike Rapoport , "Madhavan T. Venkataraman" , , "Sean Christopherson" , Paolo Bonzini , , David Woodhouse , Lu Baolu , Alexander Graf , , , , "Saenz Julienne, Nicolas" Subject: [RFC PATCH 10/13] intel-iommu: Re-hydrate persistent domains after kexec Date: Mon, 16 Sep 2024 13:30:59 +0200 Message-ID: <20240916113102.710522-11-jgowans@amazon.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240916113102.710522-1-jgowans@amazon.com> References: <20240916113102.710522-1-jgowans@amazon.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: EX19D046UWB003.ant.amazon.com (10.13.139.174) To EX19D014EUC004.ant.amazon.com (10.252.51.182) Content-Type: text/plain; charset="utf-8" Go through the domain data persisted in KHO, allocate fresh dmar_domain structs and populate the structs with the persisted data. Persisted page table pages in the "mem" field are also claimed to transfer ownership of the pages from KHO back to the intel-iommu driver. Once re-hydrated the struct iommu_domain pointers are inserted into the persisted_domains xarray so that they can be fetched later when they need to be restored by iommufd. This will be done in the next commit. --- drivers/iommu/intel/iommu.c | 9 ++++++- drivers/iommu/intel/iommu.h | 1 + drivers/iommu/intel/serialise.c | 44 +++++++++++++++++++++++++++++++++ 3 files changed, 53 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 0a2118a3b7c4..8e0ed033b03f 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -1505,7 +1505,7 @@ static bool first_level_by_default(unsigned int type) return type !=3D IOMMU_DOMAIN_UNMANAGED; } =20 -static struct dmar_domain *alloc_domain(unsigned int type) +struct dmar_domain *alloc_domain(unsigned int type) { struct dmar_domain *domain; =20 @@ -3468,6 +3468,7 @@ int __init intel_iommu_init(void) =20 init_no_remapping_devices(); =20 + intel_iommu_deserialise_kho(); ret =3D init_dmars(); if (ret) { if (force_on) @@ -4127,6 +4128,12 @@ static struct iommu_device *intel_iommu_probe_device= (struct device *dev) } =20 dev_iommu_priv_set(dev, info); + + /* + * TODO: around here the device should be added to the persistent + * domain if it is a persistent device. + */ + if (pdev && pci_ats_supported(pdev)) { ret =3D device_rbtree_insert(iommu, info); if (ret) diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h index cd932a97a9bc..7ee050ebfaca 100644 --- a/drivers/iommu/intel/iommu.h +++ b/drivers/iommu/intel/iommu.h @@ -1118,6 +1118,7 @@ int qi_submit_sync(struct intel_iommu *iommu, struct = qi_desc *desc, */ #define QI_OPT_WAIT_DRAIN BIT(0) =20 +struct dmar_domain *alloc_domain(unsigned int type); void domain_update_iotlb(struct dmar_domain *domain); int domain_attach_iommu(struct dmar_domain *domain, struct intel_iommu *io= mmu); void domain_detach_iommu(struct dmar_domain *domain, struct intel_iommu *i= ommu); diff --git a/drivers/iommu/intel/serialise.c b/drivers/iommu/intel/serialis= e.c index bc755e51732b..20f42b84d490 100644 --- a/drivers/iommu/intel/serialise.c +++ b/drivers/iommu/intel/serialise.c @@ -124,7 +124,51 @@ int intel_iommu_serialise_kho(struct notifier_block *s= elf, unsigned long cmd, } } =20 +static void deserialise_domains(const void *fdt, int root_off) +{ + int off; + struct dmar_domain *dmar_domain; + + fdt_for_each_subnode(off, fdt, root_off) { + const struct kho_mem *kho_mems; + int len, idx; + const unsigned long *pgd_phys; + const int *agaw; + const unsigned long *persistent_id; + int rc; + + dmar_domain =3D alloc_domain(IOMMU_DOMAIN_UNMANAGED); + + kho_mems =3D fdt_getprop(fdt, off, "mem", &len); + for (idx =3D 0; idx * sizeof(struct kho_mem) < len; ++idx) + kho_claim_mem(&kho_mems[idx]); + + pgd_phys =3D fdt_getprop(fdt, off, "pgd", &len); + dmar_domain->pgd =3D phys_to_virt(*pgd_phys); + agaw =3D fdt_getprop(fdt, off, "agaw", &len); + dmar_domain->agaw =3D *agaw; + persistent_id =3D fdt_getprop(fdt, off, "persistent_id", &len); + dmar_domain->domain.persistent_id =3D *persistent_id; + + rc =3D xa_insert(&persistent_domains, *persistent_id, + &dmar_domain->domain, GFP_KERNEL); + if (rc) + pr_warn("Unable to re-insert persistent domain %lu\n", *persistent_id); + } +} + int __init intel_iommu_deserialise_kho(void) { + const void *fdt =3D kho_get_fdt(); + int off; + + if (!fdt) + return 0; + + off =3D fdt_path_offset(fdt, "/intel-iommu"); + if (off <=3D 0) + return 0; /* No data in KHO */ + + deserialise_domains(fdt, fdt_subnode_offset(fdt, off, "domains")); return 0; } --=20 2.34.1 From nobody Fri Nov 29 18:27:50 2024 Received: from smtp-fw-9102.amazon.com (smtp-fw-9102.amazon.com [207.171.184.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DB4BD155336; Mon, 16 Sep 2024 11:35:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=207.171.184.29 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726486514; cv=none; b=gUFAqoTT1gsaIQH//qEWTHc23GDvrXCElKlLmK2oknK7yrCx6qgDLXVert7A3lf1DrI203HbyhMXVbgC17PV6nebb4eKbbv4dNltT8wz7dIjAly+LLYKnQtjZnDq1rvljjp9VcpECONo2wA0Yb0HoB28PFQjG6sUojcJlziq5H8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726486514; c=relaxed/simple; bh=kA7sCJq83w/U5a6+MV6UIyor1R8tN22DopzPIXKiz28=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=orCZwiZrEwSWCANFf9j+x+lBKIdlkbyAsHtTtjTecirwqFJolR0Tja3wK4AK5pa1BSa8WH8UURbCoU1bD0lRoweTpIE5Ky3c+YzpxskFCLHTdmxcSs9rJsRmBjoRfjiBU9RzAY7RrZigl90ljxFgQe5g9sk4tZRVDKfCN5DbRiE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.com; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=rtexy970; arc=none smtp.client-ip=207.171.184.29 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="rtexy970" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1726486513; x=1758022513; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ddSVhaYeFUjlU8dhdd/QQglr933C97nK5BrvKFB/qRU=; b=rtexy9702sA+r1nbyRmGYR3uxCTQyS/UY6UFzJebofLGcIbPDglYBTUE 9wGqwGKOShrV9CjxP6EjUlhzVmzck1Mq9sSjgRp4gp0ZiR419C9D8p3zi Dl5S9Aywc+lpRZ3P1I9Y9jkyZpvCwVlRNmchCjHrO7FjoCrDenEBeCa6g M=; X-IronPort-AV: E=Sophos;i="6.10,233,1719878400"; d="scan'208";a="454426971" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev) ([10.25.36.214]) by smtp-border-fw-9102.sea19.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Sep 2024 11:35:06 +0000 Received: from EX19MTAEUB002.ant.amazon.com [10.0.43.254:46964] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.35.229:2525] with esmtp (Farcaster) id 9197f70c-cad3-4f56-b798-def27e3f2a65; Mon, 16 Sep 2024 11:35:04 +0000 (UTC) X-Farcaster-Flow-ID: 9197f70c-cad3-4f56-b798-def27e3f2a65 Received: from EX19D014EUC004.ant.amazon.com (10.252.51.182) by EX19MTAEUB002.ant.amazon.com (10.252.51.59) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 16 Sep 2024 11:35:04 +0000 Received: from u5d18b891348c5b.ant.amazon.com (10.146.13.221) by EX19D014EUC004.ant.amazon.com (10.252.51.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 16 Sep 2024 11:34:54 +0000 From: James Gowans To: CC: Jason Gunthorpe , Kevin Tian , "Joerg Roedel" , =?UTF-8?q?Krzysztof=20Wilczy=C5=84ski?= , Will Deacon , Robin Murphy , Mike Rapoport , "Madhavan T. Venkataraman" , , "Sean Christopherson" , Paolo Bonzini , , David Woodhouse , Lu Baolu , Alexander Graf , , , , "Saenz Julienne, Nicolas" Subject: [RFC PATCH 11/13] iommu: Add callback to restore persisted iommu_domain Date: Mon, 16 Sep 2024 13:31:00 +0200 Message-ID: <20240916113102.710522-12-jgowans@amazon.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240916113102.710522-1-jgowans@amazon.com> References: <20240916113102.710522-1-jgowans@amazon.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: EX19D046UWB003.ant.amazon.com (10.13.139.174) To EX19D014EUC004.ant.amazon.com (10.252.51.182) Content-Type: text/plain; charset="utf-8" The previous commits re-hydrated the struct iommu_domain and added them to the persisted_domains xarray. Now provide a callback to get the domain so that iommufd can restore a link to it. Roughly where the restore would happen is called out in a comment, but some more head scratching is needed to figure out how to actually do this. --- drivers/iommu/intel/iommu.c | 12 ++++++++++++ drivers/iommu/iommufd/serialise.c | 9 ++++++++- include/linux/iommu.h | 5 +++++ 3 files changed, 25 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 8e0ed033b03f..000ddfe5b6de 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -4690,6 +4690,17 @@ static int intel_iommu_read_and_clear_dirty(struct i= ommu_domain *domain, return 0; } =20 +static struct iommu_domain *intel_domain_restore(struct device *dev, + unsigned long persistent_id) +{ + struct iommu_domain *domain; + + domain =3D xa_load(&persistent_domains, persistent_id); + if (!domain) + pr_warn("No such persisted domain id %lu\n", persistent_id); + return domain; +} + static const struct iommu_dirty_ops intel_dirty_ops =3D { .set_dirty_tracking =3D intel_iommu_set_dirty_tracking, .read_and_clear_dirty =3D intel_iommu_read_and_clear_dirty, @@ -4703,6 +4714,7 @@ const struct iommu_ops intel_iommu_ops =3D { .domain_alloc =3D intel_iommu_domain_alloc, .domain_alloc_user =3D intel_iommu_domain_alloc_user, .domain_alloc_sva =3D intel_svm_domain_alloc, + .domain_restore =3D intel_domain_restore, .probe_device =3D intel_iommu_probe_device, .release_device =3D intel_iommu_release_device, .get_resv_regions =3D intel_iommu_get_resv_regions, diff --git a/drivers/iommu/iommufd/serialise.c b/drivers/iommu/iommufd/seri= alise.c index 9519969bd201..baac7d6150cb 100644 --- a/drivers/iommu/iommufd/serialise.c +++ b/drivers/iommu/iommufd/serialise.c @@ -139,7 +139,14 @@ static int rehydrate_iommufd(char *iommufd_name) area->node.last =3D *iova_start + *iova_len - 1; interval_tree_insert(&area->node, &ioas->iopt.area_itree); } - /* TODO: restore link from ioas to hwpt. */ + /* + * Here we should do something to associate struct iommufd_device wit= h the + * ictx, then get the iommu_ops via dev_iommu_ops(), and call the new + * .domain_restore callback to get the struct iommu_domain. + * Something like: + * hwpt->domain =3D ops->domain_restore(dev, persistent_id); + * Hand wavy - the details allude me at the moment... + */ } =20 return fd; diff --git a/include/linux/iommu.h b/include/linux/iommu.h index a616e8702a1c..0dc97d494fd9 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -529,6 +529,8 @@ static inline int __iommu_copy_struct_from_user_array( * @domain_alloc_paging: Allocate an iommu_domain that can be used for * UNMANAGED, DMA, and DMA_FQ domain types. * @domain_alloc_sva: Allocate an iommu_domain for Shared Virtual Addressi= ng. + * @domain_restore: After kexec, give the same persistent_id which was ori= ginally + * used to allocate the domain, and the domain will be re= stored. * @probe_device: Add device to iommu driver handling * @release_device: Remove device from iommu driver handling * @probe_finalize: Do final setup work after the device is added to an IO= MMU @@ -576,6 +578,9 @@ struct iommu_ops { struct iommu_domain *(*domain_alloc_sva)(struct device *dev, struct mm_struct *mm); =20 + struct iommu_domain *(*domain_restore)(struct device *dev, + unsigned long persistent_id); + struct iommu_device *(*probe_device)(struct device *dev); void (*release_device)(struct device *dev); void (*probe_finalize)(struct device *dev); --=20 2.34.1 From nobody Fri Nov 29 18:27:50 2024 Received: from smtp-fw-9102.amazon.com (smtp-fw-9102.amazon.com [207.171.184.29]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0905D153824; Mon, 16 Sep 2024 11:35:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=207.171.184.29 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726486551; cv=none; b=phvv5DWLoNBclXCcvpj2cFrQez77BCrupEA+szhfvm2HlotqRv81mRJuQr6klH3Obv9N5Cy4zwoa+aJmum6t51Z4hPLpeMsMCdYMcNPu8iKcOJF97ouEV9aeAmYTw2/WReuROaEKh5fT4CvqtFEoEP866Oies736cyXl042uKNw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726486551; c=relaxed/simple; bh=kZap1Ca0wYXL5ZXFV+JmmCCnLEfEqJOXPIU9iFe1SF0=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=HED/YzXHBbU8sL30rKRz7gaoqM3hqSk7LrLIGklhxBsNZ9Clw0isUkjHP6AP6qGHuEFPZ7RjIjfouKdXK43Tpgj83fcVT78fxYF0A8iUgRRDI3h4nyQRtO0ri5oTUcBiVlgLVSltRYLnI4UH5dsrKhn3r3T4WV9JWN8HMiwKh+8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.com; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=LMPKFVZu; arc=none smtp.client-ip=207.171.184.29 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="LMPKFVZu" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1726486551; x=1758022551; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=YhYC82vuYhxRbKNM3lupC9Sz/V22SIFTp2GLNchSdFM=; b=LMPKFVZuLEXbvmJBhHxXvdUIXTSUU4GSqmckX98KlY9h324MztgMQWPc biX7ZnyGnUQmQVo6SMMhEqdH2xZDwc/Coq84gyzlmaTm5B6m3O1oqg0y3 ehPcsr4lkmf0Xu3k/wDmmDqK0ZR4b48imGNaZLQWiWqdNWpKwDj9cVziA M=; X-IronPort-AV: E=Sophos;i="6.10,233,1719878400"; d="scan'208";a="454427141" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO smtpout.prod.us-east-1.prod.farcaster.email.amazon.dev) ([10.25.36.214]) by smtp-border-fw-9102.sea19.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Sep 2024 11:35:48 +0000 Received: from EX19MTAEUA001.ant.amazon.com [10.0.17.79:29382] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.20.15:2525] with esmtp (Farcaster) id b4c4bc91-312e-476c-b408-dce0c369f62a; Mon, 16 Sep 2024 11:35:47 +0000 (UTC) X-Farcaster-Flow-ID: b4c4bc91-312e-476c-b408-dce0c369f62a Received: from EX19D014EUC004.ant.amazon.com (10.252.51.182) by EX19MTAEUA001.ant.amazon.com (10.252.50.223) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 16 Sep 2024 11:35:46 +0000 Received: from u5d18b891348c5b.ant.amazon.com (10.146.13.221) by EX19D014EUC004.ant.amazon.com (10.252.51.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 16 Sep 2024 11:35:36 +0000 From: James Gowans To: CC: Jason Gunthorpe , Kevin Tian , "Joerg Roedel" , =?UTF-8?q?Krzysztof=20Wilczy=C5=84ski?= , Will Deacon , Robin Murphy , Mike Rapoport , "Madhavan T. Venkataraman" , , "Sean Christopherson" , Paolo Bonzini , , David Woodhouse , Lu Baolu , Alexander Graf , , , , "Saenz Julienne, Nicolas" Subject: [RFC PATCH 12/13] iommufd, guestmemfs: Ensure persistent file used for persistent DMA Date: Mon, 16 Sep 2024 13:31:01 +0200 Message-ID: <20240916113102.710522-13-jgowans@amazon.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240916113102.710522-1-jgowans@amazon.com> References: <20240916113102.710522-1-jgowans@amazon.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: EX19D031UWA004.ant.amazon.com (10.13.139.19) To EX19D014EUC004.ant.amazon.com (10.252.51.182) Content-Type: text/plain; charset="utf-8" When IOASes and hardware page tables are made persistent then DMA will continue accessing memory which is mapped for DMA during and after kexec. This is only legal if we are sure that the memory being accessed by that DMA is also persistent. It would not be legal to map normal buddy-list managed anonymous memory for persistent DMA. Currently there is one provider of persistent memory: guestmemfs: https://lore.kernel.org/all/20240805093245.889357-1-jgowans@amazon.com/ This commit ensures that only guestmemfs memory can be mapped into persistent iommufds. This is almost certainly the wrong way and place to do it, but something similar to this is needed. Perhaps in page.c? As more persistent memory providers become available they can be added to the list to check for. --- drivers/iommu/iommufd/ioas.c | 22 ++++++++++++++++++++++ fs/guestmemfs/file.c | 5 +++++ include/linux/guestmemfs.h | 7 +++++++ 3 files changed, 34 insertions(+) diff --git a/drivers/iommu/iommufd/ioas.c b/drivers/iommu/iommufd/ioas.c index 742248276548..ce76b41d2d72 100644 --- a/drivers/iommu/iommufd/ioas.c +++ b/drivers/iommu/iommufd/ioas.c @@ -2,9 +2,11 @@ /* * Copyright (c) 2021-2022, NVIDIA CORPORATION & AFFILIATES */ +#include #include #include #include +#include #include =20 #include "io_pagetable.h" @@ -217,6 +219,26 @@ int iommufd_ioas_map(struct iommufd_ucmd *ucmd) if (IS_ERR(ioas)) return PTR_ERR(ioas); =20 + pr_info("iommufd_ioas_map persistent id %lu\n", + ucmd->ictx->persistent_id); + if (ucmd->ictx->persistent_id) { +#ifdef CONFIG_GUESTMEMFS_FS + struct vm_area_struct *vma; + struct mm_struct *mm =3D current->mm; + + mmap_read_lock(mm); + vma =3D find_vma_intersection(current->mm, + cmd->user_va, cmd->user_va + cmd->length); + if (!vma || !is_guestmemfs_file(vma->vm_file)) { + mmap_read_unlock(mm); + return -EFAULT; + } + mmap_read_unlock(mm); +#else + return -EFAULT; +#endif /* CONFIG_GUESTMEMFS_FS */ + } + if (!(cmd->flags & IOMMU_IOAS_MAP_FIXED_IOVA)) flags =3D IOPT_ALLOC_IOVA; rc =3D iopt_map_user_pages(ucmd->ictx, &ioas->iopt, &iova, diff --git a/fs/guestmemfs/file.c b/fs/guestmemfs/file.c index 8707a9d3ad90..ecacaf200a31 100644 --- a/fs/guestmemfs/file.c +++ b/fs/guestmemfs/file.c @@ -104,3 +104,8 @@ const struct file_operations guestmemfs_file_fops =3D { .owner =3D THIS_MODULE, .mmap =3D mmap, }; + +bool is_guestmemfs_file(struct file const *file) +{ + return file && file->f_op =3D=3D &guestmemfs_file_fops; +} diff --git a/include/linux/guestmemfs.h b/include/linux/guestmemfs.h index 60e769c8e533..c5cd7b6a5630 100644 --- a/include/linux/guestmemfs.h +++ b/include/linux/guestmemfs.h @@ -3,14 +3,21 @@ #ifndef _LINUX_GUESTMEMFS_H #define _LINUX_GUESTMEMFS_H =20 +#include + /* * Carves out chunks of memory from memblocks for guestmemfs. * Must be called in early boot before memblocks are freed. */ # ifdef CONFIG_GUESTMEMFS_FS void guestmemfs_reserve_mem(void); +bool is_guestmemfs_file(struct file const *filp); #else void guestmemfs_reserve_mem(void) { } +inline bool is_guestmemfs_file(struct file const *filp) +{ + return 0; +} #endif =20 #endif --=20 2.34.1 From nobody Fri Nov 29 18:27:50 2024 Received: from smtp-fw-6001.amazon.com (smtp-fw-6001.amazon.com [52.95.48.154]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ACA07153824; Mon, 16 Sep 2024 11:36:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=52.95.48.154 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726486563; cv=none; b=BmM3mkinPyqKdzbVb5fPohXHNh7xkxPS8mf40oyzb7TQo/tfD1VSr4cZf1nNZDmWgwXhQm76XTid3UejBZNsdSnO7267viSs5e1oVDZN3+ZtzDR/F5DIb7KexK8fE3NsbexBLEXQaSDK76dBy87Ukfh40on/XBAlLl5P246ucOY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726486563; c=relaxed/simple; bh=z++L0nDgK3j/I2EneGkWSR3NrEn1uaVrc2KHU+a8SjE=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=ux2igA/LJ2tRAkwUR8eURPRzru23ffgea1prj/3ji7ZZjyTqRUAfmX6r18gsqUuf8QvNcqJmyQ6/nPkyOUk7xSrHGs2gen4CST+RDGP21MLBMVzxMsFEZSmLrj/D3K6d1Um6wdnHix8+1ztJt/TcI3hQsIlYbRTO8y9ltI0uQJI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com; spf=pass smtp.mailfrom=amazon.com; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b=bhcvRyiv; arc=none smtp.client-ip=52.95.48.154 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amazon.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=amazon.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amazon.com header.i=@amazon.com header.b="bhcvRyiv" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1726486562; x=1758022562; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=rmNCEgTiIn0/HSF3qLDtBAVvZlJga9Hix4MGZ1qplug=; b=bhcvRyiva1xKjfbilWeVGQ1oQWNQTk2uj0h+xIozd1mGvUGsTSdokN2q H9H9F3KYZD3+cd57mQmD0kEoVMdvfcCM45uleNG7mnHsyQwC4ZFlnWz9S s+c8gLKdFlsgZ/0yTNapUiKeNr/jcfEalDWflVOCEsZiy1l57Tc6/7ER0 o=; X-IronPort-AV: E=Sophos;i="6.10,233,1719878400"; d="scan'208";a="424150549" Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO smtpout.prod.us-east-1.prod.farcaster.email.amazon.dev) ([10.43.8.2]) by smtp-border-fw-6001.iad6.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Sep 2024 11:36:00 +0000 Received: from EX19MTAEUB001.ant.amazon.com [10.0.17.79:6208] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.35.229:2525] with esmtp (Farcaster) id 64b5d967-c00c-45a6-8975-b19cfe456119; Mon, 16 Sep 2024 11:35:58 +0000 (UTC) X-Farcaster-Flow-ID: 64b5d967-c00c-45a6-8975-b19cfe456119 Received: from EX19D014EUC004.ant.amazon.com (10.252.51.182) by EX19MTAEUB001.ant.amazon.com (10.252.51.26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 16 Sep 2024 11:35:57 +0000 Received: from u5d18b891348c5b.ant.amazon.com (10.146.13.221) by EX19D014EUC004.ant.amazon.com (10.252.51.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.1258.34; Mon, 16 Sep 2024 11:35:47 +0000 From: James Gowans To: CC: Jason Gunthorpe , Kevin Tian , "Joerg Roedel" , =?UTF-8?q?Krzysztof=20Wilczy=C5=84ski?= , Will Deacon , Robin Murphy , Mike Rapoport , "Madhavan T. Venkataraman" , , "Sean Christopherson" , Paolo Bonzini , , David Woodhouse , Lu Baolu , Alexander Graf , , , , "Saenz Julienne, Nicolas" Subject: [RFC PATCH 13/13] iommufd, guestmemfs: Pin files when mapped for persistent DMA Date: Mon, 16 Sep 2024 13:31:02 +0200 Message-ID: <20240916113102.710522-14-jgowans@amazon.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240916113102.710522-1-jgowans@amazon.com> References: <20240916113102.710522-1-jgowans@amazon.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: EX19D031UWA004.ant.amazon.com (10.13.139.19) To EX19D014EUC004.ant.amazon.com (10.252.51.182) Content-Type: text/plain; charset="utf-8" Ordinarily after kexec the new kernel would have no idea that some files are still actually in use as DMA targets, this could allow the files to be deleted while still actually in use behind the scenes. This would allow use-after-frees of the persistent memory. To prevent this, add the ability to do long term (across kexec) pinning of files in guestmemfs. Iommufd is updated to use this when mapping a file into a persistent domain. As long as the file has pins it cannot be deleted. A hand-wavy alternative would be to use something like the iommufd's storage domain and actually do this at the PFN level. --- drivers/iommu/iommufd/ioas.c | 4 ++++ drivers/iommu/iommufd/iommufd_private.h | 5 +++++ drivers/iommu/iommufd/serialise.c | 9 ++++++++- fs/guestmemfs/file.c | 20 ++++++++++++++++++++ fs/guestmemfs/guestmemfs.h | 1 + fs/guestmemfs/inode.c | 4 ++++ include/linux/guestmemfs.h | 8 ++++++++ 7 files changed, 50 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/iommufd/ioas.c b/drivers/iommu/iommufd/ioas.c index ce76b41d2d72..8b7fa3d17e8a 100644 --- a/drivers/iommu/iommufd/ioas.c +++ b/drivers/iommu/iommufd/ioas.c @@ -233,6 +233,7 @@ int iommufd_ioas_map(struct iommufd_ucmd *ucmd) mmap_read_unlock(mm); return -EFAULT; } + ioas->pinned_file_handle =3D guestmemfs_pin_file(vma->vm_file); mmap_read_unlock(mm); #else return -EFAULT; @@ -331,6 +332,9 @@ int iommufd_ioas_unmap(struct iommufd_ucmd *ucmd) &unmapped); if (rc) goto out_put; + + if (ioas->pinned_file_handle) + guestmemfs_unpin_file(ioas->pinned_file_handle); } =20 cmd->length =3D unmapped; diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommuf= d/iommufd_private.h index 94612cec2814..597a54a1adf3 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -260,12 +260,17 @@ struct iommufd_object *_iommufd_object_alloc(struct i= ommufd_ctx *ictx, * An iommu_domain & iommfd_hw_pagetable will be automatically selected * for a device based on the hwpt_list. If no suitable iommu_domain * is found a new iommu_domain will be created. + * + * If this IOAS is pinning a file for persistent DMA, pinned_file_handle w= ill + * be set to a non-zero value. When unmapping this IOAS the file will be + * unpinned. */ struct iommufd_ioas { struct iommufd_object obj; struct io_pagetable iopt; struct mutex mutex; struct list_head hwpt_list; + unsigned long pinned_file_handle; }; =20 static inline struct iommufd_ioas *iommufd_get_ioas(struct iommufd_ctx *ic= tx, diff --git a/drivers/iommu/iommufd/serialise.c b/drivers/iommu/iommufd/seri= alise.c index baac7d6150cb..d95e150c3dd9 100644 --- a/drivers/iommu/iommufd/serialise.c +++ b/drivers/iommu/iommufd/serialise.c @@ -16,6 +16,7 @@ * account_mode =3D u8 * ioases =3D [ * { + * pinned_file_handle =3D u64 * areas =3D [ * ] * } @@ -48,6 +49,9 @@ static int serialise_iommufd(void *fdt, struct iommufd_ct= x *ictx) snprintf(name, sizeof(name), "%lu", obj_idx); err |=3D fdt_begin_node(fdt, name); =20 + err |=3D fdt_property(fdt, "pinned-file-handle", + &ioas->pinned_file_handle, sizeof(ioas->pinned_file_handle)); + for (area =3D iopt_area_iter_first(&ioas->iopt, 0, ULONG_MAX); area; area =3D iopt_area_iter_next(area, 0, ULONG_MAX)) { unsigned long iova_start, iova_len; @@ -119,15 +123,18 @@ static int rehydrate_iommufd(char *iommufd_name) snprintf(kho_path, sizeof(kho_path), "/iommufd/iommufds/%s/ioases", iommu= fd_name); fdt_for_each_subnode(off, fdt, fdt_path_offset(fdt, kho_path)) { struct iommufd_ioas *ioas; + int len; int range_off; + const unsigned long *pinned_file_handle; =20 ioas =3D iommufd_ioas_alloc(ictx); + pinned_file_handle =3D fdt_getprop(fdt, off, "pinned-file-handle", &l= en); + ioas->pinned_file_handle =3D *pinned_file_handle; iommufd_object_finalize(ictx, &ioas->obj); =20 fdt_for_each_subnode(range_off, fdt, off) { const unsigned long *iova_start, *iova_len; const int *iommu_prot; - int len; struct iopt_area *area =3D iopt_area_alloc(); =20 iova_start =3D fdt_getprop(fdt, range_off, "iova-start", &len); diff --git a/fs/guestmemfs/file.c b/fs/guestmemfs/file.c index ecacaf200a31..d7840831df03 100644 --- a/fs/guestmemfs/file.c +++ b/fs/guestmemfs/file.c @@ -109,3 +109,23 @@ bool is_guestmemfs_file(struct file const *file) { return file && file->f_op =3D=3D &guestmemfs_file_fops; } + +unsigned long guestmemfs_pin_file(struct file *file) +{ + struct guestmemfs_inode *inode =3D + guestmemfs_get_persisted_inode(file->f_inode->i_sb, + file->f_inode->i_ino); + + atomic_inc(&inode->long_term_pins); + return file->f_inode->i_ino; +} + +void guestmemfs_unpin_file(unsigned long pin_handle) +{ + struct guestmemfs_inode *inode =3D + guestmemfs_get_persisted_inode(guestmemfs_sb, pin_handle); + int new; + + new =3D atomic_dec_return(&inode->long_term_pins); + WARN_ON(new < 0); +} diff --git a/fs/guestmemfs/guestmemfs.h b/fs/guestmemfs/guestmemfs.h index 91cc06ae45a5..d107ad0e3323 100644 --- a/fs/guestmemfs/guestmemfs.h +++ b/fs/guestmemfs/guestmemfs.h @@ -42,6 +42,7 @@ struct guestmemfs_inode { char filename[GUESTMEMFS_FILENAME_LEN]; void *mappings; int num_mappings; + atomic_t long_term_pins; }; =20 void guestmemfs_initialise_inode_store(struct super_block *sb); diff --git a/fs/guestmemfs/inode.c b/fs/guestmemfs/inode.c index d521b35d4992..6bc0abbde8d1 100644 --- a/fs/guestmemfs/inode.c +++ b/fs/guestmemfs/inode.c @@ -151,6 +151,10 @@ static int guestmemfs_unlink(struct inode *dir, struct= dentry *dentry) =20 ino =3D guestmemfs_get_persisted_inode(dir->i_sb, dir->i_ino)->child_ino; =20 + inode =3D guestmemfs_get_persisted_inode(dir->i_sb, dentry->d_inode->i_in= o); + if (atomic_read(&inode->long_term_pins)) + return -EBUSY; + /* Special case for first file in dir */ if (ino =3D=3D dentry->d_inode->i_ino) { guestmemfs_get_persisted_inode(dir->i_sb, dir->i_ino)->child_ino =3D diff --git a/include/linux/guestmemfs.h b/include/linux/guestmemfs.h index c5cd7b6a5630..c2018b4f38fd 100644 --- a/include/linux/guestmemfs.h +++ b/include/linux/guestmemfs.h @@ -20,4 +20,12 @@ inline bool is_guestmemfs_file(struct file const *filp) } #endif =20 +/* + * Ensure that the file cannot be deleted or have its memory changed + * until it is unpinned. The returned value is a handle which can be + * used to un-pin the file. + */ +unsigned long guestmemfs_pin_file(struct file *file); +void guestmemfs_unpin_file(unsigned long pin_handle); + #endif --=20 2.34.1