From nobody Thu Dec 25 01:28:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EEC1B168A3 for ; Mon, 22 Jan 2024 07:44:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705909477; cv=none; b=WcKqGlw0X/JwctmCw6Qi54YhpajAFIA98KM2IEp6smNTLl669vQ+FlCWBPy5VryQud1ejQcB8hUncRllzIdFbKHmS83zJ21G71VoYyE1u3DPmg7NKSFu3a9odEMc3hPoU2kZcaw4eB915jiA76hJIqfXSTUawt4dt0uCaEPTJhY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705909477; c=relaxed/simple; bh=FQXUwQaz7SgmCkzzKtrhwjjGOR58hsRCfdf9rJPSwqE=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=YzPkce3cYGZGSwGDvmFGyDcglqzG4WQVWLiLPy07FapoSS4GkZftaouaqRxHNiqDuZETYHP7f30hBBuDIu96VIv0URQTcCF/yf4aa9C7MTgbGiOEYNaDk6Z0ipNv0jq6W2bbtvl2MyL/o6HIpuUv752ylxy9cQZRDlnGRKKqQyc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=nu8syqUB; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="nu8syqUB" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1705909476; x=1737445476; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=FQXUwQaz7SgmCkzzKtrhwjjGOR58hsRCfdf9rJPSwqE=; b=nu8syqUBcRUthzHa9gb1LNztIEK3NaaI1+gCddMi5B8LF7qkeHxXrJnX vPzZo+s2TyHBS4zLKzoJXhCEAPn++565z5tZ0OZPUpvYRl//UmM5N/mAD cju7V9PddpF35C4RI67IYkbofPdbcMauYUpAhPFElYjXDjiadZB6WdXyt +HyqpL6eQmPvVQqMimGZKqxAdRo0if4y2SqacLAic4Tcn3hbybGveOVX3 SlQ9GtOL+b4zdvVOw+gro4I/ZidBVkhHCMc5QGG3SVPekRM1bV6bB5AD3 S8xvYH8N6N62mKTWwXnpkX6vzAxszjf/yAIr5ONeHaAdjq4LzJg4BhKHJ Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10960"; a="22611509" X-IronPort-AV: E=Sophos;i="6.05,211,1701158400"; d="scan'208";a="22611509" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Jan 2024 23:44:35 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10960"; a="778504959" X-IronPort-AV: E=Sophos;i="6.05,211,1701158400"; d="scan'208";a="778504959" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orsmga007.jf.intel.com with ESMTP; 21 Jan 2024 23:44:31 -0800 From: Lu Baolu To: Jason Gunthorpe , Kevin Tian , Joerg Roedel , Will Deacon , Robin Murphy , Jean-Philippe Brucker , Nicolin Chen , Yi Liu , Jacob Pan , Joel Granados Cc: iommu@lists.linux.dev, virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, Lu Baolu Subject: [PATCH v3 1/8] iommu: Add iopf domain attach/detach/replace interface Date: Mon, 22 Jan 2024 15:38:56 +0800 Message-Id: <20240122073903.24406-2-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240122073903.24406-1-baolu.lu@linux.intel.com> References: <20240122073903.24406-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" There is a slight difference between iopf domains and non-iopf domains. In the latter, references to domains occur between attach and detach; While in the former, due to the existence of asynchronous iopf handling paths, references to the domain may occur after detach, which leads to potential UAF issues. Introduce iopf-specific domain attach/detach/replace interface where the caller provides an attach cookie. This cookie can only be freed after all outstanding iopf groups are handled and the domain is detached from the RID or PASID. The presence of this attach cookie indicates that a domain has been attached to the RID or PASID and won't be released until all outstanding iopf groups are handled. The cookie data structure also includes a private field for storing a caller-specific pointer that will be passed back to its page fault handler. This field provides flexibility for various uses. For example, the IOMMUFD could use it to store the iommufd_device pointer, so that it could easily retrieve the dev_id of the device that triggered the fault. Signed-off-by: Lu Baolu --- include/linux/iommu.h | 36 +++++++++ drivers/iommu/io-pgfault.c | 158 +++++++++++++++++++++++++++++++++++++ 2 files changed, 194 insertions(+) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 1ccad10e8164..6d85be23952a 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -120,6 +120,16 @@ struct iommu_page_response { u32 code; }; =20 +struct iopf_attach_cookie { + struct iommu_domain *domain; + struct device *dev; + unsigned int pasid; + refcount_t users; + + void *private; + void (*release)(struct iopf_attach_cookie *cookie); +}; + struct iopf_fault { struct iommu_fault fault; /* node for pending lists */ @@ -699,6 +709,7 @@ struct iommu_fault_param { struct device *dev; struct iopf_queue *queue; struct list_head queue_list; + struct xarray pasid_cookie; =20 struct list_head partial; struct list_head faults; @@ -1552,6 +1563,12 @@ void iopf_free_group(struct iopf_group *group); void iommu_report_device_fault(struct device *dev, struct iopf_fault *evt); void iopf_group_response(struct iopf_group *group, enum iommu_page_response_code status); +int iopf_domain_attach(struct iommu_domain *domain, struct device *dev, + ioasid_t pasid, struct iopf_attach_cookie *cookie); +void iopf_domain_detach(struct iommu_domain *domain, struct device *dev, + ioasid_t pasid); +int iopf_domain_replace(struct iommu_domain *domain, struct device *dev, + ioasid_t pasid, struct iopf_attach_cookie *cookie); #else static inline int iopf_queue_add_device(struct iopf_queue *queue, struct device *dev) @@ -1596,5 +1613,24 @@ static inline void iopf_group_response(struct iopf_g= roup *group, enum iommu_page_response_code status) { } + +static inline int iopf_domain_attach(struct iommu_domain *domain, + struct device *dev, ioasid_t pasid, + struct iopf_attach_cookie *cookie) +{ + return -ENODEV; +} + +static inline void iopf_domain_detach(struct iommu_domain *domain, + struct device *dev, ioasid_t pasid) +{ +} + +static inline int iopf_domain_replace(struct iommu_domain *domain, + struct device *dev, ioasid_t pasid, + struct iopf_attach_cookie *cookie) +{ + return -ENODEV; +} #endif /* CONFIG_IOMMU_IOPF */ #endif /* __LINUX_IOMMU_H */ diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c index b64229dab976..f7ce41573799 100644 --- a/drivers/iommu/io-pgfault.c +++ b/drivers/iommu/io-pgfault.c @@ -39,6 +39,103 @@ static void iopf_put_dev_fault_param(struct iommu_fault= _param *fault_param) kfree_rcu(fault_param, rcu); } =20 +/* Get the domain attachment cookie for pasid of a device. */ +static struct iopf_attach_cookie __maybe_unused * +iopf_pasid_cookie_get(struct device *dev, ioasid_t pasid) +{ + struct iommu_fault_param *iopf_param =3D iopf_get_dev_fault_param(dev); + struct iopf_attach_cookie *curr; + + if (!iopf_param) + return ERR_PTR(-ENODEV); + + xa_lock(&iopf_param->pasid_cookie); + curr =3D xa_load(&iopf_param->pasid_cookie, pasid); + if (curr && !refcount_inc_not_zero(&curr->users)) + curr =3D ERR_PTR(-EINVAL); + xa_unlock(&iopf_param->pasid_cookie); + + iopf_put_dev_fault_param(iopf_param); + + return curr; +} + +/* Put the domain attachment cookie. */ +static void iopf_pasid_cookie_put(struct iopf_attach_cookie *cookie) +{ + if (cookie && refcount_dec_and_test(&cookie->users)) + cookie->release(cookie); +} + +/* + * Set the domain attachment cookie for pasid of a device. Return 0 on + * success, or error number on failure. + */ +static int iopf_pasid_cookie_set(struct iommu_domain *domain, struct devic= e *dev, + ioasid_t pasid, struct iopf_attach_cookie *cookie) +{ + struct iommu_fault_param *iopf_param =3D iopf_get_dev_fault_param(dev); + struct iopf_attach_cookie *curr; + + if (!iopf_param) + return -ENODEV; + + refcount_set(&cookie->users, 1); + cookie->dev =3D dev; + cookie->pasid =3D pasid; + cookie->domain =3D domain; + + curr =3D xa_cmpxchg(&iopf_param->pasid_cookie, pasid, NULL, cookie, GFP_K= ERNEL); + iopf_put_dev_fault_param(iopf_param); + + return curr ? xa_err(curr) : 0; +} + +/* Clear the domain attachment cookie for pasid of a device. */ +static void iopf_pasid_cookie_clear(struct device *dev, ioasid_t pasid) +{ + struct iommu_fault_param *iopf_param =3D iopf_get_dev_fault_param(dev); + struct iopf_attach_cookie *curr; + + if (WARN_ON(!iopf_param)) + return; + + curr =3D xa_erase(&iopf_param->pasid_cookie, pasid); + /* paired with iopf_pasid_cookie_set/replace() */ + iopf_pasid_cookie_put(curr); + + iopf_put_dev_fault_param(iopf_param); +} + +/* Replace the domain attachment cookie for pasid of a device. */ +static int iopf_pasid_cookie_replace(struct iommu_domain *domain, struct d= evice *dev, + ioasid_t pasid, struct iopf_attach_cookie *cookie) +{ + struct iommu_fault_param *iopf_param =3D iopf_get_dev_fault_param(dev); + struct iopf_attach_cookie *curr; + + if (!iopf_param) + return -ENODEV; + + if (cookie) { + refcount_set(&cookie->users, 1); + cookie->dev =3D dev; + cookie->pasid =3D pasid; + cookie->domain =3D domain; + } + + curr =3D xa_store(&iopf_param->pasid_cookie, pasid, cookie, GFP_KERNEL); + if (xa_err(curr)) + return xa_err(curr); + + /* paired with iopf_pasid_cookie_set/replace() */ + iopf_pasid_cookie_put(curr); + + iopf_put_dev_fault_param(iopf_param); + + return 0; +} + static void __iopf_free_group(struct iopf_group *group) { struct iopf_fault *iopf, *next; @@ -362,6 +459,7 @@ int iopf_queue_add_device(struct iopf_queue *queue, str= uct device *dev) mutex_init(&fault_param->lock); INIT_LIST_HEAD(&fault_param->faults); INIT_LIST_HEAD(&fault_param->partial); + xa_init(&fault_param->pasid_cookie); fault_param->dev =3D dev; refcount_set(&fault_param->users, 1); list_add(&fault_param->queue_list, &queue->devices); @@ -502,3 +600,63 @@ void iopf_queue_free(struct iopf_queue *queue) kfree(queue); } EXPORT_SYMBOL_GPL(iopf_queue_free); + +int iopf_domain_attach(struct iommu_domain *domain, struct device *dev, + ioasid_t pasid, struct iopf_attach_cookie *cookie) +{ + int ret; + + if (!domain->iopf_handler) + return -EINVAL; + + if (pasid =3D=3D IOMMU_NO_PASID) + ret =3D iommu_attach_group(domain, dev->iommu_group); + else + ret =3D iommu_attach_device_pasid(domain, dev, pasid); + if (ret) + return ret; + + ret =3D iopf_pasid_cookie_set(domain, dev, pasid, cookie); + if (ret) { + if (pasid =3D=3D IOMMU_NO_PASID) + iommu_detach_group(domain, dev->iommu_group); + else + iommu_detach_device_pasid(domain, dev, pasid); + } + + return ret; +} +EXPORT_SYMBOL_GPL(iopf_domain_attach); + +void iopf_domain_detach(struct iommu_domain *domain, struct device *dev, i= oasid_t pasid) +{ + iopf_pasid_cookie_clear(dev, pasid); + + if (pasid =3D=3D IOMMU_NO_PASID) + iommu_detach_group(domain, dev->iommu_group); + else + iommu_detach_device_pasid(domain, dev, pasid); +} +EXPORT_SYMBOL_GPL(iopf_domain_detach); + +int iopf_domain_replace(struct iommu_domain *domain, struct device *dev, + ioasid_t pasid, struct iopf_attach_cookie *cookie) +{ + struct iommu_domain *old_domain =3D iommu_get_domain_for_dev(dev); + int ret; + + if (!old_domain || pasid !=3D IOMMU_NO_PASID || + (!old_domain->iopf_handler && !domain->iopf_handler)) + return -EINVAL; + + ret =3D iommu_group_replace_domain(dev->iommu_group, domain); + if (ret) + return ret; + + ret =3D iopf_pasid_cookie_replace(domain, dev, pasid, cookie); + if (ret) + iommu_group_replace_domain(dev->iommu_group, old_domain); + + return ret; +} +EXPORT_SYMBOL_NS_GPL(iopf_domain_replace, IOMMUFD_INTERNAL); --=20 2.34.1 From nobody Thu Dec 25 01:28:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 522EA1A70A for ; Mon, 22 Jan 2024 07:44:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705909481; cv=none; b=ZO0nyZzkyFow8v+neSDOwBbi/X32bbAWqX0kQ5j9kfT9R31Sx3n/Uk2NW53siMm8UZKZDIv9vDGDetgE1alJN3xgvvU2HpGWO21P1v63bziDzmVlCex+wHusmRh6vjpr8DU6emGSf4mmtPgwQ4H+1hK5TGtU8eo8q3EpZ6ywSz8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705909481; c=relaxed/simple; bh=+2PMIWS/pE25l+KTaQjwg/kxEyeBkuzck4hCSJ/D/zU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=OA4iIG8jxXhpMhoBZT7lbx75ZHOxhgmA25UujCS1BPirYD/sQUFY+3knsC/OHmp0lwCFwXT1llDeAnh+9Gq8J1NW9jWSWds9OWNYCODZLeeQGdLGxyvgssnsq4LzEvOIkf5SDemS4VuMFFACeb7eeHYla5EcFeI3G8qdLZLmB5Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=TOD6tbCq; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="TOD6tbCq" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1705909480; x=1737445480; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+2PMIWS/pE25l+KTaQjwg/kxEyeBkuzck4hCSJ/D/zU=; b=TOD6tbCqzjdVkoeH/u9BGZWw8IHIIzYJBHn7vlecp+0962ocR1e0cATi AmtYOaeHzOkWvuiod3QTvO0TBiIsmiwiYJIntiWVmB0xVAuEkUHQr2BJZ n7ukLw0iZ67Rzhn661lE7Teroo1gnsTSHbzTAeKFpVakTIjZ45y5hUG3F FYCI7ZMj/Hc17HrdvmSQGgbn576tjGjnuEM92s32D31RVUyoNeufH2T1x tN10Y7cGRpvMBJz1GP99TiRHoF7ESro4CnkL9sfew16r4E7bY7cvJPZb0 kKykaekCK2QVu18Jw0+R140i368nALkz4NYF+AUe6MG0rK3gbJOKhOCwp g==; X-IronPort-AV: E=McAfee;i="6600,9927,10960"; a="22611533" X-IronPort-AV: E=Sophos;i="6.05,211,1701158400"; d="scan'208";a="22611533" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Jan 2024 23:44:40 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10960"; a="778504968" X-IronPort-AV: E=Sophos;i="6.05,211,1701158400"; d="scan'208";a="778504968" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orsmga007.jf.intel.com with ESMTP; 21 Jan 2024 23:44:35 -0800 From: Lu Baolu To: Jason Gunthorpe , Kevin Tian , Joerg Roedel , Will Deacon , Robin Murphy , Jean-Philippe Brucker , Nicolin Chen , Yi Liu , Jacob Pan , Joel Granados Cc: iommu@lists.linux.dev, virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, Lu Baolu Subject: [PATCH v3 2/8] iommu/sva: Use iopf domain attach/detach interface Date: Mon, 22 Jan 2024 15:38:57 +0800 Message-Id: <20240122073903.24406-3-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240122073903.24406-1-baolu.lu@linux.intel.com> References: <20240122073903.24406-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The iommu sva implementation relies on iopf handling. Allocate an attachment cookie and use the iopf domain attach/detach interface. The SVA domain is guaranteed to be released after all outstanding page faults are handled. In the fault delivering path, the attachment cookie is retrieved, instead of the domain. This ensures that the page fault is forwarded only if an iopf-capable domain is attached, and the domain will only be released after all outstanding faults are handled. Signed-off-by: Lu Baolu --- include/linux/iommu.h | 2 +- drivers/iommu/io-pgfault.c | 59 +++++++++++++++++++------------------- drivers/iommu/iommu-sva.c | 48 ++++++++++++++++++++++++------- 3 files changed, 68 insertions(+), 41 deletions(-) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 6d85be23952a..511dc7b4bdb2 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -142,9 +142,9 @@ struct iopf_group { /* list node for iommu_fault_param::faults */ struct list_head pending_node; struct work_struct work; - struct iommu_domain *domain; /* The device's fault data parameter. */ struct iommu_fault_param *fault_param; + struct iopf_attach_cookie *cookie; }; =20 /** diff --git a/drivers/iommu/io-pgfault.c b/drivers/iommu/io-pgfault.c index f7ce41573799..2567d8c04e46 100644 --- a/drivers/iommu/io-pgfault.c +++ b/drivers/iommu/io-pgfault.c @@ -40,7 +40,7 @@ static void iopf_put_dev_fault_param(struct iommu_fault_p= aram *fault_param) } =20 /* Get the domain attachment cookie for pasid of a device. */ -static struct iopf_attach_cookie __maybe_unused * +static struct iopf_attach_cookie * iopf_pasid_cookie_get(struct device *dev, ioasid_t pasid) { struct iommu_fault_param *iopf_param =3D iopf_get_dev_fault_param(dev); @@ -147,6 +147,7 @@ static void __iopf_free_group(struct iopf_group *group) =20 /* Pair with iommu_report_device_fault(). */ iopf_put_dev_fault_param(group->fault_param); + iopf_pasid_cookie_put(group->cookie); } =20 void iopf_free_group(struct iopf_group *group) @@ -156,30 +157,6 @@ void iopf_free_group(struct iopf_group *group) } EXPORT_SYMBOL_GPL(iopf_free_group); =20 -static struct iommu_domain *get_domain_for_iopf(struct device *dev, - struct iommu_fault *fault) -{ - struct iommu_domain *domain; - - if (fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID) { - domain =3D iommu_get_domain_for_dev_pasid(dev, fault->prm.pasid, 0); - if (IS_ERR(domain)) - domain =3D NULL; - } else { - domain =3D iommu_get_domain_for_dev(dev); - } - - if (!domain || !domain->iopf_handler) { - dev_warn_ratelimited(dev, - "iopf (pasid %d) without domain attached or handler installed\n", - fault->prm.pasid); - - return NULL; - } - - return domain; -} - /* Non-last request of a group. Postpone until the last one. */ static int report_partial_fault(struct iommu_fault_param *fault_param, struct iommu_fault *fault) @@ -199,10 +176,20 @@ static int report_partial_fault(struct iommu_fault_pa= ram *fault_param, return 0; } =20 +static ioasid_t fault_to_pasid(struct iommu_fault *fault) +{ + if (fault->prm.flags & IOMMU_FAULT_PAGE_REQUEST_PASID_VALID) + return fault->prm.pasid; + + return IOMMU_NO_PASID; +} + static struct iopf_group *iopf_group_alloc(struct iommu_fault_param *iopf_= param, struct iopf_fault *evt, struct iopf_group *abort_group) { + ioasid_t pasid =3D fault_to_pasid(&evt->fault); + struct iopf_attach_cookie *cookie; struct iopf_fault *iopf, *next; struct iopf_group *group; =20 @@ -215,7 +202,23 @@ static struct iopf_group *iopf_group_alloc(struct iomm= u_fault_param *iopf_param, group =3D abort_group; } =20 + cookie =3D iopf_pasid_cookie_get(iopf_param->dev, pasid); + if (!cookie && pasid !=3D IOMMU_NO_PASID) + cookie =3D iopf_pasid_cookie_get(iopf_param->dev, IOMMU_NO_PASID); + if (IS_ERR(cookie) || !cookie) { + /* + * The PASID of this device was not attached by an I/O-capable + * domain. Ask the caller to abort handling of this fault. + * Otherwise, the reference count will be switched to the new + * iopf group and will be released in iopf_free_group(). + */ + kfree(group); + group =3D abort_group; + cookie =3D NULL; + } + group->fault_param =3D iopf_param; + group->cookie =3D cookie; group->last_fault.fault =3D evt->fault; INIT_LIST_HEAD(&group->faults); INIT_LIST_HEAD(&group->pending_node); @@ -305,15 +308,11 @@ void iommu_report_device_fault(struct device *dev, st= ruct iopf_fault *evt) if (group =3D=3D &abort_group) goto err_abort; =20 - group->domain =3D get_domain_for_iopf(dev, fault); - if (!group->domain) - goto err_abort; - /* * On success iopf_handler must call iopf_group_response() and * iopf_free_group() */ - if (group->domain->iopf_handler(group)) + if (group->cookie->domain->iopf_handler(group)) goto err_abort; =20 return; diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c index b51995b4fe90..fff3ee1ee9ce 100644 --- a/drivers/iommu/iommu-sva.c +++ b/drivers/iommu/iommu-sva.c @@ -50,6 +50,39 @@ static struct iommu_mm_data *iommu_alloc_mm_data(struct = mm_struct *mm, struct de return iommu_mm; } =20 +static void release_attach_cookie(struct iopf_attach_cookie *cookie) +{ + struct iommu_domain *domain =3D cookie->domain; + + mutex_lock(&iommu_sva_lock); + if (--domain->users =3D=3D 0) { + list_del(&domain->next); + iommu_domain_free(domain); + } + mutex_unlock(&iommu_sva_lock); + + kfree(cookie); +} + +static int sva_attach_device_pasid(struct iommu_domain *domain, + struct device *dev, ioasid_t pasid) +{ + struct iopf_attach_cookie *cookie; + int ret; + + cookie =3D kzalloc(sizeof(*cookie), GFP_KERNEL); + if (!cookie) + return -ENOMEM; + + cookie->release =3D release_attach_cookie; + + ret =3D iopf_domain_attach(domain, dev, pasid, cookie); + if (ret) + kfree(cookie); + + return ret; +} + /** * iommu_sva_bind_device() - Bind a process address space to a device * @dev: the device @@ -90,7 +123,7 @@ struct iommu_sva *iommu_sva_bind_device(struct device *d= ev, struct mm_struct *mm =20 /* Search for an existing domain. */ list_for_each_entry(domain, &mm->iommu_mm->sva_domains, next) { - ret =3D iommu_attach_device_pasid(domain, dev, iommu_mm->pasid); + ret =3D sva_attach_device_pasid(domain, dev, iommu_mm->pasid); if (!ret) { domain->users++; goto out; @@ -104,7 +137,7 @@ struct iommu_sva *iommu_sva_bind_device(struct device *= dev, struct mm_struct *mm goto out_free_handle; } =20 - ret =3D iommu_attach_device_pasid(domain, dev, iommu_mm->pasid); + ret =3D sva_attach_device_pasid(domain, dev, iommu_mm->pasid); if (ret) goto out_free_domain; domain->users =3D 1; @@ -140,13 +173,7 @@ void iommu_sva_unbind_device(struct iommu_sva *handle) struct iommu_mm_data *iommu_mm =3D domain->mm->iommu_mm; struct device *dev =3D handle->dev; =20 - mutex_lock(&iommu_sva_lock); - iommu_detach_device_pasid(domain, dev, iommu_mm->pasid); - if (--domain->users =3D=3D 0) { - list_del(&domain->next); - iommu_domain_free(domain); - } - mutex_unlock(&iommu_sva_lock); + iopf_domain_detach(domain, dev, iommu_mm->pasid); kfree(handle); } EXPORT_SYMBOL_GPL(iommu_sva_unbind_device); @@ -242,7 +269,8 @@ static void iommu_sva_handle_iopf(struct work_struct *w= ork) if (status !=3D IOMMU_PAGE_RESP_SUCCESS) break; =20 - status =3D iommu_sva_handle_mm(&iopf->fault, group->domain->mm); + status =3D iommu_sva_handle_mm(&iopf->fault, + group->cookie->domain->mm); } =20 iopf_group_response(group, status); --=20 2.34.1 From nobody Thu Dec 25 01:28:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 15DC0210E7 for ; Mon, 22 Jan 2024 07:44:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705909488; cv=none; b=PODfgLRTkpYA5PEvzpbiSqCNaoKLB+4RW5ClnETWF+DNfSmjV49TDodtTKTNfGUUKBdHCVeyVzkLTKUySRpMp3ijLS1GBnYthH8ucNzO3LlrVplRU9rFCa0tELwq2Ldhl+8tRNCiZnMRjZp/DuhNLW3J4YViRCaiK041hWYMqBk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705909488; c=relaxed/simple; bh=hmEQvRWi8eIAqM5TdiirQYEfxUrAqPd7+aexCnYV6h4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=tU3+BqYzNZYosIj1Abu8NNYvSY40HyXE8BcHWccSFFD7xfojVmlncHkmJ4FxTYzOAe0U/ceL1ek8iA0W0emu891EBbSDLqNgzJPrdexH4xEoMovYLPEXmvQVA9E3Mbc5U2dPSwYAtXZEBvUqWglWwcm/FNzChT07Ah53yEcdQHM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=mCWMUAxG; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="mCWMUAxG" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1705909487; x=1737445487; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hmEQvRWi8eIAqM5TdiirQYEfxUrAqPd7+aexCnYV6h4=; b=mCWMUAxGR93uu/JvQyUTd+w0fCYoxvHREt2sOnYKwqvjH7scZSmEHAh4 8SZJJRXe517nW7GCRulYMhlbdxqZrkkRlIvjOzInuWloInSCZLJMdvWnl ihyP8COY+9Sdm1dQFZLXyKBseIwgRNJGLVHkW6p++RdXFkkMGf78QnvG3 gO3d3XvlLvBRU3hP6laHy7IhQXX6DCvCZmER7AmJyaGXFl3jqhMYQDrBx C0tCqE56CTPE7tfIMpu2Dzu/MmuOwPnSckdDe6E8JGKeQM0A+G+R53TE8 uYDyfXTcga7Az82nm3FgKBSviAW9xVfWSoWehUQb1UWFKClX5wWzpqOsD g==; X-IronPort-AV: E=McAfee;i="6600,9927,10960"; a="22611557" X-IronPort-AV: E=Sophos;i="6.05,211,1701158400"; d="scan'208";a="22611557" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Jan 2024 23:44:45 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10960"; a="778504977" X-IronPort-AV: E=Sophos;i="6.05,211,1701158400"; d="scan'208";a="778504977" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orsmga007.jf.intel.com with ESMTP; 21 Jan 2024 23:44:40 -0800 From: Lu Baolu To: Jason Gunthorpe , Kevin Tian , Joerg Roedel , Will Deacon , Robin Murphy , Jean-Philippe Brucker , Nicolin Chen , Yi Liu , Jacob Pan , Joel Granados Cc: iommu@lists.linux.dev, virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, Lu Baolu Subject: [PATCH v3 3/8] iommufd: Add fault and response message definitions Date: Mon, 22 Jan 2024 15:38:58 +0800 Message-Id: <20240122073903.24406-4-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240122073903.24406-1-baolu.lu@linux.intel.com> References: <20240122073903.24406-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" iommu_hwpt_pgfaults represent fault messages that the userspace can retrieve. Multiple iommu_hwpt_pgfaults might be put in an iopf group, with the IOMMU_PGFAULT_FLAGS_LAST_PAGE flag set only for the last iommu_hwpt_pgfault. An iommu_hwpt_page_response is a response message that the userspace should send to the kernel after finishing handling a group of fault messages. The @dev_id, @pasid, and @grpid fields in the message identify an outstanding iopf group for a device. The @addr field, which matches the fault address of the last fault in the group, will be used by the kernel for a sanity check. Signed-off-by: Lu Baolu --- include/uapi/linux/iommufd.h | 67 ++++++++++++++++++++++++++++++++++++ 1 file changed, 67 insertions(+) diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index 1dfeaa2e649e..d59e839ae49e 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -692,4 +692,71 @@ struct iommu_hwpt_invalidate { __u32 __reserved; }; #define IOMMU_HWPT_INVALIDATE _IO(IOMMUFD_TYPE, IOMMUFD_CMD_HWPT_INVALIDAT= E) + +/** + * enum iommu_hwpt_pgfault_flags - flags for struct iommu_hwpt_pgfault + * @IOMMU_PGFAULT_FLAGS_PASID_VALID: The pasid field of the fault data is + * valid. + * @IOMMU_PGFAULT_FLAGS_LAST_PAGE: It's the last fault of a fault group. + */ +enum iommu_hwpt_pgfault_flags { + IOMMU_PGFAULT_FLAGS_PASID_VALID =3D (1 << 0), + IOMMU_PGFAULT_FLAGS_LAST_PAGE =3D (1 << 1), +}; + +/** + * enum iommu_hwpt_pgfault_perm - perm bits for struct iommu_hwpt_pgfault + * @IOMMU_PGFAULT_PERM_READ: request for read permission + * @IOMMU_PGFAULT_PERM_WRITE: request for write permission + * @IOMMU_PGFAULT_PERM_EXEC: request for execute permission + * @IOMMU_PGFAULT_PERM_PRIV: request for privileged permission + */ +enum iommu_hwpt_pgfault_perm { + IOMMU_PGFAULT_PERM_READ =3D (1 << 0), + IOMMU_PGFAULT_PERM_WRITE =3D (1 << 1), + IOMMU_PGFAULT_PERM_EXEC =3D (1 << 2), + IOMMU_PGFAULT_PERM_PRIV =3D (1 << 3), +}; + +/** + * struct iommu_hwpt_pgfault - iommu page fault data + * @size: sizeof(struct iommu_hwpt_pgfault) + * @flags: Combination of enum iommu_hwpt_pgfault_flags + * @dev_id: id of the originated device + * @pasid: Process Address Space ID + * @grpid: Page Request Group Index + * @perm: Combination of enum iommu_hwpt_pgfault_perm + * @addr: page address + */ +struct iommu_hwpt_pgfault { + __u32 size; + __u32 flags; + __u32 dev_id; + __u32 pasid; + __u32 grpid; + __u32 perm; + __u64 addr; +}; + +/** + * struct iommu_hwpt_page_response - IOMMU page fault response + * @size: sizeof(struct iommu_hwpt_page_response) + * @flags: Must be set to 0 + * @dev_id: device ID of target device for the response + * @pasid: Process Address Space ID + * @grpid: Page Request Group Index + * @code: response code. The supported codes include: + * 0: Successful; 1: Response Failure; 2: Invalid Request. + * @addr: The fault address. Must match the addr field of the + * last iommu_hwpt_pgfault of a reported iopf group. + */ +struct iommu_hwpt_page_response { + __u32 size; + __u32 flags; + __u32 dev_id; + __u32 pasid; + __u32 grpid; + __u32 code; + __u64 addr; +}; #endif --=20 2.34.1 From nobody Thu Dec 25 01:28:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6AD0D21108 for ; Mon, 22 Jan 2024 07:44:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705909491; cv=none; b=Moh67EJ4lYF52DkB3nCW8Ja3B9eBG2VcXBAgrk8tER11fu4HyggOyqLrhr1Ivw3Cdt3Mp/f5wHYv1Fz8Gvh8GlGl78jFtFOBjiZ6MAV/e9hTyspV6ckCCiLtFpaSP5BXr4qR4sx6l6TvGWP6wIEYW3Cx0zDEKUE7JXD5mTos/rQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705909491; c=relaxed/simple; bh=BJ+FMIw8p1K/Crff1Micy6NmyOR5TM9iue+/mV6svUY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=W0uYmq/7pC++sNv8LnW/Ltyk3bAay/CpgUrplEd/SneqNPZlxijypL5vyMkAogzCIwlLONI9DyIUlOb0gKfTUTw5z5YXW6tLrEccsIQmU6yrip1GQUL3ZfUWTWdeIY9RywtoShvj5r5G2LWYAJlDOnsYd9pOico9CLC7JvVNEKY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=XimPzqj8; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="XimPzqj8" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1705909489; x=1737445489; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=BJ+FMIw8p1K/Crff1Micy6NmyOR5TM9iue+/mV6svUY=; b=XimPzqj82jz7M25AudcM3m2xVpeUbxZMs9iDcWBMbCTtzDuyORoJyKqf ixinioS2Ce4gOLgD4oDRJ85gX6dsNqw19qp1gdF6GozOYOmjoMrsrpBjR 4fSyoQM2bDoBB+FKIiD555gvYzkFvQkzwgr96vViqwRAkTbqo/tD8ICBR 7xwtcP6cssL9wYy8i4w+6k212tC2PdNtSHlBZ2g4Vq+MgabGlLfRK3QTq CQggKBBlZQqWCbShI0p3dyDL4BILCendde3SIzoxGa3e3KSlujBkzURzB L6Mcp3I/i2o4itu7Tg9jFoOcSRWKeCO3PndLowKwzVeVqEiDjcdFmYd1/ Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10960"; a="22611580" X-IronPort-AV: E=Sophos;i="6.05,211,1701158400"; d="scan'208";a="22611580" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Jan 2024 23:44:49 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10960"; a="778504996" X-IronPort-AV: E=Sophos;i="6.05,211,1701158400"; d="scan'208";a="778504996" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orsmga007.jf.intel.com with ESMTP; 21 Jan 2024 23:44:44 -0800 From: Lu Baolu To: Jason Gunthorpe , Kevin Tian , Joerg Roedel , Will Deacon , Robin Murphy , Jean-Philippe Brucker , Nicolin Chen , Yi Liu , Jacob Pan , Joel Granados Cc: iommu@lists.linux.dev, virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, Lu Baolu Subject: [PATCH v3 4/8] iommufd: Add iommufd fault object Date: Mon, 22 Jan 2024 15:38:59 +0800 Message-Id: <20240122073903.24406-5-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240122073903.24406-1-baolu.lu@linux.intel.com> References: <20240122073903.24406-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" An iommufd fault object provides an interface for delivering I/O page faults to user space. These objects are created and destroyed by user space, and they can be associated with or dissociated from hardware page table objects during page table allocation or destruction. User space interacts with the fault object through a file interface. This interface offers a straightforward and efficient way for user space to handle page faults. It allows user space to read fault messages sequentially and respond to them by writing to the same file. The file interface supports reading messages in poll mode, so it's recommended that user space applications use io_uring to enhance read and write efficiency. A fault object can be associated with any iopf-capable iommufd_hw_pgtable during the pgtable's allocation. All I/O page faults triggered by devices when accessing the I/O addresses of an iommufd_hw_pgtable are routed through the fault object to user space. Similarly, user space's responses to these page faults are routed back to the iommu device driver through the same fault object. Signed-off-by: Lu Baolu --- include/linux/iommu.h | 2 + drivers/iommu/iommufd/iommufd_private.h | 23 +++ include/uapi/linux/iommufd.h | 18 ++ drivers/iommu/iommufd/device.c | 1 + drivers/iommu/iommufd/fault.c | 255 ++++++++++++++++++++++++ drivers/iommu/iommufd/main.c | 6 + drivers/iommu/iommufd/Makefile | 1 + 7 files changed, 306 insertions(+) create mode 100644 drivers/iommu/iommufd/fault.c diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 511dc7b4bdb2..4372648ac22e 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -145,6 +145,8 @@ struct iopf_group { /* The device's fault data parameter. */ struct iommu_fault_param *fault_param; struct iopf_attach_cookie *cookie; + /* Used by handler provider to hook the group on its own lists. */ + struct list_head node; }; =20 /** diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommuf= d/iommufd_private.h index 991f864d1f9b..52d83e888bd0 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -128,6 +128,7 @@ enum iommufd_object_type { IOMMUFD_OBJ_HWPT_NESTED, IOMMUFD_OBJ_IOAS, IOMMUFD_OBJ_ACCESS, + IOMMUFD_OBJ_FAULT, #ifdef CONFIG_IOMMUFD_TEST IOMMUFD_OBJ_SELFTEST, #endif @@ -395,6 +396,8 @@ struct iommufd_device { /* always the physical device */ struct device *dev; bool enforce_cache_coherency; + /* outstanding faults awaiting response indexed by fault group id */ + struct xarray faults; }; =20 static inline struct iommufd_device * @@ -426,6 +429,26 @@ void iopt_remove_access(struct io_pagetable *iopt, u32 iopt_access_list_id); void iommufd_access_destroy_object(struct iommufd_object *obj); =20 +/* + * An iommufd_fault object represents an interface to deliver I/O page fau= lts + * to the user space. These objects are created/destroyed by the user spac= e and + * associated with hardware page table objects during page-table allocatio= n. + */ +struct iommufd_fault { + struct iommufd_object obj; + struct iommufd_ctx *ictx; + + /* The lists of outstanding faults protected by below mutex. */ + struct mutex mutex; + struct list_head deliver; + struct list_head response; + + struct wait_queue_head wait_queue; +}; + +int iommufd_fault_alloc(struct iommufd_ucmd *ucmd); +void iommufd_fault_destroy(struct iommufd_object *obj); + #ifdef CONFIG_IOMMUFD_TEST int iommufd_test(struct iommufd_ucmd *ucmd); void iommufd_selftest_destroy(struct iommufd_object *obj); diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index d59e839ae49e..c32d62b02306 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -50,6 +50,7 @@ enum { IOMMUFD_CMD_HWPT_SET_DIRTY_TRACKING, IOMMUFD_CMD_HWPT_GET_DIRTY_BITMAP, IOMMUFD_CMD_HWPT_INVALIDATE, + IOMMUFD_CMD_FAULT_ALLOC, }; =20 /** @@ -759,4 +760,21 @@ struct iommu_hwpt_page_response { __u32 code; __u64 addr; }; + +/** + * struct iommu_fault_alloc - ioctl(IOMMU_FAULT_ALLOC) + * @size: sizeof(struct iommu_fault_alloc) + * @flags: Must be 0 + * @out_fault_id: The ID of the new FAULT + * @out_fault_fd: The fd of the new FAULT + * + * Explicitly allocate a fault handling object. + */ +struct iommu_fault_alloc { + __u32 size; + __u32 flags; + __u32 out_fault_id; + __u32 out_fault_fd; +}; +#define IOMMU_FAULT_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_FAULT_ALLOC) #endif diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index 873630c111c1..d70913ee8fdf 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -215,6 +215,7 @@ struct iommufd_device *iommufd_device_bind(struct iommu= fd_ctx *ictx, refcount_inc(&idev->obj.users); /* igroup refcount moves into iommufd_device */ idev->igroup =3D igroup; + xa_init(&idev->faults); =20 /* * If the caller fails after this success it must call diff --git a/drivers/iommu/iommufd/fault.c b/drivers/iommu/iommufd/fault.c new file mode 100644 index 000000000000..9844a85feeb2 --- /dev/null +++ b/drivers/iommu/iommufd/fault.c @@ -0,0 +1,255 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright (C) 2024 Intel Corporation + */ +#define pr_fmt(fmt) "iommufd: " fmt + +#include +#include +#include +#include +#include +#include +#include +#include + +#include "iommufd_private.h" + +static int device_add_fault(struct iopf_group *group) +{ + struct iommufd_device *idev =3D group->cookie->private; + void *curr; + + curr =3D xa_cmpxchg(&idev->faults, group->last_fault.fault.prm.grpid, + NULL, group, GFP_KERNEL); + + return curr ? xa_err(curr) : 0; +} + +static void device_remove_fault(struct iopf_group *group) +{ + struct iommufd_device *idev =3D group->cookie->private; + + xa_store(&idev->faults, group->last_fault.fault.prm.grpid, + NULL, GFP_KERNEL); +} + +static struct iopf_group *device_get_fault(struct iommufd_device *idev, + unsigned long grpid) +{ + return xa_load(&idev->faults, grpid); +} + +void iommufd_fault_destroy(struct iommufd_object *obj) +{ + struct iommufd_fault *fault =3D container_of(obj, struct iommufd_fault, o= bj); + struct iopf_group *group, *next; + + mutex_lock(&fault->mutex); + list_for_each_entry_safe(group, next, &fault->deliver, node) { + list_del(&group->node); + iopf_group_response(group, IOMMU_PAGE_RESP_INVALID); + iopf_free_group(group); + } + list_for_each_entry_safe(group, next, &fault->response, node) { + list_del(&group->node); + device_remove_fault(group); + iopf_group_response(group, IOMMU_PAGE_RESP_INVALID); + iopf_free_group(group); + } + mutex_unlock(&fault->mutex); + + mutex_destroy(&fault->mutex); +} + +static void iommufd_compose_fault_message(struct iommu_fault *fault, + struct iommu_hwpt_pgfault *hwpt_fault, + struct iommufd_device *idev) +{ + hwpt_fault->size =3D sizeof(*hwpt_fault); + hwpt_fault->flags =3D fault->prm.flags; + hwpt_fault->dev_id =3D idev->obj.id; + hwpt_fault->pasid =3D fault->prm.pasid; + hwpt_fault->grpid =3D fault->prm.grpid; + hwpt_fault->perm =3D fault->prm.perm; + hwpt_fault->addr =3D fault->prm.addr; +} + +static ssize_t iommufd_fault_fops_read(struct file *filep, char __user *bu= f, + size_t count, loff_t *ppos) +{ + size_t fault_size =3D sizeof(struct iommu_hwpt_pgfault); + struct iommufd_fault *fault =3D filep->private_data; + struct iommu_hwpt_pgfault data; + struct iommufd_device *idev; + struct iopf_group *group; + struct iopf_fault *iopf; + size_t done =3D 0; + int rc; + + if (*ppos || count % fault_size) + return -ESPIPE; + + mutex_lock(&fault->mutex); + while (!list_empty(&fault->deliver) && count > done) { + group =3D list_first_entry(&fault->deliver, + struct iopf_group, node); + + if (list_count_nodes(&group->faults) * fault_size > count - done) + break; + + idev =3D (struct iommufd_device *)group->cookie->private; + list_for_each_entry(iopf, &group->faults, list) { + iommufd_compose_fault_message(&iopf->fault, &data, idev); + rc =3D copy_to_user(buf + done, &data, fault_size); + if (rc) + goto err_unlock; + done +=3D fault_size; + } + + rc =3D device_add_fault(group); + if (rc) + goto err_unlock; + + list_move_tail(&group->node, &fault->response); + } + mutex_unlock(&fault->mutex); + + return done; +err_unlock: + mutex_unlock(&fault->mutex); + return rc; +} + +static ssize_t iommufd_fault_fops_write(struct file *filep, const char __u= ser *buf, + size_t count, loff_t *ppos) +{ + size_t response_size =3D sizeof(struct iommu_hwpt_page_response); + struct iommufd_fault *fault =3D filep->private_data; + struct iommu_hwpt_page_response response; + struct iommufd_device *idev; + struct iopf_group *group; + size_t done =3D 0; + int rc; + + if (*ppos || count % response_size) + return -ESPIPE; + + while (!list_empty(&fault->response) && count > done) { + rc =3D copy_from_user(&response, buf + done, response_size); + if (rc) + break; + + idev =3D container_of(iommufd_get_object(fault->ictx, + response.dev_id, + IOMMUFD_OBJ_DEVICE), + struct iommufd_device, obj); + if (IS_ERR(idev)) + break; + + group =3D device_get_fault(idev, response.grpid); + if (!group || + response.addr !=3D group->last_fault.fault.prm.addr || + ((group->last_fault.fault.prm.flags & IOMMU_FAULT_PAGE_REQUEST_PASID= _VALID) && + response.pasid !=3D group->last_fault.fault.prm.pasid)) { + iommufd_put_object(fault->ictx, &idev->obj); + break; + } + + iopf_group_response(group, response.code); + + mutex_lock(&fault->mutex); + list_del(&group->node); + mutex_unlock(&fault->mutex); + + device_remove_fault(group); + iopf_free_group(group); + done +=3D response_size; + + iommufd_put_object(fault->ictx, &idev->obj); + } + + return done; +} + +static __poll_t iommufd_fault_fops_poll(struct file *filep, + struct poll_table_struct *wait) +{ + struct iommufd_fault *fault =3D filep->private_data; + __poll_t pollflags =3D 0; + + poll_wait(filep, &fault->wait_queue, wait); + mutex_lock(&fault->mutex); + if (!list_empty(&fault->deliver)) + pollflags =3D EPOLLIN | EPOLLRDNORM; + mutex_unlock(&fault->mutex); + + return pollflags; +} + +static const struct file_operations iommufd_fault_fops =3D { + .owner =3D THIS_MODULE, + .open =3D nonseekable_open, + .read =3D iommufd_fault_fops_read, + .write =3D iommufd_fault_fops_write, + .poll =3D iommufd_fault_fops_poll, + .llseek =3D no_llseek, +}; + +static int get_fault_fd(struct iommufd_fault *fault) +{ + struct file *filep; + int fdno; + + fdno =3D get_unused_fd_flags(O_CLOEXEC); + if (fdno < 0) + return fdno; + + filep =3D anon_inode_getfile("[iommufd-pgfault]", &iommufd_fault_fops, + fault, O_RDWR); + if (IS_ERR(filep)) { + put_unused_fd(fdno); + return PTR_ERR(filep); + } + + fd_install(fdno, filep); + + return fdno; +} + +int iommufd_fault_alloc(struct iommufd_ucmd *ucmd) +{ + struct iommu_fault_alloc *cmd =3D ucmd->cmd; + struct iommufd_fault *fault; + int rc; + + if (cmd->flags) + return -EOPNOTSUPP; + + fault =3D iommufd_object_alloc(ucmd->ictx, fault, IOMMUFD_OBJ_FAULT); + if (IS_ERR(fault)) + return PTR_ERR(fault); + + fault->ictx =3D ucmd->ictx; + INIT_LIST_HEAD(&fault->deliver); + INIT_LIST_HEAD(&fault->response); + mutex_init(&fault->mutex); + init_waitqueue_head(&fault->wait_queue); + + rc =3D get_fault_fd(fault); + if (rc < 0) + goto out_abort; + + cmd->out_fault_id =3D fault->obj.id; + cmd->out_fault_fd =3D rc; + + rc =3D iommufd_ucmd_respond(ucmd, sizeof(*cmd)); + if (rc) + goto out_abort; + iommufd_object_finalize(ucmd->ictx, &fault->obj); + + return 0; +out_abort: + iommufd_object_abort_and_destroy(ucmd->ictx, &fault->obj); + + return rc; +} diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c index 39b32932c61e..792db077d33e 100644 --- a/drivers/iommu/iommufd/main.c +++ b/drivers/iommu/iommufd/main.c @@ -332,6 +332,7 @@ union ucmd_buffer { struct iommu_ioas_unmap unmap; struct iommu_option option; struct iommu_vfio_ioas vfio_ioas; + struct iommu_fault_alloc fault; #ifdef CONFIG_IOMMUFD_TEST struct iommu_test_cmd test; #endif @@ -381,6 +382,8 @@ static const struct iommufd_ioctl_op iommufd_ioctl_ops[= ] =3D { val64), IOCTL_OP(IOMMU_VFIO_IOAS, iommufd_vfio_ioas, struct iommu_vfio_ioas, __reserved), + IOCTL_OP(IOMMU_FAULT_ALLOC, iommufd_fault_alloc, struct iommu_fault_alloc, + out_fault_fd), #ifdef CONFIG_IOMMUFD_TEST IOCTL_OP(IOMMU_TEST_CMD, iommufd_test, struct iommu_test_cmd, last), #endif @@ -513,6 +516,9 @@ static const struct iommufd_object_ops iommufd_object_o= ps[] =3D { .destroy =3D iommufd_hwpt_nested_destroy, .abort =3D iommufd_hwpt_nested_abort, }, + [IOMMUFD_OBJ_FAULT] =3D { + .destroy =3D iommufd_fault_destroy, + }, #ifdef CONFIG_IOMMUFD_TEST [IOMMUFD_OBJ_SELFTEST] =3D { .destroy =3D iommufd_selftest_destroy, diff --git a/drivers/iommu/iommufd/Makefile b/drivers/iommu/iommufd/Makefile index 34b446146961..b94a74366eed 100644 --- a/drivers/iommu/iommufd/Makefile +++ b/drivers/iommu/iommufd/Makefile @@ -6,6 +6,7 @@ iommufd-y :=3D \ ioas.o \ main.o \ pages.o \ + fault.o \ vfio_compat.o =20 iommufd-$(CONFIG_IOMMUFD_TEST) +=3D selftest.o --=20 2.34.1 From nobody Thu Dec 25 01:28:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9395B36134 for ; Mon, 22 Jan 2024 07:44:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705909495; cv=none; b=gR4oaquNXztCbpd8dgiswHEOvOjCD2yERHbpdAduiuf+TX6IIX5kVvPVMm27PEIhej7BRDTa8mkjBIMcdTtGhGH3FbjimvIdWYQpuU03OU0uxTVRYo2i5jjdzQJswMxNYLYCCFXW1FOlulMxGhf4/Fvt6wvFTXzTkqNcE8L/g5k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705909495; c=relaxed/simple; bh=3G+m/18NThTRa31/ppFhR/nvEFSddDTyIuSRJM+GD4Q=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=fqYnvt8x48TxVo5JGfXUnQPEcNUIuP54hqlQspbnLZh0q7i86FTamAwzOyZDYe45N9Pq8L11PDyB80dGqJ+VgqzwslROycxtO5QOneb9tYqFUaIUQT4EEpbXiMtvZnwIFTt1yz9A22WLGJB2gEY4SKdeMG5cjr25zrr1XDz2TV4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=YF7ieNZi; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="YF7ieNZi" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1705909493; x=1737445493; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3G+m/18NThTRa31/ppFhR/nvEFSddDTyIuSRJM+GD4Q=; b=YF7ieNZi3SNvPS7/n5TmtG3cf0IeBVdIdABhRA+PsYg9D4DFUR50P1jL rV6vbFlUS2hFpxOluXogStl7K+pCqotQMCzqEUTuquJ3V3V5UzfEJUxbz PSfDHnk7Pr3OgmE+5ZVu+7SB/Hq9YTtiC1nFFny6Ok5qIqCcdFmH4b0zs xigdeHHrOU3au7uJMlzEX4MLCqazZFeBKeytczsLXObdWv0gHdMp2mqbH pIQNTCBi+o4jEeg8jyfs1AP3ZXq8vk9SGrhAABO7Cpjxo0Q4MTrGvDQrx R76YCyphrCddUXgM3FlavfVj3MB+ETFCG7BqHyb4Xtcyg/Er9R6TlP2PK A==; X-IronPort-AV: E=McAfee;i="6600,9927,10960"; a="22611594" X-IronPort-AV: E=Sophos;i="6.05,211,1701158400"; d="scan'208";a="22611594" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Jan 2024 23:44:53 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10960"; a="778505006" X-IronPort-AV: E=Sophos;i="6.05,211,1701158400"; d="scan'208";a="778505006" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orsmga007.jf.intel.com with ESMTP; 21 Jan 2024 23:44:49 -0800 From: Lu Baolu To: Jason Gunthorpe , Kevin Tian , Joerg Roedel , Will Deacon , Robin Murphy , Jean-Philippe Brucker , Nicolin Chen , Yi Liu , Jacob Pan , Joel Granados Cc: iommu@lists.linux.dev, virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, Lu Baolu Subject: [PATCH v3 5/8] iommufd: Associate fault object with iommufd_hw_pgtable Date: Mon, 22 Jan 2024 15:39:00 +0800 Message-Id: <20240122073903.24406-6-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240122073903.24406-1-baolu.lu@linux.intel.com> References: <20240122073903.24406-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When allocating a user iommufd_hw_pagetable, the user space is allowed to associate a fault object with the hw_pagetable by specifying the fault object ID in the page table allocation data and setting the IOMMU_HWPT_FAULT_ID_VALID flag bit. On a successful return of hwpt allocation, the user can retrieve and respond to page faults by reading and writing the file interface of the fault object. Once a fault object has been associated with a hwpt, the hwpt is iopf-capable, indicated by fault_capable set to true. Attaching, detaching, or replacing an iopf-capable hwpt to an RID or PASID will differ from those that are not iopf-capable. The implementation of these will be introduced in the next patch. Signed-off-by: Lu Baolu --- drivers/iommu/iommufd/iommufd_private.h | 11 ++++++++ include/uapi/linux/iommufd.h | 6 +++++ drivers/iommu/iommufd/fault.c | 14 ++++++++++ drivers/iommu/iommufd/hw_pagetable.c | 36 +++++++++++++++++++------ 4 files changed, 59 insertions(+), 8 deletions(-) diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommuf= d/iommufd_private.h index 52d83e888bd0..2780bed0c6b1 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -293,6 +293,8 @@ int iommufd_check_iova_range(struct io_pagetable *iopt, struct iommufd_hw_pagetable { struct iommufd_object obj; struct iommu_domain *domain; + struct iommufd_fault *fault; + bool fault_capable : 1; }; =20 struct iommufd_hwpt_paging { @@ -446,8 +448,17 @@ struct iommufd_fault { struct wait_queue_head wait_queue; }; =20 +static inline struct iommufd_fault * +iommufd_get_fault(struct iommufd_ucmd *ucmd, u32 id) +{ + return container_of(iommufd_get_object(ucmd->ictx, id, + IOMMUFD_OBJ_FAULT), + struct iommufd_fault, obj); +} + int iommufd_fault_alloc(struct iommufd_ucmd *ucmd); void iommufd_fault_destroy(struct iommufd_object *obj); +int iommufd_fault_iopf_handler(struct iopf_group *group); =20 #ifdef CONFIG_IOMMUFD_TEST int iommufd_test(struct iommufd_ucmd *ucmd); diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index c32d62b02306..7481cdd57027 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -357,10 +357,13 @@ struct iommu_vfio_ioas { * the parent HWPT in a nesting configurati= on. * @IOMMU_HWPT_ALLOC_DIRTY_TRACKING: Dirty tracking support for device IOM= MU is * enforced on device attachment + * @IOMMU_HWPT_FAULT_ID_VALID: The fault_id field of hwpt allocation data = is + * valid. */ enum iommufd_hwpt_alloc_flags { IOMMU_HWPT_ALLOC_NEST_PARENT =3D 1 << 0, IOMMU_HWPT_ALLOC_DIRTY_TRACKING =3D 1 << 1, + IOMMU_HWPT_FAULT_ID_VALID =3D 1 << 2, }; =20 /** @@ -411,6 +414,8 @@ enum iommu_hwpt_data_type { * @__reserved: Must be 0 * @data_type: One of enum iommu_hwpt_data_type * @data_len: Length of the type specific data + * @fault_id: The ID of IOMMUFD_FAULT object. Valid only if flags field of + * IOMMU_HWPT_FAULT_ID_VALID is set. * @data_uptr: User pointer to the type specific data * * Explicitly allocate a hardware page table object. This is the same obje= ct @@ -441,6 +446,7 @@ struct iommu_hwpt_alloc { __u32 __reserved; __u32 data_type; __u32 data_len; + __u32 fault_id; __aligned_u64 data_uptr; }; #define IOMMU_HWPT_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_HWPT_ALLOC) diff --git a/drivers/iommu/iommufd/fault.c b/drivers/iommu/iommufd/fault.c index 9844a85feeb2..e752d1c49dde 100644 --- a/drivers/iommu/iommufd/fault.c +++ b/drivers/iommu/iommufd/fault.c @@ -253,3 +253,17 @@ int iommufd_fault_alloc(struct iommufd_ucmd *ucmd) =20 return rc; } + +int iommufd_fault_iopf_handler(struct iopf_group *group) +{ + struct iommufd_hw_pagetable *hwpt =3D group->cookie->domain->fault_data; + struct iommufd_fault *fault =3D hwpt->fault; + + mutex_lock(&fault->mutex); + list_add_tail(&group->node, &fault->deliver); + mutex_unlock(&fault->mutex); + + wake_up_interruptible(&fault->wait_queue); + + return 0; +} diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/h= w_pagetable.c index 3f3f1fa1a0a9..2703d5aea4f5 100644 --- a/drivers/iommu/iommufd/hw_pagetable.c +++ b/drivers/iommu/iommufd/hw_pagetable.c @@ -8,6 +8,15 @@ #include "../iommu-priv.h" #include "iommufd_private.h" =20 +static void __iommufd_hwpt_destroy(struct iommufd_hw_pagetable *hwpt) +{ + if (hwpt->domain) + iommu_domain_free(hwpt->domain); + + if (hwpt->fault) + iommufd_put_object(hwpt->fault->ictx, &hwpt->fault->obj); +} + void iommufd_hwpt_paging_destroy(struct iommufd_object *obj) { struct iommufd_hwpt_paging *hwpt_paging =3D @@ -22,9 +31,7 @@ void iommufd_hwpt_paging_destroy(struct iommufd_object *o= bj) hwpt_paging->common.domain); } =20 - if (hwpt_paging->common.domain) - iommu_domain_free(hwpt_paging->common.domain); - + __iommufd_hwpt_destroy(&hwpt_paging->common); refcount_dec(&hwpt_paging->ioas->obj.users); } =20 @@ -49,9 +56,7 @@ void iommufd_hwpt_nested_destroy(struct iommufd_object *o= bj) struct iommufd_hwpt_nested *hwpt_nested =3D container_of(obj, struct iommufd_hwpt_nested, common.obj); =20 - if (hwpt_nested->common.domain) - iommu_domain_free(hwpt_nested->common.domain); - + __iommufd_hwpt_destroy(&hwpt_nested->common); refcount_dec(&hwpt_nested->parent->common.obj.users); } =20 @@ -213,7 +218,8 @@ iommufd_hwpt_nested_alloc(struct iommufd_ctx *ictx, struct iommufd_hw_pagetable *hwpt; int rc; =20 - if (flags || !user_data->len || !ops->domain_alloc_user) + if ((flags & ~IOMMU_HWPT_FAULT_ID_VALID) || + !user_data->len || !ops->domain_alloc_user) return ERR_PTR(-EOPNOTSUPP); if (parent->auto_domain || !parent->nest_parent) return ERR_PTR(-EINVAL); @@ -227,7 +233,7 @@ iommufd_hwpt_nested_alloc(struct iommufd_ctx *ictx, refcount_inc(&parent->common.obj.users); hwpt_nested->parent =3D parent; =20 - hwpt->domain =3D ops->domain_alloc_user(idev->dev, flags, + hwpt->domain =3D ops->domain_alloc_user(idev->dev, 0, parent->common.domain, user_data); if (IS_ERR(hwpt->domain)) { rc =3D PTR_ERR(hwpt->domain); @@ -307,6 +313,20 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd) goto out_put_pt; } =20 + if (cmd->flags & IOMMU_HWPT_FAULT_ID_VALID) { + struct iommufd_fault *fault; + + fault =3D iommufd_get_fault(ucmd, cmd->fault_id); + if (IS_ERR(fault)) { + rc =3D PTR_ERR(fault); + goto out_hwpt; + } + hwpt->fault =3D fault; + hwpt->domain->iopf_handler =3D iommufd_fault_iopf_handler; + hwpt->domain->fault_data =3D hwpt; + hwpt->fault_capable =3D true; + } + cmd->out_hwpt_id =3D hwpt->obj.id; rc =3D iommufd_ucmd_respond(ucmd, sizeof(*cmd)); if (rc) --=20 2.34.1 From nobody Thu Dec 25 01:28:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0FC82374E5 for ; Mon, 22 Jan 2024 07:44:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705909499; cv=none; b=Dy6JgG+2eV5xTHpQViJGHpg7Wqq+xWaHxBRKOajnWrYCM9rWxdFy9lT3g9Ns/SqDHZuDocQ70ei5uKycrua0LZIFnI4vfKtRdJGMdgHDZdyj9pbAhzki/EsigEg1+OTrTfDTUiHml/MiKv7A+ttibXCEC2eWqO5m+0nyGJlr1MU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705909499; c=relaxed/simple; bh=AAA7ZFqCERjHLbjdGyFGZVuugtSq3l9iPfmyQAbJj78=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=CX0LolqukOZ0xUySm0MnzT8U33HYxkxbGIhTIAx/Fx4GLV++EQBdjAanMq3oEGy7H7PofTMf4BPdF7V5ju6npf/SeqAN6FUh4hgaQwp0ItT9HITJsfYH8Dh2jUZLpRx982w9vKVNOUGQdODe/KeSo3nmu4bGepXUYzHlqxy+Pfc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=h1dRxPXb; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="h1dRxPXb" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1705909498; x=1737445498; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=AAA7ZFqCERjHLbjdGyFGZVuugtSq3l9iPfmyQAbJj78=; b=h1dRxPXbmtxUJc+Nu8KIfhPju6kKEcyJlOb6TbxmMkW189CzuNaqvrXo XY6S5d7JEZjg2WfxEgwg5nbkDLZKn9P26BviKhuuddUF6kQPfmmwxkI0q CEiU7JM3lntHdl0X2oAuk+dxrU78sVCjFn7Lj0HUElNkMX1ry3hkKckdV buKo6BbnvswTVttG0UsyWSsHK3p5qdHl8gq6hu2cxDVeAOL2BEYhJihEF BkCYSWDyroGTSmVNOPwprVlfOJTiDTCU/2Da+WDuMk4fVfCH0JDt3ALbF 3WbuzCwpP27/8b2GEFRUYRQGXG/xnhEhVrBeMdBIIyDw8lgMuKk9Y1EIW A==; X-IronPort-AV: E=McAfee;i="6600,9927,10960"; a="22611624" X-IronPort-AV: E=Sophos;i="6.05,211,1701158400"; d="scan'208";a="22611624" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Jan 2024 23:44:57 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10960"; a="778505018" X-IronPort-AV: E=Sophos;i="6.05,211,1701158400"; d="scan'208";a="778505018" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orsmga007.jf.intel.com with ESMTP; 21 Jan 2024 23:44:53 -0800 From: Lu Baolu To: Jason Gunthorpe , Kevin Tian , Joerg Roedel , Will Deacon , Robin Murphy , Jean-Philippe Brucker , Nicolin Chen , Yi Liu , Jacob Pan , Joel Granados Cc: iommu@lists.linux.dev, virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, Lu Baolu Subject: [PATCH v3 6/8] iommufd: IOPF-capable hw page table attach/detach/replace Date: Mon, 22 Jan 2024 15:39:01 +0800 Message-Id: <20240122073903.24406-7-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240122073903.24406-1-baolu.lu@linux.intel.com> References: <20240122073903.24406-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The iopf-capable hw page table attach/detach/replace should use the iommu iopf-specific interfaces. The pointer to iommufd_device is stored in the private field of the attachment cookie, so that it can be easily retrieved in the fault handling paths. The references to iommufd_device and iommufd_hw_pagetable objects are held until the cookie is released, which happens after the hw_pagetable is detached from the device and all outstanding iopf's are responded to. This guarantees that both the device and hw_pagetable are valid before domain detachment and outstanding faults are handled. The iopf-capable hw page tables can only be attached to devices that support the IOMMU_DEV_FEAT_IOPF feature. On the first attachment of an iopf-capable hw_pagetable to the device, the IOPF feature is enabled on the device. Similarly, after the last iopf-capable hwpt is detached from the device, the IOPF feature is disabled on the device. The current implementation allows a replacement between iopf-capable and non-iopf-capable hw page tables. This matches the nested translation use case, where a parent domain is attached by default and can then be replaced with a nested user domain with iopf support. Signed-off-by: Lu Baolu --- drivers/iommu/iommufd/iommufd_private.h | 7 ++ drivers/iommu/iommufd/device.c | 15 ++- drivers/iommu/iommufd/fault.c | 122 ++++++++++++++++++++++++ 3 files changed, 141 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommuf= d/iommufd_private.h index 2780bed0c6b1..9844a1289c01 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -398,6 +398,7 @@ struct iommufd_device { /* always the physical device */ struct device *dev; bool enforce_cache_coherency; + bool iopf_enabled; /* outstanding faults awaiting response indexed by fault group id */ struct xarray faults; }; @@ -459,6 +460,12 @@ iommufd_get_fault(struct iommufd_ucmd *ucmd, u32 id) int iommufd_fault_alloc(struct iommufd_ucmd *ucmd); void iommufd_fault_destroy(struct iommufd_object *obj); int iommufd_fault_iopf_handler(struct iopf_group *group); +int iommufd_fault_domain_attach_dev(struct iommufd_hw_pagetable *hwpt, + struct iommufd_device *idev); +void iommufd_fault_domain_detach_dev(struct iommufd_hw_pagetable *hwpt, + struct iommufd_device *idev); +int iommufd_fault_domain_replace_dev(struct iommufd_hw_pagetable *hwpt, + struct iommufd_device *idev); =20 #ifdef CONFIG_IOMMUFD_TEST int iommufd_test(struct iommufd_ucmd *ucmd); diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index d70913ee8fdf..c4737e876ebc 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -377,7 +377,10 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_page= table *hwpt, * attachment. */ if (list_empty(&idev->igroup->device_list)) { - rc =3D iommu_attach_group(hwpt->domain, idev->igroup->group); + if (hwpt->fault_capable) + rc =3D iommufd_fault_domain_attach_dev(hwpt, idev); + else + rc =3D iommu_attach_group(hwpt->domain, idev->igroup->group); if (rc) goto err_unresv; idev->igroup->hwpt =3D hwpt; @@ -403,7 +406,10 @@ iommufd_hw_pagetable_detach(struct iommufd_device *ide= v) mutex_lock(&idev->igroup->lock); list_del(&idev->group_item); if (list_empty(&idev->igroup->device_list)) { - iommu_detach_group(hwpt->domain, idev->igroup->group); + if (hwpt->fault_capable) + iommufd_fault_domain_detach_dev(hwpt, idev); + else + iommu_detach_group(hwpt->domain, idev->igroup->group); idev->igroup->hwpt =3D NULL; } if (hwpt_is_paging(hwpt)) @@ -498,7 +504,10 @@ iommufd_device_do_replace(struct iommufd_device *idev, goto err_unlock; } =20 - rc =3D iommu_group_replace_domain(igroup->group, hwpt->domain); + if (old_hwpt->fault_capable || hwpt->fault_capable) + rc =3D iommufd_fault_domain_replace_dev(hwpt, idev); + else + rc =3D iommu_group_replace_domain(igroup->group, hwpt->domain); if (rc) goto err_unresv; =20 diff --git a/drivers/iommu/iommufd/fault.c b/drivers/iommu/iommufd/fault.c index e752d1c49dde..a4a49f3cd4c2 100644 --- a/drivers/iommu/iommufd/fault.c +++ b/drivers/iommu/iommufd/fault.c @@ -267,3 +267,125 @@ int iommufd_fault_iopf_handler(struct iopf_group *gro= up) =20 return 0; } + +static void release_attach_cookie(struct iopf_attach_cookie *cookie) +{ + struct iommufd_hw_pagetable *hwpt =3D cookie->domain->fault_data; + struct iommufd_device *idev =3D cookie->private; + + refcount_dec(&idev->obj.users); + refcount_dec(&hwpt->obj.users); + kfree(cookie); +} + +static int iommufd_fault_iopf_enable(struct iommufd_device *idev) +{ + int ret; + + if (idev->iopf_enabled) + return 0; + + ret =3D iommu_dev_enable_feature(idev->dev, IOMMU_DEV_FEAT_IOPF); + if (ret) + return ret; + + idev->iopf_enabled =3D true; + + return 0; +} + +static void iommufd_fault_iopf_disable(struct iommufd_device *idev) +{ + if (!idev->iopf_enabled) + return; + + iommu_dev_disable_feature(idev->dev, IOMMU_DEV_FEAT_IOPF); + idev->iopf_enabled =3D false; +} + +int iommufd_fault_domain_attach_dev(struct iommufd_hw_pagetable *hwpt, + struct iommufd_device *idev) +{ + struct iopf_attach_cookie *cookie; + int ret; + + cookie =3D kzalloc(sizeof(*cookie), GFP_KERNEL); + if (!cookie) + return -ENOMEM; + + refcount_inc(&hwpt->obj.users); + refcount_inc(&idev->obj.users); + cookie->release =3D release_attach_cookie; + cookie->private =3D idev; + + if (!idev->iopf_enabled) { + ret =3D iommufd_fault_iopf_enable(idev); + if (ret) + goto out_put_cookie; + } + + ret =3D iopf_domain_attach(hwpt->domain, idev->dev, IOMMU_NO_PASID, cooki= e); + if (ret) + goto out_disable_iopf; + + return 0; +out_disable_iopf: + iommufd_fault_iopf_disable(idev); +out_put_cookie: + release_attach_cookie(cookie); + + return ret; +} + +void iommufd_fault_domain_detach_dev(struct iommufd_hw_pagetable *hwpt, + struct iommufd_device *idev) +{ + iopf_domain_detach(hwpt->domain, idev->dev, IOMMU_NO_PASID); + iommufd_fault_iopf_disable(idev); +} + +int iommufd_fault_domain_replace_dev(struct iommufd_hw_pagetable *hwpt, + struct iommufd_device *idev) +{ + bool iopf_enabled_originally =3D idev->iopf_enabled; + struct iopf_attach_cookie *cookie =3D NULL; + int ret; + + if (hwpt->fault_capable) { + cookie =3D kzalloc(sizeof(*cookie), GFP_KERNEL); + if (!cookie) + return -ENOMEM; + + refcount_inc(&hwpt->obj.users); + refcount_inc(&idev->obj.users); + cookie->release =3D release_attach_cookie; + cookie->private =3D idev; + + if (!idev->iopf_enabled) { + ret =3D iommufd_fault_iopf_enable(idev); + if (ret) { + release_attach_cookie(cookie); + return ret; + } + } + } + + ret =3D iopf_domain_replace(hwpt->domain, idev->dev, IOMMU_NO_PASID, cook= ie); + if (ret) { + goto out_put_cookie; + } + + if (iopf_enabled_originally && !hwpt->fault_capable) + iommufd_fault_iopf_disable(idev); + + return 0; +out_put_cookie: + if (hwpt->fault_capable) + release_attach_cookie(cookie); + if (iopf_enabled_originally && !idev->iopf_enabled) + iommufd_fault_iopf_enable(idev); + else if (!iopf_enabled_originally && idev->iopf_enabled) + iommufd_fault_iopf_disable(idev); + + return ret; +} --=20 2.34.1 From nobody Thu Dec 25 01:28:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 27F713838E for ; Mon, 22 Jan 2024 07:45:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705909503; cv=none; b=sARvoc2eCLbaIfv+98KtTqqgsdpG+N2ECiIMz6MX59aalJ3FD6SuIfzlOYFyElAIJWQY/AoMFU8KOF0bShmiJUQmrmlb2h0UTc5PEcrl4q4nmlpodPgOTN1Q9cazvgo2iEzTgc5BZ/dzK3lVeX6kTQ0+9JE46PYNLtoVYEk9+1w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705909503; c=relaxed/simple; bh=DwSOH+5N741uvq3khDnkaUrB3woK0aLhmUoTQu04tCc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Pwz3EO6aPI5J5V2BntFci1KKjQ8251rpTS7mFp7fJ/xmHwSxEB9d6wyaNKX3LbETtaSLcWSQy/+JTmUpJrTuGV0pBM8c5EFAAlB+pTfKqWzEHZq97iC2atqCtgdh2gdNmcgbzdHeY2vqWOBl3aUZ/3LX+qewDMv+PqDdVXacSPE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=i3OB1Vui; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="i3OB1Vui" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1705909502; x=1737445502; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=DwSOH+5N741uvq3khDnkaUrB3woK0aLhmUoTQu04tCc=; b=i3OB1VuiZl/yRxSNnez0A1nZrEyX918Gw52m8yw77P6PiM7ImaW3NUu1 +DQDcm6+wypPZ/C1TvKlc1dlMq1N7mZbvDizPVsRTr122ikzxT/+uOu8c A4cyRS64ojrV9OcgxPe29FCAd63LySgxKtZTsYLyzcLK23F1NoKkG1d16 HunfNU7nRCvcgQRnnctXizL3g4d9iMXmTvs4S2katbZwgJGumtdQOW+XN zRD9eZoMA3i5MBLXTmjfM6cV7Mh3FJ0ytjq9d2a+lZKXmEqPge3e3aWMk Z4bk+8rc38H3oYwBh3qbiR85YqeoXhAkYlM5Ew1v8FFhi6H10RFCQJQ/1 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10960"; a="22611660" X-IronPort-AV: E=Sophos;i="6.05,211,1701158400"; d="scan'208";a="22611660" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Jan 2024 23:45:01 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10960"; a="778505044" X-IronPort-AV: E=Sophos;i="6.05,211,1701158400"; d="scan'208";a="778505044" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orsmga007.jf.intel.com with ESMTP; 21 Jan 2024 23:44:57 -0800 From: Lu Baolu To: Jason Gunthorpe , Kevin Tian , Joerg Roedel , Will Deacon , Robin Murphy , Jean-Philippe Brucker , Nicolin Chen , Yi Liu , Jacob Pan , Joel Granados Cc: iommu@lists.linux.dev, virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, Lu Baolu Subject: [PATCH v3 7/8] iommufd/selftest: Add IOPF support for mock device Date: Mon, 22 Jan 2024 15:39:02 +0800 Message-Id: <20240122073903.24406-8-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240122073903.24406-1-baolu.lu@linux.intel.com> References: <20240122073903.24406-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Extend the selftest mock device to support generating and responding to an IOPF. Also add an ioctl interface to userspace applications to trigger the IOPF on the mock device. This would allow userspace applications to test the IOMMUFD's handling of IOPFs without having to rely on any real hardware. Signed-off-by: Lu Baolu --- drivers/iommu/iommufd/iommufd_test.h | 8 ++++ drivers/iommu/iommufd/selftest.c | 63 ++++++++++++++++++++++++++++ 2 files changed, 71 insertions(+) diff --git a/drivers/iommu/iommufd/iommufd_test.h b/drivers/iommu/iommufd/i= ommufd_test.h index 482d4059f5db..ff9dcd812618 100644 --- a/drivers/iommu/iommufd/iommufd_test.h +++ b/drivers/iommu/iommufd/iommufd_test.h @@ -22,6 +22,7 @@ enum { IOMMU_TEST_OP_MOCK_DOMAIN_FLAGS, IOMMU_TEST_OP_DIRTY, IOMMU_TEST_OP_MD_CHECK_IOTLB, + IOMMU_TEST_OP_TRIGGER_IOPF, }; =20 enum { @@ -126,6 +127,13 @@ struct iommu_test_cmd { __u32 id; __u32 iotlb; } check_iotlb; + struct { + __u32 dev_id; + __u32 pasid; + __u32 grpid; + __u32 perm; + __u64 addr; + } trigger_iopf; }; __u32 last; }; diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selft= est.c index 2fb2597e069f..2ca226f88856 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -445,6 +445,8 @@ static bool mock_domain_capable(struct device *dev, enu= m iommu_cap cap) return false; } =20 +static struct iopf_queue *mock_iommu_iopf_queue; + static struct iommu_device mock_iommu_device =3D { }; =20 @@ -455,6 +457,29 @@ static struct iommu_device *mock_probe_device(struct d= evice *dev) return &mock_iommu_device; } =20 +static void mock_domain_page_response(struct device *dev, struct iopf_faul= t *evt, + struct iommu_page_response *msg) +{ +} + +static int mock_dev_enable_feat(struct device *dev, enum iommu_dev_feature= s feat) +{ + if (feat !=3D IOMMU_DEV_FEAT_IOPF || !mock_iommu_iopf_queue) + return -ENODEV; + + return iopf_queue_add_device(mock_iommu_iopf_queue, dev); +} + +static int mock_dev_disable_feat(struct device *dev, enum iommu_dev_featur= es feat) +{ + if (feat !=3D IOMMU_DEV_FEAT_IOPF || !mock_iommu_iopf_queue) + return -ENODEV; + + iopf_queue_remove_device(mock_iommu_iopf_queue, dev); + + return 0; +} + static const struct iommu_ops mock_ops =3D { /* * IOMMU_DOMAIN_BLOCKED cannot be returned from def_domain_type() @@ -470,6 +495,9 @@ static const struct iommu_ops mock_ops =3D { .capable =3D mock_domain_capable, .device_group =3D generic_device_group, .probe_device =3D mock_probe_device, + .page_response =3D mock_domain_page_response, + .dev_enable_feat =3D mock_dev_enable_feat, + .dev_disable_feat =3D mock_dev_disable_feat, .default_domain_ops =3D &(struct iommu_domain_ops){ .free =3D mock_domain_free, @@ -1333,6 +1361,31 @@ static int iommufd_test_dirty(struct iommufd_ucmd *u= cmd, unsigned int mockpt_id, return rc; } =20 +static int iommufd_test_trigger_iopf(struct iommufd_ucmd *ucmd, + struct iommu_test_cmd *cmd) +{ + struct iopf_fault event =3D { }; + struct iommufd_device *idev; + + idev =3D iommufd_get_device(ucmd, cmd->trigger_iopf.dev_id); + if (IS_ERR(idev)) + return PTR_ERR(idev); + + event.fault.prm.flags =3D IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE; + if (cmd->trigger_iopf.pasid !=3D IOMMU_NO_PASID) + event.fault.prm.flags |=3D IOMMU_FAULT_PAGE_REQUEST_PASID_VALID; + event.fault.type =3D IOMMU_FAULT_PAGE_REQ; + event.fault.prm.addr =3D cmd->trigger_iopf.addr; + event.fault.prm.pasid =3D cmd->trigger_iopf.pasid; + event.fault.prm.grpid =3D cmd->trigger_iopf.grpid; + event.fault.prm.perm =3D cmd->trigger_iopf.perm; + + iommu_report_device_fault(idev->dev, &event); + iommufd_put_object(ucmd->ictx, &idev->obj); + + return 0; +} + void iommufd_selftest_destroy(struct iommufd_object *obj) { struct selftest_obj *sobj =3D container_of(obj, struct selftest_obj, obj); @@ -1408,6 +1461,8 @@ int iommufd_test(struct iommufd_ucmd *ucmd) cmd->dirty.page_size, u64_to_user_ptr(cmd->dirty.uptr), cmd->dirty.flags); + case IOMMU_TEST_OP_TRIGGER_IOPF: + return iommufd_test_trigger_iopf(ucmd, cmd); default: return -EOPNOTSUPP; } @@ -1449,6 +1504,9 @@ int __init iommufd_test_init(void) &iommufd_mock_bus_type.nb); if (rc) goto err_sysfs; + + mock_iommu_iopf_queue =3D iopf_queue_alloc("mock-iopfq"); + return 0; =20 err_sysfs: @@ -1464,6 +1522,11 @@ int __init iommufd_test_init(void) =20 void iommufd_test_exit(void) { + if (mock_iommu_iopf_queue) { + iopf_queue_free(mock_iommu_iopf_queue); + mock_iommu_iopf_queue =3D NULL; + } + iommu_device_sysfs_remove(&mock_iommu_device); iommu_device_unregister_bus(&mock_iommu_device, &iommufd_mock_bus_type.bus, --=20 2.34.1 From nobody Thu Dec 25 01:28:30 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AD53738FB2 for ; Mon, 22 Jan 2024 07:45:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.7 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705909508; cv=none; b=FAx3QpWwGHDd8vr4J8yHH8nSPE58i7kP2K4PBkhF4ulUvR47NHfk8yKKlRZ606IzF94WZI6Fasp6WRVIP0l74P83aG9In5R6zPKOJIiUsuBfQ6bPIbD9tBSUnS0CJE0KA9V/i7E23ezOV+Bt9aWT2XUdxkf2MVELKB9XUCBwMTg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705909508; c=relaxed/simple; bh=KlmWnPAZToFQvPsiUfP6iSvCOZ56NoSZwovTmOBtG9g=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=NNbh97RiZvWHkCYP2pWWMH8+OsedA639G/aPQytwPZ4FrwSL+Pu1CzdPB2MyPWYJmZEe0JVKnXKLGACHPaJcYZEVSgyCHYb2OUpip4/rqy7jF7DPAjASHiPYQFJqrGf2cB0yWE9QvMdzw8vGziJXCufbtWG5oDr8IOG57B45UnI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=QQ8PUQLY; arc=none smtp.client-ip=192.198.163.7 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="QQ8PUQLY" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1705909506; x=1737445506; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=KlmWnPAZToFQvPsiUfP6iSvCOZ56NoSZwovTmOBtG9g=; b=QQ8PUQLYLAsxCG1uQbS4OyaQhCIUQVwmEBVP8rx+Cm5z6vv12amsmsZz uYuc1/bVsh3cGa7DfIu0B3VTvD2wJRu5uEFc0RXPL9hY2sN2eKXjZedc4 m0UbG8wS1OBQtWKWNMdnVWspOBqW7s5B91XYe/omSwz0f4Rwr0Nwr/UTf jfgS2AqYnanB7sCQ0fE2qOOd+A0BgYp+kVmuvZvjXtrGQcejSAPcQFQaX YPa05mkacqkJlgYw/5Cc3OLD4y+5+3SePMrO2Vru49Z/ChQCbSOmquuDP dexq7Tv5od7Ootfmo7lYEU4NagPzeibYGWNfqVeKNhHl+vHdoV2N5LBc3 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10960"; a="22611700" X-IronPort-AV: E=Sophos;i="6.05,211,1701158400"; d="scan'208";a="22611700" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Jan 2024 23:45:06 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10960"; a="778505102" X-IronPort-AV: E=Sophos;i="6.05,211,1701158400"; d="scan'208";a="778505102" Received: from allen-box.sh.intel.com ([10.239.159.127]) by orsmga007.jf.intel.com with ESMTP; 21 Jan 2024 23:45:01 -0800 From: Lu Baolu To: Jason Gunthorpe , Kevin Tian , Joerg Roedel , Will Deacon , Robin Murphy , Jean-Philippe Brucker , Nicolin Chen , Yi Liu , Jacob Pan , Joel Granados Cc: iommu@lists.linux.dev, virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, Lu Baolu Subject: [PATCH v3 8/8] iommufd/selftest: Add coverage for IOPF test Date: Mon, 22 Jan 2024 15:39:03 +0800 Message-Id: <20240122073903.24406-9-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240122073903.24406-1-baolu.lu@linux.intel.com> References: <20240122073903.24406-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Extend the selftest tool to add coverage of testing IOPF handling. This would include the following tests: - Allocating and destorying an iommufd fault object. - Allocating and destroying an IOPF-capable HWPT. - Attaching/detaching/replacing an IOPF-capable HWPT on a device. - Triggering an IOPF on the mock device. - Retrieving and responding to the IOPF through the file interface. Signed-off-by: Lu Baolu --- tools/testing/selftests/iommu/iommufd_utils.h | 83 +++++++++++++++++-- tools/testing/selftests/iommu/iommufd.c | 17 ++++ .../selftests/iommu/iommufd_fail_nth.c | 2 +- 3 files changed, 96 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/= selftests/iommu/iommufd_utils.h index c646264aa41f..bf6027f2a16d 100644 --- a/tools/testing/selftests/iommu/iommufd_utils.h +++ b/tools/testing/selftests/iommu/iommufd_utils.h @@ -153,7 +153,7 @@ static int _test_cmd_mock_domain_replace(int fd, __u32 = stdev_id, __u32 pt_id, EXPECT_ERRNO(_errno, _test_cmd_mock_domain_replace(self->fd, stdev_id, \ pt_id, NULL)) =20 -static int _test_cmd_hwpt_alloc(int fd, __u32 device_id, __u32 pt_id, +static int _test_cmd_hwpt_alloc(int fd, __u32 device_id, __u32 pt_id, __u3= 2 ft_id, __u32 flags, __u32 *hwpt_id, __u32 data_type, void *data, size_t data_len) { @@ -165,6 +165,7 @@ static int _test_cmd_hwpt_alloc(int fd, __u32 device_id= , __u32 pt_id, .data_type =3D data_type, .data_len =3D data_len, .data_uptr =3D (uint64_t)data, + .fault_id =3D ft_id, }; int ret; =20 @@ -177,24 +178,30 @@ static int _test_cmd_hwpt_alloc(int fd, __u32 device_= id, __u32 pt_id, } =20 #define test_cmd_hwpt_alloc(device_id, pt_id, flags, hwpt_id) = \ - ASSERT_EQ(0, _test_cmd_hwpt_alloc(self->fd, device_id, pt_id, flags, \ + ASSERT_EQ(0, _test_cmd_hwpt_alloc(self->fd, device_id, pt_id, 0, flags, = \ hwpt_id, IOMMU_HWPT_DATA_NONE, NULL, \ 0)) #define test_err_hwpt_alloc(_errno, device_id, pt_id, flags, hwpt_id) \ EXPECT_ERRNO(_errno, _test_cmd_hwpt_alloc( \ - self->fd, device_id, pt_id, flags, \ + self->fd, device_id, pt_id, 0, flags, \ hwpt_id, IOMMU_HWPT_DATA_NONE, NULL, 0)) =20 #define test_cmd_hwpt_alloc_nested(device_id, pt_id, flags, hwpt_id, = \ data_type, data, data_len) \ - ASSERT_EQ(0, _test_cmd_hwpt_alloc(self->fd, device_id, pt_id, flags, \ + ASSERT_EQ(0, _test_cmd_hwpt_alloc(self->fd, device_id, pt_id, 0, flags, \ hwpt_id, data_type, data, data_len)) #define test_err_hwpt_alloc_nested(_errno, device_id, pt_id, flags, hwpt_i= d, \ data_type, data, data_len) \ EXPECT_ERRNO(_errno, \ - _test_cmd_hwpt_alloc(self->fd, device_id, pt_id, flags, \ + _test_cmd_hwpt_alloc(self->fd, device_id, pt_id, 0, flags, \ hwpt_id, data_type, data, data_len)) =20 +#define test_cmd_hwpt_alloc_iopf(device_id, pt_id, fault_id, flags, hwpt_i= d, \ + data_type, data, data_len) \ + ASSERT_EQ(0, _test_cmd_hwpt_alloc(self->fd, device_id, pt_id, fault_id, \ + flags, hwpt_id, data_type, data, \ + data_len)) + #define test_cmd_hwpt_check_iotlb(hwpt_id, iotlb_id, expected) = \ ({ \ struct iommu_test_cmd test_cmd =3D { \ @@ -673,3 +680,69 @@ static int _test_cmd_get_hw_info(int fd, __u32 device_= id, void *data, =20 #define test_cmd_get_hw_capabilities(device_id, caps, mask) \ ASSERT_EQ(0, _test_cmd_get_hw_info(self->fd, device_id, NULL, 0, &caps)) + +static int _test_ioctl_fault_alloc(int fd, __u32 *fault_id, __u32 *fault_f= d) +{ + struct iommu_fault_alloc cmd =3D { + .size =3D sizeof(cmd), + }; + int ret; + + ret =3D ioctl(fd, IOMMU_FAULT_ALLOC, &cmd); + if (ret) + return ret; + *fault_id =3D cmd.out_fault_id; + *fault_fd =3D cmd.out_fault_fd; + return 0; +} + +#define test_ioctl_fault_alloc(fault_id, fault_fd) \ + ({ \ + ASSERT_EQ(0, _test_ioctl_fault_alloc(self->fd, fault_id, \ + fault_fd)); \ + ASSERT_NE(0, *(fault_id)); \ + ASSERT_NE(0, *(fault_fd)); \ + }) + +static int _test_cmd_trigger_iopf(int fd, __u32 device_id, __u32 fault_fd) +{ + struct iommu_test_cmd trigger_iopf_cmd =3D { + .size =3D sizeof(trigger_iopf_cmd), + .op =3D IOMMU_TEST_OP_TRIGGER_IOPF, + .trigger_iopf =3D { + .dev_id =3D device_id, + .pasid =3D 0x1, + .grpid =3D 0x2, + .perm =3D IOMMU_PGFAULT_PERM_READ | IOMMU_PGFAULT_PERM_WRITE, + .addr =3D 0xdeadbeaf, + }, + }; + struct iommu_hwpt_page_response response =3D { + .size =3D sizeof(struct iommu_hwpt_page_response), + .dev_id =3D device_id, + .pasid =3D 0x1, + .grpid =3D 0x2, + .code =3D 0, + .addr =3D 0xdeadbeaf, + }; + struct iommu_hwpt_pgfault fault =3D {}; + ssize_t bytes; + int ret; + + ret =3D ioctl(fd, _IOMMU_TEST_CMD(IOMMU_TEST_OP_TRIGGER_IOPF), &trigger_i= opf_cmd); + if (ret) + return ret; + + bytes =3D read(fault_fd, &fault, sizeof(fault)); + if (bytes < 0) + return bytes; + + bytes =3D write(fault_fd, &response, sizeof(response)); + if (bytes < 0) + return bytes; + + return 0; +} + +#define test_cmd_trigger_iopf(device_id, fault_fd) \ + ASSERT_EQ(0, _test_cmd_trigger_iopf(self->fd, device_id, fault_fd)) diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selfte= sts/iommu/iommufd.c index 1a881e7a21d1..d7049df62ed2 100644 --- a/tools/testing/selftests/iommu/iommufd.c +++ b/tools/testing/selftests/iommu/iommufd.c @@ -278,6 +278,9 @@ TEST_F(iommufd_ioas, alloc_hwpt_nested) uint32_t parent_hwpt_id =3D 0; uint32_t parent_hwpt_id_not_work =3D 0; uint32_t test_hwpt_id =3D 0; + uint32_t iopf_hwpt_id; + uint32_t fault_id; + uint32_t fault_fd; =20 if (self->device_id) { /* Negative tests */ @@ -325,6 +328,7 @@ TEST_F(iommufd_ioas, alloc_hwpt_nested) sizeof(data)); =20 /* Allocate two nested hwpts sharing one common parent hwpt */ + test_ioctl_fault_alloc(&fault_id, &fault_fd); test_cmd_hwpt_alloc_nested(self->device_id, parent_hwpt_id, 0, &nested_hwpt_id[0], IOMMU_HWPT_DATA_SELFTEST, &data, @@ -333,6 +337,10 @@ TEST_F(iommufd_ioas, alloc_hwpt_nested) &nested_hwpt_id[1], IOMMU_HWPT_DATA_SELFTEST, &data, sizeof(data)); + test_cmd_hwpt_alloc_iopf(self->device_id, parent_hwpt_id, fault_id, + IOMMU_HWPT_FAULT_ID_VALID, &iopf_hwpt_id, + IOMMU_HWPT_DATA_SELFTEST, &data, + sizeof(data)); test_cmd_hwpt_check_iotlb_all(nested_hwpt_id[0], IOMMU_TEST_IOTLB_DEFAULT); test_cmd_hwpt_check_iotlb_all(nested_hwpt_id[1], @@ -503,14 +511,23 @@ TEST_F(iommufd_ioas, alloc_hwpt_nested) _test_ioctl_destroy(self->fd, nested_hwpt_id[1])); test_ioctl_destroy(nested_hwpt_id[0]); =20 + /* Switch from nested_hwpt_id[1] to iopf_hwpt_id */ + test_cmd_mock_domain_replace(self->stdev_id, iopf_hwpt_id); + EXPECT_ERRNO(EBUSY, + _test_ioctl_destroy(self->fd, iopf_hwpt_id)); + /* Trigger an IOPF on the device */ + test_cmd_trigger_iopf(self->device_id, fault_fd); + /* Detach from nested_hwpt_id[1] and destroy it */ test_cmd_mock_domain_replace(self->stdev_id, parent_hwpt_id); test_ioctl_destroy(nested_hwpt_id[1]); + test_ioctl_destroy(iopf_hwpt_id); =20 /* Detach from the parent hw_pagetable and destroy it */ test_cmd_mock_domain_replace(self->stdev_id, self->ioas_id); test_ioctl_destroy(parent_hwpt_id); test_ioctl_destroy(parent_hwpt_id_not_work); + test_ioctl_destroy(fault_id); } else { test_err_hwpt_alloc(ENOENT, self->device_id, self->ioas_id, 0, &parent_hwpt_id); diff --git a/tools/testing/selftests/iommu/iommufd_fail_nth.c b/tools/testi= ng/selftests/iommu/iommufd_fail_nth.c index f590417cd67a..c5d5e69452b0 100644 --- a/tools/testing/selftests/iommu/iommufd_fail_nth.c +++ b/tools/testing/selftests/iommu/iommufd_fail_nth.c @@ -615,7 +615,7 @@ TEST_FAIL_NTH(basic_fail_nth, device) if (_test_cmd_get_hw_info(self->fd, idev_id, &info, sizeof(info), NULL)) return -1; =20 - if (_test_cmd_hwpt_alloc(self->fd, idev_id, ioas_id, 0, &hwpt_id, + if (_test_cmd_hwpt_alloc(self->fd, idev_id, ioas_id, 0, 0, &hwpt_id, IOMMU_HWPT_DATA_NONE, 0, 0)) return -1; =20 --=20 2.34.1