From nobody Sat Apr 11 00:49:39 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F2A2C32772 for ; Wed, 17 Aug 2022 10:16:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232830AbiHQKQT (ORCPT ); Wed, 17 Aug 2022 06:16:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50284 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230359AbiHQKQN (ORCPT ); Wed, 17 Aug 2022 06:16:13 -0400 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1478554CB2 for ; Wed, 17 Aug 2022 03:16:13 -0700 (PDT) Received: from fraeml745-chm.china.huawei.com (unknown [172.18.147.226]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4M73h560fhz67Prk; Wed, 17 Aug 2022 18:11:17 +0800 (CST) Received: from lhrpeml500003.china.huawei.com (7.191.162.67) by fraeml745-chm.china.huawei.com (10.206.15.226) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Wed, 17 Aug 2022 12:16:11 +0200 Received: from localhost.localdomain (10.69.192.58) by lhrpeml500003.china.huawei.com (7.191.162.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Wed, 17 Aug 2022 11:16:09 +0100 From: John Garry To: , , CC: , , John Garry Subject: [PATCH 1/3] iova: Remove some magazine pointer NULL checks Date: Wed, 17 Aug 2022 18:09:42 +0800 Message-ID: <1660730984-30333-2-git-send-email-john.garry@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1660730984-30333-1-git-send-email-john.garry@huawei.com> References: <1660730984-30333-1-git-send-email-john.garry@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.69.192.58] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To lhrpeml500003.china.huawei.com (7.191.162.67) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Since commit 32e92d9f6f87 ("iommu/iova: Separate out rcache init") it has not been possible to have NULL CPU rcache "loaded" or "prev" magazine pointers. As such, the checks in iova_magazine_full(), iova_magazine_empty(), and iova_magazine_free_pfns() may be dropped. Signed-off-by: John Garry Reviewed-by: Robin Murphy --- drivers/iommu/iova.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index 47d1983dfa2a..580fdf669922 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -661,9 +661,6 @@ iova_magazine_free_pfns(struct iova_magazine *mag, stru= ct iova_domain *iovad) unsigned long flags; int i; =20 - if (!mag) - return; - spin_lock_irqsave(&iovad->iova_rbtree_lock, flags); =20 for (i =3D 0 ; i < mag->size; ++i) { @@ -683,12 +680,12 @@ iova_magazine_free_pfns(struct iova_magazine *mag, st= ruct iova_domain *iovad) =20 static bool iova_magazine_full(struct iova_magazine *mag) { - return (mag && mag->size =3D=3D IOVA_MAG_SIZE); + return mag->size =3D=3D IOVA_MAG_SIZE; } =20 static bool iova_magazine_empty(struct iova_magazine *mag) { - return (!mag || mag->size =3D=3D 0); + return mag->size =3D=3D 0; } =20 static unsigned long iova_magazine_pop(struct iova_magazine *mag, --=20 2.35.3 From nobody Sat Apr 11 00:49:39 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2FC8DC25B08 for ; Wed, 17 Aug 2022 10:16:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235430AbiHQKQW (ORCPT ); Wed, 17 Aug 2022 06:16:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50300 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229627AbiHQKQP (ORCPT ); Wed, 17 Aug 2022 06:16:15 -0400 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 43B7E54CB2 for ; Wed, 17 Aug 2022 03:16:15 -0700 (PDT) Received: from fraeml744-chm.china.huawei.com (unknown [172.18.147.200]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4M73nW64MGz67NYP; Wed, 17 Aug 2022 18:15:59 +0800 (CST) Received: from lhrpeml500003.china.huawei.com (7.191.162.67) by fraeml744-chm.china.huawei.com (10.206.15.225) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Wed, 17 Aug 2022 12:16:13 +0200 Received: from localhost.localdomain (10.69.192.58) by lhrpeml500003.china.huawei.com (7.191.162.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Wed, 17 Aug 2022 11:16:11 +0100 From: John Garry To: , , CC: , , John Garry Subject: [PATCH 2/3] iova: Remove magazine BUG_ON() checks Date: Wed, 17 Aug 2022 18:09:43 +0800 Message-ID: <1660730984-30333-3-git-send-email-john.garry@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1660730984-30333-1-git-send-email-john.garry@huawei.com> References: <1660730984-30333-1-git-send-email-john.garry@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.69.192.58] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To lhrpeml500003.china.huawei.com (7.191.162.67) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Two of the magazine helpers have BUG_ON() checks, as follows: - iova_magazine_pop() - here we ensure that the mag is not empty. However we already ensure that in the only caller, __iova_rcache_get(). - iova_magazine_push() - here we ensure that the mag is not full. However we already ensure that in the only caller, __iova_rcache_insert(). As described, the two bug checks are pointless so drop them. Signed-off-by: John Garry Acked-by: Robin Murphy --- drivers/iommu/iova.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index 580fdf669922..8aece052ce72 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -694,8 +694,6 @@ static unsigned long iova_magazine_pop(struct iova_maga= zine *mag, int i; unsigned long pfn; =20 - BUG_ON(iova_magazine_empty(mag)); - /* Only fall back to the rbtree if we have no suitable pfns at all */ for (i =3D mag->size - 1; mag->pfns[i] > limit_pfn; i--) if (i =3D=3D 0) @@ -710,8 +708,6 @@ static unsigned long iova_magazine_pop(struct iova_maga= zine *mag, =20 static void iova_magazine_push(struct iova_magazine *mag, unsigned long pf= n) { - BUG_ON(iova_magazine_full(mag)); - mag->pfns[mag->size++] =3D pfn; } =20 --=20 2.35.3 From nobody Sat Apr 11 00:49:39 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE34FC25B08 for ; Wed, 17 Aug 2022 10:16:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235048AbiHQKQ2 (ORCPT ); Wed, 17 Aug 2022 06:16:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50318 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234696AbiHQKQU (ORCPT ); Wed, 17 Aug 2022 06:16:20 -0400 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 312332DF4 for ; Wed, 17 Aug 2022 03:16:17 -0700 (PDT) Received: from fraeml743-chm.china.huawei.com (unknown [172.18.147.206]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4M73nY6S3Tz682vm; Wed, 17 Aug 2022 18:16:01 +0800 (CST) Received: from lhrpeml500003.china.huawei.com (7.191.162.67) by fraeml743-chm.china.huawei.com (10.206.15.224) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Wed, 17 Aug 2022 12:16:15 +0200 Received: from localhost.localdomain (10.69.192.58) by lhrpeml500003.china.huawei.com (7.191.162.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Wed, 17 Aug 2022 11:16:13 +0100 From: John Garry To: , , CC: , , John Garry Subject: [PATCH 3/3] iova: Re-order code to remove forward declarations Date: Wed, 17 Aug 2022 18:09:44 +0800 Message-ID: <1660730984-30333-4-git-send-email-john.garry@huawei.com> X-Mailer: git-send-email 2.8.1 In-Reply-To: <1660730984-30333-1-git-send-email-john.garry@huawei.com> References: <1660730984-30333-1-git-send-email-john.garry@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.69.192.58] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To lhrpeml500003.china.huawei.com (7.191.162.67) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Now that the FQ code has been moved to dma-iommu.c and also the rcache- related structures have been brought into iova.c, let's re-order the code to remove all the forward declarations. The general order will be as follows: - RB tree code - iova management - magazine helpers - rcache code and "fast" APIs - iova domain public APIs Re-order prototypes in iova.h to follow the same general group ordering. A couple of pre-existing checkpatch warnings are also remedied: A suspect indentation is also corrected: WARNING: suspect code indent for conditional statements (16, 32) #374: FILE: drivers/iommu/iova.c:194: + } else if (overlap) + break; WARNING: Block comments should align the * on each line #1038: FILE: drivers/iommu/iova.c:787: + * fails too and the flush_rcache flag is set then the rcache will be flus= hed. +*/ No functional change intended. Signed-off-by: John Garry --- drivers/iommu/iova.c | 1069 +++++++++++++++++++++--------------------- include/linux/iova.h | 60 +-- 2 files changed, 560 insertions(+), 569 deletions(-) diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c index 8aece052ce72..e55fd95866a2 100644 --- a/drivers/iommu/iova.c +++ b/drivers/iommu/iova.c @@ -17,80 +17,45 @@ =20 #define IOVA_RANGE_CACHE_MAX_SIZE 6 /* log of max cached IOVA range size (= in pages) */ =20 -static bool iova_rcache_insert(struct iova_domain *iovad, - unsigned long pfn, - unsigned long size); -static unsigned long iova_rcache_get(struct iova_domain *iovad, - unsigned long size, - unsigned long limit_pfn); -static void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *io= vad); -static void free_iova_rcaches(struct iova_domain *iovad); - -unsigned long iova_rcache_range(void) -{ - return PAGE_SIZE << (IOVA_RANGE_CACHE_MAX_SIZE - 1); -} +/* + * Magazine caches for IOVA ranges. For an introduction to magazines, + * see the USENIX 2001 paper "Magazines and Vmem: Extending the Slab + * Allocator to Many CPUs and Arbitrary Resources" by Bonwick and Adams. + * For simplicity, we use a static magazine size and don't implement the + * dynamic size tuning described in the paper. + */ =20 -static int iova_cpuhp_dead(unsigned int cpu, struct hlist_node *node) -{ - struct iova_domain *iovad; +/* + * As kmalloc's buffer size is fixed to power of 2, 127 is chosen to + * assure size of 'iova_magazine' to be 1024 bytes, so that no memory + * will be wasted. + */ +#define IOVA_MAG_SIZE 127 +#define MAX_GLOBAL_MAGS 32 /* magazines per bin */ =20 - iovad =3D hlist_entry_safe(node, struct iova_domain, cpuhp_dead); +struct iova_magazine { + unsigned long size; + unsigned long pfns[IOVA_MAG_SIZE]; +}; =20 - free_cpu_cached_iovas(cpu, iovad); - return 0; -} +struct iova_cpu_rcache { + spinlock_t lock; + struct iova_magazine *loaded; + struct iova_magazine *prev; +}; =20 -static void free_global_cached_iovas(struct iova_domain *iovad); +struct iova_rcache { + spinlock_t lock; + unsigned long depot_size; + struct iova_magazine *depot[MAX_GLOBAL_MAGS]; + struct iova_cpu_rcache __percpu *cpu_rcaches; +}; =20 static struct iova *to_iova(struct rb_node *node) { return rb_entry(node, struct iova, node); } =20 -void -init_iova_domain(struct iova_domain *iovad, unsigned long granule, - unsigned long start_pfn) -{ - /* - * IOVA granularity will normally be equal to the smallest - * supported IOMMU page size; both *must* be capable of - * representing individual CPU pages exactly. - */ - BUG_ON((granule > PAGE_SIZE) || !is_power_of_2(granule)); - - spin_lock_init(&iovad->iova_rbtree_lock); - iovad->rbroot =3D RB_ROOT; - iovad->cached_node =3D &iovad->anchor.node; - iovad->cached32_node =3D &iovad->anchor.node; - iovad->granule =3D granule; - iovad->start_pfn =3D start_pfn; - iovad->dma_32bit_pfn =3D 1UL << (32 - iova_shift(iovad)); - iovad->max32_alloc_size =3D iovad->dma_32bit_pfn; - iovad->anchor.pfn_lo =3D iovad->anchor.pfn_hi =3D IOVA_ANCHOR; - rb_link_node(&iovad->anchor.node, NULL, &iovad->rbroot.rb_node); - rb_insert_color(&iovad->anchor.node, &iovad->rbroot); -} -EXPORT_SYMBOL_GPL(init_iova_domain); - -static struct rb_node * -__get_cached_rbnode(struct iova_domain *iovad, unsigned long limit_pfn) -{ - if (limit_pfn <=3D iovad->dma_32bit_pfn) - return iovad->cached32_node; - - return iovad->cached_node; -} - -static void -__cached_rbnode_insert_update(struct iova_domain *iovad, struct iova *new) -{ - if (new->pfn_hi < iovad->dma_32bit_pfn) - iovad->cached32_node =3D &new->node; - else - iovad->cached_node =3D &new->node; -} - static void __cached_rbnode_delete_update(struct iova_domain *iovad, struct iova *free) { @@ -110,43 +75,6 @@ __cached_rbnode_delete_update(struct iova_domain *iovad= , struct iova *free) iovad->cached_node =3D rb_next(&free->node); } =20 -static struct rb_node *iova_find_limit(struct iova_domain *iovad, unsigned= long limit_pfn) -{ - struct rb_node *node, *next; - /* - * Ideally what we'd like to judge here is whether limit_pfn is close - * enough to the highest-allocated IOVA that starting the allocation - * walk from the anchor node will be quicker than this initial work to - * find an exact starting point (especially if that ends up being the - * anchor node anyway). This is an incredibly crude approximation which - * only really helps the most likely case, but is at least trivially easy. - */ - if (limit_pfn > iovad->dma_32bit_pfn) - return &iovad->anchor.node; - - node =3D iovad->rbroot.rb_node; - while (to_iova(node)->pfn_hi < limit_pfn) - node =3D node->rb_right; - -search_left: - while (node->rb_left && to_iova(node->rb_left)->pfn_lo >=3D limit_pfn) - node =3D node->rb_left; - - if (!node->rb_left) - return node; - - next =3D node->rb_left; - while (next->rb_right) { - next =3D next->rb_right; - if (to_iova(next)->pfn_lo >=3D limit_pfn) { - node =3D next; - goto search_left; - } - } - - return node; -} - /* Insert the iova into domain rbtree by holding writer lock */ static void iova_insert_rbtree(struct rb_root *root, struct iova *iova, @@ -175,65 +103,15 @@ iova_insert_rbtree(struct rb_root *root, struct iova = *iova, rb_insert_color(&iova->node, root); } =20 -static int __alloc_and_insert_iova_range(struct iova_domain *iovad, - unsigned long size, unsigned long limit_pfn, - struct iova *new, bool size_aligned) +static int +__is_range_overlap(struct rb_node *node, + unsigned long pfn_lo, unsigned long pfn_hi) { - struct rb_node *curr, *prev; - struct iova *curr_iova; - unsigned long flags; - unsigned long new_pfn, retry_pfn; - unsigned long align_mask =3D ~0UL; - unsigned long high_pfn =3D limit_pfn, low_pfn =3D iovad->start_pfn; - - if (size_aligned) - align_mask <<=3D fls_long(size - 1); - - /* Walk the tree backwards */ - spin_lock_irqsave(&iovad->iova_rbtree_lock, flags); - if (limit_pfn <=3D iovad->dma_32bit_pfn && - size >=3D iovad->max32_alloc_size) - goto iova32_full; - - curr =3D __get_cached_rbnode(iovad, limit_pfn); - curr_iova =3D to_iova(curr); - retry_pfn =3D curr_iova->pfn_hi + 1; - -retry: - do { - high_pfn =3D min(high_pfn, curr_iova->pfn_lo); - new_pfn =3D (high_pfn - size) & align_mask; - prev =3D curr; - curr =3D rb_prev(curr); - curr_iova =3D to_iova(curr); - } while (curr && new_pfn <=3D curr_iova->pfn_hi && new_pfn >=3D low_pfn); - - if (high_pfn < size || new_pfn < low_pfn) { - if (low_pfn =3D=3D iovad->start_pfn && retry_pfn < limit_pfn) { - high_pfn =3D limit_pfn; - low_pfn =3D retry_pfn; - curr =3D iova_find_limit(iovad, limit_pfn); - curr_iova =3D to_iova(curr); - goto retry; - } - iovad->max32_alloc_size =3D size; - goto iova32_full; - } - - /* pfn_lo will point to size aligned address if size_aligned is set */ - new->pfn_lo =3D new_pfn; - new->pfn_hi =3D new->pfn_lo + size - 1; - - /* If we have 'prev', it's a valid place to start the insertion. */ - iova_insert_rbtree(&iovad->rbroot, new, prev); - __cached_rbnode_insert_update(iovad, new); + struct iova *iova =3D to_iova(node); =20 - spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags); + if ((pfn_lo <=3D iova->pfn_hi) && (pfn_hi >=3D iova->pfn_lo)) + return 1; return 0; - -iova32_full: - spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags); - return -ENOMEM; } =20 static struct kmem_cache *iova_cache; @@ -251,88 +129,87 @@ static void free_iova_mem(struct iova *iova) kmem_cache_free(iova_cache, iova); } =20 -int iova_cache_get(void) +static inline struct iova * +alloc_and_init_iova(unsigned long pfn_lo, unsigned long pfn_hi) { - mutex_lock(&iova_cache_mutex); - if (!iova_cache_users) { - int ret; - - ret =3D cpuhp_setup_state_multi(CPUHP_IOMMU_IOVA_DEAD, "iommu/iova:dead"= , NULL, - iova_cpuhp_dead); - if (ret) { - mutex_unlock(&iova_cache_mutex); - pr_err("Couldn't register cpuhp handler\n"); - return ret; - } + struct iova *iova; =20 - iova_cache =3D kmem_cache_create( - "iommu_iova", sizeof(struct iova), 0, - SLAB_HWCACHE_ALIGN, NULL); - if (!iova_cache) { - cpuhp_remove_multi_state(CPUHP_IOMMU_IOVA_DEAD); - mutex_unlock(&iova_cache_mutex); - pr_err("Couldn't create iova cache\n"); - return -ENOMEM; - } + iova =3D alloc_iova_mem(); + if (iova) { + iova->pfn_lo =3D pfn_lo; + iova->pfn_hi =3D pfn_hi; } =20 - iova_cache_users++; - mutex_unlock(&iova_cache_mutex); + return iova; +} =20 - return 0; +static struct iova * +__insert_new_range(struct iova_domain *iovad, + unsigned long pfn_lo, unsigned long pfn_hi) +{ + struct iova *iova; + + iova =3D alloc_and_init_iova(pfn_lo, pfn_hi); + if (iova) + iova_insert_rbtree(&iovad->rbroot, iova, NULL); + + return iova; } -EXPORT_SYMBOL_GPL(iova_cache_get); =20 -void iova_cache_put(void) +static void +__adjust_overlap_range(struct iova *iova, + unsigned long *pfn_lo, unsigned long *pfn_hi) { - mutex_lock(&iova_cache_mutex); - if (WARN_ON(!iova_cache_users)) { - mutex_unlock(&iova_cache_mutex); - return; - } - iova_cache_users--; - if (!iova_cache_users) { - cpuhp_remove_multi_state(CPUHP_IOMMU_IOVA_DEAD); - kmem_cache_destroy(iova_cache); - } - mutex_unlock(&iova_cache_mutex); + if (*pfn_lo < iova->pfn_lo) + iova->pfn_lo =3D *pfn_lo; + if (*pfn_hi > iova->pfn_hi) + *pfn_lo =3D iova->pfn_hi + 1; } -EXPORT_SYMBOL_GPL(iova_cache_put); =20 /** - * alloc_iova - allocates an iova - * @iovad: - iova domain in question - * @size: - size of page frames to allocate - * @limit_pfn: - max limit address - * @size_aligned: - set if size_aligned address range is required - * This function allocates an iova in the range iovad->start_pfn to limit_= pfn, - * searching top-down from limit_pfn to iovad->start_pfn. If the size_alig= ned - * flag is set then the allocated address iova->pfn_lo will be naturally - * aligned on roundup_power_of_two(size). + * reserve_iova - reserves an iova in the given range + * @iovad: - iova domain pointer + * @pfn_lo: - lower page frame address + * @pfn_hi:- higher pfn adderss + * This function allocates reserves the address range from pfn_lo to pfn_h= i so + * that this address is not dished out as part of alloc_iova. */ struct iova * -alloc_iova(struct iova_domain *iovad, unsigned long size, - unsigned long limit_pfn, - bool size_aligned) +reserve_iova(struct iova_domain *iovad, + unsigned long pfn_lo, unsigned long pfn_hi) { - struct iova *new_iova; - int ret; + struct rb_node *node; + unsigned long flags; + struct iova *iova; + unsigned int overlap =3D 0; =20 - new_iova =3D alloc_iova_mem(); - if (!new_iova) + /* Don't allow nonsensical pfns */ + if (WARN_ON((pfn_hi | pfn_lo) > (ULLONG_MAX >> iova_shift(iovad)))) return NULL; =20 - ret =3D __alloc_and_insert_iova_range(iovad, size, limit_pfn + 1, - new_iova, size_aligned); - - if (ret) { - free_iova_mem(new_iova); - return NULL; + spin_lock_irqsave(&iovad->iova_rbtree_lock, flags); + for (node =3D rb_first(&iovad->rbroot); node; node =3D rb_next(node)) { + if (__is_range_overlap(node, pfn_lo, pfn_hi)) { + iova =3D to_iova(node); + __adjust_overlap_range(iova, &pfn_lo, &pfn_hi); + if ((pfn_lo >=3D iova->pfn_lo) && + (pfn_hi <=3D iova->pfn_hi)) + goto finish; + overlap =3D 1; + } else if (overlap) + break; } =20 - return new_iova; + /* We are here either because this is the first reserver node + * or need to insert remaining non overlap addr range + */ + iova =3D __insert_new_range(iovad, pfn_lo, pfn_hi); +finish: + + spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags); + return iova; } -EXPORT_SYMBOL_GPL(alloc_iova); +EXPORT_SYMBOL_GPL(reserve_iova); =20 static struct iova * private_find_iova(struct iova_domain *iovad, unsigned long pfn) @@ -362,6 +239,122 @@ static void remove_iova(struct iova_domain *iovad, st= ruct iova *iova) rb_erase(&iova->node, &iovad->rbroot); } =20 +static struct rb_node * +__get_cached_rbnode(struct iova_domain *iovad, unsigned long limit_pfn) +{ + if (limit_pfn <=3D iovad->dma_32bit_pfn) + return iovad->cached32_node; + + return iovad->cached_node; +} + +static void +__cached_rbnode_insert_update(struct iova_domain *iovad, struct iova *new) +{ + if (new->pfn_hi < iovad->dma_32bit_pfn) + iovad->cached32_node =3D &new->node; + else + iovad->cached_node =3D &new->node; +} + +static struct rb_node *iova_find_limit(struct iova_domain *iovad, unsigned= long limit_pfn) +{ + struct rb_node *node, *next; + /* + * Ideally what we'd like to judge here is whether limit_pfn is close + * enough to the highest-allocated IOVA that starting the allocation + * walk from the anchor node will be quicker than this initial work to + * find an exact starting point (especially if that ends up being the + * anchor node anyway). This is an incredibly crude approximation which + * only really helps the most likely case, but is at least trivially easy. + */ + if (limit_pfn > iovad->dma_32bit_pfn) + return &iovad->anchor.node; + + node =3D iovad->rbroot.rb_node; + while (to_iova(node)->pfn_hi < limit_pfn) + node =3D node->rb_right; + +search_left: + while (node->rb_left && to_iova(node->rb_left)->pfn_lo >=3D limit_pfn) + node =3D node->rb_left; + + if (!node->rb_left) + return node; + + next =3D node->rb_left; + while (next->rb_right) { + next =3D next->rb_right; + if (to_iova(next)->pfn_lo >=3D limit_pfn) { + node =3D next; + goto search_left; + } + } + + return node; +} + +static int __alloc_and_insert_iova_range(struct iova_domain *iovad, + unsigned long size, unsigned long limit_pfn, + struct iova *new, bool size_aligned) +{ + struct rb_node *curr, *prev; + struct iova *curr_iova; + unsigned long flags; + unsigned long new_pfn, retry_pfn; + unsigned long align_mask =3D ~0UL; + unsigned long high_pfn =3D limit_pfn, low_pfn =3D iovad->start_pfn; + + if (size_aligned) + align_mask <<=3D fls_long(size - 1); + + /* Walk the tree backwards */ + spin_lock_irqsave(&iovad->iova_rbtree_lock, flags); + if (limit_pfn <=3D iovad->dma_32bit_pfn && + size >=3D iovad->max32_alloc_size) + goto iova32_full; + + curr =3D __get_cached_rbnode(iovad, limit_pfn); + curr_iova =3D to_iova(curr); + retry_pfn =3D curr_iova->pfn_hi + 1; + +retry: + do { + high_pfn =3D min(high_pfn, curr_iova->pfn_lo); + new_pfn =3D (high_pfn - size) & align_mask; + prev =3D curr; + curr =3D rb_prev(curr); + curr_iova =3D to_iova(curr); + } while (curr && new_pfn <=3D curr_iova->pfn_hi && new_pfn >=3D low_pfn); + + if (high_pfn < size || new_pfn < low_pfn) { + if (low_pfn =3D=3D iovad->start_pfn && retry_pfn < limit_pfn) { + high_pfn =3D limit_pfn; + low_pfn =3D retry_pfn; + curr =3D iova_find_limit(iovad, limit_pfn); + curr_iova =3D to_iova(curr); + goto retry; + } + iovad->max32_alloc_size =3D size; + goto iova32_full; + } + + /* pfn_lo will point to size aligned address if size_aligned is set */ + new->pfn_lo =3D new_pfn; + new->pfn_hi =3D new->pfn_lo + size - 1; + + /* If we have 'prev', it's a valid place to start the insertion. */ + iova_insert_rbtree(&iovad->rbroot, new, prev); + __cached_rbnode_insert_update(iovad, new); + + spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags); + return 0; + +iova32_full: + spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags); + return -ENOMEM; +} + /** * find_iova - finds an iova for a given pfn * @iovad: - iova domain in question. @@ -382,6 +375,41 @@ struct iova *find_iova(struct iova_domain *iovad, unsi= gned long pfn) } EXPORT_SYMBOL_GPL(find_iova); =20 +/** + * alloc_iova - allocates an iova + * @iovad: - iova domain in question + * @size: - size of page frames to allocate + * @limit_pfn: - max limit address + * @size_aligned: - set if size_aligned address range is required + * This function allocates an iova in the range iovad->start_pfn to limit_= pfn, + * searching top-down from limit_pfn to iovad->start_pfn. If the size_alig= ned + * flag is set then the allocated address iova->pfn_lo will be naturally + * aligned on roundup_power_of_two(size). + */ +struct iova * +alloc_iova(struct iova_domain *iovad, unsigned long size, + unsigned long limit_pfn, + bool size_aligned) +{ + struct iova *new_iova; + int ret; + + new_iova =3D alloc_iova_mem(); + if (!new_iova) + return NULL; + + ret =3D __alloc_and_insert_iova_range(iovad, size, limit_pfn + 1, + new_iova, size_aligned); + + if (ret) { + free_iova_mem(new_iova); + return NULL; + } + + return new_iova; +} +EXPORT_SYMBOL_GPL(alloc_iova); + /** * __free_iova - frees the given iova * @iovad: iova domain in question. @@ -425,234 +453,47 @@ free_iova(struct iova_domain *iovad, unsigned long p= fn) } EXPORT_SYMBOL_GPL(free_iova); =20 -/** - * alloc_iova_fast - allocates an iova from rcache - * @iovad: - iova domain in question - * @size: - size of page frames to allocate - * @limit_pfn: - max limit address - * @flush_rcache: - set to flush rcache on regular allocation failure - * This function tries to satisfy an iova allocation from the rcache, - * and falls back to regular allocation on failure. If regular allocation - * fails too and the flush_rcache flag is set then the rcache will be flus= hed. -*/ -unsigned long -alloc_iova_fast(struct iova_domain *iovad, unsigned long size, - unsigned long limit_pfn, bool flush_rcache) +static bool iova_magazine_full(struct iova_magazine *mag) { - unsigned long iova_pfn; - struct iova *new_iova; + return mag->size =3D=3D IOVA_MAG_SIZE; +} =20 - /* - * Freeing non-power-of-two-sized allocations back into the IOVA caches - * will come back to bite us badly, so we have to waste a bit of space - * rounding up anything cacheable to make sure that can't happen. The - * order of the unadjusted size will still match upon freeing. - */ - if (size < (1 << (IOVA_RANGE_CACHE_MAX_SIZE - 1))) - size =3D roundup_pow_of_two(size); +static void iova_magazine_push(struct iova_magazine *mag, unsigned long pf= n) +{ + mag->pfns[mag->size++] =3D pfn; +} =20 - iova_pfn =3D iova_rcache_get(iovad, size, limit_pfn + 1); - if (iova_pfn) - return iova_pfn; +static struct iova_magazine *iova_magazine_alloc(gfp_t flags) +{ + return kzalloc(sizeof(struct iova_magazine), flags); +} =20 -retry: - new_iova =3D alloc_iova(iovad, size, limit_pfn, true); - if (!new_iova) { - unsigned int cpu; +static void iova_magazine_free(struct iova_magazine *mag) +{ + kfree(mag); +} =20 - if (!flush_rcache) - return 0; +static bool iova_magazine_empty(struct iova_magazine *mag) +{ + return mag->size =3D=3D 0; +} =20 - /* Try replenishing IOVAs by flushing rcache. */ - flush_rcache =3D false; - for_each_online_cpu(cpu) - free_cpu_cached_iovas(cpu, iovad); - free_global_cached_iovas(iovad); - goto retry; - } - - return new_iova->pfn_lo; -} -EXPORT_SYMBOL_GPL(alloc_iova_fast); - -/** - * free_iova_fast - free iova pfn range into rcache - * @iovad: - iova domain in question. - * @pfn: - pfn that is allocated previously - * @size: - # of pages in range - * This functions frees an iova range by trying to put it into the rcache, - * falling back to regular iova deallocation via free_iova() if this fails. - */ -void -free_iova_fast(struct iova_domain *iovad, unsigned long pfn, unsigned long= size) -{ - if (iova_rcache_insert(iovad, pfn, size)) - return; - - free_iova(iovad, pfn); -} -EXPORT_SYMBOL_GPL(free_iova_fast); - -static void iova_domain_free_rcaches(struct iova_domain *iovad) -{ - cpuhp_state_remove_instance_nocalls(CPUHP_IOMMU_IOVA_DEAD, - &iovad->cpuhp_dead); - free_iova_rcaches(iovad); -} - -/** - * put_iova_domain - destroys the iova domain - * @iovad: - iova domain in question. - * All the iova's in that domain are destroyed. - */ -void put_iova_domain(struct iova_domain *iovad) -{ - struct iova *iova, *tmp; - - if (iovad->rcaches) - iova_domain_free_rcaches(iovad); - - rbtree_postorder_for_each_entry_safe(iova, tmp, &iovad->rbroot, node) - free_iova_mem(iova); -} -EXPORT_SYMBOL_GPL(put_iova_domain); - -static int -__is_range_overlap(struct rb_node *node, - unsigned long pfn_lo, unsigned long pfn_hi) -{ - struct iova *iova =3D to_iova(node); - - if ((pfn_lo <=3D iova->pfn_hi) && (pfn_hi >=3D iova->pfn_lo)) - return 1; - return 0; -} - -static inline struct iova * -alloc_and_init_iova(unsigned long pfn_lo, unsigned long pfn_hi) -{ - struct iova *iova; - - iova =3D alloc_iova_mem(); - if (iova) { - iova->pfn_lo =3D pfn_lo; - iova->pfn_hi =3D pfn_hi; - } - - return iova; -} - -static struct iova * -__insert_new_range(struct iova_domain *iovad, - unsigned long pfn_lo, unsigned long pfn_hi) -{ - struct iova *iova; - - iova =3D alloc_and_init_iova(pfn_lo, pfn_hi); - if (iova) - iova_insert_rbtree(&iovad->rbroot, iova, NULL); - - return iova; -} - -static void -__adjust_overlap_range(struct iova *iova, - unsigned long *pfn_lo, unsigned long *pfn_hi) -{ - if (*pfn_lo < iova->pfn_lo) - iova->pfn_lo =3D *pfn_lo; - if (*pfn_hi > iova->pfn_hi) - *pfn_lo =3D iova->pfn_hi + 1; -} - -/** - * reserve_iova - reserves an iova in the given range - * @iovad: - iova domain pointer - * @pfn_lo: - lower page frame address - * @pfn_hi:- higher pfn adderss - * This function allocates reserves the address range from pfn_lo to pfn_h= i so - * that this address is not dished out as part of alloc_iova. - */ -struct iova * -reserve_iova(struct iova_domain *iovad, - unsigned long pfn_lo, unsigned long pfn_hi) +static unsigned long iova_magazine_pop(struct iova_magazine *mag, + unsigned long limit_pfn) { - struct rb_node *node; - unsigned long flags; - struct iova *iova; - unsigned int overlap =3D 0; - - /* Don't allow nonsensical pfns */ - if (WARN_ON((pfn_hi | pfn_lo) > (ULLONG_MAX >> iova_shift(iovad)))) - return NULL; - - spin_lock_irqsave(&iovad->iova_rbtree_lock, flags); - for (node =3D rb_first(&iovad->rbroot); node; node =3D rb_next(node)) { - if (__is_range_overlap(node, pfn_lo, pfn_hi)) { - iova =3D to_iova(node); - __adjust_overlap_range(iova, &pfn_lo, &pfn_hi); - if ((pfn_lo >=3D iova->pfn_lo) && - (pfn_hi <=3D iova->pfn_hi)) - goto finish; - overlap =3D 1; - - } else if (overlap) - break; - } - - /* We are here either because this is the first reserver node - * or need to insert remaining non overlap addr range - */ - iova =3D __insert_new_range(iovad, pfn_lo, pfn_hi); -finish: - - spin_unlock_irqrestore(&iovad->iova_rbtree_lock, flags); - return iova; -} -EXPORT_SYMBOL_GPL(reserve_iova); - -/* - * Magazine caches for IOVA ranges. For an introduction to magazines, - * see the USENIX 2001 paper "Magazines and Vmem: Extending the Slab - * Allocator to Many CPUs and Arbitrary Resources" by Bonwick and Adams. - * For simplicity, we use a static magazine size and don't implement the - * dynamic size tuning described in the paper. - */ - -/* - * As kmalloc's buffer size is fixed to power of 2, 127 is chosen to - * assure size of 'iova_magazine' to be 1024 bytes, so that no memory - * will be wasted. - */ -#define IOVA_MAG_SIZE 127 -#define MAX_GLOBAL_MAGS 32 /* magazines per bin */ - -struct iova_magazine { - unsigned long size; - unsigned long pfns[IOVA_MAG_SIZE]; -}; - -struct iova_cpu_rcache { - spinlock_t lock; - struct iova_magazine *loaded; - struct iova_magazine *prev; -}; + int i; + unsigned long pfn; =20 -struct iova_rcache { - spinlock_t lock; - unsigned long depot_size; - struct iova_magazine *depot[MAX_GLOBAL_MAGS]; - struct iova_cpu_rcache __percpu *cpu_rcaches; -}; + /* Only fall back to the rbtree if we have no suitable pfns at all */ + for (i =3D mag->size - 1; mag->pfns[i] > limit_pfn; i--) + if (i =3D=3D 0) + return 0; =20 -static struct iova_magazine *iova_magazine_alloc(gfp_t flags) -{ - return kzalloc(sizeof(struct iova_magazine), flags); -} + /* Swap it to pop it */ + pfn =3D mag->pfns[i]; + mag->pfns[i] =3D mag->pfns[--mag->size]; =20 -static void iova_magazine_free(struct iova_magazine *mag) -{ - kfree(mag); + return pfn; } =20 static void @@ -678,87 +519,91 @@ iova_magazine_free_pfns(struct iova_magazine *mag, st= ruct iova_domain *iovad) mag->size =3D 0; } =20 -static bool iova_magazine_full(struct iova_magazine *mag) +/* + * free all the IOVA ranges cached by a cpu (used when cpu is unplugged) + */ +static void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *io= vad) { - return mag->size =3D=3D IOVA_MAG_SIZE; -} + struct iova_cpu_rcache *cpu_rcache; + struct iova_rcache *rcache; + unsigned long flags; + int i; =20 -static bool iova_magazine_empty(struct iova_magazine *mag) -{ - return mag->size =3D=3D 0; + for (i =3D 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) { + rcache =3D &iovad->rcaches[i]; + cpu_rcache =3D per_cpu_ptr(rcache->cpu_rcaches, cpu); + spin_lock_irqsave(&cpu_rcache->lock, flags); + iova_magazine_free_pfns(cpu_rcache->loaded, iovad); + iova_magazine_free_pfns(cpu_rcache->prev, iovad); + spin_unlock_irqrestore(&cpu_rcache->lock, flags); + } } =20 -static unsigned long iova_magazine_pop(struct iova_magazine *mag, - unsigned long limit_pfn) +static int iova_cpuhp_dead(unsigned int cpu, struct hlist_node *node) { - int i; - unsigned long pfn; - - /* Only fall back to the rbtree if we have no suitable pfns at all */ - for (i =3D mag->size - 1; mag->pfns[i] > limit_pfn; i--) - if (i =3D=3D 0) - return 0; + struct iova_domain *iovad; =20 - /* Swap it to pop it */ - pfn =3D mag->pfns[i]; - mag->pfns[i] =3D mag->pfns[--mag->size]; + iovad =3D hlist_entry_safe(node, struct iova_domain, cpuhp_dead); =20 - return pfn; + free_cpu_cached_iovas(cpu, iovad); + return 0; } =20 -static void iova_magazine_push(struct iova_magazine *mag, unsigned long pf= n) +/* + * free all the IOVA ranges of global cache + */ +static void free_global_cached_iovas(struct iova_domain *iovad) { - mag->pfns[mag->size++] =3D pfn; + struct iova_rcache *rcache; + unsigned long flags; + int i, j; + + for (i =3D 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) { + rcache =3D &iovad->rcaches[i]; + spin_lock_irqsave(&rcache->lock, flags); + for (j =3D 0; j < rcache->depot_size; ++j) { + iova_magazine_free_pfns(rcache->depot[j], iovad); + iova_magazine_free(rcache->depot[j]); + } + rcache->depot_size =3D 0; + spin_unlock_irqrestore(&rcache->lock, flags); + } } =20 -int iova_domain_init_rcaches(struct iova_domain *iovad) +/* + * free rcache data structures. + */ +static void free_iova_rcaches(struct iova_domain *iovad) { + struct iova_rcache *rcache; + struct iova_cpu_rcache *cpu_rcache; unsigned int cpu; - int i, ret; - - iovad->rcaches =3D kcalloc(IOVA_RANGE_CACHE_MAX_SIZE, - sizeof(struct iova_rcache), - GFP_KERNEL); - if (!iovad->rcaches) - return -ENOMEM; + int i, j; =20 for (i =3D 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) { - struct iova_cpu_rcache *cpu_rcache; - struct iova_rcache *rcache; - rcache =3D &iovad->rcaches[i]; - spin_lock_init(&rcache->lock); - rcache->depot_size =3D 0; - rcache->cpu_rcaches =3D __alloc_percpu(sizeof(*cpu_rcache), - cache_line_size()); - if (!rcache->cpu_rcaches) { - ret =3D -ENOMEM; - goto out_err; - } - for_each_possible_cpu(cpu) { - cpu_rcache =3D per_cpu_ptr(rcache->cpu_rcaches, cpu); - - spin_lock_init(&cpu_rcache->lock); - cpu_rcache->loaded =3D iova_magazine_alloc(GFP_KERNEL); - cpu_rcache->prev =3D iova_magazine_alloc(GFP_KERNEL); - if (!cpu_rcache->loaded || !cpu_rcache->prev) { - ret =3D -ENOMEM; - goto out_err; - } + if (!rcache->cpu_rcaches) + break; + for_each_possible_cpu(cpu) { + cpu_rcache =3D per_cpu_ptr(rcache->cpu_rcaches, cpu); + iova_magazine_free(cpu_rcache->loaded); + iova_magazine_free(cpu_rcache->prev); } + free_percpu(rcache->cpu_rcaches); + for (j =3D 0; j < rcache->depot_size; ++j) + iova_magazine_free(rcache->depot[j]); } =20 - ret =3D cpuhp_state_add_instance_nocalls(CPUHP_IOMMU_IOVA_DEAD, - &iovad->cpuhp_dead); - if (ret) - goto out_err; - return 0; + kfree(iovad->rcaches); + iovad->rcaches =3D NULL; +} =20 -out_err: +static void iova_domain_free_rcaches(struct iova_domain *iovad) +{ + cpuhp_state_remove_instance_nocalls(CPUHP_IOMMU_IOVA_DEAD, + &iovad->cpuhp_dead); free_iova_rcaches(iovad); - return ret; } -EXPORT_SYMBOL_GPL(iova_domain_init_rcaches); =20 /* * Try inserting IOVA range starting with 'iova_pfn' into 'rcache', and @@ -881,73 +726,217 @@ static unsigned long iova_rcache_get(struct iova_dom= ain *iovad, return __iova_rcache_get(&iovad->rcaches[log_size], limit_pfn - size); } =20 -/* - * free rcache data structures. +/** + * alloc_iova_fast - allocates an iova from rcache + * @iovad: - iova domain in question + * @size: - size of page frames to allocate + * @limit_pfn: - max limit address + * @flush_rcache: - set to flush rcache on regular allocation failure + * This function tries to satisfy an iova allocation from the rcache, + * and falls back to regular allocation on failure. If regular allocation + * fails too and the flush_rcache flag is set then the rcache will be flus= hed. */ -static void free_iova_rcaches(struct iova_domain *iovad) +unsigned long +alloc_iova_fast(struct iova_domain *iovad, unsigned long size, + unsigned long limit_pfn, bool flush_rcache) +{ + unsigned long iova_pfn; + struct iova *new_iova; + + /* + * Freeing non-power-of-two-sized allocations back into the IOVA caches + * will come back to bite us badly, so we have to waste a bit of space + * rounding up anything cacheable to make sure that can't happen. The + * order of the unadjusted size will still match upon freeing. + */ + if (size < (1 << (IOVA_RANGE_CACHE_MAX_SIZE - 1))) + size =3D roundup_pow_of_two(size); + + iova_pfn =3D iova_rcache_get(iovad, size, limit_pfn + 1); + if (iova_pfn) + return iova_pfn; + +retry: + new_iova =3D alloc_iova(iovad, size, limit_pfn, true); + if (!new_iova) { + unsigned int cpu; + + if (!flush_rcache) + return 0; + + /* Try replenishing IOVAs by flushing rcache. */ + flush_rcache =3D false; + for_each_online_cpu(cpu) + free_cpu_cached_iovas(cpu, iovad); + free_global_cached_iovas(iovad); + goto retry; + } + + return new_iova->pfn_lo; +} +EXPORT_SYMBOL_GPL(alloc_iova_fast); + +/** + * free_iova_fast - free iova pfn range into rcache + * @iovad: - iova domain in question. + * @pfn: - pfn that is allocated previously + * @size: - # of pages in range + * This functions frees an iova range by trying to put it into the rcache, + * falling back to regular iova deallocation via free_iova() if this fails. + */ +void +free_iova_fast(struct iova_domain *iovad, unsigned long pfn, unsigned long= size) +{ + if (iova_rcache_insert(iovad, pfn, size)) + return; + + free_iova(iovad, pfn); +} +EXPORT_SYMBOL_GPL(free_iova_fast); + +unsigned long iova_rcache_range(void) +{ + return PAGE_SIZE << (IOVA_RANGE_CACHE_MAX_SIZE - 1); +} + +int iova_domain_init_rcaches(struct iova_domain *iovad) { - struct iova_rcache *rcache; - struct iova_cpu_rcache *cpu_rcache; unsigned int cpu; - int i, j; + int i, ret; + + iovad->rcaches =3D kcalloc(IOVA_RANGE_CACHE_MAX_SIZE, + sizeof(struct iova_rcache), + GFP_KERNEL); + if (!iovad->rcaches) + return -ENOMEM; =20 for (i =3D 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) { + struct iova_cpu_rcache *cpu_rcache; + struct iova_rcache *rcache; + rcache =3D &iovad->rcaches[i]; - if (!rcache->cpu_rcaches) - break; + spin_lock_init(&rcache->lock); + rcache->depot_size =3D 0; + rcache->cpu_rcaches =3D __alloc_percpu(sizeof(*cpu_rcache), + cache_line_size()); + if (!rcache->cpu_rcaches) { + ret =3D -ENOMEM; + goto out_err; + } for_each_possible_cpu(cpu) { cpu_rcache =3D per_cpu_ptr(rcache->cpu_rcaches, cpu); - iova_magazine_free(cpu_rcache->loaded); - iova_magazine_free(cpu_rcache->prev); + + spin_lock_init(&cpu_rcache->lock); + cpu_rcache->loaded =3D iova_magazine_alloc(GFP_KERNEL); + cpu_rcache->prev =3D iova_magazine_alloc(GFP_KERNEL); + if (!cpu_rcache->loaded || !cpu_rcache->prev) { + ret =3D -ENOMEM; + goto out_err; + } } - free_percpu(rcache->cpu_rcaches); - for (j =3D 0; j < rcache->depot_size; ++j) - iova_magazine_free(rcache->depot[j]); } =20 - kfree(iovad->rcaches); - iovad->rcaches =3D NULL; + ret =3D cpuhp_state_add_instance_nocalls(CPUHP_IOMMU_IOVA_DEAD, + &iovad->cpuhp_dead); + if (ret) + goto out_err; + return 0; + +out_err: + free_iova_rcaches(iovad); + return ret; } +EXPORT_SYMBOL_GPL(iova_domain_init_rcaches); =20 -/* - * free all the IOVA ranges cached by a cpu (used when cpu is unplugged) - */ -static void free_cpu_cached_iovas(unsigned int cpu, struct iova_domain *io= vad) +void +init_iova_domain(struct iova_domain *iovad, unsigned long granule, + unsigned long start_pfn) { - struct iova_cpu_rcache *cpu_rcache; - struct iova_rcache *rcache; - unsigned long flags; - int i; + /* + * IOVA granularity will normally be equal to the smallest + * supported IOMMU page size; both *must* be capable of + * representing individual CPU pages exactly. + */ + BUG_ON((granule > PAGE_SIZE) || !is_power_of_2(granule)); =20 - for (i =3D 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) { - rcache =3D &iovad->rcaches[i]; - cpu_rcache =3D per_cpu_ptr(rcache->cpu_rcaches, cpu); - spin_lock_irqsave(&cpu_rcache->lock, flags); - iova_magazine_free_pfns(cpu_rcache->loaded, iovad); - iova_magazine_free_pfns(cpu_rcache->prev, iovad); - spin_unlock_irqrestore(&cpu_rcache->lock, flags); - } + spin_lock_init(&iovad->iova_rbtree_lock); + iovad->rbroot =3D RB_ROOT; + iovad->cached_node =3D &iovad->anchor.node; + iovad->cached32_node =3D &iovad->anchor.node; + iovad->granule =3D granule; + iovad->start_pfn =3D start_pfn; + iovad->dma_32bit_pfn =3D 1UL << (32 - iova_shift(iovad)); + iovad->max32_alloc_size =3D iovad->dma_32bit_pfn; + iovad->anchor.pfn_lo =3D iovad->anchor.pfn_hi =3D IOVA_ANCHOR; + rb_link_node(&iovad->anchor.node, NULL, &iovad->rbroot.rb_node); + rb_insert_color(&iovad->anchor.node, &iovad->rbroot); } +EXPORT_SYMBOL_GPL(init_iova_domain); =20 -/* - * free all the IOVA ranges of global cache +/** + * put_iova_domain - destroys the iova domain + * @iovad: - iova domain in question. + * All the iova's in that domain are destroyed. */ -static void free_global_cached_iovas(struct iova_domain *iovad) +void put_iova_domain(struct iova_domain *iovad) { - struct iova_rcache *rcache; - unsigned long flags; - int i, j; + struct iova *iova, *tmp; =20 - for (i =3D 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) { - rcache =3D &iovad->rcaches[i]; - spin_lock_irqsave(&rcache->lock, flags); - for (j =3D 0; j < rcache->depot_size; ++j) { - iova_magazine_free_pfns(rcache->depot[j], iovad); - iova_magazine_free(rcache->depot[j]); + if (iovad->rcaches) + iova_domain_free_rcaches(iovad); + + rbtree_postorder_for_each_entry_safe(iova, tmp, &iovad->rbroot, node) + free_iova_mem(iova); +} +EXPORT_SYMBOL_GPL(put_iova_domain); + +int iova_cache_get(void) +{ + mutex_lock(&iova_cache_mutex); + if (!iova_cache_users) { + int ret; + + ret =3D cpuhp_setup_state_multi(CPUHP_IOMMU_IOVA_DEAD, "iommu/iova:dead"= , NULL, + iova_cpuhp_dead); + if (ret) { + mutex_unlock(&iova_cache_mutex); + pr_err("Couldn't register cpuhp handler\n"); + return ret; } - rcache->depot_size =3D 0; - spin_unlock_irqrestore(&rcache->lock, flags); + + iova_cache =3D kmem_cache_create( + "iommu_iova", sizeof(struct iova), 0, + SLAB_HWCACHE_ALIGN, NULL); + if (!iova_cache) { + cpuhp_remove_multi_state(CPUHP_IOMMU_IOVA_DEAD); + mutex_unlock(&iova_cache_mutex); + pr_err("Couldn't create iova cache\n"); + return -ENOMEM; + } + } + + iova_cache_users++; + mutex_unlock(&iova_cache_mutex); + + return 0; +} +EXPORT_SYMBOL_GPL(iova_cache_get); + +void iova_cache_put(void) +{ + mutex_lock(&iova_cache_mutex); + if (WARN_ON(!iova_cache_users)) { + mutex_unlock(&iova_cache_mutex); + return; } + iova_cache_users--; + if (!iova_cache_users) { + cpuhp_remove_multi_state(CPUHP_IOMMU_IOVA_DEAD); + kmem_cache_destroy(iova_cache); + } + mutex_unlock(&iova_cache_mutex); } +EXPORT_SYMBOL_GPL(iova_cache_put); + MODULE_AUTHOR("Anil S Keshavamurthy "); MODULE_LICENSE("GPL"); diff --git a/include/linux/iova.h b/include/linux/iova.h index c6ba6d95d79c..66f036f509e6 100644 --- a/include/linux/iova.h +++ b/include/linux/iova.h @@ -76,37 +76,28 @@ static inline unsigned long iova_pfn(struct iova_domain= *iovad, dma_addr_t iova) } =20 #if IS_ENABLED(CONFIG_IOMMU_IOVA) -int iova_cache_get(void); -void iova_cache_put(void); - -unsigned long iova_rcache_range(void); =20 void free_iova(struct iova_domain *iovad, unsigned long pfn); void __free_iova(struct iova_domain *iovad, struct iova *iova); struct iova *alloc_iova(struct iova_domain *iovad, unsigned long size, unsigned long limit_pfn, bool size_aligned); -void free_iova_fast(struct iova_domain *iovad, unsigned long pfn, - unsigned long size); -unsigned long alloc_iova_fast(struct iova_domain *iovad, unsigned long siz= e, - unsigned long limit_pfn, bool flush_rcache); struct iova *reserve_iova(struct iova_domain *iovad, unsigned long pfn_lo, unsigned long pfn_hi); +struct iova *find_iova(struct iova_domain *iovad, unsigned long pfn); +unsigned long alloc_iova_fast(struct iova_domain *iovad, unsigned long siz= e, + unsigned long limit_pfn, bool flush_rcache); +void free_iova_fast(struct iova_domain *iovad, unsigned long pfn, + unsigned long size); +unsigned long iova_rcache_range(void); +int iova_domain_init_rcaches(struct iova_domain *iovad); void init_iova_domain(struct iova_domain *iovad, unsigned long granule, unsigned long start_pfn); -int iova_domain_init_rcaches(struct iova_domain *iovad); -struct iova *find_iova(struct iova_domain *iovad, unsigned long pfn); void put_iova_domain(struct iova_domain *iovad); -#else -static inline int iova_cache_get(void) -{ - return -ENOTSUPP; -} - -static inline void iova_cache_put(void) -{ -} +int iova_cache_get(void); +void iova_cache_put(void); =20 +#else static inline void free_iova(struct iova_domain *iovad, unsigned long pfn) { } @@ -123,6 +114,19 @@ static inline struct iova *alloc_iova(struct iova_doma= in *iovad, return NULL; } =20 +static inline struct iova *reserve_iova(struct iova_domain *iovad, + unsigned long pfn_lo, + unsigned long pfn_hi) +{ + return NULL; +} + +static inline struct iova *find_iova(struct iova_domain *iovad, + unsigned long pfn) +{ + return NULL; +} + static inline void free_iova_fast(struct iova_domain *iovad, unsigned long pfn, unsigned long size) @@ -137,12 +141,6 @@ static inline unsigned long alloc_iova_fast(struct iov= a_domain *iovad, return 0; } =20 -static inline struct iova *reserve_iova(struct iova_domain *iovad, - unsigned long pfn_lo, - unsigned long pfn_hi) -{ - return NULL; -} =20 static inline void init_iova_domain(struct iova_domain *iovad, unsigned long granule, @@ -150,13 +148,17 @@ static inline void init_iova_domain(struct iova_domai= n *iovad, { } =20 -static inline struct iova *find_iova(struct iova_domain *iovad, - unsigned long pfn) + +static inline void put_iova_domain(struct iova_domain *iovad) { - return NULL; } =20 -static inline void put_iova_domain(struct iova_domain *iovad) +static inline int iova_cache_get(void) +{ + return -ENOTSUPP; +} + +static inline void iova_cache_put(void) { } =20 --=20 2.35.3