From nobody Mon Feb 9 05:38:41 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4082AC0015E for ; Mon, 26 Jun 2023 13:02:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231339AbjFZNCP (ORCPT ); Mon, 26 Jun 2023 09:02:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52134 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231182AbjFZNCE (ORCPT ); Mon, 26 Jun 2023 09:02:04 -0400 Received: from frasgout11.his.huawei.com (unknown [14.137.139.23]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 61902BD for ; Mon, 26 Jun 2023 06:02:00 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.18.147.228]) by frasgout11.his.huawei.com (SkyGuard) with ESMTP id 4QqSQ54S3Sz9xFQP for ; Mon, 26 Jun 2023 20:51:09 +0800 (CST) Received: from A2101119013HW2.china.huawei.com (unknown [10.81.202.159]) by APP2 (Coremail) with SMTP id GxC2BwDXXWQljJlk1fGzAw--.36557S3; Mon, 26 Jun 2023 14:01:40 +0100 (CET) From: Petr Tesarik To: Christoph Hellwig , Marek Szyprowski , Robin Murphy , iommu@lists.linux.dev (open list:DMA MAPPING HELPERS), linux-kernel@vger.kernel.org (open list) Cc: Roberto Sassu , Kefeng Wang , petr@tesarici.cz Subject: [PATCH v1 1/2] swiotlb: Always set the number of areas before allocating the pool Date: Mon, 26 Jun 2023 15:01:03 +0200 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: GxC2BwDXXWQljJlk1fGzAw--.36557S3 X-Coremail-Antispam: 1UD129KBjvJXoWxGFyDXFWxuF1DJFykJFyrZwb_yoW5GrWxpr yfZa4jyFWSqF1xAFyUAayxJFy0k3WkCryjkryY9ryfJr45JF1fXrWDKFWjqr95tay7Zr4j qayrZr4YgrnxX3DanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBG14x267AKxVW5JVWrJwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jr4l82xGYIkIc2 x26xkF7I0E14v26r1I6r4UM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Jr0_JF4l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1l84 ACjcxK6I8E87Iv67AKxVW8JVWxJwA2z4x0Y4vEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UM2AI xVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20x vE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xv r2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7MxkF7I0Ew4C26cxK6c8Ij28Icw CF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j 6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64 vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_ Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0x vEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUnvtZUUUUU= X-CM-SenderInfo: hshw23xhvd2x3n6k3tpzhluzxrxghudrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Petr Tesarik The number of areas defaults to the number of possible CPUs. However, the total number of slots may have to be increased after adjusting the number of areas. Consequently, the number of areas must be determined before allocating the memory pool. This is even explained with a comment in swiotlb_init_remap(), but swiotlb_init_late() adjusts the number of areas after slots are already allocated. The areas may end up being smaller than IO_TLB_SEGSIZE, which breaks per-area locking. While fixing swiotlb_init_late(), move all relevant comments before the definition of swiotlb_adjust_nareas() and convert them to kernel-doc. Fixes: 20347fca71a3 ("swiotlb: split up the global swiotlb lock") Signed-off-by: Petr Tesarik Reviewed-by: Roberto Sassu --- kernel/dma/swiotlb.c | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index af2e304c672c..16f53d8c51bc 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -115,9 +115,16 @@ static bool round_up_default_nslabs(void) return true; } =20 +/** + * swiotlb_adjust_nareas() - adjust the number of areas and slots + * @nareas: Desired number of areas. Zero is treated as 1. + * + * Adjust the default number of areas in a memory pool. + * The default size of the memory pool may also change to meet minimum area + * size requirements. + */ static void swiotlb_adjust_nareas(unsigned int nareas) { - /* use a single area when non is specified */ if (!nareas) nareas =3D 1; else if (!is_power_of_2(nareas)) @@ -298,10 +305,6 @@ void __init swiotlb_init_remap(bool addressing_limit, = unsigned int flags, if (swiotlb_force_disable) return; =20 - /* - * default_nslabs maybe changed when adjust area number. - * So allocate bounce buffer after adjusting area number. - */ if (!default_nareas) swiotlb_adjust_nareas(num_possible_cpus()); =20 @@ -363,6 +366,9 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask, if (swiotlb_force_disable) return 0; =20 + if (!default_nareas) + swiotlb_adjust_nareas(num_possible_cpus()); + retry: order =3D get_order(nslabs << IO_TLB_SHIFT); nslabs =3D SLABS_PER_PAGE << order; @@ -397,9 +403,6 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask, (PAGE_SIZE << order) >> 20); } =20 - if (!default_nareas) - swiotlb_adjust_nareas(num_possible_cpus()); - area_order =3D get_order(array_size(sizeof(*mem->areas), default_nareas)); mem->areas =3D (struct io_tlb_area *) --=20 2.25.1 From nobody Mon Feb 9 05:38:41 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2BAD0EB64D7 for ; Mon, 26 Jun 2023 13:02:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229658AbjFZNCX (ORCPT ); Mon, 26 Jun 2023 09:02:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52162 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231310AbjFZNCK (ORCPT ); Mon, 26 Jun 2023 09:02:10 -0400 Received: from frasgout11.his.huawei.com (unknown [14.137.139.23]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7917FC2 for ; Mon, 26 Jun 2023 06:02:05 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.18.147.228]) by frasgout11.his.huawei.com (SkyGuard) with ESMTP id 4QqSQB5PXjz9xFQ7 for ; Mon, 26 Jun 2023 20:51:14 +0800 (CST) Received: from A2101119013HW2.china.huawei.com (unknown [10.81.202.159]) by APP2 (Coremail) with SMTP id GxC2BwDXXWQljJlk1fGzAw--.36557S4; Mon, 26 Jun 2023 14:01:46 +0100 (CET) From: Petr Tesarik To: Christoph Hellwig , Marek Szyprowski , Robin Murphy , iommu@lists.linux.dev (open list:DMA MAPPING HELPERS), linux-kernel@vger.kernel.org (open list) Cc: Roberto Sassu , Kefeng Wang , petr@tesarici.cz Subject: [PATCH v1 2/2] swiotlb: Reduce the number of areas to match actual memory pool size Date: Mon, 26 Jun 2023 15:01:04 +0200 Message-Id: <097dffeb406cef6596499c8dbdafb28d05f7501b.1687784289.git.petr.tesarik.ext@huawei.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: GxC2BwDXXWQljJlk1fGzAw--.36557S4 X-Coremail-Antispam: 1UD129KBjvJXoWxZw1xur4UKr1xCF18GFyrCrg_yoWrGFy3pr yfWa4UtFWFqFn7CFW2y3y8CFySka4vyrnFkF9xC3sxXw1UGF1SkrWqkFWUJFWrtF4kuF4f X3yrZF45uanxJ3DanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBG14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jryl82xGYIkIc2 x26xkF7I0E14v26r4j6ryUM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Jr0_JF4l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1l84 ACjcxK6I8E87Iv67AKxVW8JVWxJwA2z4x0Y4vEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UM2AI xVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20x vE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xv r2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7MxkF7I0Ew4C26cxK6c8Ij28Icw CF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j 6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64 vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_ Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0x vEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0pRLqXdUUUUU= X-CM-SenderInfo: hshw23xhvd2x3n6k3tpzhluzxrxghudrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Petr Tesarik Although the desired size of the SWIOTLB memory pool is increased in swiotlb_adjust_nareas() to match the number of areas, the actual allocation may be smaller, which may require reducing the number of areas. For example, Xen uses swiotlb_init_late(), which in turn uses the page allocator. On x86, page size is 4 KiB and MAX_ORDER is 10 (1024 pages), resulting in a maximum memory pool size of 4 MiB. This corresponds to 2048 slots of 2 KiB each. The minimum area size is 128 (IO_TLB_SEGSIZE), allowing at most 2048 / 128 =3D 16 areas. If num_possible_cpus() is greater than the maximum number of areas, areas are smaller than IO_TLB_SEGSIZE and contiguous groups of free slots will span multiple areas. When allocating and freeing slots, only one area will be properly locked, causing race conditions on the unlocked slots and ultimately data corruption, kernel hangs and crashes. Fixes: 20347fca71a3 ("swiotlb: split up the global swiotlb lock") Signed-off-by: Petr Tesarik Reviewed-by: Roberto Sassu --- kernel/dma/swiotlb.c | 27 ++++++++++++++++++++++++--- 1 file changed, 24 insertions(+), 3 deletions(-) diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 16f53d8c51bc..079df5ad38d0 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -138,6 +138,23 @@ static void swiotlb_adjust_nareas(unsigned int nareas) (default_nslabs << IO_TLB_SHIFT) >> 20); } =20 +/** + * limit_nareas() - get the maximum number of areas for a given memory poo= l size + * @nareas: Desired number of areas. + * @nslots: Total number of slots in the memory pool. + * + * Limit the number of areas to the maximum possible number of areas in + * a memory pool of the given size. + * + * Return: Maximum possible number of areas. + */ +static unsigned int limit_nareas(unsigned int nareas, unsigned long nslots) +{ + if (nslots < nareas * IO_TLB_SEGSIZE) + nareas =3D nslots / IO_TLB_SEGSIZE; + return nareas; +} + static int __init setup_io_tlb_npages(char *str) { @@ -297,6 +314,7 @@ void __init swiotlb_init_remap(bool addressing_limit, u= nsigned int flags, { struct io_tlb_mem *mem =3D &io_tlb_default_mem; unsigned long nslabs; + unsigned int nareas; size_t alloc_size; void *tlb; =20 @@ -309,10 +327,12 @@ void __init swiotlb_init_remap(bool addressing_limit,= unsigned int flags, swiotlb_adjust_nareas(num_possible_cpus()); =20 nslabs =3D default_nslabs; + nareas =3D limit_nareas(default_nareas, nslabs); while ((tlb =3D swiotlb_memblock_alloc(nslabs, flags, remap)) =3D=3D NULL= ) { if (nslabs <=3D IO_TLB_MIN_SLABS) return; nslabs =3D ALIGN(nslabs >> 1, IO_TLB_SEGSIZE); + nareas =3D limit_nareas(nareas, nslabs); } =20 if (default_nslabs !=3D nslabs) { @@ -358,6 +378,7 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask, { struct io_tlb_mem *mem =3D &io_tlb_default_mem; unsigned long nslabs =3D ALIGN(size >> IO_TLB_SHIFT, IO_TLB_SEGSIZE); + unsigned int nareas; unsigned char *vstart =3D NULL; unsigned int order, area_order; bool retried =3D false; @@ -403,8 +424,8 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask, (PAGE_SIZE << order) >> 20); } =20 - area_order =3D get_order(array_size(sizeof(*mem->areas), - default_nareas)); + nareas =3D limit_nareas(default_nareas, nslabs); + area_order =3D get_order(array_size(sizeof(*mem->areas), nareas)); mem->areas =3D (struct io_tlb_area *) __get_free_pages(GFP_KERNEL | __GFP_ZERO, area_order); if (!mem->areas) @@ -418,7 +439,7 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask, set_memory_decrypted((unsigned long)vstart, (nslabs << IO_TLB_SHIFT) >> PAGE_SHIFT); swiotlb_init_io_tlb_mem(mem, virt_to_phys(vstart), nslabs, 0, true, - default_nareas); + nareas); =20 swiotlb_print_info(); return 0; --=20 2.25.1