From nobody Mon Jun 15 22:01:53 2026 Received: from out30-111.freemail.mail.aliyun.com (out30-111.freemail.mail.aliyun.com [115.124.30.111]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 52F4F37F756 for ; Tue, 14 Apr 2026 09:03:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.111 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776157397; cv=none; b=oW7tJOPyYrco4u5ZNkRblpGpQ3KzGUdmOkGPqlimPVxBewStrLzBCnbEYYC0aEamMCMhmb8tf8pI6AC6tUso8TcMsacXlL9yjsMk1cU+ZLQ5JU1fziC68j9Z31T3z3ZmTEPGKfj49he17pWBc5LmzyT1gwtWVUJppzbCqzmp59I= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776157397; c=relaxed/simple; bh=wN4DKWlxmROB6FN+Uz8xwErxwfZjlVcfAFMxhH8KZjk=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=WMT/nGQxSGamSVEiW+FlX1Cy7zcTWs4kjntb8OA5KD+svJOywp7hHMZrS8HamQF1e//APVD9+asnKnDMIfS4CW0CcRit1rfS2K9O31z7N/7Bc1oWDi2jc48Ha3uX2kWkUgqnpcANTEFyxlCBGokgP3Z9SBkJyvIGOmoDxpS7tyI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=KLNTI5zi; arc=none smtp.client-ip=115.124.30.111 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="KLNTI5zi" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1776157391; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=IFzB3mo90OwMkDYnSXs3XtgMaNpNtEUFZm5l6DP+N4c=; b=KLNTI5zih/3kQUPAuVILsqg8QOyF+pkgwLfVSidPq0aQT6FivSLggIO+lWS7y3VJy8tzunQcUBFEjBQSEWQbj8ByZ5sjg4x7PsLCf3LojTyCIWiysDnNlr9gMgTRYXyJwkKhFiaQ5j4QsLwauEjyEQUARbENNxev7K+/shV47Sw= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R121e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033037033178;MF=feng.tang@linux.alibaba.com;NM=1;PH=DS;RN=6;SR=0;TI=SMTPD_---0X10x0BJ_1776157390; Received: from localhost(mailfrom:feng.tang@linux.alibaba.com fp:SMTPD_---0X10x0BJ_1776157390 cluster:ay36) by smtp.aliyun-inc.com; Tue, 14 Apr 2026 17:03:11 +0800 From: Feng Tang To: Marek Szyprowski , Robin Murphy , Christoph Hellwig , iommu@lists.linux.dev, linux-kernel@vger.kernel.org Cc: Feng Tang Subject: [PATCH v1] dma-contiguous: try local node first for dma_alloc_contiguous() Date: Tue, 14 Apr 2026 17:03:10 +0800 Message-Id: <20260414090310.92055-1-feng.tang@linux.alibaba.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" There was a bug report on a multi-numa-nodes ARM server that when IOMMU is disabled, the dma_alloc_coherent() function always returns memory from node 0 even for devices attaching to other nodes, while they can get local dma memory when IOMMU is on with the same API. The reason is, when IOMMU is disabled, the dma_alloc_coherent() will go the direct way and call dma_alloc_contiguous(). The system doesn't have any explicit cma setting (like per-numa cma), and only has a default 64MB cma reserved area (on node 0), where kernel will try first to allocate memory from. Make the dma allocation more locality friendly by trying first the numa aware allocation alloc_pages_node() before falling back to the reserved cma area. One more thought is to check the node of the reserved cma area and only call alloc_pages_nodes() when it isn't the same node that the device attaches to. Signed-off-by: Feng Tang --- kernel/dma/contiguous.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c index c56004d314dc..0180f40f094e 100644 --- a/kernel/dma/contiguous.c +++ b/kernel/dma/contiguous.c @@ -371,8 +371,9 @@ static struct page *cma_alloc_aligned(struct cma *cma, = size_t size, gfp_t gfp) */ struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t g= fp) { -#ifdef CONFIG_DMA_NUMA_CMA +#ifdef CONFIG_NUMA int nid =3D dev_to_node(dev); + struct page *page; #endif =20 /* CMA can be used only in the context which permits sleeping */ @@ -386,7 +387,6 @@ struct page *dma_alloc_contiguous(struct device *dev, s= ize_t size, gfp_t gfp) #ifdef CONFIG_DMA_NUMA_CMA if (nid !=3D NUMA_NO_NODE && !(gfp & (GFP_DMA | GFP_DMA32))) { struct cma *cma =3D dma_contiguous_pernuma_area[nid]; - struct page *page; =20 if (cma) { page =3D cma_alloc_aligned(cma, size, gfp); @@ -402,6 +402,14 @@ struct page *dma_alloc_contiguous(struct device *dev, = size_t size, gfp_t gfp) } } #endif + +#ifdef CONFIG_NUMA + /* Try first to allocate memory on the same node as the device */ + page =3D alloc_pages_node(nid, gfp, get_order(size)); + if (page) + return page; +#endif + if (!dma_contiguous_default_area) return NULL; =20 --=20 2.39.5 (Apple Git-154)