From nobody Fri Apr 3 16:02:20 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BE939364927; Mon, 23 Mar 2026 20:20:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774297253; cv=none; b=QiALbkXp2B6NcPv3yNfMKPD0OF/UTMJMByurI2t09/6B2DsIi3g0ouHkWuP3tGcMh7D2gib+2jOtLeYkxd1EJ7qg57+seQgMfV0PTRoKG/q7J6XTYMEGwK/dVLAhTZ5flbMRslL+ty1FrIKY5n59dypFRzH6wElHwdDDQtdW3c8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774297253; c=relaxed/simple; bh=4V0rUPFg+PlacrcySkEqbCBvm3s6ZnPbzsKpYhUGN+0=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version:Content-Type; b=gyIGUFYvke6+6ynlXMm5pPjZh8b6cZfLU6loRlKSJI8i8G1CbbfbzQRvFBwYy7L+T2BI/HkKmNA7vwusid8OtKwZZnNUleeilqPl124w2whGBhh2uJIFUt9osyAzvbXj5eXXxKN9IYOi9A7htir3KnDSSvDToqIQssYmQmVzoEY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=aGkDccqF; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="aGkDccqF" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E37F7C4CEF7; Mon, 23 Mar 2026 20:20:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1774297253; bh=4V0rUPFg+PlacrcySkEqbCBvm3s6ZnPbzsKpYhUGN+0=; h=From:To:Cc:Subject:Date:From; b=aGkDccqFuU8pvU9q1E6EE6iIjTLieHOtln5HvKivaiefGHxQ8oJBVzrbic5OQF6cT hOIJjN1YOjU/Ukb1EqOy2HmtVWhH2XmNpfmAyQHDa5RrljhbKW2z9JXfFitMRhlciU iHEmEGVbdmXZ69ZCa15n5OnO4XjJHfRy0ZKdrHZXm4LligB9uKUDzvUop16T44h55I 9cv1otrAn0AH1DCRwVEhx7Ys695z7104LiHj3/Cgtd5YmIlInHWe+nujlBgeyx22mL ijrblMx3bB6E1oGTVRLPqanG6YLGgTTPTZNLEZDgXa3NJuW0A2i/rtIzxvpNNxyKJR 7GCIiHSGLe9vQ== From: Leon Romanovsky To: Marek Szyprowski , Robin Murphy Cc: iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH] dma-debug: ensure mappings are created and released with matching attributes Date: Mon, 23 Mar 2026 22:20:37 +0200 Message-ID: <20260323-dma-attrs-debug-v1-1-6275228ca300@nvidia.com> X-Mailer: git-send-email 2.53.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" X-Change-ID: 20260323-dma-attrs-debug-85e282d6f3bb X-Mailer: b4 0.15-dev-18f8f Content-Transfer-Encoding: quoted-printable From: Leon Romanovsky The DMA API expects that callers use the same attributes when mapping and unmapping. Add tracking to verify this and catch mismatches. Signed-off-by: Leon Romanovsky --- kernel/dma/debug.c | 62 +++++++++++++++++++++++++++++++++++++-------------= ---- 1 file changed, 43 insertions(+), 19 deletions(-) --- Marek, This patch is based on f5ebf241c407 ("mm/hmm: Indicate that HMM requires DM= A coherency"), just to minimize merge conflicts, but it is definitely for -next. If the patch is ok and that f5ebf241c407 commit won't be backmerged to -next, I ca= n resend it in next cycle. This debug aid helped me to see this case: https://lore.kernel.org/all/20260323-umem-dma-attrs-v1-1-d6890f2e6a1e@nvidi= a.com/ diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c index 0677918f06a80..6e5e69b8bc4d8 100644 --- a/kernel/dma/debug.c +++ b/kernel/dma/debug.c @@ -63,7 +63,7 @@ enum map_err_types { * @sg_mapped_ents: 'mapped_ents' from dma_map_sg * @paddr: physical start address of the mapping * @map_err_type: track whether dma_mapping_error() was checked - * @is_cache_clean: driver promises not to write to buffer while mapped + * @attrs: dma attributes * @stack_len: number of backtrace entries in @stack_entries * @stack_entries: stack of backtrace history */ @@ -78,7 +78,7 @@ struct dma_debug_entry { int sg_mapped_ents; phys_addr_t paddr; enum map_err_types map_err_type; - bool is_cache_clean; + unsigned long attrs; #ifdef CONFIG_STACKTRACE unsigned int stack_len; unsigned long stack_entries[DMA_DEBUG_STACKTRACE_ENTRIES]; @@ -478,6 +478,9 @@ static int active_cacheline_insert(struct dma_debug_ent= ry *entry, bool *overlap_cache_clean) { phys_addr_t cln =3D to_cacheline_number(entry); + bool is_cache_clean =3D entry->attrs & + (DMA_ATTR_DEBUGGING_IGNORE_CACHELINES | + DMA_ATTR_REQUIRE_COHERENT); unsigned long flags; int rc; =20 @@ -495,12 +498,15 @@ static int active_cacheline_insert(struct dma_debug_e= ntry *entry, if (rc =3D=3D -EEXIST) { struct dma_debug_entry *existing; =20 - active_cacheline_inc_overlap(cln, entry->is_cache_clean); + active_cacheline_inc_overlap(cln, is_cache_clean); existing =3D radix_tree_lookup(&dma_active_cacheline, cln); /* A lookup failure here after we got -EEXIST is unexpected. */ WARN_ON(!existing); if (existing) - *overlap_cache_clean =3D existing->is_cache_clean; + *overlap_cache_clean =3D + existing->attrs & + (DMA_ATTR_DEBUGGING_IGNORE_CACHELINES | + DMA_ATTR_REQUIRE_COHERENT); } spin_unlock_irqrestore(&radix_lock, flags); =20 @@ -544,12 +550,13 @@ void debug_dma_dump_mappings(struct device *dev) if (!dev || dev =3D=3D entry->dev) { cln =3D to_cacheline_number(entry); dev_info(entry->dev, - "%s idx %d P=3D%pa D=3D%llx L=3D%llx cln=3D%pa %s %s\n", + "%s idx %d P=3D%pa D=3D%llx L=3D%llx cln=3D%pa %s %s attrs=3D0x%lx\n= ", type2name[entry->type], idx, &entry->paddr, entry->dev_addr, entry->size, &cln, dir2name[entry->direction], - maperr2str[entry->map_err_type]); + maperr2str[entry->map_err_type], + entry->attrs); } } spin_unlock_irqrestore(&bucket->lock, flags); @@ -575,14 +582,15 @@ static int dump_show(struct seq_file *seq, void *v) list_for_each_entry(entry, &bucket->list, list) { cln =3D to_cacheline_number(entry); seq_printf(seq, - "%s %s %s idx %d P=3D%pa D=3D%llx L=3D%llx cln=3D%pa %s %s\n", + "%s %s %s idx %d P=3D%pa D=3D%llx L=3D%llx cln=3D%pa %s %s attrs=3D= 0x%lx\n", dev_driver_string(entry->dev), dev_name(entry->dev), type2name[entry->type], idx, &entry->paddr, entry->dev_addr, entry->size, &cln, dir2name[entry->direction], - maperr2str[entry->map_err_type]); + maperr2str[entry->map_err_type], + entry->attrs); } spin_unlock_irqrestore(&bucket->lock, flags); } @@ -594,16 +602,14 @@ DEFINE_SHOW_ATTRIBUTE(dump); * Wrapper function for adding an entry to the hash. * This function takes care of locking itself. */ -static void add_dma_entry(struct dma_debug_entry *entry, unsigned long att= rs) +static void add_dma_entry(struct dma_debug_entry *entry) { + unsigned long attrs =3D entry->attrs; bool overlap_cache_clean; struct hash_bucket *bucket; unsigned long flags; int rc; =20 - entry->is_cache_clean =3D attrs & (DMA_ATTR_DEBUGGING_IGNORE_CACHELINES | - DMA_ATTR_REQUIRE_COHERENT); - bucket =3D get_hash_bucket(entry, &flags); hash_bucket_add(bucket, entry); put_hash_bucket(bucket, flags); @@ -612,9 +618,10 @@ static void add_dma_entry(struct dma_debug_entry *entr= y, unsigned long attrs) if (rc =3D=3D -ENOMEM) { pr_err_once("cacheline tracking ENOMEM, dma-debug disabled\n"); global_disable =3D true; - } else if (rc =3D=3D -EEXIST && - !(attrs & DMA_ATTR_SKIP_CPU_SYNC) && - !(entry->is_cache_clean && overlap_cache_clean) && + } else if (rc =3D=3D -EEXIST && !(attrs & DMA_ATTR_SKIP_CPU_SYNC) && + !(attrs & (DMA_ATTR_DEBUGGING_IGNORE_CACHELINES | + DMA_ATTR_REQUIRE_COHERENT) && + overlap_cache_clean) && !(IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) && is_swiotlb_active(entry->dev))) { err_printk(entry->dev, entry, @@ -1066,6 +1073,19 @@ static void check_unmap(struct dma_debug_entry *ref) type2name[entry->type]); } =20 + /* + * This may be no bug in reality - but DMA API is still expects + * that entry is unmapped with same attributes as it was mapped. + */ + if (ref->attrs !=3D entry->attrs) { + err_printk(ref->dev, entry, + "device driver frees " + "DMA memory with different attributes " + "[device address=3D0x%016llx] [size=3D%llu bytes] " + "[mapped with 0x%lx] [unmapped with 0x%lx]\n", + ref->dev_addr, ref->size, entry->attrs, ref->attrs); + } + hash_bucket_del(entry); put_hash_bucket(bucket, flags); =20 @@ -1249,6 +1269,7 @@ void debug_dma_map_phys(struct device *dev, phys_addr= _t phys, size_t size, entry->size =3D size; entry->direction =3D direction; entry->map_err_type =3D MAP_ERR_NOT_CHECKED; + entry->attrs =3D attrs; =20 if (!(attrs & DMA_ATTR_MMIO)) { check_for_stack(dev, phys); @@ -1257,7 +1278,7 @@ void debug_dma_map_phys(struct device *dev, phys_addr= _t phys, size_t size, check_for_illegal_area(dev, phys_to_virt(phys), size); } =20 - add_dma_entry(entry, attrs); + add_dma_entry(entry); } =20 void debug_dma_mapping_error(struct device *dev, dma_addr_t dma_addr) @@ -1344,10 +1365,11 @@ void debug_dma_map_sg(struct device *dev, struct sc= atterlist *sg, entry->direction =3D direction; entry->sg_call_ents =3D nents; entry->sg_mapped_ents =3D mapped_ents; + entry->attrs =3D attrs; =20 check_sg_segment(dev, s); =20 - add_dma_entry(entry, attrs); + add_dma_entry(entry); } } =20 @@ -1439,8 +1461,9 @@ void debug_dma_alloc_coherent(struct device *dev, siz= e_t size, entry->size =3D size; entry->dev_addr =3D dma_addr; entry->direction =3D DMA_BIDIRECTIONAL; + entry->attrs =3D attrs; =20 - add_dma_entry(entry, attrs); + add_dma_entry(entry); } =20 void debug_dma_free_coherent(struct device *dev, size_t size, @@ -1584,8 +1607,9 @@ void debug_dma_alloc_pages(struct device *dev, struct= page *page, entry->size =3D size; entry->dev_addr =3D dma_addr; entry->direction =3D direction; + entry->attrs =3D attrs; =20 - add_dma_entry(entry, attrs); + add_dma_entry(entry); } =20 void debug_dma_free_pages(struct device *dev, struct page *page, --- base-commit: f5ebf241c407dbf629fcf515015e139fcea2c2f0 change-id: 20260323-dma-attrs-debug-85e282d6f3bb Best regards, -- =20 Leon Romanovsky