From nobody Mon Apr 6 09:09:44 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 623D13D6CCA; Mon, 30 Mar 2026 12:39:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774874357; cv=none; b=ewXisAFlbzXdASe6C1UjozmG7E2/m1UJLcqg7A+xUbs2p03hA8wxifOmOwNLT0I0qbHDrZlvFrtetzlVcvIHZoXLoMYBWeOBzjq+L0uPsSdU+uGvZkq5Ln9sLmNuwN6I6fujihUm2RQKUaQD35PsWbcBIRMp/6iAy4CZWY+fbUA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774874357; c=relaxed/simple; bh=4dyZy//1X+fVVnoNZknAXNXq0USkV3bXJr+4nv7eBKE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Cn23pMcWwcX0ymkctwn3LqyP4rTD11sMW9UitVs5BjnuiP4mONCDrqozlYsa1wfjEEClMS8TY4gM7n25mO5jOHsVff1TL5aSRJfUXSotAZ2K4yQMDVl1JbCi41Fgq02ZY5vkpTbBSgQawtizCZjtR6MrabjfccZqOKyHydffG/I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=cVEj0m9v; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="cVEj0m9v" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E6C6FC4CEF7; Mon, 30 Mar 2026 12:39:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1774874356; bh=4dyZy//1X+fVVnoNZknAXNXq0USkV3bXJr+4nv7eBKE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=cVEj0m9v1NsfKaOWhAOMvy2A/WDOce/obbDKErN8s8qP/4DQOsEUBYG+f+wcxjAr+ UogHzmLHS2Mx+n9Nxo9RdDQU08EwJehczOIB5MgEYSSZB1WP5EjtCY1cyEmeILxRU+ qw4AOoD3TMWlzviEYgSXOchnkQ0OKkuRncLNneQC4G3mKr4BFBN2FRj+SQyfZ+AG67 5axUWg12zjHm+ktWCadTLQ8o/zqexjIkR5R5+RnUFF9YqF3lelhdz2ZMOMYnP/v0KJ hoXhl5+xfc40dX0s6fL9x8sb1Ybwf/n/ViWE5ZBQM/WVm0AmFwY5SQpyeHTXAHe0Pj 9d/wvvVYWHwrw== From: Sasha Levin To: patches@lists.linux.dev, stable@vger.kernel.org Cc: Jacob Moroni , Leon Romanovsky , Sasha Levin , krzysztof.czurylo@intel.com, tatyana.e.nikolova@intel.com, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH AUTOSEL 6.19-6.6] RDMA/irdma: Fix double free related to rereg_user_mr Date: Mon, 30 Mar 2026 08:38:36 -0400 Message-ID: <20260330123842.756154-23-sashal@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260330123842.756154-1-sashal@kernel.org> References: <20260330123842.756154-1-sashal@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore X-stable-base: Linux 6.19.10 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: Jacob Moroni [ Upstream commit 29a3edd7004bb635d299fb9bc6f0ea4ef13ed5a2 ] If IB_MR_REREG_TRANS is set during rereg_user_mr, the umem will be released and a new one will be allocated in irdma_rereg_mr_trans. If any step of irdma_rereg_mr_trans fails after the new umem is allocated, it releases the umem, but does not set iwmr->region to NULL. The problem is that this failure is propagated to the user, who will then call ibv_dereg_mr (as they should). Then, the dereg_mr path will see a non-NULL umem and attempt to call ib_umem_release again. Fix this by setting iwmr->region to NULL after ib_umem_release. Fixed: 5ac388db27c4 ("RDMA/irdma: Add support to re-register a memory regio= n") Signed-off-by: Jacob Moroni Link: https://patch.msgid.link/20260227152743.1183388-1-jmoroni@google.com Signed-off-by: Leon Romanovsky Signed-off-by: Sasha Levin --- LLM Generated explanations, may be completely bogus: Now I have all the facts. Here is my complete analysis: --- ## PHASE 1: COMMIT MESSAGE FORENSICS ### Step 1.1: PARSE THE SUBJECT LINE Record: [RDMA/irdma] [Fix] [double free related to rereg_user_mr =E2=80=94 = stale `iwmr->region` pointer after error in `irdma_rereg_mr_trans`] ### Step 1.2: PARSE ALL COMMIT MESSAGE TAGS - **Fixed:** `5ac388db27c4` ("RDMA/irdma: Add support to re-register a memory region") =E2=80=94 Note: uses `Fixed:` instead of the standard `Fi= xes:` tag, but semantically identical. This identifies the commit that introduced the bug. - **Signed-off-by:** Jacob Moroni (author) - **Link:** https://patch.msgid.link/20260227152743.1183388-1-jmoroni@google.com - **Signed-off-by:** Leon Romanovsky (RDMA subsystem maintainer) - No Reported-by, Tested-by, Reviewed-by, Acked-by, or Cc: stable tags present. Record: Author from Google with prior RDMA double-free fixes (e.g., `40126bcbefa79 RDMA/umem: Fix double dma_buf_unpin in failure path`). Accepted through the RDMA maintainer tree (Leon Romanovsky). ### Step 1.3: ANALYZE THE COMMIT BODY TEXT The commit message precisely describes the bug mechanism: 1. When `IB_MR_REREG_TRANS` is set, `irdma_rereg_user_mr()` releases the old umem and NULLs `iwmr->region`. 2. `irdma_rereg_mr_trans()` allocates a new umem and stores it in `iwmr->region` (line 3700). 3. If a later step fails (page_size check or `irdma_reg_user_mr_type_mem()`), the `err:` path calls `ib_umem_release(region)` but does NOT set `iwmr->region =3D NULL`. 4. Error propagates to userspace. User correctly calls `ibv_dereg_mr`. 5. `irdma_dereg_mr()` sees non-NULL `iwmr->region` at line 3932 and calls `ib_umem_release()` again =E2=80=94 double free. Record: Bug =3D double free of ib_umem. Symptom =3D kernel crash, memory corruption, or potential security vulnerability. Root cause =3D stale pointer in `iwmr->region` after error-path free. ### Step 1.4: DETECT HIDDEN BUG FIXES Record: Not hidden =E2=80=94 explicitly labeled "Fix double free." This is a direct, clear memory-safety bug fix. --- ## PHASE 2: DIFF ANALYSIS =E2=80=94 LINE BY LINE ### Step 2.1: INVENTORY THE CHANGES Record: **1 file**: `drivers/infiniband/hw/irdma/verbs.c`, **+1 line** added. Function modified: `irdma_rereg_mr_trans()`. Scope: single-file, single-line surgical fix. ### Step 2.2: UNDERSTAND THE CODE FLOW CHANGE Before the fix, the `err:` label at line 3721 runs: ```3721:3723:drivers/infiniband/hw/irdma/verbs.c err: ib_umem_release(region); return err; ``` After the fix, it becomes: ```c err: ib_umem_release(region); iwmr->region =3D NULL; return err; ``` Record: Before =3D freed memory, left dangling pointer in `iwmr->region`. After =3D freed memory, set `iwmr->region =3D NULL` to prevent double-free in `irdma_dereg_mr()`. ### Step 2.3: IDENTIFY THE BUG MECHANISM Category: **Double-free / memory safety**. The confirmed double-free path: 1. Line 3700: `iwmr->region =3D region;` =E2=80=94 stores new umem pointer 2. Lines 3706-3717: possible failure paths (`goto err`) 3. Line 3722: `ib_umem_release(region);` =E2=80=94 frees the umem 4. Line 3723: returns error (but `iwmr->region` still points to freed memory) 5. Later, in `irdma_dereg_mr()`: ```3932:3933:drivers/infiniband/hw/irdma/verbs.c if (iwmr->region) ib_umem_release(iwmr->region); ``` This calls `ib_umem_release()` on already-freed memory. Verified that `ib_umem_release()` dereferences the `umem` object, unpins pages, and calls `kfree(umem)` (confirmed in `drivers/infiniband/core/umem.c` lines 284-298). Record: Double-free of `ib_umem` object. The second `ib_umem_release()` dereferences freed memory and kfree's it again. ### Step 2.4: ASSESS THE FIX QUALITY The fix is obviously correct: it sets `iwmr->region =3D NULL` after freeing the object, which is the standard kernel pattern for preventing double-frees. This **exactly matches** the existing pattern in the same file =E2=80=94 `irdma_rereg_user_mr()` already does this at lines 3775-3777: ```3775:3778:drivers/infiniband/hw/irdma/verbs.c if (iwmr->region) { ib_umem_release(iwmr->region); iwmr->region =3D NULL; } ``` Record: Fix is obviously correct, minimal, follows existing code conventions, zero regression risk. --- ## PHASE 3: GIT HISTORY INVESTIGATION ### Step 3.1: BLAME THE CHANGED LINES Git blame confirms every line in `irdma_rereg_mr_trans()` (lines 3696-3723) was introduced by commit `5ac388db27c4` (Sindhu Devale, 2023-10-04). The bug has existed since the function was first written. Record: Buggy code introduced in `5ac388db27c4`, first appeared in v6.7-rc1. ### Step 3.2: FOLLOW THE FIXES: TAG `git show 5ac388db27c4` confirms it added the entire MR re-registration support to irdma, including `irdma_rereg_mr_trans()`, `irdma_rereg_user_mr()`, and `.rereg_user_mr` in the device ops table. `git describe --contains 5ac388db27c4` =E2=86=92 `v6.7-rc1~96^2~21` =E2=80= =94 first released in v6.7. `git merge-base --is-ancestor 5ac388db27c4 v6.6` =E2=86=92 exit 1 (NOT in v= 6.6). `git merge-base --is-ancestor 5ac388db27c4 v6.7` =E2=86=92 exit 0 (IS in v6= .7). `git grep irdma_rereg_mr_trans v6.6` =E2=86=92 no match. `git grep irdma_rereg_mr_trans v6.12` =E2=86=92 match (3 references). Record: Original buggy commit exists in v6.7+ stable trees. NOT in v6.6.y or older. Bug present since inception of the function. ### Step 3.3: CHECK FILE HISTORY FOR RELATED CHANGES Recent file history shows active irdma maintenance but no related fix for this specific double-free. No prerequisite commits needed. Record: Standalone fix, no dependencies. ### Step 3.4: CHECK THE AUTHOR'S OTHER COMMITS Jacob Moroni has multiple RDMA commits including `40126bcbefa79 RDMA/umem: Fix double dma_buf_unpin in failure path` =E2=80=94 another doub= le- free fix in RDMA umem handling. This demonstrates relevant domain expertise. Record: Author is an active RDMA contributor with prior double-free fixes. Patch accepted by RDMA subsystem maintainer Leon Romanovsky. ### Step 3.5: CHECK FOR DEPENDENT/PREREQUISITE COMMITS The fix is a single-line NULL assignment in an existing error path. No dependencies. Record: Fully standalone, clean apply expected. --- ## PHASE 4: MAILING LIST AND EXTERNAL RESEARCH ### Steps 4.1-4.4 Lore.kernel.org access was blocked by Anubis proof-of-work page. The patch.msgid.link URL was also inaccessible. Web searches did not locate a mirror of the exact patch discussion. However, the patch was accepted through the standard RDMA maintainer tree (Signed-off-by: Leon Romanovsky), indicating it passed normal review. Record: UNVERIFIED: Could not access mailing list discussion due to anti-bot measures. No additional context about reviewer feedback or stable nominations. --- ## PHASE 5: CODE SEMANTIC ANALYSIS ### Step 5.1: IDENTIFY KEY FUNCTIONS - `irdma_rereg_mr_trans()` =E2=80=94 modified (error path) - `irdma_rereg_user_mr()` =E2=80=94 caller, wired as `.rereg_user_mr` in de= vice ops - `irdma_dereg_mr()` =E2=80=94 site of the second (double) free, wired as `.dereg_mr` ### Step 5.2: TRACE CALLERS - `irdma_rereg_user_mr()` is registered as `.rereg_user_mr` in the irdma device ops - RDMA core's `ib_uverbs_rereg_mr()` (in `drivers/infiniband/core/uverbs_cmd.c`) calls `ib_dev->ops.rereg_user_mr()` - This is reachable from userspace via RDMA uverbs The double-free path: - Userspace `ibv_dereg_mr` =E2=86=92 RDMA core MR destroy =E2=86=92 `ib_der= eg_mr_user()` =E2=86=92 `irdma_dereg_mr()` =E2=86=92 `ib_umem_release(iwmr->region)` on= dangling pointer Record: Both entry points (rereg and dereg) are userspace-reachable through RDMA uverbs. ### Step 5.3: TRACE CALLEES `ib_umem_release()` (confirmed at `drivers/infiniband/core/umem.c:284`) dereferences the `umem` object, unpins pages via `__ib_umem_release()`, decrements `pinned_vm`, calls `mmdrop()`, and finally `kfree(umem)`. A second call on the same freed pointer is a genuine double-free with memory corruption. ### Step 5.4: CALL CHAIN REACHABILITY Userspace =E2=86=92 `ibv_rereg_mr` =E2=86=92 `ib_uverbs_rereg_mr()` =E2=86= =92 `irdma_rereg_user_mr()` =E2=86=92 `irdma_rereg_mr_trans()` (fails) =E2=86= =92 returns error to user =E2=86=92 user calls `ibv_dereg_mr` =E2=86=92 `irdma_dereg_mr= ()` =E2=86=92 double free. Record: Fully userspace-reachable path on systems with irdma hardware. ### Step 5.5: SIMILAR PATTERNS The caller `irdma_rereg_user_mr()` already correctly does `ib_umem_release(iwmr->region); iwmr->region =3D NULL;` at lines 3775-3777, establishing the pattern. The omission in `irdma_rereg_mr_trans()` is an inconsistency with the file's own conventions. --- ## PHASE 6: CROSS-REFERENCING AND STABLE TREE ANALYSIS ### Step 6.1: DOES THE BUGGY CODE EXIST IN STABLE TREES? - **v6.6.y**: NOT present (confirmed via `git grep` =E2=80=94 no match for `irdma_rereg_mr_trans`) - **v6.12.y**: Present (confirmed via `git grep` =E2=80=94 3 references fou= nd) - **v6.7+**: All trees contain the buggy code Record: Affects stable trees v6.7 and newer (including 6.12.y). NOT applicable to v6.6.y or older. ### Step 6.2: BACKPORT COMPLICATIONS The one-line fix in an unchanged error path should apply cleanly to all trees containing the function. Record: Expected clean apply, no conflicts. ### Step 6.3: CHECK IF RELATED FIXES ARE ALREADY IN STABLE `git log --grep=3D"double free" --grep=3D"rereg_user_mr"` =E2=80=94 no resu= lts. The fix is not yet in any tree. Record: No related fix already applied. --- ## PHASE 7: SUBSYSTEM AND MAINTAINER CONTEXT ### Step 7.1: SUBSYSTEM CRITICALITY - Subsystem: RDMA / irdma driver (`drivers/infiniband/hw/irdma/`) - Criticality: IMPORTANT for RDMA deployments =E2=80=94 Intel iWARP hardware used in data centers, HPC, and cloud infrastructure. While not core-mm universal, a kernel memory-safety bug on a userspace-reachable path has security implications. ### Step 7.2: SUBSYSTEM ACTIVITY Actively maintained =E2=80=94 15+ recent commits show ongoing bug fixes and development. --- ## PHASE 8: IMPACT AND RISK ASSESSMENT ### Step 8.1: WHO IS AFFECTED Users of Intel RDMA (irdma) hardware who use MR re-registration with the `IB_MR_REREG_TRANS` flag. Record: Driver-specific (irdma users), but these are often production data center systems. ### Step 8.2: TRIGGER CONDITIONS 1. Userspace calls `ibv_rereg_mr` with `IB_MR_REREG_TRANS` 2. `irdma_rereg_mr_trans()` fails after allocating the new umem (either `ib_umem_find_best_pgsz()` returns 0 or `irdma_reg_user_mr_type_mem()` fails) 3. User then correctly calls `ibv_dereg_mr` to clean up This is a deterministic error-path trigger, not a timing race. Any application hitting a registration failure and cleaning up properly will trigger it. Record: Deterministic trigger on error path. Userspace-reachable. ### Step 8.3: FAILURE MODE SEVERITY Double-free of a `ib_umem` structure. `ib_umem_release()` dereferences multiple fields and calls `kfree()`. A second call on freed memory causes: - Heap corruption (SLAB allocator corruption) - Kernel crash / oops - Potential security vulnerability (exploitable heap corruption) Record: Severity: **CRITICAL** (double-free of kernel heap object on userspace-triggerable path) ### Step 8.4: RISK-BENEFIT RATIO - **Benefit:** HIGH =E2=80=94 prevents a double-free / memory corruption on= a userspace-reachable error path - **Risk:** VERY LOW =E2=80=94 single line `iwmr->region =3D NULL;` after f= ree, follows existing code pattern, obviously correct - **Ratio:** Extremely favorable for backporting --- ## PHASE 9: FINAL SYNTHESIS ### Step 9.1: COMPILE THE EVIDENCE **FOR backporting:** - Verified double-free of `ib_umem` object on a userspace-reachable error path - `ib_umem_release()` confirmed to dereference and `kfree()` the object - Single-line, obviously correct fix (NULL-after-free) - Matches existing code pattern in the same file (lines 3775-3777) - Bug present since function was introduced in v6.7-rc1 (`5ac388db27c4`) - Zero regression risk - Author has prior RDMA double-free fix expertise - Accepted by RDMA subsystem maintainer **AGAINST backporting:** - None. **UNRESOLVED:** - Could not access lore.kernel.org mailing list discussion (Anubis block) - No independent crash report verified (no Reported-by tag) ### Step 9.2: STABLE RULES CHECKLIST 1. Obviously correct and tested? **YES** =E2=80=94 single-line NULL assignm= ent after free, matches existing pattern 2. Fixes a real bug? **YES** =E2=80=94 verified double-free via code path analysis 3. Important issue? **YES** =E2=80=94 double-free =3D crash/corruption/pote= ntial security vulnerability (CRITICAL) 4. Small and contained? **YES** =E2=80=94 1 line, 1 file, 1 function error = path 5. No new features or APIs? **YES** =E2=80=94 pure bug fix 6. Can apply to stable trees? **YES** =E2=80=94 clean apply expected on v6.= 7+ trees ### Step 9.3: EXCEPTION CATEGORIES N/A =E2=80=94 this is a standard bug fix meeting all criteria directly. ### Step 9.4: DECISION This is a textbook stable backport candidate: a one-line fix for a verified double-free on a userspace-reachable path, with zero regression risk, applicable to stable trees v6.7 and newer (not v6.6.y or older, where the code does not exist). --- ## Verification - [Phase 1] Parsed all tags from commit message: `Fixed: 5ac388db27c4`, `Link:`, two `Signed-off-by:` (author + maintainer). No Reported- by/Tested-by/Cc:stable. - [Phase 2] Read `verbs.c` lines 3690-3724: confirmed `iwmr->region =3D region` at line 3700, `err:` path at lines 3721-3723 releases region without NULLing `iwmr->region`. - [Phase 2] Read `verbs.c` lines 3905-3938: confirmed `irdma_dereg_mr()` checks `if (iwmr->region)` at line 3932 and calls `ib_umem_release(iwmr->region)` =E2=80=94 this is the second free. - [Phase 2] Read `drivers/infiniband/core/umem.c` lines 284-298: confirmed `ib_umem_release()` dereferences umem fields and calls `kfree(umem)`. - [Phase 2] Verified existing NULL-after-free pattern at lines 3775-3777 in `irdma_rereg_user_mr()`. - [Phase 3] `git blame -L 3696,3723`: all lines from `5ac388db27c4` (Sindhu Devale, 2023-10-04). - [Phase 3] `git show 5ac388db27c4`: confirmed it adds MR re- registration support including the buggy function. - [Phase 3] `git describe --contains 5ac388db27c4` =E2=86=92 `v6.7-rc1~96^2= ~21`. - [Phase 3] `git merge-base --is-ancestor 5ac388db27c4 v6.6` =E2=86=92 exit= 1 (NOT in v6.6). - [Phase 3] `git merge-base --is-ancestor 5ac388db27c4 v6.7` =E2=86=92 exit= 0 (IS in v6.7). - [Phase 3] `git log --author=3D"Jacob Moroni"`: found 10 RDMA commits including `40126bcbefa79 RDMA/umem: Fix double dma_buf_unpin in failure path`. - [Phase 4] UNVERIFIED: Lore/patch discussion blocked by Anubis anti-bot page. - [Phase 5] Grep for `if (iwmr->region)` in verbs.c: found at lines 3775, 3912, and 3932 =E2=80=94 all three sites depend on `iwmr->region` accurately reflecting ownership. - [Phase 6] `git grep irdma_rereg_mr_trans v6.6` =E2=86=92 no match (code a= bsent from v6.6). - [Phase 6] `git grep irdma_rereg_mr_trans v6.12` =E2=86=92 3 matches (code present in v6.12). - [Phase 6] `git log --grep=3D"double free" --grep=3D"rereg_user_mr"` =E2= =86=92 no results (fix not yet applied anywhere). - [Phase 8] Failure mode: double-free of `ib_umem` =E2=86=92 heap corruptio= n, crash, potential security exploit. Severity CRITICAL. - UNVERIFIED: Mailing list reviewer feedback and stable nominations (lore blocked). - UNVERIFIED: Whether an independent crash report exists beyond the author's finding. **YES** drivers/infiniband/hw/irdma/verbs.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/ir= dma/verbs.c index 68fb81b7bd221..18844d24973be 100644 --- a/drivers/infiniband/hw/irdma/verbs.c +++ b/drivers/infiniband/hw/irdma/verbs.c @@ -3720,6 +3720,7 @@ static int irdma_rereg_mr_trans(struct irdma_mr *iwmr= , u64 start, u64 len, =20 err: ib_umem_release(region); + iwmr->region =3D NULL; return err; } =20 --=20 2.53.0