From nobody Sun May 24 17:49:06 2026 Received: from mail-qv1-f52.google.com (mail-qv1-f52.google.com [209.85.219.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7EFB1318EE7 for ; Sun, 24 May 2026 11:55:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779623751; cv=none; b=iZReH3/eho8unNYgUaZi3f02BbsbGzP3OJWdinfRJhBRM496Ms7eYlAz5fkCYPF/VMgqsVMGuG0mSkpgIk/YuukEBccbOkq5aTzPZQgK3jTygd43DLGQhANizDbLisD0bVvIyGGoZnhkTGPUtlT+VphlsgA0iU8UQPjiU0W+H1w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779623751; c=relaxed/simple; bh=Z6otwaDFT4Vu8XH4hi9joxGSGQ0OnK7Or3PSuisf+yQ=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=T/EoyiLl0GZQxc1tS0HTxavHLwp1eSATfWb/1apjoB8VIygWV9g+2igO+AhLtNPzecVJTkXrACGb3SpUH3WGQcxDHe8CbsVv2bECgSDEuGNxrekWGCWFfL7VdLVMag1mUvs6DJWUBqqP0Z4p7ff4N6lAevfaigtW/GHaB6mHEeE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=CCfxjJQ1; arc=none smtp.client-ip=209.85.219.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="CCfxjJQ1" Received: by mail-qv1-f52.google.com with SMTP id 6a1803df08f44-8b74b460d77so120988576d6.3 for ; Sun, 24 May 2026 04:55:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779623748; x=1780228548; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=FNOIhZwEAyxoSU8hZDbj9MzcmmC5fm/C4a9DIHbn8HM=; b=CCfxjJQ16UUBvBYI0OXs8MwS0VV/XrYrfd1EsrDTxgA/b8tT8nok6Gxc+r6XoaAJ/G rstoTi5jjtl+wnab8YPoqWE9A71YaQvrjDv72lXb166qGYoXTUivIjPSzt7toafCa/J2 9VGXsSrK8CHohuEsuuum6TKdS6g/xH1r7VpHq/Jh24LwTv/fuxMvuLyKnUEm6mLDzA3W /Hes9Glm3aajOAq1A/h1bRG6CoMp6Mbf/Odwj703VZ66W2zRP9JBtGBdwBtS4OdNJwC7 5RXQYja7eb8f+RrYpdqNqvDde5kb7YAqAfoKajGyzsecL2e69jnxMj3q0XFxLc5pGYIW 2KbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779623748; x=1780228548; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=FNOIhZwEAyxoSU8hZDbj9MzcmmC5fm/C4a9DIHbn8HM=; b=abZWgx2fwYf3N4ZsndIMl9kqii8TLq3Z0cTnLOLeYkUf3rOfl41nZJlSdmojkGw3q6 TeVxTYGYX0oUBTr5IsC9skN5pKhUAxEOdrovjG5mS/uDYF8QfF3WCTZ/yNGjGzh33qNB 3jkggjoSv7Xjbdm0Upt+CB05W5yhuj5EMy1O1BAaWapKgSkXE4C8Uyqwxxng83qyqUgs Ma83N5TF/UOet/7iWM+J6+V1iGjX8xga7g5o96WcQCfErVUs9El1PgZfNW3WvewBZDtE 7Qc046A1wkVgE9eoHZmU0m3CmThwqbS79+TDvrRQHUwuYc1BsEgZOVdQBD+o/gTuRw+r 2NkA== X-Forwarded-Encrypted: i=1; AFNElJ+PEi1PTjWc9fXBK6Uf+zOD0PQBBRlJDNXqb9niZHIzDii/yAN1PujcmUP5Y+d6a48cO7XH0rtL0Glgu6U=@vger.kernel.org X-Gm-Message-State: AOJu0YyQYRRGwsUvbzB8i6Cge2oAC3Q/WuOE2XM0k5E+eytARalWt3eU gpqX40sGr9BMGo4cQEMhPWXPc1OA0nm3CiUCmiIo7KZa81lt90yPumTR X-Gm-Gg: Acq92OFb1PVHyXJbLlRagDmIz8V6fDp35dn/2JQ7SFZNVsiPbfiHlBSJpfyB4SEBb7W HnXNuqjONxNhDP/fp4KpmUk8WZIlLLT7qnUlCLXcq8FhuTAFnpn/6/wsMUiyuJMLiX8iA/ElrDl IoftDLv5lYYjXxZ3DQiENrMcRXGM4mrj0KJQNRvcrIpL8w1d5/pp8Z2hLcxvzYVoDwLCdyPuc7d NYD46UXYQY4u8udPN3xqy4NEzIpUEu4D6A/+omf/Nf9/ZevhkUJeCVDQU9DmqlgPwpUPSEpTk8n PysrN3zc7XtlAHKHsPjiyHuCCjFydCGSemI+UJ+XVWRmmcTq/WG3jxYvtZGbV5t1h29C9xpea8u 6WOY90IRM+t+NZldCsdHSBAhmUMrb7LeOZD0dJu3iuXI5vqXNv7vIqkepY1+8RSN7BWWl0TvMZL l4OXifcT4r4MkBFN1zaVjiYQ034pDwIMoaQG/h4m5gEwGbsSbmoEkYSFEnWOB651U6PERT6Ysrd Ha3rRJL+RjUjzC5sW7xrNs36HTt2h8= X-Received: by 2002:a05:6214:570a:b0:89e:f35f:79f7 with SMTP id 6a1803df08f44-8cc7b606790mr171511936d6.51.1779623748030; Sun, 24 May 2026 04:55:48 -0700 (PDT) Received: from server0 (c-68-48-65-54.hsd1.mi.comcast.net. [68.48.65.54]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-8cc80decd31sm81137386d6.13.2026.05.24.04.55.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 24 May 2026 04:55:47 -0700 (PDT) From: Michael Bommarito To: Trond Myklebust , Anna Schumaker , Chuck Lever , Jeff Layton Cc: NeilBrown , Olga Kornievskaia , Dai Ngo , Tom Talpey , linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org Subject: [PATCH v3] lockd: pin next file across nlm_inspect_file lock-drop Date: Sun, 24 May 2026 07:55:27 -0400 Message-ID: <20260524115527.1734251-1-michael.bommarito@gmail.com> X-Mailer: git-send-email 2.53.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" nlm_traverse_files() pins the current file with f_count++ across a mutex_unlock for nlm_inspect_file(), but nothing pins the saved next pointer. A concurrent nlm_release_file() can kfree the next file during the unlock window, and the iterator dereferences freed memory on the next loop step. Pin both current and next before the lock-drop. Advance by swapping the pinned cursors at the end of each iteration so next is always held alive across the unlock. Always call nlm_file_release() after dropping the iteration pin, regardless of whether the file matched the predicate. Use nlm_file_inuse(), which does a live walk of the inode lock list, rather than the cached f_locks field, so skipped files that never ran nlm_inspect_file() are evaluated correctly. Cc: stable@vger.kernel.org Fixes: 01df9c5e918a ("LOCKD: Fix a deadlock in nlm_traverse_files()") Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Michael Bommarito --- fs/lockd/svcsubs.c | 65 +++++++++++++++++++++++++++------------------- 1 file changed, 38 insertions(+), 27 deletions(-) Changes since v2: - Use nlm_file_inuse() (live inode lock-list walk) for the cleanup decision instead of the cached f_locks field. v2 skipped cleanup entirely for non-matching files, leaking dead entries; the v1 fix used cached f_locks which is stale for skipped files. nlm_file_inuse() is correct on both paths because it walks the actual POSIX lock list rather than relying on nlm_inspect_file() having refreshed the cache. (Chuck Lever.) Changes since v1: - Fixed premature kfree of non-matching files. Reproduced under UML + KASAN with 768 concurrent POSIX holders and parallel /proc/fs/nfsd/unlock_filesystem writes. Stock kernel trips KASAN slab-use-after-free in nlm_traverse_files. Patched v3 silent. diff --git a/fs/lockd/svcsubs.c b/fs/lockd/svcsubs.c index dd0214dcb6950..0ce1e3711f003 100644 --- a/fs/lockd/svcsubs.c +++ b/fs/lockd/svcsubs.c @@ -284,47 +284,58 @@ nlm_file_inuse(struct nlm_file *file) return 0; } =20 -static void nlm_close_files(struct nlm_file *file) -{ - if (file->f_file[O_RDONLY]) - nlmsvc_ops->fclose(file->f_file[O_RDONLY]); - if (file->f_file[O_WRONLY]) - nlmsvc_ops->fclose(file->f_file[O_WRONLY]); -} - /* * Loop over all files in the file table. */ +static void nlm_file_release(struct nlm_file *file) +{ + if (!nlm_file_inuse(file)) + nlm_delete_file(file); +} + static int nlm_traverse_files(void *data, nlm_host_match_fn_t match, int (*is_failover_file)(void *data, struct nlm_file *file)) { - struct hlist_node *next; - struct nlm_file *file; + struct nlm_file *file, *next; int i, ret =3D 0; =20 mutex_lock(&nlm_file_mutex); for (i =3D 0; i < FILE_NRHASH; i++) { - hlist_for_each_entry_safe(file, next, &nlm_files[i], f_list) { - if (is_failover_file && !is_failover_file(data, file)) - continue; + file =3D hlist_entry_safe(nlm_files[i].first, + struct nlm_file, f_list); + if (file) file->f_count++; - mutex_unlock(&nlm_file_mutex); - - /* Traverse locks, blocks and shares of this file - * and update file->f_locks count */ - if (nlm_inspect_file(data, file, match)) - ret =3D 1; + while (file) { + /* + * Pin the next neighbour before we drop the mutex + * for nlm_inspect_file(); a concurrent + * nlm_release_file() under the same mutex would + * otherwise be free to unlink and kfree it during + * the unlock window, leaving us to dereference a + * freed slab when we walked to next afterwards. + */ + next =3D hlist_entry_safe(file->f_list.next, + struct nlm_file, f_list); + if (next) + next->f_count++; + + if (!is_failover_file || is_failover_file(data, file)) { + mutex_unlock(&nlm_file_mutex); + + /* + * Traverse locks, blocks and shares of this + * file and update file->f_locks count. + */ + if (nlm_inspect_file(data, file, match)) + ret =3D 1; + + mutex_lock(&nlm_file_mutex); + } =20 - mutex_lock(&nlm_file_mutex); file->f_count--; - /* No more references to this file. Let go of it. */ - if (list_empty(&file->f_blocks) && !file->f_locks - && !file->f_shares && !file->f_count) { - hlist_del(&file->f_list); - nlm_close_files(file); - kfree(file); - } + nlm_file_release(file); + file =3D next; } } mutex_unlock(&nlm_file_mutex); --=20 2.53.0