From nobody Mon Jun 8 09:49:04 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6E4983AF65C; Sat, 30 May 2026 13:19:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780147175; cv=none; b=dTke8L2ckqAlvm1aC/onbGx0wt1GOvfGJRINruRwtABXM47INlkyc5Xpvr9zwPgPMRVAEYpuOmMGU4OcVCct/aFgM4YookpB7JGjlw7kK4E3kySkjPOI6HQL9xCR3SpFyfxJ1UuzWWsOAQPDSpD8unZC+TTfV4R47b2L+9hSmLs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780147175; c=relaxed/simple; bh=K4pQil6EEQxcVQV+dvV6As3uZ3pP2sKiXvsjMGYS06A=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=u+y5HisYZRp4XTe4iwhTtkERp25C3nHsgSUAj2yU/1YBo1vuAzumZTCZIs0amJlp8Xmo2LIYS2ybgQP1x1QfEu3A4t1QAxsCQOyjMlt1jTbQLpQzua9ustCPU57fhe2h1UDecY+cmNcICFkmElfSoUNjj6HUz7YnORraL0Vy964= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=N6f4QzEy; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="N6f4QzEy" Received: by smtp.kernel.org (Postfix) with ESMTPSA id C51791F00898; Sat, 30 May 2026 13:19:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780147174; bh=nrFvrNgc+rOLYbN4gcsqJjA+wO93cTh/W5ieXGGabak=; h=From:Date:Subject:References:In-Reply-To:To:Cc; b=N6f4QzEy2Jb6tcQoiokpmfBsckRsWOoG6vTXNxO+0rd+aQqTkyHVn/sG5e3ajnfC+ mPVeSQSpoAuk0LQyP3ByQbw6Lt796+bbPZCFRFcqrIMUQh82X98a2/qI/dFx/0uYel bMElCn+soGXQqeIwhCCZmLPBHIxx5J8yystyXYP9MAckf4CjaClP3MxnZDCmJp9W7A s5qV8I1lpQQvtCgbrH6kSMmcmLhW0cbq7ywbKX5kpEYbcBNWtwFf8KnfZsMjmR4krg Ym9NddA4tGzzdldj0dQbRG2vbqWmYhLsc57c2bzkjeeYWDQoT5zpc96bkrcW/V38MH JUiC4AM2yRmPg== From: Jeff Layton Date: Sat, 30 May 2026 09:19:17 -0400 Subject: [PATCH v2 1/9] nfsd: fix BUG_ON in nfsd4_alloc_layout_stateid on racing delegation revoke Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260530-nfsd-fixes-v2-1-f27e8eb4d974@kernel.org> References: <20260530-nfsd-fixes-v2-0-f27e8eb4d974@kernel.org> In-Reply-To: <20260530-nfsd-fixes-v2-0-f27e8eb4d974@kernel.org> To: Chuck Lever , NeilBrown , Olga Kornievskaia , Dai Ngo , Tom Talpey , "J. Bruce Fields" , Scott Mayhew , Trond Myklebust , Andreas Gruenbacher , Mike Snitzer , Rick Macklem Cc: Chris Mason , linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org, Jeff Layton X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=openpgp-sha256; l=1947; i=jlayton@kernel.org; h=from:subject:message-id; bh=K4pQil6EEQxcVQV+dvV6As3uZ3pP2sKiXvsjMGYS06A=; b=owEBbQKS/ZANAwAKAQAOaEEZVoIVAcsmYgBqGuPhXjjZvJzbMc8zZVo+KElrvOzq57vb4S4T3 u7w26UXMIGJAjMEAAEKAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCahrj4QAKCRAADmhBGVaC FQGvD/9GHGEVVerX+v92RQzFq7rL6zQOJwlKulpxEFbSOlpZ+CJyqBaB0m4rwMhfmY8G3+QWXgI Z5Mkm1SRfrxtM0wOh0YhBX32cKpWLp7xD0lLDhFzgZj3b7EosXzLF4RWPtkM9hP8l7Htd/dMRh9 M2MdiqeIm9ckLJ9LnCY1ZrpHVrAKnaQuVfJFryUkGBUyMxoCqa0Bw/qk62Gti1iSd/vbgRAkVdI EtdxOTkU5NQNfQFmezNw62ZdDziIB57caeg/9QYulz4gOoATVMij110K6BWsJI8CEWbS4fezT+g thmgHVKnCMc+0fvA3bLfK+hKmzqq+DRfBkGSQyw/Pw/vY0NcRaXKAkSBb3lzeCET1cK6+6HGmtX jMWduqWVfMNmxnyb++EUNZokqPFLaQb/jg7zPGSxIAzu4jxamdv6Z0nClntYVTOTNOBRHud6duB LsFKYnrjJ2LYwynq4QDBP20s6erkDUcIFpo16jhSQnbBsBbO+jqkutxRClIp1br6sGuxNUsEjBa IFmEl4z6GA01VnrIodXIWatxkW7FsynfHHqC1X33H3C5Qo2eLMVClZkzszxpy2a/TMhsNOpDgkZ 88KSghoUpgny4xOpYLSyKL5+6f0+3k/BbjX1nR2bWxbEgLWJWK16rNPrT6uFSn7IuFvOiEQU9gZ CWNh/1EpDHw7WuA== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 nfsd4_alloc_layout_stateid reads fp->fi_deleg_file without holding fi_lock when the parent stateid is a delegation. A concurrent delegation revoke via the laundromat can clear fi_deleg_file under fi_lock, causing nfsd_file_get() to return NULL and triggering the BUG_ON. This race is client-reachable: two NFS clients can trigger it by having one hold a delegation while another opens the same file to force a recall. When the first client doesn't respond to the recall, the laundromat revokes it. A concurrent LAYOUTGET from any client using the delegation stateid hits the race window. Fix this by taking the rcu_read_lock() around the fi_deleg_file read in the SC_TYPE_DELEG path, and replacing the BUG_ON with a graceful error return that cleans up the partially-initialized layout stateid. Fixes: c5c707f96fc9 ("nfsd: implement pNFS layout recalls") Assisted-by: kres:claude-opus-4-7 Reported-by: Chris Mason Signed-off-by: Jeff Layton --- fs/nfsd/nfs4layouts.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/fs/nfsd/nfs4layouts.c b/fs/nfsd/nfs4layouts.c index 9ed2e3d65062..6c4e4fdd6c05 100644 --- a/fs/nfsd/nfs4layouts.c +++ b/fs/nfsd/nfs4layouts.c @@ -247,11 +247,17 @@ nfsd4_alloc_layout_stateid(struct nfsd4_compound_stat= e *cstate, nfsd4_init_cb(&ls->ls_recall, clp, &nfsd4_cb_layout_ops, NFSPROC4_CLNT_CB_LAYOUT); =20 - if (parent->sc_type =3D=3D SC_TYPE_DELEG) - ls->ls_file =3D nfsd_file_get(rcu_dereference_protected(fp->fi_deleg_fil= e, 1)); - else + if (parent->sc_type =3D=3D SC_TYPE_DELEG) { + rcu_read_lock(); + ls->ls_file =3D nfsd_file_get(rcu_dereference(fp->fi_deleg_file)); + rcu_read_unlock(); + } else { ls->ls_file =3D find_any_file(fp); - BUG_ON(!ls->ls_file); + } + if (!ls->ls_file) { + nfs4_put_stid(stp); + return NULL; + } =20 ls->ls_fenced =3D false; ls->ls_fence_delay =3D 0; --=20 2.54.0 From nobody Mon Jun 8 09:49:04 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1204E3C345C; Sat, 30 May 2026 13:19:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780147177; cv=none; b=BEp/0xyqlKydbwugCFKL29JkbM+dkcSrZ28VO/Qp/CK8QcMcM8+SA6Cr5sWh1oHiJoXRn0fl+WmTD/sH3l83rcwJluhPnyo8JXClHT7tKz+4+EM8Kwb+1goXyB7IaPH5i+/qm7U09Ppsp9Y5hfLavB6u6M08GfnEiAOvwnrrQG0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780147177; c=relaxed/simple; bh=ok1w8ZTu0tkWmH62kK8ep31jnlBHpnPUqz9WmL5buLU=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=F82vJUGncleOqLDMTLDpQ/45U6Ha/ENQfBRM2S/gbjGQ1uu4OiDpqqIFfcTrDhnsHQI2uWhnFgM0Dw4oSymQ0t9km3V3L/wg+/+vhO02GwY/ZbJ3uY1UI6xW3omlxmwrJLkMQ/2qvfMjLpu5uu7PZoGKuqt5uuUStBT8YhHHsrc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=fHUnqbMJ; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="fHUnqbMJ" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 57D281F00893; Sat, 30 May 2026 13:19:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780147175; bh=Y/VxR1pWFM9uFscNT3GTwp//s18dYkhn16+UtFw+ppo=; h=From:Date:Subject:References:In-Reply-To:To:Cc; b=fHUnqbMJ4uLwWxQ1QgE4oaPTMnofbhcBnI/RXpHxMNTy1ALStveWir7kabpgP7YSX gTij/z9CcSx1gSoHpCgvF77tZBYwu2/DtznZYJlbksBVYqZ+1zITlcJ/yMS6N6R0kU ui8loKO20lTAhNz8D7osDJeJQm8SD72ZTKMx6n60mvXHPDSwPh5pgswMXcD/2np3JC G9rb5NahOerz3c3Wc3bT4OJOD+9YLCNfsO9v+iFFwlIRuwiaiWfdNAe1wkflz0RLTG kvjQtCtTPwn3OWgOQ5jfhOihqPHGVtAeDuJTrqwyAIO5S9A5lbKMMzat8hAGaSoznv FUcu7HZeeG4og== From: Jeff Layton Date: Sat, 30 May 2026 09:19:18 -0400 Subject: [PATCH v2 2/9] nfsd: RCU-protect cl_cb_session to fix use-after-free on session teardown Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260530-nfsd-fixes-v2-2-f27e8eb4d974@kernel.org> References: <20260530-nfsd-fixes-v2-0-f27e8eb4d974@kernel.org> In-Reply-To: <20260530-nfsd-fixes-v2-0-f27e8eb4d974@kernel.org> To: Chuck Lever , NeilBrown , Olga Kornievskaia , Dai Ngo , Tom Talpey , "J. Bruce Fields" , Scott Mayhew , Trond Myklebust , Andreas Gruenbacher , Mike Snitzer , Rick Macklem Cc: Chris Mason , linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org, Jeff Layton X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=openpgp-sha256; l=14826; i=jlayton@kernel.org; h=from:subject:message-id; bh=ok1w8ZTu0tkWmH62kK8ep31jnlBHpnPUqz9WmL5buLU=; b=owEBbQKS/ZANAwAKAQAOaEEZVoIVAcsmYgBqGuPh54Zi7bh3dN1/g0uwKVDMKDxuEKd+Pb96X pDZvw/rPXGJAjMEAAEKAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCahrj4QAKCRAADmhBGVaC FRsxEADLGo9wg3Txo8Zs0UlfTK0uaZTn/UWClyoAdK2ETPNRwnoB8dt1XXuXlVBaRFg0SuRfxKP PowVj/QM+yQl/kuvhYGMfsMq57nK3b7qH+lhLYZohWZZkmfyMQOR2Aa5ND3QPFferrofNIbppzM q0AX3v5rqsM76Dj9kpU+U9NdYBThdJvA8P4LuYl6JQaQ73ViQSHsRVFodPDmV9ea3ymbAJ6aRxu cOb881BRYKyF9l+e5BgHd0Zdvc1Y/k96m4ttU8IAvr3xjqsyDOgmE8izLzSp+c9n0f4POamIl3P CBMzdCZeNm7kzsd34LcNIKg8dTLWRBEOEeip05EaRZ+ZbmSmks9/c/CUBksmubklKv4ZHfnnnvF xzIb57kF6KBZjBVuuWwGGI9hE/yTDe4ihTzU40U9mUuYKWEFZSvI39MNnNKJD8oruNPYEJZxaYR mab8awJ7EIl22dul0+oIYia7z7fmMqZAP/LKBPkgaZ0irU/xlGjDJrguAnnR6SJds7y/kYiRmYp tQlMnVqVHarDgHju+7U2WJYBjFu5yUdsDrTxsjKTdMpge2IuiG++LoxigGBCjqPwVDGVhUgL9p3 nHrVP+fyIx/AVmyjJ66x1uplio01uuLwr2mxOkqxPXltGM3Xlw4bllnXtJ6U0fJohf/QGunRLum eacz4pjkMxELENw== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 After a DESTROY_SESSION the per-session teardown path can free a session while rpciod still holds an inflight callback rpc_task that dereferences clp->cl_cb_session. nfsd4_probe_callback_sync() flushes cl_callback_wq, but once nfsd4_run_cb_work() has called rpc_call_async() the rpc_task lives on rpciod; flushing the workqueue does not wait for it. rpc_shutdown_client() does drain rpciod tasks, but uses a 1-second wait_event_timeout =E2=80=94 tasks stuck in rpc_delay() (e.g. 2-second NFS4ERR_DELAY retries) can outlive the drain. destroy path rpciod ------------ ------ unhash_session(ses) nfsd4_probe_callback_sync(clp) flush_workqueue(cl_callback_wq) /* returns; rpc_task still live */ nfsd4_put_session_locked(ses) free_session(ses) -> kfree(ses) nfsd4_cb_sequence_done() reads cb_clp->cl_cb_session /* freed slab */ A second window exists in nfsd4_process_cb_update(). When __nfsd4_find_backchannel() returns NULL because unhash_session() has already removed the destroyed session from cl_sessions, setup_callback_client() takes the v4.1 early return so clp->cl_cb_session =3D ses never fires and the field retains a pointer to the about-to-be-freed session. Fix both by converting cl_cb_session to an RCU-protected pointer: - Move the cl_cb_session =3D ses assignment in setup_callback_client() to after rpc_create() succeeds, so it is only published when a working backchannel exists. Clear cl_cb_session on the error return in nfsd4_process_cb_update(). Both stores use rcu_assign_pointer(). - Annotate cl_cb_session with __rcu. All rpciod-side readers use rcu_read_lock()/rcu_dereference() and check for NULL, bailing to the appropriate error or requeue path: encode_cb_sequence4args(), decode_cb_sequence4resok(), nfsd41_cb_get_slot(), nfsd41_cb_release_slot(), nfsd4_cb_prepare(), and nfsd4_cb_sequence_done(). - Switch __free_session() from kfree() to kfree_rcu() so the session slab is not reclaimed until after an RCU grace period, guaranteeing that rpciod readers inside rcu_read_lock() never dereference freed memory. - Pass the session pointer to the nfsd_cb_seq_status and nfsd_cb_free_slot tracepoints instead of having them re-read cl_cb_session. - nfsd4_cb_prepare() calls rpc_exit() when the session is NULL, routing through the done/release path to requeue the callback. Fixes: dcbeaa68dbbd ("nfsd4: allow backchannel recovery") Reported-by: Chris Mason Signed-off-by: Chris Mason Signed-off-by: Jeff Layton --- fs/nfsd/nfs4callback.c | 109 ++++++++++++++++++++++++++++++++++++++++-----= ---- fs/nfsd/nfs4state.c | 4 +- fs/nfsd/state.h | 3 +- fs/nfsd/trace.h | 14 +++---- 4 files changed, 100 insertions(+), 30 deletions(-) diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c index 1964a213f80e..2eb2278dddd1 100644 --- a/fs/nfsd/nfs4callback.c +++ b/fs/nfsd/nfs4callback.c @@ -456,13 +456,20 @@ static void encode_cb_sequence4args(struct xdr_stream= *xdr, const struct nfsd4_callback *cb, struct nfs4_cb_compound_hdr *hdr) { - struct nfsd4_session *session =3D cb->cb_clp->cl_cb_session; + struct nfsd4_session *session; struct nfsd4_referring_call_list *rcl; __be32 *p; =20 if (hdr->minorversion =3D=3D 0) return; =20 + rcu_read_lock(); + session =3D rcu_dereference(cb->cb_clp->cl_cb_session); + if (!session) { + rcu_read_unlock(); + return; + } + encode_nfs_cb_opnum4(xdr, OP_CB_SEQUENCE); encode_sessionid4(xdr, session); =20 @@ -478,6 +485,7 @@ static void encode_cb_sequence4args(struct xdr_stream *= xdr, encode_referring_call_list4(xdr, rcl); =20 hdr->nops++; + rcu_read_unlock(); } =20 static void update_cb_slot_table(struct nfsd4_session *ses, u32 target) @@ -529,21 +537,32 @@ static void update_cb_slot_table(struct nfsd4_session= *ses, u32 target) static int decode_cb_sequence4resok(struct xdr_stream *xdr, struct nfsd4_callback *cb) { - struct nfsd4_session *session =3D cb->cb_clp->cl_cb_session; + struct nfsd4_session *session; int status =3D -ESERVERFAULT; __be32 *p; u32 seqid, slotid, target; =20 + rcu_read_lock(); + session =3D rcu_dereference(cb->cb_clp->cl_cb_session); + if (!session) { + rcu_read_unlock(); + cb->cb_seq_status =3D -NFS4ERR_BADSESSION; + return -NFS4ERR_BADSESSION; + } + /* * If the server returns different values for sessionID, slotID or * sequence number, the server is looney tunes. */ p =3D xdr_inline_decode(xdr, NFS4_MAX_SESSIONID_LEN + 4 + 4 + 4 + 4); - if (unlikely(p =3D=3D NULL)) + if (unlikely(p =3D=3D NULL)) { + rcu_read_unlock(); goto out_overflow; + } =20 if (memcmp(p, session->se_sessionid.data, NFS4_MAX_SESSIONID_LEN)) { dprintk("NFS: %s Invalid session id\n", __func__); + rcu_read_unlock(); goto out; } p +=3D XDR_QUADLEN(NFS4_MAX_SESSIONID_LEN); @@ -551,12 +570,14 @@ static int decode_cb_sequence4resok(struct xdr_stream= *xdr, seqid =3D be32_to_cpup(p++); if (seqid !=3D session->se_cb_seq_nr[cb->cb_held_slot]) { dprintk("NFS: %s Invalid sequence number\n", __func__); + rcu_read_unlock(); goto out; } =20 slotid =3D be32_to_cpup(p++); if (slotid !=3D cb->cb_held_slot) { dprintk("NFS: %s Invalid slotid\n", __func__); + rcu_read_unlock(); goto out; } =20 @@ -564,6 +585,7 @@ static int decode_cb_sequence4resok(struct xdr_stream *= xdr, =20 target =3D be32_to_cpup(p++); update_cb_slot_table(session, target); + rcu_read_unlock(); status =3D 0; out: cb->cb_seq_status =3D status; @@ -1205,9 +1227,8 @@ static int setup_callback_client(struct nfs4_client *= clp, struct nfs4_cb_conn *c } else { if (!conn->cb_xprt || !ses) return -EINVAL; - clp->cl_cb_session =3D ses; args.bc_xprt =3D conn->cb_xprt; - args.prognumber =3D clp->cl_cb_session->se_cb_prog; + args.prognumber =3D ses->se_cb_prog; args.protocol =3D conn->cb_xprt->xpt_class->xcl_ident | XPRT_TRANSPORT_BC; args.authflavor =3D ses->se_cb_sec.flavor; @@ -1225,8 +1246,10 @@ static int setup_callback_client(struct nfs4_client = *clp, struct nfs4_cb_conn *c return -ENOMEM; } =20 - if (clp->cl_minorversion !=3D 0) + if (clp->cl_minorversion !=3D 0) { clp->cl_cb_conn.cb_xprt =3D conn->cb_xprt; + rcu_assign_pointer(clp->cl_cb_session, ses); + } clp->cl_cb_client =3D client; clp->cl_cb_cred =3D cred; rcu_read_lock(); @@ -1333,18 +1356,33 @@ static int grab_slot(struct nfsd4_session *ses) static bool nfsd41_cb_get_slot(struct nfsd4_callback *cb, struct rpc_task = *task) { struct nfs4_client *clp =3D cb->cb_clp; - struct nfsd4_session *ses =3D clp->cl_cb_session; + struct nfsd4_session *ses; =20 if (cb->cb_held_slot >=3D 0) return true; + + rcu_read_lock(); + ses =3D rcu_dereference(clp->cl_cb_session); + if (!ses) { + rcu_read_unlock(); + rpc_sleep_on(&clp->cl_cb_waitq, task, NULL); + return false; + } cb->cb_held_slot =3D grab_slot(ses); if (cb->cb_held_slot < 0) { + rcu_read_unlock(); rpc_sleep_on(&clp->cl_cb_waitq, task, NULL); /* Race breaker */ - cb->cb_held_slot =3D grab_slot(ses); + rcu_read_lock(); + ses =3D rcu_dereference(clp->cl_cb_session); + if (ses) + cb->cb_held_slot =3D grab_slot(ses); + rcu_read_unlock(); if (cb->cb_held_slot < 0) return false; rpc_wake_up_queued_task(&clp->cl_cb_waitq, task); + } else { + rcu_read_unlock(); } return true; } @@ -1352,12 +1390,17 @@ static bool nfsd41_cb_get_slot(struct nfsd4_callbac= k *cb, struct rpc_task *task) static void nfsd41_cb_release_slot(struct nfsd4_callback *cb) { struct nfs4_client *clp =3D cb->cb_clp; - struct nfsd4_session *ses =3D clp->cl_cb_session; + struct nfsd4_session *ses; =20 if (cb->cb_held_slot >=3D 0) { - spin_lock(&ses->se_lock); - ses->se_cb_slot_avail |=3D BIT(cb->cb_held_slot); - spin_unlock(&ses->se_lock); + rcu_read_lock(); + ses =3D rcu_dereference(clp->cl_cb_session); + if (ses) { + spin_lock(&ses->se_lock); + ses->se_cb_slot_avail |=3D BIT(cb->cb_held_slot); + spin_unlock(&ses->se_lock); + } + rcu_read_unlock(); cb->cb_held_slot =3D -1; rpc_wake_up_next(&clp->cl_cb_waitq); } @@ -1489,22 +1532,35 @@ static void nfsd4_cb_prepare(struct rpc_task *task,= void *calldata) trace_nfsd_cb_rpc_prepare(clp); cb->cb_seq_status =3D 1; cb->cb_status =3D 0; - if (minorversion && !nfsd41_cb_get_slot(cb, task)) - return; + if (minorversion) { + if (!rcu_access_pointer(clp->cl_cb_session)) { + rpc_exit(task, -EIO); + return; + } + if (!nfsd41_cb_get_slot(cb, task)) + return; + } rpc_call_start(task); } =20 /* Returns true if CB_COMPOUND processing should continue */ static bool nfsd4_cb_sequence_done(struct rpc_task *task, struct nfsd4_cal= lback *cb) { - struct nfsd4_session *session =3D cb->cb_clp->cl_cb_session; + struct nfsd4_session *session; bool ret =3D false; =20 if (cb->cb_held_slot < 0) goto requeue; =20 + rcu_read_lock(); + session =3D rcu_dereference(cb->cb_clp->cl_cb_session); + if (!session) { + rcu_read_unlock(); + goto requeue; + } + /* This is the operation status code for CB_SEQUENCE */ - trace_nfsd_cb_seq_status(task, cb); + trace_nfsd_cb_seq_status(task, cb, session); switch (cb->cb_seq_status) { case 0: /* @@ -1536,12 +1592,16 @@ static bool nfsd4_cb_sequence_done(struct rpc_task = *task, struct nfsd4_callback fallthrough; case -NFS4ERR_BADSESSION: nfsd4_mark_cb_fault(cb->cb_clp); + rcu_read_unlock(); goto requeue; case -NFS4ERR_DELAY: cb->cb_seq_status =3D 1; - if (RPC_SIGNALLED(task) || !rpc_restart_call(task)) + if (RPC_SIGNALLED(task) || !rpc_restart_call(task)) { + rcu_read_unlock(); goto requeue; + } rpc_delay(task, 2 * HZ); + rcu_read_unlock(); return false; case -NFS4ERR_SEQ_MISORDERED: case -NFS4ERR_BADSLOT: @@ -1553,11 +1613,13 @@ static bool nfsd4_cb_sequence_done(struct rpc_task = *task, struct nfsd4_callback */ nfsd4_mark_cb_fault(cb->cb_clp); cb->cb_held_slot =3D -1; + rcu_read_unlock(); goto retry_nowait; default: nfsd4_mark_cb_fault(cb->cb_clp); } - trace_nfsd_cb_free_slot(task, cb); + trace_nfsd_cb_free_slot(task, cb, session); + rcu_read_unlock(); nfsd41_cb_release_slot(cb); return ret; retry_nowait: @@ -1679,7 +1741,15 @@ static struct nfsd4_conn * __nfsd4_find_backchannel(= struct nfs4_client *clp) * Note there isn't a lot of locking in this code; instead we depend on * the fact that it is run from clp->cl_callback_wq, which won't run two * work items at once. So, for example, clp->cl_callback_wq handles all - * access of cl_cb_client and all calls to rpc_create or rpc_shutdown_clie= nt. + * access of cl_cb_client, and all calls to rpc_create or + * rpc_shutdown_client. + * + * cl_cb_session is written only from cl_callback_wq (via + * rcu_assign_pointer) and read from rpciod under rcu_read_lock (via + * rcu_dereference) by encode_cb_sequence4args(), decode_cb_sequence4resok= (), + * nfsd4_cb_sequence_done(), and the cb-slot helpers. Sessions are freed + * with kfree_rcu() so that rpciod readers in an RCU read-side critical + * section never dereference a freed session. */ static void nfsd4_process_cb_update(struct nfsd4_callback *cb) { @@ -1731,6 +1801,7 @@ static void nfsd4_process_cb_update(struct nfsd4_call= back *cb) nfsd4_mark_cb_down(clp); if (c) svc_xprt_put(c->cn_xprt); + rcu_assign_pointer(clp->cl_cb_session, NULL); return; } } diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index f4d12dbcf97b..9503859918ac 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -2397,7 +2397,7 @@ static void __free_session(struct nfsd4_session *ses) { free_session_slots(ses, 0); xa_destroy(&ses->se_slots); - kfree(ses); + kfree_rcu(ses, rcu_head); } =20 static void free_session(struct nfsd4_session *ses) @@ -3689,7 +3689,7 @@ static struct nfs4_client *create_client(struct xdr_n= etobj name, clp->cl_time =3D ktime_get_boottime_seconds(); copy_verf(clp, verf); memcpy(&clp->cl_addr, sa, sizeof(struct sockaddr_storage)); - clp->cl_cb_session =3D NULL; + RCU_INIT_POINTER(clp->cl_cb_session, NULL); clp->net =3D net; clp->cl_nfsd_dentry =3D nfsd_client_mkdir( nn, &clp->cl_nfsdfs, diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h index 62a5fe3f6cc0..c26b2384d694 100644 --- a/fs/nfsd/state.h +++ b/fs/nfsd/state.h @@ -440,6 +440,7 @@ struct nfsd4_session { u16 se_slot_gen; bool se_dead; u32 se_target_maxslots; + struct rcu_head rcu_head; }; =20 /* formatted contents of nfs4_sessionid */ @@ -552,7 +553,7 @@ struct nfs4_client { #define NFSD4_CB_FAULT 3 int cl_cb_state; struct nfsd4_callback cl_cb_null; - struct nfsd4_session *cl_cb_session; + struct nfsd4_session __rcu *cl_cb_session; =20 /* for all client information that callback code might need: */ spinlock_t cl_lock; diff --git a/fs/nfsd/trace.h b/fs/nfsd/trace.h index d01496aa3cf8..9917c0440522 100644 --- a/fs/nfsd/trace.h +++ b/fs/nfsd/trace.h @@ -1751,9 +1751,10 @@ DEFINE_NFSD_CB_LIFETIME_EVENT(bc_shutdown); TRACE_EVENT(nfsd_cb_seq_status, TP_PROTO( const struct rpc_task *task, - const struct nfsd4_callback *cb + const struct nfsd4_callback *cb, + const struct nfsd4_session *session ), - TP_ARGS(task, cb), + TP_ARGS(task, cb, session), TP_STRUCT__entry( __field(unsigned int, task_id) __field(unsigned int, client_id) @@ -1765,8 +1766,6 @@ TRACE_EVENT(nfsd_cb_seq_status, __field(int, seq_status) ), TP_fast_assign( - const struct nfs4_client *clp =3D cb->cb_clp; - const struct nfsd4_session *session =3D clp->cl_cb_session; const struct nfsd4_sessionid *sid =3D (struct nfsd4_sessionid *)&session->se_sessionid; =20 @@ -1792,9 +1791,10 @@ TRACE_EVENT(nfsd_cb_seq_status, TRACE_EVENT(nfsd_cb_free_slot, TP_PROTO( const struct rpc_task *task, - const struct nfsd4_callback *cb + const struct nfsd4_callback *cb, + const struct nfsd4_session *session ), - TP_ARGS(task, cb), + TP_ARGS(task, cb, session), TP_STRUCT__entry( __field(unsigned int, task_id) __field(unsigned int, client_id) @@ -1805,8 +1805,6 @@ TRACE_EVENT(nfsd_cb_free_slot, __field(u32, slot_seqno) ), TP_fast_assign( - const struct nfs4_client *clp =3D cb->cb_clp; - const struct nfsd4_session *session =3D clp->cl_cb_session; const struct nfsd4_sessionid *sid =3D (struct nfsd4_sessionid *)&session->se_sessionid; =20 --=20 2.54.0 From nobody Mon Jun 8 09:49:04 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 93EDC3C3C06; Sat, 30 May 2026 13:19:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780147179; cv=none; b=IhFXRNplfBAWRYItoAMUontiHP0wdkUM/NX2OZ5uPG8ykXjQc8N96sxD+m3YgarcY6lCr/fyaKZCbs53vVyig7Zbm5FKnhedDspDSRafttkWJXOS1aMsVDQCOSb2b7LIc3PP86uafp3zZ7CI6FcgbHAC9uJnjb6CuPg2qnIcN6w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780147179; c=relaxed/simple; bh=6Jg2QEBnKvdkwDkYlbnJqOX6yDaZBREEszUiDD31pas=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=eI7sMFQ9MFZxb3uYfj1ix2Cfaj4YWPAbaCg7BRqoXd6sejAkb0ZXzRLAM2dxHrJ0wswSKG4klcnZLPMJQK4+8wzBM01IXtzXy24rVfPlWfiWyd3RScTey0pdKWrQJw87AZhGJzzVMSkl9ttDcV97u16h/quBXlJOOpSPJF+QxCk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=JvdOlHSl; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="JvdOlHSl" Received: by smtp.kernel.org (Postfix) with ESMTPSA id F30CB1F00899; Sat, 30 May 2026 13:19:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780147177; bh=72YHRXi5QzkT6R4V3TMOaZ3tdbbBBf4DkzC4TacddQw=; h=From:Date:Subject:References:In-Reply-To:To:Cc; b=JvdOlHSlyOuWWH28trFIfhcR3vdIk5pbDE8yv/tXJzKE7iq9/cAggrgo5FGQxNRcg FUjtBnLXil1DPKCe7RaBgfgSJ2uck1px/XqZsu/vgxqvxSKyIweut1Td3omo0jOhSs /im0KaiA7SfTrlbC50y9p9/wO9SwoSKMyILJsnngSFVvukpnOa7+lQhT5MEyzFrxzA J9qSxCN6zg8l15DNPCe5UHA+VsmAd9pZyyzyauVwRw6RUAwNLjv+xSe2Cuf83D3IwB EC0d1R/Q70/QDMsEM+HrNsZOrsluiV6tu9YIdGVT5mQ6+J6jb6k/SK9wIoI+xQMNtP udorDqLQipikQ== From: Jeff Layton Date: Sat, 30 May 2026 09:19:19 -0400 Subject: [PATCH v2 3/9] nfsd: convert nfsd_net boolean flags to unsigned long flags word Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260530-nfsd-fixes-v2-3-f27e8eb4d974@kernel.org> References: <20260530-nfsd-fixes-v2-0-f27e8eb4d974@kernel.org> In-Reply-To: <20260530-nfsd-fixes-v2-0-f27e8eb4d974@kernel.org> To: Chuck Lever , NeilBrown , Olga Kornievskaia , Dai Ngo , Tom Talpey , "J. Bruce Fields" , Scott Mayhew , Trond Myklebust , Andreas Gruenbacher , Mike Snitzer , Rick Macklem Cc: Chris Mason , linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org, Jeff Layton X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=openpgp-sha256; l=13720; i=jlayton@kernel.org; h=from:subject:message-id; bh=t4y30JyLGCagVZA4QvlQFVEBQz3ko6mGCP6Zwaxeylw=; b=owEBbQKS/ZANAwAKAQAOaEEZVoIVAcsmYgBqGuPhTH2mSTmBiNnMowSmrsKSiUnsgLs5a4RuI I5vMpzwgr2JAjMEAAEKAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCahrj4QAKCRAADmhBGVaC FXdxEACM73TyC4mErVvhhEgj7mZ3qBT+p2aLkZPqfw3+U8vILCYnS5yvTQav5at3cUQNc2/EgbN RmHCvAD6+5QmXmlnM0a4AiUcICrQYrxln23cmnfKH+2h3mriwGlUfiUk+umeNbSQh3XnBdsMmd4 5pxozLWlSE4gGgrA5WUlIk4T1dhtABUcM5xMebPtuoQ5zP0YvTnAkV0x/WRyLTEcB8Sf45e31X7 ZsVLqW78LLiQrPHyyhCoYS9+gyuIUylz4SPopURWzo6tvyePl4POm0By59BM8RIe9ZQycHRXlIw PFerRWUGFKLqBdAyZasdGS0PMGKKexyoVW4MvWwbdmcTZz5CNv7oDjLEXIWy8WE6vrw9Fji7V3W oa6SHuNXnVguMixVrHiEUa8Avb1+6o8MMjx02hYYT5PpwWb3BjGyuD57pk3i8tDyfhGyeVLQzNv 78zxn+YSL2i7b/gag10DgLsi5zXy0uCm++wYvMTiW+XZXA9vaaSaXwPG01K/A+QRzDk8uchGcM9 IlhxDr8JTeZeYLLfjftwyywerrUMKoT4t9nxt4AJhy2GlbbL1K8IQw40k+i9/wMQmKQmY7ojgD6 L6L/RyAR42J1d9ULLY3YaqGE1WOmFd4r2U/xMZhNt4CCoh8agLhwXJUWf4MLLuMJdbgh8H235Gw np3k9pDEeAnn9OQ== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 From: Chris Mason nfsd_net contains several boolean fields that are accessed from concurrent contexts without serialization. In particular, nfsd4_end_grace() guards its drain path with a plain bool: if (nn->grace_ended) return; nn->grace_ended =3D true; The read and the write are independent, and nothing in struct nfsd_net serializes them. At least two contexts can reach this code with no lock held: laundromat path laundry_wq kworker nfs4_laundromat() nfsd4_end_grace() RECLAIM_COMPLETE path nfsd compound kthread nfsd4_reclaim_complete() inc_reclaim_complete() nfsd4_end_grace() Both callers can observe grace_ended =3D=3D false on different CPUs, both store true, and both proceed into nfsd4_record_grace_done(), which invokes the active client_tracking_ops->grace_done callback. For tracking ops that drain reclaim_str_hashtbl (legacy_tracking_ops via nfsd4_recdir_purge_old, and the cld v1+ ops via nfsd4_cld_grace_done), grace_done calls nfs4_release_reclaim(), which walks every bucket of reclaim_str_hashtbl with no lock and calls nfs4_remove_reclaim_record() (list_del + kfree) on each entry. Two concurrent walkers corrupt the list and double-free every nfs4_client_reclaim. A concurrent nfsd4_find_reclaim_client() iterating the same bucket reads through freed memory. A third call site exists in nfs4_state_start_net() on the skip_grace startup path, but it runs under nfsd_mutex before any client has connected and before the laundromat's first delayed work fires, so it cannot race with the two callers above. Replace the scattered boolean fields in nfsd_net with a single unsigned long flags word and an enum nfsd_net_flag for the bit positions. The grace_ended race is fixed by using test_and_set_bit(), which is atomic on all architectures. The remaining flags (grace_end_forced, in_grace, somebody_reclaimed, track_reclaim_completes, nfsd_net_up, lockd_up) are converted to use test_bit/set_bit/clear_bit for consistency. This avoids sub-word cmpxchg issues on architectures like Hexagon that only support word-sized atomic operations. Fixes: 362063a595be ("nfsd: keep a tally of RECLAIM_COMPLETE operations whe= n using nfsdcld") Assisted-by: kres:claude-opus-4-7 Reported-by: Chris Mason Signed-off-by: Chris Mason --- fs/nfsd/netns.h | 19 +++++++++++-------- fs/nfsd/nfs4proc.c | 2 +- fs/nfsd/nfs4recover.c | 12 ++++++------ fs/nfsd/nfs4state.c | 40 ++++++++++++++++++++++++---------------- fs/nfsd/nfsctl.c | 2 +- fs/nfsd/nfssvc.c | 22 +++++++++++----------- 6 files changed, 54 insertions(+), 43 deletions(-) diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h index 27da1a3edacb..37dfecb9d49d 100644 --- a/fs/nfsd/netns.h +++ b/fs/nfsd/netns.h @@ -28,6 +28,16 @@ struct cld_net; struct nfsd_net_cb; struct nfsd4_client_tracking_ops; =20 +enum nfsd_net_flag { + NFSD_NET_GRACE_ENDED, + NFSD_NET_GRACE_END_FORCED, + NFSD_NET_IN_GRACE, + NFSD_NET_SOMEBODY_RECLAIMED, + NFSD_NET_TRACK_RECLAIM_COMPLETES, + NFSD_NET_UP, + NFSD_NET_LOCKD_UP, +}; + enum { /* cache misses due only to checksum comparison failures */ NFSD_STATS_PAYLOAD_MISSES, @@ -66,8 +76,7 @@ struct nfsd_net { struct cache_detail *nametoid_cache; =20 struct lock_manager nfsd4_manager; - bool grace_ended; - bool grace_end_forced; + unsigned long flags; time64_t boot_time; =20 struct dentry *nfsd_client_dir; @@ -117,19 +126,13 @@ struct nfsd_net { spinlock_t blocked_locks_lock; =20 struct file *rec_file; - bool in_grace; const struct nfsd4_client_tracking_ops *client_tracking_ops; =20 time64_t nfsd4_lease; time64_t nfsd4_grace; - bool somebody_reclaimed; =20 - bool track_reclaim_completes; atomic_t nr_reclaim_complete; =20 - bool nfsd_net_up; - bool lockd_up; - seqlock_t writeverf_lock; unsigned char writeverf[8]; =20 diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c index 5f2b9bfc3a84..9473aeb53f72 100644 --- a/fs/nfsd/nfs4proc.c +++ b/fs/nfsd/nfs4proc.c @@ -667,7 +667,7 @@ nfsd4_open(struct svc_rqst *rqstp, struct nfsd4_compoun= d_state *cstate, pr_warn("nfsd4_process_open2 failed to open newly-created file: status= =3D%u\n", be32_to_cpu(status)); if (reclaim && !status) - nn->somebody_reclaimed =3D true; + set_bit(NFSD_NET_SOMEBODY_RECLAIMED, &nn->flags); out: if (open->op_filp) { fput(open->op_filp); diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c index 6ea25a52d2f4..c841da585142 100644 --- a/fs/nfsd/nfs4recover.c +++ b/fs/nfsd/nfs4recover.c @@ -167,7 +167,7 @@ nfsd4_create_clid_dir(struct nfs4_client *clp) end_creating(dentry); out: if (status =3D=3D 0) { - if (nn->in_grace) + if (test_bit(NFSD_NET_IN_GRACE, &nn->flags)) __nfsd4_create_reclaim_record_grace(clp, dname, nn); vfs_fsync(nn->rec_file, 0); } else { @@ -317,7 +317,7 @@ nfsd4_remove_clid_dir(struct nfs4_client *clp) nfs4_reset_creds(original_cred); if (status =3D=3D 0) { vfs_fsync(nn->rec_file, 0); - if (nn->in_grace) + if (test_bit(NFSD_NET_IN_GRACE, &nn->flags)) __nfsd4_remove_reclaim_record_grace(dname, HEXDIR_LEN, nn); } @@ -373,7 +373,7 @@ nfsd4_recdir_purge_old(struct nfsd_net *nn) { int status; =20 - nn->in_grace =3D false; + clear_bit(NFSD_NET_IN_GRACE, &nn->flags); if (!nn->rec_file) return; status =3D mnt_want_write_file(nn->rec_file); @@ -455,7 +455,7 @@ nfsd4_init_recdir(struct net *net) =20 nfs4_reset_creds(original_cred); if (!status) - nn->in_grace =3D true; + set_bit(NFSD_NET_IN_GRACE, &nn->flags); return status; } =20 @@ -1362,7 +1362,7 @@ nfs4_cld_state_init(struct net *net) for (i =3D 0; i < CLIENT_HASH_SIZE; i++) INIT_LIST_HEAD(&nn->reclaim_str_hashtbl[i]); nn->reclaim_str_hashtbl_size =3D 0; - nn->track_reclaim_completes =3D true; + set_bit(NFSD_NET_TRACK_RECLAIM_COMPLETES, &nn->flags); atomic_set(&nn->nr_reclaim_complete, 0); =20 return 0; @@ -1373,7 +1373,7 @@ nfs4_cld_state_shutdown(struct net *net) { struct nfsd_net *nn =3D net_generic(net, nfsd_net_id); =20 - nn->track_reclaim_completes =3D false; + clear_bit(NFSD_NET_TRACK_RECLAIM_COMPLETES, &nn->flags); kfree(nn->reclaim_str_hashtbl); } =20 diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index 9503859918ac..bc5216bb08ff 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -2777,7 +2777,7 @@ static void inc_reclaim_complete(struct nfs4_client *= clp) { struct nfsd_net *nn =3D net_generic(clp->net, nfsd_net_id); =20 - if (!nn->track_reclaim_completes) + if (!test_bit(NFSD_NET_TRACK_RECLAIM_COMPLETES, &nn->flags)) return; if (!nfsd4_find_reclaim_client(clp->cl_name, nn)) return; @@ -5309,8 +5309,6 @@ nfsd4_init_leases_net(struct nfsd_net *nn) =20 nn->nfsd4_lease =3D 90; /* default lease time */ nn->nfsd4_grace =3D 90; - nn->somebody_reclaimed =3D false; - nn->track_reclaim_completes =3D false; nn->clverifier_counter =3D get_random_u32(); nn->clientid_base =3D get_random_u32(); nn->clientid_counter =3D nn->clientid_base + 1; @@ -7022,12 +7020,21 @@ nfsd4_renew(struct svc_rqst *rqstp, struct nfsd4_co= mpound_state *cstate, static void nfsd4_end_grace(struct nfsd_net *nn) { - /* do nothing if grace period already ended */ - if (nn->grace_ended) + /* + * nfsd4_end_grace() can be entered concurrently from the + * laundromat workqueue and from an nfsd compound thread + * handling RECLAIM_COMPLETE. Without serialization, both + * callers can observe grace_ended=3D=3Dfalse and proceed into + * nfsd4_record_grace_done(). For tracking ops whose + * grace_done drains reclaim_str_hashtbl, that results in + * list corruption and a double free of every + * nfs4_client_reclaim entry. Use an atomic test-and-set so + * exactly one caller proceeds. + */ + if (test_and_set_bit(NFSD_NET_GRACE_ENDED, &nn->flags)) return; =20 trace_nfsd_grace_complete(nn); - nn->grace_ended =3D true; /* * If the server goes down again right now, an NFSv4 * client will still be allowed to reclaim after it comes back up, @@ -7068,10 +7075,10 @@ bool nfsd4_force_end_grace(struct nfsd_net *nn) { if (!nn->client_tracking_ops) return false; - if (READ_ONCE(nn->grace_ended)) + if (test_bit(NFSD_NET_GRACE_ENDED, &nn->flags)) return false; /* laundromat_work must be initialised now, though it might be disabled */ - WRITE_ONCE(nn->grace_end_forced, true); + set_bit(NFSD_NET_GRACE_END_FORCED, &nn->flags); /* mod_delayed_work() doesn't queue work after * nfs4_state_shutdown_net() has called disable_delayed_work_sync() */ @@ -7088,15 +7095,15 @@ static bool clients_still_reclaiming(struct nfsd_ne= t *nn) time64_t double_grace_period_end =3D nn->boot_time + 2 * nn->nfsd4_lease; =20 - if (READ_ONCE(nn->grace_end_forced)) + if (test_bit(NFSD_NET_GRACE_END_FORCED, &nn->flags)) return false; - if (nn->track_reclaim_completes && + if (test_bit(NFSD_NET_TRACK_RECLAIM_COMPLETES, &nn->flags) && atomic_read(&nn->nr_reclaim_complete) =3D=3D nn->reclaim_str_hashtbl_size) return false; - if (!nn->somebody_reclaimed) + if (!test_bit(NFSD_NET_SOMEBODY_RECLAIMED, &nn->flags)) return false; - nn->somebody_reclaimed =3D false; + clear_bit(NFSD_NET_SOMEBODY_RECLAIMED, &nn->flags); /* * If we've given them *two* lease times to reclaim, and they're * still not done, give up: @@ -8887,7 +8894,7 @@ nfsd4_lock(struct svc_rqst *rqstp, struct nfsd4_compo= und_state *cstate, nfs4_inc_and_copy_stateid(&lock->lk_resp_stateid, &lock_stp->st_stid); status =3D 0; if (lock->lk_reclaim) - nn->somebody_reclaimed =3D true; + set_bit(NFSD_NET_SOMEBODY_RECLAIMED, &nn->flags); break; case FILE_LOCK_DEFERRED: kref_put(&nbl->nbl_kref, free_nbl); @@ -9413,8 +9420,8 @@ static int nfs4_state_create_net(struct net *net) nn->conf_name_tree =3D RB_ROOT; nn->unconf_name_tree =3D RB_ROOT; nn->boot_time =3D ktime_get_real_seconds(); - nn->grace_ended =3D false; - nn->grace_end_forced =3D false; + clear_bit(NFSD_NET_GRACE_ENDED, &nn->flags); + clear_bit(NFSD_NET_GRACE_END_FORCED, &nn->flags); nn->nfsd4_manager.block_opens =3D true; INIT_LIST_HEAD(&nn->nfsd4_manager.list); INIT_LIST_HEAD(&nn->client_lru); @@ -9500,7 +9507,8 @@ nfs4_state_start_net(struct net *net) nfsd4_client_tracking_init(net); /* safe for laundromat to run now */ enable_delayed_work(&nn->laundromat_work); - if (nn->track_reclaim_completes && nn->reclaim_str_hashtbl_size =3D=3D 0) + if (test_bit(NFSD_NET_TRACK_RECLAIM_COMPLETES, &nn->flags) && + nn->reclaim_str_hashtbl_size =3D=3D 0) goto skip_grace; printk(KERN_INFO "NFSD: starting %lld-second grace period (net %x)\n", nn->nfsd4_grace, net->ns.inum); diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c index 468aad8c3af9..92f65ca6f667 100644 --- a/fs/nfsd/nfsctl.c +++ b/fs/nfsd/nfsctl.c @@ -1111,7 +1111,7 @@ static ssize_t write_v4_end_grace(struct file *file, = char *buf, size_t size) } =20 return scnprintf(buf, SIMPLE_TRANSACTION_LIMIT, "%c\n", - nn->grace_ended ? 'Y' : 'N'); + test_bit(NFSD_NET_GRACE_ENDED, &nn->flags) ? 'Y' : 'N'); } =20 #endif diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c index be0add971c2d..551d3cf51036 100644 --- a/fs/nfsd/nfssvc.c +++ b/fs/nfsd/nfssvc.c @@ -351,7 +351,7 @@ static int nfsd_startup_net(struct net *net, const stru= ct cred *cred) struct nfsd_net *nn =3D net_generic(net, nfsd_net_id); int ret; =20 - if (nn->nfsd_net_up) + if (test_bit(NFSD_NET_UP, &nn->flags)) return 0; =20 ret =3D nfsd_startup_generic(); @@ -364,11 +364,11 @@ static int nfsd_startup_net(struct net *net, const st= ruct cred *cred) goto out_socks; } =20 - if (nfsd_needs_lockd(nn) && !nn->lockd_up) { + if (nfsd_needs_lockd(nn) && !test_bit(NFSD_NET_LOCKD_UP, &nn->flags)) { ret =3D lockd_up(net, cred); if (ret) goto out_socks; - nn->lockd_up =3D true; + set_bit(NFSD_NET_LOCKD_UP, &nn->flags); } =20 ret =3D nfsd_file_cache_start_net(net); @@ -386,7 +386,7 @@ static int nfsd_startup_net(struct net *net, const stru= ct cred *cred) if (ret) goto out_reply_cache; =20 - nn->nfsd_net_up =3D true; + set_bit(NFSD_NET_UP, &nn->flags); return 0; =20 out_reply_cache: @@ -394,9 +394,9 @@ static int nfsd_startup_net(struct net *net, const stru= ct cred *cred) out_filecache: nfsd_file_cache_shutdown_net(net); out_lockd: - if (nn->lockd_up) { + if (test_bit(NFSD_NET_LOCKD_UP, &nn->flags)) { lockd_down(net); - nn->lockd_up =3D false; + clear_bit(NFSD_NET_LOCKD_UP, &nn->flags); } out_socks: nfsd_shutdown_generic(); @@ -407,7 +407,7 @@ static void nfsd_shutdown_net(struct net *net) { struct nfsd_net *nn =3D net_generic(net, nfsd_net_id); =20 - if (nn->nfsd_net_up) { + if (test_bit(NFSD_NET_UP, &nn->flags)) { percpu_ref_kill_and_confirm(&nn->nfsd_net_ref, nfsd_net_done); wait_for_completion(&nn->nfsd_net_confirm_done); =20 @@ -415,18 +415,18 @@ static void nfsd_shutdown_net(struct net *net) nfs4_state_shutdown_net(net); nfsd_reply_cache_shutdown(nn); nfsd_file_cache_shutdown_net(net); - if (nn->lockd_up) { + if (test_bit(NFSD_NET_LOCKD_UP, &nn->flags)) { lockd_down(net); - nn->lockd_up =3D false; + clear_bit(NFSD_NET_LOCKD_UP, &nn->flags); } wait_for_completion(&nn->nfsd_net_free_done); } =20 percpu_ref_exit(&nn->nfsd_net_ref); =20 - if (nn->nfsd_net_up) + if (test_bit(NFSD_NET_UP, &nn->flags)) nfsd_shutdown_generic(); - nn->nfsd_net_up =3D false; + clear_bit(NFSD_NET_UP, &nn->flags); } =20 static DEFINE_SPINLOCK(nfsd_notifier_lock); --=20 2.54.0 From nobody Mon Jun 8 09:49:04 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0FC213C3BFD; Sat, 30 May 2026 13:19:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780147180; cv=none; b=h7XNWCmjspCRvBvTf7sib378iqFGFo1XHQxXtxld7qnbqzcFUkaOQBqMGRKh5XH3QLcXKcibYvGgpBp1EWIgp81Mr6B6aPO/YPAuu1T83KYyPxaDS4+BwPMfx7vuKwOl235bl13vH8lng6/l6QZOUQQaKI2252HZ0lk1xuT2a2o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780147180; c=relaxed/simple; bh=uVs0ovhPONdj+jaVLIvw14XxurhtpfdRNc67sMGov74=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=tnKYFuTBHp0wH8EmClHIk/fEJbEpEdeMyFZu4JY3FZrl11mtGLIZFvvPdXDLceqkCzn7eyBeL/sWuawecHvNcHq93OGPrdjsRzd6W6V3WSmADofhNCRivXGfvowASyGA7lZHnhwAxPDYmkgYMLuLmu8yEa4ZJCf5UqzQnkXYCmI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=AlzXj3QL; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="AlzXj3QL" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8452C1F0089C; Sat, 30 May 2026 13:19:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780147178; bh=M+5wGtzS4e6NAu+j5ObGYbaTEVrdLRmUONcdzarv2nM=; h=From:Date:Subject:References:In-Reply-To:To:Cc; b=AlzXj3QL9rbptVcXXFy8Y3TNU77mxWUAB7fZRNM9vO7EIjbpS/NOyZXiqtv3r9tIv CsaUOgqnP+8Ko6stCfAXkimVVBNtRUq4rMLQNQftoZCxHVcfRKvaIjLWn1GROraF3N DTUXwOtw4i7HQ4xEgCSBqnMikc4MXr96zuHGaiGTWPKDf8vBz5dUv1ZQzR9/WYo/Gf kvismieUAzesoCDkmUoDzQyL94WWjw+J+u6fqy/JgeEx6yAOceXPAhWfa/QIKR9GkH h7aUAiL1zZqRNIf0IG7WLCzG1K+oPb6/MEztNYyp6maJqSXv5YGzlCNkyg9VaUyZbs uTjqR3omAEUjg== From: Jeff Layton Date: Sat, 30 May 2026 09:19:20 -0400 Subject: [PATCH v2 4/9] nfsd: dedup nfs4_client_to_reclaim inserts Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260530-nfsd-fixes-v2-4-f27e8eb4d974@kernel.org> References: <20260530-nfsd-fixes-v2-0-f27e8eb4d974@kernel.org> In-Reply-To: <20260530-nfsd-fixes-v2-0-f27e8eb4d974@kernel.org> To: Chuck Lever , NeilBrown , Olga Kornievskaia , Dai Ngo , Tom Talpey , "J. Bruce Fields" , Scott Mayhew , Trond Myklebust , Andreas Gruenbacher , Mike Snitzer , Rick Macklem Cc: Chris Mason , linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org, Jeff Layton X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=openpgp-sha256; l=11934; i=jlayton@kernel.org; h=from:subject:message-id; bh=uVs0ovhPONdj+jaVLIvw14XxurhtpfdRNc67sMGov74=; b=owEBbQKS/ZANAwAKAQAOaEEZVoIVAcsmYgBqGuPhse7vXSgjRtY6+ITL6d1u6P9Qd73CuJLCG nBN7y/dPBOJAjMEAAEKAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCahrj4QAKCRAADmhBGVaC FdH+EACRwFkbMiUiT21GrtzhHImpg9Gi2T62dJeHdNfYdJrJTNudwENiR8n163YvkqrJRf9deXv 8tMZUpi6NnoeeKW800EDj3wnHOjtzcGvQkuM986IdRVJn2Pi05DjSi05otK/qk7TuYjV4zmqNM7 iOsqMPW8D6DOTJxOMLumcsTskdXnBf331gZV+n53nYh5Kv0XCAWo8IQj7wCavSzpOQZb5fluf0v DGKexfFVTo1Jg0buOUnla5xPgh+ude2TaMaiQsCCfkAhe+Vx6R7MgN8Ak1lbV3ztXC7ktvi69Cz iawO8D5Nz6Jm98hpuJ0dXEpsBHsl/Q+OTehd3V/4JMRCC4Wdh74D9Ae+bAjEzxf5UNS5AxE84rq VntnpfsnwDIaGEF1DHngkGnl2AeOe6ndcGIXtwKt0RVfSZMhxBUQwoBoz/fL6PgMLXTdS5yRJq6 Ndm+c7koVnwZcCABvx1GI1VaPksXNYAP9t+rUMFE0cn3lpAWFxFSGy1uQjE7F2uCEo/VDmPlDup vexh1uQzmnmO+xTp1reN3t1YHB9FykY8FCV4tl7FL/KQgzIcIFJO0f00roE2EUSmgHnrKMOWegj uN2QzRVWz+vw/q64pQHZlkRZ0DYp1wRLpkJ8zsks6sSx7safurZbx92voQ1N5YTbSMMBhGY7LLy MebibrF4yWSovdQ== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 nfs4_client_to_reclaim() unconditionally allocates a new nfs4_client_reclaim, prepends it to reclaim_str_hashtbl[], and bumps reclaim_str_hashtbl_size with no check for an existing entry for the same client name. After a reboot with a populated recovery directory that inflates the counter by one for every client that reclaims: boot: load_recdir() nfs4_client_to_reclaim(name) /* entry #1, size++ */ grace: RECLAIM_COMPLETE __nfsd4_create_reclaim_record_grace() nfs4_client_to_reclaim(name) /* entry #2, size++ */ inc_reclaim_complete() ends the grace period early only when atomic_inc_return(&nn->nr_reclaim_complete) =3D=3D nn->reclaim_str_hashtbl_size With reclaim_str_hashtbl_size at 2N and nr_reclaim_complete capped at N, the equality never holds and the fast end-of-grace path is dead. The grace period always runs out the full 90-second laundromat timer, and the shadow entry left in the hash table carries a dangling cr_clp for any reader that walks it. Fix nfs4_client_to_reclaim() to look the name up with nfsd4_find_reclaim_client() first and, on a hit, fold the new princhash into the existing record (if it lacks one) and return that record without allocating or touching reclaim_str_hashtbl_size. On kmemdup() failure during the fold-in, return NULL so __cld_pipe_inprogress_downcall() surfaces -EFAULT to nfsdcld, matching the miss-path contract. Add an rw_semaphore (reclaim_str_hashtbl_lock) to struct nfsd_net that serialises all access to reclaim_str_hashtbl[] and reclaim_str_hashtbl_size. Writers (nfs4_client_to_reclaim, nfs4_remove_reclaim_record callers) hold the write side; readers (nfsd4_cld_check*, inc_reclaim_complete, clients_still_reclaiming, nfs4_has_reclaimed_state, nfsd4_check_legacy_client) hold the read side. All call sites are in sleepable context, and none is a hot path, so the rwsem cost is negligible. Reported-by: Chris Mason Fixes: 362063a595be ("nfsd: keep a tally of RECLAIM_COMPLETE operations whe= n using nfsdcld") Assisted-by: kres:claude-opus-4-7 Signed-off-by: Jeff Layton --- fs/nfsd/netns.h | 6 ++++- fs/nfsd/nfs4recover.c | 36 ++++++++++++++++++++++++------ fs/nfsd/nfs4state.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++-= ---- 3 files changed, 89 insertions(+), 14 deletions(-) diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h index 37dfecb9d49d..47bbd4fb42b0 100644 --- a/fs/nfsd/netns.h +++ b/fs/nfsd/netns.h @@ -93,6 +93,7 @@ struct nfsd_net { */ struct list_head *reclaim_str_hashtbl; int reclaim_str_hashtbl_size; + struct rw_semaphore reclaim_str_hashtbl_lock; struct list_head *conf_id_hashtbl; struct rb_root conf_name_tree; struct list_head *unconf_id_hashtbl; @@ -105,7 +106,10 @@ struct nfsd_net { * close_lru holds (open) stateowner queue ordered by nfs4_stateowner.so_= time * for last close replay. * - * All of the above fields are protected by the client_mutex. + * reclaim_str_hashtbl[], reclaim_str_hashtbl_size are protected by + * reclaim_str_hashtbl_lock. + * + * All of the remaining fields are protected by the client_mutex. */ struct list_head client_lru; struct list_head close_lru; diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c index c841da585142..d513971fb119 100644 --- a/fs/nfsd/nfs4recover.c +++ b/fs/nfsd/nfs4recover.c @@ -285,10 +285,12 @@ __nfsd4_remove_reclaim_record_grace(const char *dname= , int len, return; } name.len =3D len; + down_write(&nn->reclaim_str_hashtbl_lock); crp =3D nfsd4_find_reclaim_client(name, nn); - kfree(name.data); if (crp) nfs4_remove_reclaim_record(crp, nn); + up_write(&nn->reclaim_str_hashtbl_lock); + kfree(name.data); } =20 static void @@ -484,6 +486,7 @@ nfs4_legacy_state_init(struct net *net) for (i =3D 0; i < CLIENT_HASH_SIZE; i++) INIT_LIST_HEAD(&nn->reclaim_str_hashtbl[i]); nn->reclaim_str_hashtbl_size =3D 0; + init_rwsem(&nn->reclaim_str_hashtbl_lock); =20 return 0; } @@ -598,13 +601,16 @@ nfsd4_check_legacy_client(struct nfs4_client *clp) goto out_enoent; } name.len =3D HEXDIR_LEN; + down_read(&nn->reclaim_str_hashtbl_lock); crp =3D nfsd4_find_reclaim_client(name, nn); - kfree(name.data); if (crp) { set_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags); crp->cr_clp =3D clp; - return 0; } + up_read(&nn->reclaim_str_hashtbl_lock); + kfree(name.data); + if (crp) + return 0; =20 out_enoent: return -ENOENT; @@ -1176,6 +1182,7 @@ nfsd4_cld_check(struct nfs4_client *clp) return 0; =20 /* look for it in the reclaim hashtable otherwise */ + down_read(&nn->reclaim_str_hashtbl_lock); crp =3D nfsd4_find_reclaim_client(clp->cl_name, nn); if (crp) goto found; @@ -1191,6 +1198,7 @@ nfsd4_cld_check(struct nfs4_client *clp) if (!name.data) { dprintk("%s: failed to allocate memory for name.data!\n", __func__); + up_read(&nn->reclaim_str_hashtbl_lock); return -ENOENT; } name.len =3D HEXDIR_LEN; @@ -1201,9 +1209,11 @@ nfsd4_cld_check(struct nfs4_client *clp) =20 } #endif + up_read(&nn->reclaim_str_hashtbl_lock); return -ENOENT; found: crp->cr_clp =3D clp; + up_read(&nn->reclaim_str_hashtbl_lock); return 0; } =20 @@ -1215,6 +1225,7 @@ nfsd4_cld_check_v2(struct nfs4_client *clp) struct cld_net *cn =3D nn->cld_net; #endif struct nfs4_client_reclaim *crp; + unsigned int princhashlen; char *principal =3D NULL; =20 /* did we already find that this client is stable? */ @@ -1222,6 +1233,7 @@ nfsd4_cld_check_v2(struct nfs4_client *clp) return 0; =20 /* look for it in the reclaim hashtable otherwise */ + down_read(&nn->reclaim_str_hashtbl_lock); crp =3D nfsd4_find_reclaim_client(clp->cl_name, nn); if (crp) goto found; @@ -1237,6 +1249,7 @@ nfsd4_cld_check_v2(struct nfs4_client *clp) if (!name.data) { dprintk("%s: failed to allocate memory for name.data\n", __func__); + up_read(&nn->reclaim_str_hashtbl_lock); return -ENOENT; } name.len =3D HEXDIR_LEN; @@ -1247,23 +1260,31 @@ nfsd4_cld_check_v2(struct nfs4_client *clp) =20 } #endif + up_read(&nn->reclaim_str_hashtbl_lock); return -ENOENT; found: - if (crp->cr_princhash.len) { + princhashlen =3D crp->cr_princhash.len; + if (princhashlen) { u8 digest[SHA256_DIGEST_SIZE]; + u8 *pdata; =20 if (clp->cl_cred.cr_raw_principal) principal =3D clp->cl_cred.cr_raw_principal; else if (clp->cl_cred.cr_principal) principal =3D clp->cl_cred.cr_principal; - if (principal =3D=3D NULL) + if (principal =3D=3D NULL) { + up_read(&nn->reclaim_str_hashtbl_lock); return -ENOENT; + } sha256(principal, strlen(principal), digest); - if (memcmp(crp->cr_princhash.data, digest, - crp->cr_princhash.len)) + pdata =3D crp->cr_princhash.data; + if (memcmp(pdata, digest, princhashlen)) { + up_read(&nn->reclaim_str_hashtbl_lock); return -ENOENT; + } } crp->cr_clp =3D clp; + up_read(&nn->reclaim_str_hashtbl_lock); return 0; } =20 @@ -1362,6 +1383,7 @@ nfs4_cld_state_init(struct net *net) for (i =3D 0; i < CLIENT_HASH_SIZE; i++) INIT_LIST_HEAD(&nn->reclaim_str_hashtbl[i]); nn->reclaim_str_hashtbl_size =3D 0; + init_rwsem(&nn->reclaim_str_hashtbl_lock); set_bit(NFSD_NET_TRACK_RECLAIM_COMPLETES, &nn->flags); atomic_set(&nn->nr_reclaim_complete, 0); =20 diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index bc5216bb08ff..5bbc1d2b964a 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -2779,14 +2779,21 @@ static void inc_reclaim_complete(struct nfs4_client= *clp) =20 if (!test_bit(NFSD_NET_TRACK_RECLAIM_COMPLETES, &nn->flags)) return; - if (!nfsd4_find_reclaim_client(clp->cl_name, nn)) + + down_read(&nn->reclaim_str_hashtbl_lock); + if (!nfsd4_find_reclaim_client(clp->cl_name, nn)) { + up_read(&nn->reclaim_str_hashtbl_lock); return; + } if (atomic_inc_return(&nn->nr_reclaim_complete) =3D=3D nn->reclaim_str_hashtbl_size) { + up_read(&nn->reclaim_str_hashtbl_lock); printk(KERN_INFO "NFSD: all clients done reclaiming, ending NFSv4 grace = period (net %x)\n", clp->net->ns.inum); nfsd4_end_grace(nn); + return; } + up_read(&nn->reclaim_str_hashtbl_lock); } =20 static void expire_client(struct nfs4_client *clp) @@ -7097,10 +7104,15 @@ static bool clients_still_reclaiming(struct nfsd_ne= t *nn) =20 if (test_bit(NFSD_NET_GRACE_END_FORCED, &nn->flags)) return false; - if (test_bit(NFSD_NET_TRACK_RECLAIM_COMPLETES, &nn->flags) && - atomic_read(&nn->nr_reclaim_complete) =3D=3D - nn->reclaim_str_hashtbl_size) - return false; + if (test_bit(NFSD_NET_TRACK_RECLAIM_COMPLETES, &nn->flags)) { + int size; + + down_read(&nn->reclaim_str_hashtbl_lock); + size =3D nn->reclaim_str_hashtbl_size; + up_read(&nn->reclaim_str_hashtbl_lock); + if (atomic_read(&nn->nr_reclaim_complete) =3D=3D size) + return false; + } if (!test_bit(NFSD_NET_SOMEBODY_RECLAIMED, &nn->flags)) return false; clear_bit(NFSD_NET_SOMEBODY_RECLAIMED, &nn->flags); @@ -9270,9 +9282,13 @@ bool nfs4_has_reclaimed_state(struct xdr_netobj name, struct nfsd_net *nn) { struct nfs4_client_reclaim *crp; + bool found; =20 + down_read(&nn->reclaim_str_hashtbl_lock); crp =3D nfsd4_find_reclaim_client(name, nn); - return (crp && crp->cr_clp); + found =3D (crp && crp->cr_clp); + up_read(&nn->reclaim_str_hashtbl_lock); + return found; } =20 /* @@ -9285,10 +9301,39 @@ nfs4_client_to_reclaim(struct xdr_netobj name, stru= ct xdr_netobj princhash, unsigned int strhashval; struct nfs4_client_reclaim *crp; =20 + down_write(&nn->reclaim_str_hashtbl_lock); + + /* + * A reclaim record for this client name may already exist (for + * example, populated at boot from the recovery directory before + * an in-grace RECLAIM_COMPLETE or an nfsdcld downcall delivers + * the same name). Dedup here so reclaim_str_hashtbl_size stays + * equal to the number of distinct client names; inc_reclaim_complete + * relies on that equality to end the grace period via the fast path. + */ + crp =3D nfsd4_find_reclaim_client(name, nn); + if (crp) { + if (princhash.len && crp->cr_princhash.len =3D=3D 0) { + void *pdata =3D kmemdup(princhash.data, princhash.len, + GFP_KERNEL); + if (pdata) { + crp->cr_princhash.data =3D pdata; + crp->cr_princhash.len =3D princhash.len; + } else { + dprintk("%s: failed to allocate memory for princhash.data!\n", + __func__); + crp =3D NULL; + } + } + up_write(&nn->reclaim_str_hashtbl_lock); + return crp; + } + name.data =3D kmemdup(name.data, name.len, GFP_KERNEL); if (!name.data) { dprintk("%s: failed to allocate memory for name.data!\n", __func__); + up_write(&nn->reclaim_str_hashtbl_lock); return NULL; } if (princhash.len) { @@ -9297,6 +9342,7 @@ nfs4_client_to_reclaim(struct xdr_netobj name, struct= xdr_netobj princhash, dprintk("%s: failed to allocate memory for princhash.data!\n", __func__); kfree(name.data); + up_write(&nn->reclaim_str_hashtbl_lock); return NULL; } } else @@ -9316,6 +9362,7 @@ nfs4_client_to_reclaim(struct xdr_netobj name, struct= xdr_netobj princhash, kfree(name.data); kfree(princhash.data); } + up_write(&nn->reclaim_str_hashtbl_lock); return crp; } =20 @@ -9335,6 +9382,7 @@ nfs4_release_reclaim(struct nfsd_net *nn) struct nfs4_client_reclaim *crp =3D NULL; int i; =20 + down_write(&nn->reclaim_str_hashtbl_lock); for (i =3D 0; i < CLIENT_HASH_SIZE; i++) { while (!list_empty(&nn->reclaim_str_hashtbl[i])) { crp =3D list_entry(nn->reclaim_str_hashtbl[i].next, @@ -9343,6 +9391,7 @@ nfs4_release_reclaim(struct nfsd_net *nn) } } WARN_ON_ONCE(nn->reclaim_str_hashtbl_size); + up_write(&nn->reclaim_str_hashtbl_lock); } =20 /* --=20 2.54.0 From nobody Mon Jun 8 09:49:04 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9DB4F3C455B; Sat, 30 May 2026 13:19:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780147181; cv=none; b=FT+j1CX93QkuOqvVMXMiiTSxakLMZdPQKsBfcAXhHynGbD6LIMq4o2+Odrrx2ERapGocXdK9qOBoEA6QPnRzp5gOV6r94WNC4wlo0TTJyhDYXiUImeWgt62RH81ppJdwY39kmF4c69dI2aCSKVXYUw6boUBMub4OAYnuZ2p5p/4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780147181; c=relaxed/simple; bh=sot5+tKVgtaC160u4M8TGQraPyvQTXydRcusfzEUg2I=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=TC7VTRRkKiKIZwLGps9pMigs+y8WhZorHpxMF9AIMyMEBYL18tNF1flnFr1EES1HITsTR5/PxxTPtKAc3pcS7/fykut2EFcr07BdF91d7VjwVrJT8EQdFrbmb2dv6UUVoWPJQ9YdghNGC30y3TpQ7clVRfoB86R9G56Oox56Orw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=fGrZ6HQf; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="fGrZ6HQf" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 15D901F0089B; Sat, 30 May 2026 13:19:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780147180; bh=sFaI/a/4xWXZcC/5jNWNbgjAN4Llr7aadrkx5DJdv58=; h=From:Date:Subject:References:In-Reply-To:To:Cc; b=fGrZ6HQfg3L+vnGq0fAjNDiZScVKSKf1jaofAzBm8XPt3hjFWTE/HcKRP/AACec+p anoWRoV73KlAEfaFGwiWdTPNUDBM/7FV3m3IPV0XPnyOBbQlccxXr0+isf+08BTJji x8p7zaodcfYDqg+EgXGzpPv1hgHAC093Ah+Y4nmUPluGhjilcYjBCd32SbE/cO62Ic c1I7bmqGfgLJU+Vi8OG2pT7ffjlYRDN1P+IHcv9eqpYFzufCKGLqjdgvh59YHor2Wn nf8FDj3ispwmdb8gVrhyP93FOktX+zHHycYknuHfbQtM+ELbTglVkNXrrnVksff8UE TPrKtsa0i1Emw== From: Jeff Layton Date: Sat, 30 May 2026 09:19:21 -0400 Subject: [PATCH v2 5/9] nfsd: gate nfs3 setacl by argp->mask Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260530-nfsd-fixes-v2-5-f27e8eb4d974@kernel.org> References: <20260530-nfsd-fixes-v2-0-f27e8eb4d974@kernel.org> In-Reply-To: <20260530-nfsd-fixes-v2-0-f27e8eb4d974@kernel.org> To: Chuck Lever , NeilBrown , Olga Kornievskaia , Dai Ngo , Tom Talpey , "J. Bruce Fields" , Scott Mayhew , Trond Myklebust , Andreas Gruenbacher , Mike Snitzer , Rick Macklem Cc: Chris Mason , linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org, Jeff Layton X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=openpgp-sha256; l=2589; i=jlayton@kernel.org; h=from:subject:message-id; bh=gO0GOBKMvg8PUpOwP9lvo2CvRA28Q1EzXJkDWsmA54g=; b=owEBbQKS/ZANAwAKAQAOaEEZVoIVAcsmYgBqGuPhADN7oq4012TcNukxq4S4xYMcXga3Ok6b7 EifOYACBSCJAjMEAAEKAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCahrj4QAKCRAADmhBGVaC FcSYD/0St2HxGCu3cNzD/54z9a3K9vshABwbubiW7ACGJoGYKkdsl6ZyKb/gcHVTRxVf9isGQ7v +DY4TCUo0WvGxV+gkJdeu9cp+6sP/0m6HaFrutNBnvOUagtPVQ9nSuhGS5qrFlyVHu0foYaP+Ny AI1iqpoSbaHKHaBhnxjmJ3WQoU9LSIPciUEb3Uc440ZPRj6bfwvlbTFFIncVc4HYGnbnBfRWJEK /ZBXFZs+rHSURJwDK81VIiUw5hj3xcuzrH7fShzGO6IdiU3LJTPC+FKT0Q3mYAAaSoPGIzAKZWm r549A/N4U2YxjCaX26AOsxRCXBd/8DEynu9EswITqgmS9r+h1bLXc1wq0MEqxizoUS/2deR2M9Q QC+1OJTeRbjH3fjbVzWQJ/o5VSczUSl6Ym9QPI43KBRiXkg3BdxZJv8nwJcta7w2lJQQYn8JyK8 oWunY13+xmSqOOAE6418lXyFh+q7srG25LyvR1iQFopmuizUSGwIpPvKmNUTGcAZIrUaeHORPoU nD2mBBLiVb9b4vgLmgHtEiaaf8UbYgldc+2FQmlKInj6u2gVd0+37HGqQG0iNlF9Cdv4kQDl0N7 gEtkwquEkFEyePkW1HHM5Lw9Jijk8GOSM29hn5icJRh7AneUiVQ7YMuPzs8swjngk7HbgXpmi5L fqfIgaJ9cd5AiWA== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 From: Chris Mason nfsd3_proc_setacl() calls set_posix_acl() unconditionally for both ACL_TYPE_ACCESS and ACL_TYPE_DEFAULT, passing argp->acl_access and argp->acl_default verbatim. The NFSv3 ACL decoder only populates those pointers when the corresponding mask bit is set: nfs3svc_decode_setaclargs() if (args->mask & NFS_ACL) decode into acl_access if (args->mask & NFS_DFACL) decode into acl_default /* otherwise the pointer stays NULL (pc_argzero) */ nfsd3_proc_setacl() set_posix_acl(.., ACL_TYPE_ACCESS, argp->acl_access) set_posix_acl(.., ACL_TYPE_DEFAULT, argp->acl_default) set_posix_acl(idmap, dentry, type, NULL) is the VFS "remove this ACL type" operation. A NULL pointer that means "the client did not send this arm" is therefore indistinguishable from "the client asked to remove this ACL". A SETACL with mask=3DNFS_ACL silently drops the directory's default ACL; mask=3D0 drops both. The sibling nfsd3_proc_getacl() already consults argp->mask before touching each arm; mirror that in setacl. Fix by wrapping each set_posix_acl() call in the matching mask bit check and initializing error to 0 before inode_lock so that a request with neither bit set leaves the on-disk ACLs untouched and returns nfs_ok. The out_drop_lock path and the unconditional posix_acl_release() at out: are preserved; both NULL-tolerate the skipped arms. Fixes: a257cdd0e217 ("[PATCH] NFSD: Add server support for NFSv3 ACLs.") Assisted-by: kres:claude-opus-4-7 Reported-by: Chris Mason Signed-off-by: Chris Mason --- fs/nfsd/nfs3acl.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/fs/nfsd/nfs3acl.c b/fs/nfsd/nfs3acl.c index e87731380be8..a87f9d7f32be 100644 --- a/fs/nfsd/nfs3acl.c +++ b/fs/nfsd/nfs3acl.c @@ -105,12 +105,17 @@ static __be32 nfsd3_proc_setacl(struct svc_rqst *rqst= p) =20 inode_lock(inode); =20 - error =3D set_posix_acl(&nop_mnt_idmap, fh->fh_dentry, ACL_TYPE_ACCESS, - argp->acl_access); - if (error) - goto out_drop_lock; - error =3D set_posix_acl(&nop_mnt_idmap, fh->fh_dentry, ACL_TYPE_DEFAULT, - argp->acl_default); + error =3D 0; + if (argp->mask & NFS_ACL) { + error =3D set_posix_acl(&nop_mnt_idmap, fh->fh_dentry, + ACL_TYPE_ACCESS, argp->acl_access); + if (error) + goto out_drop_lock; + } + if (argp->mask & NFS_DFACL) { + error =3D set_posix_acl(&nop_mnt_idmap, fh->fh_dentry, + ACL_TYPE_DEFAULT, argp->acl_default); + } =20 out_drop_lock: inode_unlock(inode); --=20 2.54.0 From nobody Mon Jun 8 09:49:04 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 582EC3C5DCD; Sat, 30 May 2026 13:19:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780147183; cv=none; b=kc5wZPQB6Ew5kvozMp6h1x8UhGK3ONVR/zSsbwMQl5MR+T2AsJcufkWo904RjHmTuDkRLlTE/Nwmy2xe8+s7zk9rVH7DGlQyezKdWRo9jxeiXhq1EYWJU7U0RS3FJIRUR/w7d9jgvIKw/bUE4+iYXbL/G9QdnWnwZZYhoTrIxDM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780147183; c=relaxed/simple; bh=biIgfr91EnCBGwV20bIsbijQLD/T7mIzZF5vlT0w8+A=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=iqxfN2xTCITlCM/x+1V9Ta+AnCU/WWRHgGgGaHjqrTGb378FoCGEmXdq7O2dzXdozoXK0jYRm+K2V2ZGtX6PiSO+NnNPsug/LfpVWT9aV/J2Gb+8Ou9CTdnoPQEvU9J+KGr/amwKdNq7iI8Omq5jBdgf43NcU3IAAGgJEu9FqGI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=opCpewAf; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="opCpewAf" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9D25C1F00899; Sat, 30 May 2026 13:19:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780147181; bh=j1SGvhP04m7LjXX1rgheDhvNoRzX0v1uPfNOcNP4jE4=; h=From:Date:Subject:References:In-Reply-To:To:Cc; b=opCpewAfPiY4+zJkcbKhWkJYD4+0o8g2T772R/ebhsQE49tukPMySYK+QX6IQ4NXY Bpp1sAkl32h8vz0b0xJjmINchyFQW6I+ohxJRF6Kmn83YZwIBG7QH5z4rwPJEy1nu+ AWXFkwBk7ntIFYWKdheabKvdS2Ugy9soQtslYBnIWMvYBkAd82eywcSQjnY+ZjsNZs 1HgQ6kmdYq1mbxtuOtiFEIT9O3FV0f4HiyK5wxlvN+UQ3VxAeJqOOq9J4C8EjhHMsf qMKQw7Gt6laYjOqkMU7Hg89xJx8PzsIbFvCW7bretSPKpSx3C9/eKdcU2kimZGvvYe IOeNQtuqKJ8Nw== From: Jeff Layton Date: Sat, 30 May 2026 09:19:22 -0400 Subject: [PATCH v2 6/9] NFSD: check truncate permission under inode lock Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260530-nfsd-fixes-v2-6-f27e8eb4d974@kernel.org> References: <20260530-nfsd-fixes-v2-0-f27e8eb4d974@kernel.org> In-Reply-To: <20260530-nfsd-fixes-v2-0-f27e8eb4d974@kernel.org> To: Chuck Lever , NeilBrown , Olga Kornievskaia , Dai Ngo , Tom Talpey , "J. Bruce Fields" , Scott Mayhew , Trond Myklebust , Andreas Gruenbacher , Mike Snitzer , Rick Macklem Cc: Chris Mason , linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org, Jeff Layton X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=openpgp-sha256; l=3086; i=jlayton@kernel.org; h=from:subject:message-id; bh=ub0qxmYkGrQH3JKYormkTZYBwNi2mgWx/DdAJ5Wa+/M=; b=owEBbQKS/ZANAwAKAQAOaEEZVoIVAcsmYgBqGuPi+2dIfA1x7nuLryLPP9WOz2LtA54itsURj 8WcUH8SDhuJAjMEAAEKAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCahrj4gAKCRAADmhBGVaC FRFdD/4xTg+sA6mYjj89oepEQzGL8H5W8LabQu9GyR6NjuMpkJwrI69GNMq1buBiBTcj9jFBoJg ujE/n3skn7pAd4YnoqC/lz6/sw92TDY9rv3RTKBQjwReVvNOpNljeFhDlmJzEF3237lNMkUC02p 3kXBzEdU2Mpmw7+ucg5UFww3d/FeJ5h+GfhYnlj4aWFTbOd3deOJhuZWN9mubCXl8lvGrCflvKO y7Q58HgcGf5/huO4K4xjDe1jS7K2CPBXZrEnVAocy7LpprIRcecW08IrpjpuyPn5hPuNoosF8Dq nBzd6P4FxzHZsk/Gn+4m/6V1qYQUjgIizboWvnSsYxOQgEOdS6NJRmBD3XiVIbdegvpA+UXo9Fy 4DDssGI9TnfeekqObrK99e68qNHykkIOAAb4mcz50J4BQO3hzBUsDYwKTvfwVAbylQ/ue60r/IA rfN1J/D+JsKYzPoy+6psxElE2wJf5oVTYZ4CyHS6ER9Vozz6nXm+85wawBb9tzNPjFEJoYeC6ZG /efTFDrDlOXwAn3d1pMb8ZHk5Dn7Ep2nBLmHfVyRYRWgjRtpVKH/fxkeDrYfOKl4Q7yilFKpx80 FvwPlbq1Xy+dsob65NjSdrlxeSzmIhwsZFUAnNxoCs32aqrN1mS3V6V8qiPUROoyzuf2WEUICQA noxbk6purr5dcOg== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 From: Chuck Lever nfsd_setattr() checks whether a size update needs NFSD_MAY_TRUNC before it takes inode_lock(). The comparison uses the file size sampled by that unlocked read, but the actual ATTR_SIZE update is applied later under inode_lock() by notify_change(). This leaves a TOCTOU window for append-only files. If a client sends a SETATTR that does not shrink the file at the time of the unlocked sample, a concurrent append can extend the file before nfsd_setattr() takes inode_lock(). notify_change() then applies a real truncation without the NFSD_MAY_TRUNC check that rejects IS_APPEND(inode). The VFS truncate syscall paths perform their own append-only checks before calling notify_change(), so NFSD must make this decision against the locked size it is about to change. Split the write-count acquisition from the truncation permission check. Keep get_write_access() before the locked setattr work, then recheck whether the requested size is below i_size_read(inode) after inode_lock() has been acquired and before notify_change(ATTR_SIZE). This also avoids the plain unlocked inode->i_size load. Fixes: 783112f7401f ("nfsd: special case truncates some more") Assisted-by: kres:claude-opus-4-7 Reported-by: Chris Mason Signed-off-by: Chuck Lever Signed-off-by: Jeff Layton --- fs/nfsd/vfs.c | 30 ++++++++++++++++++------------ 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index 7e6468bdc723..07a9a68408ba 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -419,21 +419,22 @@ nfsd_sanitize_attrs(struct inode *inode, struct iattr= *iap) } =20 static __be32 -nfsd_get_write_access(struct svc_rqst *rqstp, struct svc_fh *fhp, - struct iattr *iap) +nfsd_may_truncate(struct svc_rqst *rqstp, struct svc_fh *fhp, + struct iattr *iap) { struct inode *inode =3D d_inode(fhp->fh_dentry); =20 - if (iap->ia_size < inode->i_size) { - __be32 err; + if (iap->ia_size >=3D i_size_read(inode)) + return nfs_ok; =20 - err =3D nfsd_permission(&rqstp->rq_cred, - fhp->fh_export, fhp->fh_dentry, - NFSD_MAY_TRUNC | NFSD_MAY_OWNER_OVERRIDE); - if (err) - return err; - } - return nfserrno(get_write_access(inode)); + return nfsd_permission(&rqstp->rq_cred, fhp->fh_export, fhp->fh_dentry, + NFSD_MAY_TRUNC | NFSD_MAY_OWNER_OVERRIDE); +} + +static __be32 +nfsd_get_write_access(struct svc_fh *fhp) +{ + return nfserrno(get_write_access(d_inode(fhp->fh_dentry))); } =20 static int __nfsd_setattr(struct dentry *dentry, struct iattr *iap) @@ -560,12 +561,17 @@ nfsd_setattr(struct svc_rqst *rqstp, struct svc_fh *f= hp, * setattr call. */ if (size_change) { - err =3D nfsd_get_write_access(rqstp, fhp, iap); + err =3D nfsd_get_write_access(fhp); if (err) return err; } =20 inode_lock(inode); + if (size_change) { + err =3D nfsd_may_truncate(rqstp, fhp, iap); + if (err) + goto out_unlock; + } err =3D fh_fill_pre_attrs(fhp); if (err) goto out_unlock; --=20 2.54.0 From nobody Mon Jun 8 09:49:04 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 929573C2BAC; Sat, 30 May 2026 13:19:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780147184; cv=none; b=J/QAwXr8M5j2BK6ZyNTs0h6Yvz/0XYyCjX+3nReK8BhFgwpDE8fc2CMhIVyw2DYdSH170+0tRCEFYbwjH4b3mS0GxwsWoW3MA4sIDdx1y6qCPJTyPcQpxXykTD5HNqU0VN9hLpz+aHTjwJ1FEAaJ07ty03PaZIqVd5bCy7U53WY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780147184; c=relaxed/simple; bh=yIZRNwjDCyh2HZes6RC2XeEH/IL+x6J+MIjmstzfZNI=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=qaer4NUn2EbWWuZKYRH18Zd67MUEtlQcvWqVFwCklgaXkxVrswM37W0mZVMtrtslyyOlZsLFEUvsoUA9E6FUNc3FytC19H5FojrIOOysagRPAFFIZcacfp0VeHejYNl/BAhowUH5KeLCiaRj85u7CNxRWDfWRWyS9P0Y14V0h4Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=YfT8II0r; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="YfT8II0r" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2F0161F0089B; Sat, 30 May 2026 13:19:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780147183; bh=vdCQo/MlgpG3cQHwUX7abR52uTne4QlqANsOSZIJmGE=; h=From:Date:Subject:References:In-Reply-To:To:Cc; b=YfT8II0rli/P42zXSc4UjNfIZxp6ThIYkT5JDoI2XwKVWGjonMe2TNvBxFZaN19OR qyZAM5BExBZeFuK0rNtdi7hjuTATknRFKyfzFgUCb0G7yNGUCMiEBODELw3tCpN68x 89tmBN2tAJojyXgqE00mhXyppdaKmOBP2cvgL8dcUU86Nxvo5b+R0Qd5T7RcpiMVJH iYXbqbGbWVTDvPnv9CJZIxEknnwObVFbeDLlOX2zeV4+JXD6unmSNLBv4lPCwwzErW coe5QJFgLLimYNHl4NanzWWgpirKRACOrYU9U57i0RCqoNrHczONuuIf3kjbkvYbBB +0zKmPR61HnrA== From: Jeff Layton Date: Sat, 30 May 2026 09:19:23 -0400 Subject: [PATCH v2 7/9] nfsd: fix partial-write detection in nfsd_direct_write Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260530-nfsd-fixes-v2-7-f27e8eb4d974@kernel.org> References: <20260530-nfsd-fixes-v2-0-f27e8eb4d974@kernel.org> In-Reply-To: <20260530-nfsd-fixes-v2-0-f27e8eb4d974@kernel.org> To: Chuck Lever , NeilBrown , Olga Kornievskaia , Dai Ngo , Tom Talpey , "J. Bruce Fields" , Scott Mayhew , Trond Myklebust , Andreas Gruenbacher , Mike Snitzer , Rick Macklem Cc: Chris Mason , linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org, Jeff Layton X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=openpgp-sha256; l=2462; i=jlayton@kernel.org; h=from:subject:message-id; bh=IQbABHhBX2jwO/cRadiSto8B2Or6VNR8U8NOL2nNkTM=; b=owEBbQKS/ZANAwAKAQAOaEEZVoIVAcsmYgBqGuPiHImNW5NnoBhOWp28KxD1i1S/Hp3X70lOz QmbH0ZrJHKJAjMEAAEKAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCahrj4gAKCRAADmhBGVaC FXKID/92WtmJp44etI2SZRH+u/CbJTbe/pc3Ogem7MFrJAOTLWVKuWGNVL16oveOZgEU39KrUPl PZAb7QXPn7G+IBvPUNC0oPNNNCdNrozH9+m2ZPx2XZ3N7pphF3eJ/a8FLIocpe33SJN4T/yKCjA XszqPuRdeBlK+tnqAP+951jGF7wPwo+AXLCaCdF+xsZnxEi+1d3ujURRsWpnvfoXPVlGXslR4T9 HoK3mCFgk0QrTVquHYRDQ4Wia68wJhK/CnMJ5tuRqu5OOYLxIn5ajbK59jlqapEGrWRY3dMt+E6 eYBtItL4JzRX5dX5Vcv8Tq+RHg9/i8izaUEgtTz4RWHRezJtn/DtE1hFBhysdwC/8jLr3Hd4scL 32QupMhBpJFcoLgCk3jua7kiyzZJuncSoKQrIHTZ8G06TaoFAmXNU70/KMI8c0qTX2/I7ahPZr9 DeE6PM2Qw2Skkswl0jnT3kTiTptEpKdVMFL5H3zfCtD/2fyO4naSBoB/OJyre0/JrlxvuikSiTH 9wUrYxUKwZxsorE9s0TWczSWG11NSojSKuBIQXxlw3Yt1ilnhthfJyxHykaAngywV63ovaxx/ji LK+cgC4vAmZECBQxXuqOnm1HHI8AebDsO8tRIKo1m72M5SdGbr8Hll+zv00c7bes8l+zJ71d4iV hk1l0iM88En3Rvg== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 From: Chris Mason nfsd_direct_write() walks a list of write segments and, after each vfs_iocb_iter_write(), tries to detect a short write so the loop can stop before placing the next segment at a wrong file offset: host_err =3D vfs_iocb_iter_write(file, kiocb, &segments[i].iter); if (host_err < 0) return host_err; *cnt +=3D host_err; if (host_err < segments[i].iter.count) break; /* partial write */ vfs_iocb_iter_write() runs the iter through ->write_iter(), which advances the iter by the number of bytes written. By the time the check runs, segments[i].iter.count is the residual, not the original request length: before write_iter: iter.count =3D=3D original_len after write_iter: iter.count =3D=3D original_len - host_err The condition then reduces to host_err < original_len - host_err, so the break fires only when less than half of the segment was written. Any short write completing between 50% and 99% of the segment slips through; the loop advances to the next segment with kiocb->ki_pos only bumped by the short amount, writing the next segment's payload at the wrong offset and over-reporting *cnt to the NFS client. Snapshot the segment's byte count before the write and compare host_err against that snapshot so any short write breaks the loop. Fixes: 06c5c97293e3 ("NFSD: Implement NFSD_IO_DIRECT for NFS WRITE") Assisted-by: kres:claude-opus-4-7 Reported-by: Chris Mason Signed-off-by: Chris Mason --- fs/nfsd/vfs.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index 07a9a68408ba..62b56d73432a 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -1380,6 +1380,7 @@ nfsd_direct_write(struct svc_rqst *rqstp, struct svc_= fh *fhp, struct file *file =3D nf->nf_file; unsigned int nsegs, i; ssize_t host_err; + size_t expected; =20 nsegs =3D nfsd_write_dio_iters_init(nf, rqstp->rq_bvec, nvecs, kiocb, *cnt, segments); @@ -1401,11 +1402,13 @@ nfsd_direct_write(struct svc_rqst *rqstp, struct sv= c_fh *fhp, kiocb->ki_flags |=3D IOCB_DONTCACHE; } =20 + expected =3D iov_iter_count(&segments[i].iter); + host_err =3D vfs_iocb_iter_write(file, kiocb, &segments[i].iter); if (host_err < 0) return host_err; *cnt +=3D host_err; - if (host_err < segments[i].iter.count) + if (host_err < (ssize_t)expected) break; /* partial write */ } =20 --=20 2.54.0 From nobody Mon Jun 8 09:49:04 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 576113C9ECF; Sat, 30 May 2026 13:19:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780147186; cv=none; b=eB2vw0OpJumc49KLJPXaW9fKLpUnbziprZC1CkP7NaKOpYYblfDaiQz63sdiaywa7/pSSlb/de6h3Zdvze3hihWHYUNCT0Pmd9/twZkQ6KnSBVTJuF6QNpgigMP3yLe7eN76wpU9YgG3fhjE0/p+4LYtJTUSqibiQASz1HtjDk8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780147186; c=relaxed/simple; bh=NbRlfS2hKj/KAZsTryBykM3B03U4/PXef6fILxnBTP8=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=gvbdmpZOaoeBPHQuWTVicF6nNyzTrIPvNIhn7sEga87Xj2Zuk6zm9neNUuaMe695h7YL5aIWJ63gfpLTrW4nJCPgHxdV6+kT8rbPt9GutQ0VFEwV1CxB9fVra2bM9QnFenPAI9S1nFc5hP3XKhSPXuFCTO4q+uXCVNnvgDIbIpo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=f9ELSfSy; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="f9ELSfSy" Received: by smtp.kernel.org (Postfix) with ESMTPSA id B62271F0089A; Sat, 30 May 2026 13:19:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780147185; bh=pM0Cde0iN0QU5Gn0jKc8lGq+Fi5zQvJL/t7Pg3WNv+E=; h=From:Date:Subject:References:In-Reply-To:To:Cc; b=f9ELSfSyy5bz0IC4KE7vd5MS3jzGjbwa7Smq88HI5ytLl/UUT5C9Sakc+HihRJ2Gh wQJB+P+v6i7He7Pg0wJmUNLL9dxZHfGv1K0/qkDoHIBVi+0YoY1h6Jz4d+i0eZkIFj UVwG3LszHXJAnT0AKrTHVMuCAVtWTJyMjhvlTTcUAmMqqI/7qWQ1kXiptHvweusB8/ 1cCW9W7jHZkpz4dAAYVTNQfklKH/lRPxHc94vadX9ZZ8k82+aUl1DJuAWD1DavxYDt VsaTttybHrb59uj9WJscSzpw2vHOVOiUjWW1Fz+oH49XaRpu7ySfuwbk5pyJEipNAr UBzgcgTWGkVtQ== From: Jeff Layton Date: Sat, 30 May 2026 09:19:24 -0400 Subject: [PATCH v2 8/9] nfsd: cap decoded POSIX ACL count to bound sort cost Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260530-nfsd-fixes-v2-8-f27e8eb4d974@kernel.org> References: <20260530-nfsd-fixes-v2-0-f27e8eb4d974@kernel.org> In-Reply-To: <20260530-nfsd-fixes-v2-0-f27e8eb4d974@kernel.org> To: Chuck Lever , NeilBrown , Olga Kornievskaia , Dai Ngo , Tom Talpey , "J. Bruce Fields" , Scott Mayhew , Trond Myklebust , Andreas Gruenbacher , Mike Snitzer , Rick Macklem Cc: Chris Mason , linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org, Jeff Layton X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=openpgp-sha256; l=2297; i=jlayton@kernel.org; h=from:subject:message-id; bh=V05w/EewdcZtlzm9cb8T88Z3VDzp6smQDgz+JVDOueY=; b=owEBbQKS/ZANAwAKAQAOaEEZVoIVAcsmYgBqGuPiH7lDADAHTadcmL9VEGOOl9xG0vvceCxys 8OrCjvybtqJAjMEAAEKAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCahrj4gAKCRAADmhBGVaC FdYkD/9FkeeQiY+HHTlfVb48/pzdFSkrDYE5+OD7zqJuiUiMp+/RbaDwC8kfssbc6bYj3Yn6VWm CFX92W/c5rpJMH2Ndh4/1FqnOvfkl5FV+tkjw1xIpiclOEps8q5Qf2mgjlB3rDjnu+/D3Z9xWNA MKy4ofctNa1p3XdR1TAsYaacbTfBG8jUdvTr/Lsis8aAO0eV5YkmmJzxHY5mLqhR6okIknb1yTE q4RkamSkV/QnG75+sxuc8XESGIcRCfyA2Nr9Xf2BoMK+GKON7HDFnV0fvgC31KBiuyi5hoM4VcM xvmhDz8G3y0jTcmTO8ZPlswQ4TpWFirgWHt689XTD/uiaXAKfUn+iwSw7UeIE2LrexBIIXU+wX2 pdzE4N7lR5T5AVXyUb/nwOQDf+U+Qny9CzHXut950v0WGHHUooRdOd8SJLJOqiCBVTMBLRz4dNh pn43ICOebQXYsMqcxMqzCgIotxANXcuHyJBYT3RsGB7DFqxzdH1fg3FUq1nO3kcVSX9YrGfzS1l BA/VfH6MGz5yMEhGCEQ6q9lPvAtMRa0u67AGS2fpOo9q5d8QjWcV47014zo4fWQjXgD7BV00NBU J9I44IquI8FzF1EorfNULepAYxD2kGng5eaDvlGLH/4kugZHIkPIC7h2qFOgw3os0vs76kftVbY h7VrycLiUMx7rWg== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 From: Chris Mason nfsd4_decode_posixacl() reads a u32 entry count off the wire and passes it straight to posix_acl_alloc() and sort_pacl_range(). The latter is an O(n^2) bubble sort, so a client-chosen count drives unbounded CPU in the server's compound processing path. nfsd4_decode_posixacl() xdr_stream_decode_u32(&count) /* uncapped u32 */ posix_acl_alloc(count, GFP_KERNEL) sort_pacl_range(*acl, 0, count - 1) /* O(n^2) bubble sort */ The encoder side in the same file already rejects ACLs whose a_count exceeds NFS_ACL_MAX_ENTRIES, but the decoder introduced in commit 5fc51dfc2eb1 ("NFSD: Add support for XDR decoding POSIX draft ACLs") omitted the symmetric check. Fix by rejecting a wire count greater than NFS_ACL_MAX_ENTRIES with nfserr_inval, before any allocation, so the sort is bounded by NFS_ACL_MAX_ENTRIES^2 comparisons. While we're in here, also fix the nfserr_resource return if posix_acl_alloc() fails. That's not a legal error code for v4.1+. Change it to return nfserr_jukebox as that's more appropriate for memory allocation failures. Fixes: 5fc51dfc2eb1 ("NFSD: Add support for XDR decoding POSIX draft ACLs") Assisted-by: kres:claude-opus-4-7 Reported-by: Chris Mason Signed-off-by: Chris Mason --- fs/nfsd/nfs4xdr.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c index c6c50c376b23..508f6986842f 100644 --- a/fs/nfsd/nfs4xdr.c +++ b/fs/nfsd/nfs4xdr.c @@ -449,9 +449,18 @@ nfsd4_decode_posixacl(struct nfsd4_compoundargs *argp,= struct posix_acl **acl) if (xdr_stream_decode_u32(argp->xdr, &count) < 0) return nfserr_bad_xdr; =20 + /* + * The NFSv4 POSIX ACL draft doesn't define a max number of ACE's, but + * the NFSACL spec does. For NFSv4, cap the number of entries to the v3 + * limit, as we want to ensure that ACLs set via NFSv4 POSIX ACL + * extensions are retrievable via NFSACL. + */ + if (count > NFS_ACL_MAX_ENTRIES) + return nfserr_inval; + *acl =3D posix_acl_alloc(count, GFP_KERNEL); if (*acl =3D=3D NULL) - return nfserr_resource; + return nfserr_jukebox; =20 (*acl)->a_count =3D count; for (ace =3D (*acl)->a_entries; ace < (*acl)->a_entries + count; ace++) { --=20 2.54.0 From nobody Mon Jun 8 09:49:04 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B445E3CAA57; Sat, 30 May 2026 13:19:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780147187; cv=none; b=mujNjxj38vH4SK7B5yhFQJfOpi0Oax7i62g2OcnV9G/rjmg2nAZG1cBDLGnlB25BKZkQpxn1Zur3GzkJgZxYM//ou5xviUQWEaCR24/gNRoJ5oMaxWsMdDU9VG/ztuqhsuUa1oUvUCRlpZMLD/5ShL+eOQxawmnNXsslubp5W4g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780147187; c=relaxed/simple; bh=fnkW/YE5vOVVkFeC+zpucTPVdK+0fy6HvVg8/P9y9po=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=sreiVmT1xAe9nioEoj7tkrKtZcLbmtEpO4PLFBkXg6dOHNkgyqlqcJCj+OL6fgUeQqDNAzk6WslnaUDvYxozuiTQwl8m/1DyinjwR2iSntEeRdhSY5FrTTCdzerkArX2hsdWeGo+y1OECQZ28VkcWgxoA1aAk5j9I9UeA2/Dcus= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Z1G9OUoK; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Z1G9OUoK" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4A7AE1F00899; Sat, 30 May 2026 13:19:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780147186; bh=r/xRLR89cQUHFz9b7QBpSXvBCkBRIyacUAUlW9fTv/4=; h=From:Date:Subject:References:In-Reply-To:To:Cc; b=Z1G9OUoKycD/gyXGAzayuOUS8+yKiLJDUUIKl/KIuxL2Yo/v+9uFsTldYI4fKn/pP JZZF7Es4bv0IXuuFVAx2LNhrLFqarc/fGm9liKO3ZwqH7/QV7RzoVffVTk2dVvfnd2 W4sUfN8GPLFM729ZK97vtWxlgO+wID9oFAP3QysVBDZ2sJt1K201o8Zm2SCpKax/er TBgpmZFbZs8TvqoheD7bjNV9K//5KwMqZvJxu1nB3F1RL3XF4JRXJDrHJJYGSOGEcc d/c48w7s889WlUWDZjM3nq4a4AhhloQ3wB70VNlNgr35Gk54MHV4IzAh3eYNHnBzpP WrvSZCB+sK8Kw== From: Jeff Layton Date: Sat, 30 May 2026 09:19:25 -0400 Subject: [PATCH v2 9/9] nfsd: validate symlink target length in NFSv4 CREATE Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260530-nfsd-fixes-v2-9-f27e8eb4d974@kernel.org> References: <20260530-nfsd-fixes-v2-0-f27e8eb4d974@kernel.org> In-Reply-To: <20260530-nfsd-fixes-v2-0-f27e8eb4d974@kernel.org> To: Chuck Lever , NeilBrown , Olga Kornievskaia , Dai Ngo , Tom Talpey , "J. Bruce Fields" , Scott Mayhew , Trond Myklebust , Andreas Gruenbacher , Mike Snitzer , Rick Macklem Cc: Chris Mason , linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org, Jeff Layton X-Mailer: b4 0.14.3 X-Developer-Signature: v=1; a=openpgp-sha256; l=1290; i=jlayton@kernel.org; h=from:subject:message-id; bh=fnkW/YE5vOVVkFeC+zpucTPVdK+0fy6HvVg8/P9y9po=; b=owEBbQKS/ZANAwAKAQAOaEEZVoIVAcsmYgBqGuPiZpLiwa1VA/lErd8L5bmtV//43VMPKATsg 9iQnFhQMhGJAjMEAAEKAB0WIQRLwNeyRHGyoYTq9dMADmhBGVaCFQUCahrj4gAKCRAADmhBGVaC FQQbD/9U1CfVfTjJv7yI/KGYbHo/kArWN+nBpDNvPTivIK0qRJMSwH/Eca2pZa9oCE1OlragC3r 3eVCCg0+SmrvRXPGc2Xx8kQGfAlPQYTv2z/7Aj30oUnlHTSBaNYeCcA/daAfA/TTabCbh7jSAdU UlvN20E/fzkxjY2r9HQCtRItHE/1BH4JbuBbApg8OZ8sNRg54gHI5o6xAwRzV7yJ6GzyeGU5zxb QLEavZvKPi4SwOEhLzpEyoF3wEvfIAfQ++Dr7q3WpLChSwj/Zod/mlw7YLpMZZkamDd+i7q55cO ZMuffe6bzrid3YzDhChH3Yp6WzvfLFdgLb3GJqdXIq2ugewNyEpmuz6cgecVCpHpLsaxA53DBXD 7DKTi9AOn7aLciDjwaBiTKsrZW0JLEAriB+4d6WdnVQGmVLmDeGztYZEQeDZfJpikRgTmKxYamF JUowse8s0t4VqtftT1qxF4b9a6MLHboVdV068KiYnnfxEr/RkLmrD20t2/7BFTEJzGeCXp+zQBT Ydc/3M9ipzKH68wGLgLe7Ens+Ho6C6Kztj0DBeCvmTpNFEFVts0lch1yo5yhLifbU43oh+Edzyf L+36De3yoeJtPs/rH46R7Xk8SsPg/JjwRcyDjh+0zmbTJzl06gvY5aA9qugvKZhokcaRPHBxJan FDhVxR4AU9RpMUQ== X-Developer-Key: i=jlayton@kernel.org; a=openpgp; fpr=4BC0D7B24471B2A184EAF5D3000E684119568215 nfsd4_decode_create() accepts an unbounded cr_datalen from the wire for NF4LNK symlink targets, allowing a client to force a kmalloc of up to the RPC-max size (~1 MiB) per COMPOUND op that persists until compound teardown. The VFS rejects oversized targets with ENAMETOOLONG, but the allocation has already occurred. Reject cr_datalen =3D=3D 0 early with nfserr_inval and cr_datalen >=3D PATH_MAX with nfserr_nametoolong to bound the allocation. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Assisted-by: kres:claude-opus-4-7 Reported-by: Chris Mason Signed-off-by: Jeff Layton --- fs/nfsd/nfs4xdr.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c index 508f6986842f..a5cfce95d2d7 100644 --- a/fs/nfsd/nfs4xdr.c +++ b/fs/nfsd/nfs4xdr.c @@ -964,6 +964,10 @@ nfsd4_decode_create(struct nfsd4_compoundargs *argp, u= nion nfsd4_op_u *u) case NF4LNK: if (xdr_stream_decode_u32(argp->xdr, &create->cr_datalen) < 0) return nfserr_bad_xdr; + if (create->cr_datalen =3D=3D 0) + return nfserr_inval; + if (create->cr_datalen > NFS4_MAXPATHLEN) + return nfserr_nametoolong; p =3D xdr_inline_decode(argp->xdr, create->cr_datalen); if (!p) return nfserr_bad_xdr; --=20 2.54.0