[PATCH v3] smb/client: fix state corruption in smb3_reconfigure multichannel path

DaeMyung Kang posted 1 patch 2 months ago
fs/smb/client/fs_context.c | 46 +++++++++++++++++++++++++++++++++-----
1 file changed, 41 insertions(+), 5 deletions(-)
[PATCH v3] smb/client: fix state corruption in smb3_reconfigure multichannel path
Posted by DaeMyung Kang 2 months ago
smb3_reconfigure() has several state-consistency bugs when handling
a multichannel remount that leave ses->chan_max or cifs_sb->ctx
inconsistent with the actual state. This patch repairs internal
state only; the userspace-visible return value of
smb3_reconfigure() is preserved to match the pre-patch behaviour:
the concurrent-scale loser path still returns -EINVAL, and a
failed smb3_update_ses_channels() is not newly propagated to
userspace by this patch.

Bugs addressed:

1) smb3_sync_ses_chan_max() is called before acquiring
   CIFS_SES_FLAG_SCALE_CHANNELS. If a concurrent operation (e.g.
   smb2_reconnect) holds the flag, the current thread takes the
   loser path and returns -EINVAL, but ses->chan_max has already
   been updated to the new value. chan_max is then out of sync
   with the actual channel state.

2) When smb3_update_ses_channels() fails, ses->chan_max is not
   rolled back. Repeated failures cause chan_max to drift further
   from reality, and subsequent reconnect/reconfigure paths use
   the drifted target.

3) Earlier in smb3_reconfigure(), STEAL_STRING moves UNC, source
   and username from cifs_sb->ctx into ctx, setting
   cifs_sb->ctx->UNC to NULL until smb3_fs_context_dup() copies
   them back near the end of the function. The pre-existing
   CIFS_SES_FLAG_SCALE_CHANNELS loser-path 'return -EINVAL' exits
   inside this window. A loser-path failure therefore permanently
   nulls cifs_sb->ctx->UNC; /proc/mounts shows the device as
   "none" and every subsequent mount.cifs-based remount is
   rejected by smb3_verify_reconfigure_ctx() because mount.cifs
   passes that "none" back as the new UNC.

4) Once (2) rolls ses->chan_max back to the old value on update
   failure, smb3_fs_context_dup() would still copy the rejected
   new ctx->max_channels into cifs_sb->ctx, creating a fresh
   cifs_sb->ctx vs ses->chan_max mismatch. This must be handled
   together with (2) to keep the rollback complete.

Fix all four by:

 - Moving smb3_sync_ses_chan_max() after the SCALE_CHANNELS
   acquire so the loser path cannot corrupt chan_max.

 - Capturing old_chan_max before the sync and restoring it on
   failure while still holding SCALE_CHANNELS so a concurrent
   reconfigure cannot race with the rollback.

 - Recording any multichannel-path failure in a local mchan_rc
   and routing the loser path through a common 'out:' label so
   every exit reaches smb3_fs_context_dup() and cifs_sb->ctx is
   restored. mchan_rc is used only as internal control flow.

 - Before the dup, restoring ctx->multichannel and
   ctx->max_channels from cifs_sb->ctx on mchan_rc so the dup
   does not desync cifs_sb->ctx from the already-rolled-back
   ses->chan_max.

 - Tracking the concurrent-scale loser path with a separate
   scale_busy flag. After cifs_sb->ctx is fully restored, the
   return value is forced to -EINVAL for that path so userspace
   continues to see the pre-patch rejection. dup /
   dfs_cache_remount_fs() failures still take precedence because
   they reflect real state recovery errors.

Deliberately not changed by this patch: the return value of
smb3_reconfigure() on smb3_update_ses_channels() failure. That
path is not newly propagated to userspace here, matching the
asynchronous model used by the mount path (mchan_mount_work_fn).
Those return-semantics and deferred-handling questions are left
to follow-up discussion/patches.

smb3_reconfigure() is not fully transactional: ses->password and
ses->password2 are committed before the multichannel block, so
unrelated earlier state changes may still be visible after a
failed multichannel remount. That is a structural property of
the function and out of scope here.

Tested with a QEMU VM (ksmbd + cifs) using module-parameter
based fault injection:
 - Forced smb3_update_ses_channels() failure via module param
   and verified ses->chan_max is preserved at the old value
   after the remount path runs.
 - Pre-set CIFS_SES_FLAG_SCALE_CHANNELS before entering the
   scaling path and verified the loser path still returns
   -EINVAL, no longer corrupts ses->chan_max, and no longer nulls
   cifs_sb->ctx->UNC.
 - Repeated 19 forced-failure remounts with varying max_channels
   (range 2-8) and confirmed no chan_max drift.
 - After each failure path, verified /proc/mounts continues to
   show the original UNC (//127.0.0.1/share) so subsequent
   remounts are accepted.

Reported-by: RAJASI MANDAL <rajasimandalos@gmail.com>
Closes: https://lore.kernel.org/lkml/CAEY6_V1+dzW3OD5zqXhsWyXwrDTrg5tAMGZ1AJ7_GAuRE+aevA@mail.gmail.com/
Link: https://lore.kernel.org/lkml/xkr2dlvgibq5j6gkcxd3yhhnj4atgxw2uy4eug2pxm7wy7nbms@iq6cf5taa65v/
Fixes: ef529f655a2c ("cifs: client: allow changing multichannel mount options on remount")
Signed-off-by: DaeMyung Kang <charsyam@gmail.com>
---
v3: (feedback from Henrique Carvalho)
 - Drop propagation of smb3_update_ses_channels() failure to
   userspace; preserve the pre-patch best-effort semantics on that
   path.
 - Keep the pre-existing CIFS_SES_FLAG_SCALE_CHANNELS loser-path
   -EINVAL via a separate scale_busy flag so that return is not
   silently converted into success.
 - Reword commit message to describe accurately what userspace-
   visible semantics are preserved and what changed internally.

v2: (feedback from Rajasi Mandal)
 - Route loser-path and update-failure exits through a common
   'out:' label so smb3_fs_context_dup() always runs and
   cifs_sb->ctx->UNC is restored after STEAL_STRING.
 - Restore ctx->multichannel/ctx->max_channels from cifs_sb->ctx
   before the dup so dup does not re-desync cifs_sb->ctx from the
   rolled-back ses->chan_max.

 fs/smb/client/fs_context.c | 46 +++++++++++++++++++++++++++++++++-----
 1 file changed, 41 insertions(+), 5 deletions(-)

diff --git a/fs/smb/client/fs_context.c b/fs/smb/client/fs_context.c
index b9544eb0381b..aaa364a3f60d 100644
--- a/fs/smb/client/fs_context.c
+++ b/fs/smb/client/fs_context.c
@@ -1085,10 +1085,12 @@ static int smb3_reconfigure(struct fs_context *fc)
 	struct dentry *root = fc->root;
 	struct cifs_sb_info *cifs_sb = CIFS_SB(root->d_sb);
 	struct cifs_ses *ses = cifs_sb_master_tcon(cifs_sb)->ses;
+	unsigned int old_chan_max;
 	unsigned int rsize = ctx->rsize, wsize = ctx->wsize;
 	char *new_password = NULL, *new_password2 = NULL;
 	bool need_recon = false;
-	int rc;
+	bool scale_busy = false;
+	int rc, mchan_rc = 0;
 
 	if (ses->expired_pwd)
 		need_recon = true;
@@ -1170,25 +1172,38 @@ static int smb3_reconfigure(struct fs_context *fc)
 	if ((ctx->multichannel != cifs_sb->ctx->multichannel) ||
 	    (ctx->max_channels != cifs_sb->ctx->max_channels)) {
 
-		/* Synchronize ses->chan_max with the new mount context */
-		smb3_sync_ses_chan_max(ses, ctx->max_channels);
-		/* Now update the session's channels to match the new configuration */
 		/* Prevent concurrent scaling operations */
 		spin_lock(&ses->ses_lock);
 		if (ses->flags & CIFS_SES_FLAG_SCALE_CHANNELS) {
 			spin_unlock(&ses->ses_lock);
 			mutex_unlock(&ses->session_mutex);
-			return -EINVAL;
+			scale_busy = true;
+			mchan_rc = -EINVAL;
+			goto out;
 		}
 		ses->flags |= CIFS_SES_FLAG_SCALE_CHANNELS;
 		spin_unlock(&ses->ses_lock);
 
+		old_chan_max = ses->chan_max;
+		/* Synchronize ses->chan_max with the new mount context */
+		smb3_sync_ses_chan_max(ses, ctx->max_channels);
+
 		mutex_unlock(&ses->session_mutex);
 
 		rc = smb3_update_ses_channels(ses, ses->server,
 					       false /* from_reconnect */,
 					       false /* disable_mchan */);
 
+		/*
+		 * On failure, restore chan_max while still holding
+		 * CIFS_SES_FLAG_SCALE_CHANNELS so a concurrent reconfigure
+		 * cannot observe or race with the rollback.
+		 */
+		if (rc < 0) {
+			smb3_sync_ses_chan_max(ses, old_chan_max);
+			mchan_rc = rc;
+		}
+
 		/* Clear scaling flag after operation */
 		spin_lock(&ses->ses_lock);
 		ses->flags &= ~CIFS_SES_FLAG_SCALE_CHANNELS;
@@ -1197,6 +1212,7 @@ static int smb3_reconfigure(struct fs_context *fc)
 		mutex_unlock(&ses->session_mutex);
 	}
 
+out:
 	STEAL_STRING(cifs_sb, ctx, domainname);
 	STEAL_STRING(cifs_sb, ctx, nodename);
 	STEAL_STRING(cifs_sb, ctx, iocharset);
@@ -1205,6 +1221,16 @@ static int smb3_reconfigure(struct fs_context *fc)
 	ctx->rsize = rsize ? CIFS_ALIGN_RSIZE(fc, rsize) : cifs_sb->ctx->rsize;
 	ctx->wsize = wsize ? CIFS_ALIGN_WSIZE(fc, wsize) : cifs_sb->ctx->wsize;
 
+	/*
+	 * If the multichannel update failed, restore the old multichannel
+	 * settings in ctx so smb3_fs_context_dup() does not desync
+	 * cifs_sb->ctx from ses->chan_max (which was already rolled back).
+	 */
+	if (mchan_rc) {
+		ctx->multichannel = cifs_sb->ctx->multichannel;
+		ctx->max_channels = cifs_sb->ctx->max_channels;
+	}
+
 	smb3_cleanup_fs_context_contents(cifs_sb->ctx);
 	rc = smb3_fs_context_dup(cifs_sb->ctx, ctx);
 	smb3_update_mnt_flags(cifs_sb);
@@ -1213,6 +1239,16 @@ static int smb3_reconfigure(struct fs_context *fc)
 		rc = dfs_cache_remount_fs(cifs_sb);
 #endif
 
+	/*
+	 * Preserve the pre-existing loser-path semantics: a concurrent
+	 * scaling operation causes the remount to be rejected with
+	 * -EINVAL. smb3_fs_context_dup() / dfs_cache_remount_fs()
+	 * failures take precedence because they reflect real state
+	 * recovery errors. Other multichannel failures remain best-effort.
+	 */
+	if (!rc && scale_busy)
+		rc = -EINVAL;
+
 	return rc;
 }
 
-- 
2.43.0
Re: [PATCH v3] smb/client: fix state corruption in smb3_reconfigure multichannel path
Posted by RAJASI MANDAL 1 month, 2 weeks ago
Hi DaeMyung,

Thanks for the v3. One minor suggestion

> + old_chan_max = ses->chan_max;
> + /* Synchronize ses->chan_max with the new mount context */
> + smb3_sync_ses_chan_max(ses, ctx->max_channels);

This reads ses->chan_max without holding chan_lock.
smb3_sync_ses_chan_max() itself takes chan_lock for the write, and
cifs_try_adding_channels() / cifs_chan_skip_or_disable() also access
chan_max under chan_lock. So there is a potential data race between
this unlocked read and a concurrent reconnect path writing chan_max.

Other than that, this  patch looks good to me.

Thanks,
Rajasi
Re: [PATCH v3] smb/client: fix state corruption in smb3_reconfigure multichannel path
Posted by CharSyam 1 month, 2 weeks ago
Hi Rajasi,

Thanks a lot for the careful review — you're right, that read of
ses->chan_max was unprotected and races with chan_lock writers in
cifs_try_adding_channels() / cifs_chan_skip_or_disable().

v4 fixes this by folding the read into smb3_sync_ses_chan_max()
itself: the helper now returns the previous chan_max value, so the
read+write happens atomically under chan_lock. The caller no longer
reads ses->chan_max outside the lock. I also switched the helper's
parameter/return type to size_t to match struct cifs_ses::chan_max.

While I was at it, I also restored ctx->multichannel_specified and
ctx->max_channels_specified together
with multichannel / max_channels on the rollback path,
so the user-specified flags don't end up out of sync with
the restored values once smb3_fs_context_dup() copies the ctx back.
v4 just went out — would appreciate another look when you get a chance.

Thanks,
DaeMyung


2026년 4월 29일 (수) 오후 12:17, RAJASI MANDAL <rajasimandalos@gmail.com>님이 작성:
>
> Hi DaeMyung,
>
> Thanks for the v3. One minor suggestion
>
> > + old_chan_max = ses->chan_max;
> > + /* Synchronize ses->chan_max with the new mount context */
> > + smb3_sync_ses_chan_max(ses, ctx->max_channels);
>
> This reads ses->chan_max without holding chan_lock.
> smb3_sync_ses_chan_max() itself takes chan_lock for the write, and
> cifs_try_adding_channels() / cifs_chan_skip_or_disable() also access
> chan_max under chan_lock. So there is a potential data race between
> this unlocked read and a concurrent reconnect path writing chan_max.
>
> Other than that, this  patch looks good to me.
>
> Thanks,
> Rajasi