From nobody Wed Dec 10 20:10:48 2025 Received: from flow-b5-smtp.messagingengine.com (flow-b5-smtp.messagingengine.com [202.12.124.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5E37C188596; Thu, 13 Nov 2025 00:40:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=202.12.124.140 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762994415; cv=none; b=qcB3McKDfq8aQE5F4BQcciseCP4VwnLJSGyjbQ3udTd3ZaA8Y99GubdvWwAtI4SSTw2AxQ+bwhhPsAZ6+ePH7vrZChall1RaZHSxjXrFLLocON9rg1hy0j8WUhw0eA/pzqXuCcliV5drhGBb/qYMJWAb6mRVBtA+9MQX8PdLwVQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762994415; c=relaxed/simple; bh=2Ouw364GqxNcbu4L3UVeWwBJPQoYZK+Vc42BSI+ObGU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=dq26BuZI4zhRuZ4lEBT7OBtunQLXvB1tbK5K+OMenaeVZxzDjrjd7LuIUNAQ9Q97iplfoY6yzeuTO3p05rCr1CkUzveI0etjTmFiV/GnmadVpoAGXmBJyspb/TroGI+4CfkEHQEK6DWAuF+7+2QkQuzwiumerl7W+zX7XJoABYM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=ownmail.net; spf=pass smtp.mailfrom=ownmail.net; dkim=pass (2048-bit key) header.d=ownmail.net header.i=@ownmail.net header.b=YvAxS36+; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=h6KhoLbF; arc=none smtp.client-ip=202.12.124.140 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=ownmail.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ownmail.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ownmail.net header.i=@ownmail.net header.b="YvAxS36+"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="h6KhoLbF" Received: from phl-compute-06.internal (phl-compute-06.internal [10.202.2.46]) by mailflow.stl.internal (Postfix) with ESMTP id A855A13000C2; Wed, 12 Nov 2025 19:40:07 -0500 (EST) Received: from phl-mailfrontend-01 ([10.202.2.162]) by phl-compute-06.internal (MEProxy); Wed, 12 Nov 2025 19:40:08 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ownmail.net; h= cc:cc:content-transfer-encoding:content-type:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:reply-to:subject:subject:to:to; s=fm3; t=1762994407; x=1763001607; bh=6JqtFMJx/Q1PEF4OPmVVlQcyflC4oyp/np9aGMVur+Y=; b= YvAxS36+4PaIGBp7Zp0UEdzWAtXpNbyfYItJJAxroPr3Yy5QGptcN2WAH6nQady5 2PxhUV4opPv6lBNkJev1ZMfDqxKBAQ84bv5X+4iQHhfCUUnfOXLHO2rGZ8OwLrph xMbdru8QssiMV0NdKP0XgmMYQDjmsQhhrkaYJPBvdXcAQg3fLfDetqNLQ50+uDBS d0Qv/qlYVZng9UYZ+QWJzV5JjCXX8IBeR2eR6ZP9LdIwArH/W9sSgY9mvoTuSD4e vJiMJ3+ju98H9SG4fx17Ur69LtQyf4w0vq9Ml7sH8aHHc3zfuG5sbNTzXOl34Gff 2nZQczK5UBtJ2DUgnRq7Ew== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:reply-to:subject:subject:to:to:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm3; t=1762994407; x=1763001607; bh=6 JqtFMJx/Q1PEF4OPmVVlQcyflC4oyp/np9aGMVur+Y=; b=h6KhoLbFmUvTUc/75 hZyW+NB5xtP0zqpdpxyoUCASg1azuz0E6f29OPfxe9pe0LaAFRIHBpxOeU/95xR+ AVNYXUpRFvZF1u6Q+oX9JrSYSY+dklnsU+zNqNU9v0zn116KNtwGwcnys4GNaWvv S3TZy/7JBSrfSa16oasOPj6SjzR50Zbq7UPRBTGN374HxkME2FgWUEZhV/zAwxfj RUVVTqIfWIk3Jl8iDGfI9fEaN11pPqr9GdfW9wAgTPQDq8bALzjNrt+1+yTsABO/ qqdsPPukLV3wQREFWmCsILwtMECaQ178ATLdx+yrOudpXhTa1JhhUroEoEsAgMAS vPrZw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeffedrtdeggddvtdehheefucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujf gurhephffvvefufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpefpvghilheu rhhofihnuceonhgvihhlsgesohifnhhmrghilhdrnhgvtheqnecuggftrfgrthhtvghrnh epveevkeffudeuvefhieeghffgudektdelkeejiedtjedugfeukedvkeffvdefvddunecu vehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepnhgvihhlsg esohifnhhmrghilhdrnhgvthdpnhgspghrtghpthhtohepgedtpdhmohguvgepshhmthhp ohhuthdprhgtphhtthhopehvihhrohesiigvnhhivhdrlhhinhhugidrohhrghdruhhkpd hrtghpthhtohepshgvlhhinhhugiesvhhgvghrrdhkvghrnhgvlhdrohhrghdprhgtphht thhopehlihhnuhigqdigfhhssehvghgvrhdrkhgvrhhnvghlrdhorhhgpdhrtghpthhtoh eplhhinhhugidquhhnihhonhhfshesvhhgvghrrdhkvghrnhgvlhdrohhrghdprhgtphht thhopehlihhnuhigqdhsvggtuhhrihhthidqmhhoughulhgvsehvghgvrhdrkhgvrhhnvg hlrdhorhhgpdhrtghpthhtoheplhhinhhugidqnhhfshesvhhgvghrrdhkvghrnhgvlhdr ohhrghdprhgtphhtthhopehlihhnuhigqdhkvghrnhgvlhesvhhgvghrrdhkvghrnhgvlh drohhrghdprhgtphhtthhopehlihhnuhigqdhfshguvghvvghlsehvghgvrhdrkhgvrhhn vghlrdhorhhgpdhrtghpthhtoheplhhinhhugidqtghifhhssehvghgvrhdrkhgvrhhnvg hlrdhorhhg X-ME-Proxy: Feedback-ID: iab3e480c:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 12 Nov 2025 19:39:57 -0500 (EST) From: NeilBrown To: "Alexander Viro" , "Christian Brauner" , "Amir Goldstein" Cc: "Jan Kara" , linux-fsdevel@vger.kernel.org, Jeff Layton , Chris Mason , David Sterba , David Howells , Greg Kroah-Hartman , "Rafael J. Wysocki" , Danilo Krummrich , Tyler Hicks , Miklos Szeredi , Chuck Lever , Olga Kornievskaia , Dai Ngo , Namjae Jeon , Steve French , Sergey Senozhatsky , Carlos Maiolino , John Johansen , Paul Moore , James Morris , "Serge E. Hallyn" , Stephen Smalley , Ondrej Mosnacek , Mateusz Guzik , Lorenzo Stoakes , Stefan Berger , "Darrick J. Wong" , linux-kernel@vger.kernel.org, netfs@lists.linux.dev, ecryptfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-unionfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-security-module@vger.kernel.org, selinux@vger.kernel.org Subject: [PATCH v6 02/15] VFS: introduce start_dirop() and end_dirop() Date: Thu, 13 Nov 2025 11:18:25 +1100 Message-ID: <20251113002050.676694-3-neilb@ownmail.net> X-Mailer: git-send-email 2.50.0.107.gf914562f5916.dirty In-Reply-To: <20251113002050.676694-1-neilb@ownmail.net> References: <20251113002050.676694-1-neilb@ownmail.net> Reply-To: NeilBrown Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: NeilBrown The fact that directory operations (create,remove,rename) are protected by a lock on the parent is known widely throughout the kernel. In order to change this - to instead lock the target dentry - it is best to centralise this knowledge so it can be changed in one place. This patch introduces start_dirop() which is local to VFS code. It performs the required locking for create and remove. Rename will be handled separately. Various functions with names like start_creating() or start_removing_path(), some of which already exist, will export this functionality beyond the VFS. end_dirop() is the partner of start_dirop(). It drops the lock and releases the reference on the dentry. It *is* exported so that various end_creating etc functions can be inline. As vfs_mkdir() drops the dentry on error we cannot use end_dirop() as that won't unlock when the dentry IS_ERR(). For now we need an explicit unlock when dentry IS_ERR(). I hope to change vfs_mkdir() to unlock when it drops a dentry so that explicit unlock can go away. end_dirop() can always be called on the result of start_dirop(), but not after vfs_mkdir(). After a vfs_mkdir() we still may need the explicit unlock as seen in end_creating_path(). As well as adding start_dirop() and end_dirop() this patch uses them in: - simple_start_creating (which requires sharing lookup_noperm_common() with libfs.c) - start_removing_path / start_removing_user_path_at - filename_create / end_creating_path() - do_rmdir(), do_unlinkat() Reviewed-by: Amir Goldstein Reviewed-by: Jeff Layton Signed-off-by: NeilBrown --- fs/internal.h | 3 ++ fs/libfs.c | 36 ++++++++--------- fs/namei.c | 98 ++++++++++++++++++++++++++++++++++------------ include/linux/fs.h | 2 + 4 files changed, 95 insertions(+), 44 deletions(-) diff --git a/fs/internal.h b/fs/internal.h index 9b2b4d116880..d08d5e2235e9 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -67,6 +67,9 @@ int vfs_tmpfile(struct mnt_idmap *idmap, const struct path *parentpath, struct file *file, umode_t mode); struct dentry *d_hash_and_lookup(struct dentry *, struct qstr *); +struct dentry *start_dirop(struct dentry *parent, struct qstr *name, + unsigned int lookup_flags); +int lookup_noperm_common(struct qstr *qname, struct dentry *base); =20 /* * namespace.c diff --git a/fs/libfs.c b/fs/libfs.c index 1661dcb7d983..2d6657947abd 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -2290,27 +2290,25 @@ void stashed_dentry_prune(struct dentry *dentry) cmpxchg(stashed, dentry, NULL); } =20 -/* parent must be held exclusive */ +/** + * simple_start_creating - prepare to create a given name + * @parent: directory in which to prepare to create the name + * @name: the name to be created + * + * Required lock is taken and a lookup in performed prior to creating an + * object in a directory. No permission checking is performed. + * + * Returns: a negative dentry on which vfs_create() or similar may + * be attempted, or an error. + */ struct dentry *simple_start_creating(struct dentry *parent, const char *na= me) { - struct dentry *dentry; - struct inode *dir =3D d_inode(parent); + struct qstr qname =3D QSTR(name); + int err; =20 - inode_lock(dir); - if (unlikely(IS_DEADDIR(dir))) { - inode_unlock(dir); - return ERR_PTR(-ENOENT); - } - dentry =3D lookup_noperm(&QSTR(name), parent); - if (IS_ERR(dentry)) { - inode_unlock(dir); - return dentry; - } - if (dentry->d_inode) { - dput(dentry); - inode_unlock(dir); - return ERR_PTR(-EEXIST); - } - return dentry; + err =3D lookup_noperm_common(&qname, parent); + if (err) + return ERR_PTR(err); + return start_dirop(parent, &qname, LOOKUP_CREATE | LOOKUP_EXCL); } EXPORT_SYMBOL(simple_start_creating); diff --git a/fs/namei.c b/fs/namei.c index 39c4d52f5b54..231e1ffd4b8d 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -2765,6 +2765,48 @@ static int filename_parentat(int dfd, struct filenam= e *name, return __filename_parentat(dfd, name, flags, parent, last, type, NULL); } =20 +/** + * start_dirop - begin a create or remove dirop, performing locking and lo= okup + * @parent: the dentry of the parent in which the operation will occ= ur + * @name: a qstr holding the name within that parent + * @lookup_flags: intent and other lookup flags. + * + * The lookup is performed and necessary locks are taken so that, on succe= ss, + * the returned dentry can be operated on safely. + * The qstr must already have the hash value calculated. + * + * Returns: a locked dentry, or an error. + * + */ +struct dentry *start_dirop(struct dentry *parent, struct qstr *name, + unsigned int lookup_flags) +{ + struct dentry *dentry; + struct inode *dir =3D d_inode(parent); + + inode_lock_nested(dir, I_MUTEX_PARENT); + dentry =3D lookup_one_qstr_excl(name, parent, lookup_flags); + if (IS_ERR(dentry)) + inode_unlock(dir); + return dentry; +} + +/** + * end_dirop - signal completion of a dirop + * @de: the dentry which was returned by start_dirop or similar. + * + * If the de is an error, nothing happens. Otherwise any lock taken to + * protect the dentry is dropped and the dentry itself is release (dput()). + */ +void end_dirop(struct dentry *de) +{ + if (!IS_ERR(de)) { + inode_unlock(de->d_parent->d_inode); + dput(de); + } +} +EXPORT_SYMBOL(end_dirop); + /* does lookup, returns the object with parent locked */ static struct dentry *__start_removing_path(int dfd, struct filename *name, struct path *path) @@ -2781,10 +2823,9 @@ static struct dentry *__start_removing_path(int dfd,= struct filename *name, return ERR_PTR(-EINVAL); /* don't fail immediately if it's r/o, at least try to report other error= s */ error =3D mnt_want_write(parent_path.mnt); - inode_lock_nested(parent_path.dentry->d_inode, I_MUTEX_PARENT); - d =3D lookup_one_qstr_excl(&last, parent_path.dentry, 0); + d =3D start_dirop(parent_path.dentry, &last, 0); if (IS_ERR(d)) - goto unlock; + goto drop; if (error) goto fail; path->dentry =3D no_free_ptr(parent_path.dentry); @@ -2792,10 +2833,9 @@ static struct dentry *__start_removing_path(int dfd,= struct filename *name, return d; =20 fail: - dput(d); + end_dirop(d); d =3D ERR_PTR(error); -unlock: - inode_unlock(parent_path.dentry->d_inode); +drop: if (!error) mnt_drop_write(parent_path.mnt); return d; @@ -2910,7 +2950,7 @@ int vfs_path_lookup(struct dentry *dentry, struct vfs= mount *mnt, } EXPORT_SYMBOL(vfs_path_lookup); =20 -static int lookup_noperm_common(struct qstr *qname, struct dentry *base) +int lookup_noperm_common(struct qstr *qname, struct dentry *base) { const char *name =3D qname->name; u32 len =3D qname->len; @@ -4223,21 +4263,18 @@ static struct dentry *filename_create(int dfd, stru= ct filename *name, */ if (last.name[last.len] && !want_dir) create_flags &=3D ~LOOKUP_CREATE; - inode_lock_nested(path->dentry->d_inode, I_MUTEX_PARENT); - dentry =3D lookup_one_qstr_excl(&last, path->dentry, - reval_flag | create_flags); + dentry =3D start_dirop(path->dentry, &last, reval_flag | create_flags); if (IS_ERR(dentry)) - goto unlock; + goto out_drop_write; =20 if (unlikely(error)) goto fail; =20 return dentry; fail: - dput(dentry); + end_dirop(dentry); dentry =3D ERR_PTR(error); -unlock: - inode_unlock(path->dentry->d_inode); +out_drop_write: if (!error) mnt_drop_write(path->mnt); out: @@ -4256,11 +4293,26 @@ struct dentry *start_creating_path(int dfd, const c= har *pathname, } EXPORT_SYMBOL(start_creating_path); =20 +/** + * end_creating_path - finish a code section started by start_creating_pat= h() + * @path: the path instantiated by start_creating_path() + * @dentry: the dentry returned by start_creating_path() + * + * end_creating_path() will unlock and locks taken by start_creating_path() + * and drop an references that were taken. It should only be called + * if start_creating_path() returned a non-error. + * If vfs_mkdir() was called and it returned an error, that error *should* + * be passed to end_creating_path() together with the path. + */ void end_creating_path(const struct path *path, struct dentry *dentry) { - if (!IS_ERR(dentry)) - dput(dentry); - inode_unlock(path->dentry->d_inode); + if (IS_ERR(dentry)) + /* The parent is still locked despite the error from + * vfs_mkdir() - must unlock it. + */ + inode_unlock(path->dentry->d_inode); + else + end_dirop(dentry); mnt_drop_write(path->mnt); path_put(path); } @@ -4592,8 +4644,7 @@ int do_rmdir(int dfd, struct filename *name) if (error) goto exit2; =20 - inode_lock_nested(path.dentry->d_inode, I_MUTEX_PARENT); - dentry =3D lookup_one_qstr_excl(&last, path.dentry, lookup_flags); + dentry =3D start_dirop(path.dentry, &last, lookup_flags); error =3D PTR_ERR(dentry); if (IS_ERR(dentry)) goto exit3; @@ -4602,9 +4653,8 @@ int do_rmdir(int dfd, struct filename *name) goto exit4; error =3D vfs_rmdir(mnt_idmap(path.mnt), path.dentry->d_inode, dentry); exit4: - dput(dentry); + end_dirop(dentry); exit3: - inode_unlock(path.dentry->d_inode); mnt_drop_write(path.mnt); exit2: path_put(&path); @@ -4721,8 +4771,7 @@ int do_unlinkat(int dfd, struct filename *name) if (error) goto exit2; retry_deleg: - inode_lock_nested(path.dentry->d_inode, I_MUTEX_PARENT); - dentry =3D lookup_one_qstr_excl(&last, path.dentry, lookup_flags); + dentry =3D start_dirop(path.dentry, &last, lookup_flags); error =3D PTR_ERR(dentry); if (!IS_ERR(dentry)) { =20 @@ -4737,9 +4786,8 @@ int do_unlinkat(int dfd, struct filename *name) error =3D vfs_unlink(mnt_idmap(path.mnt), path.dentry->d_inode, dentry, &delegated_inode); exit3: - dput(dentry); + end_dirop(dentry); } - inode_unlock(path.dentry->d_inode); if (inode) iput(inode); /* truncate the inode here */ inode =3D NULL; diff --git a/include/linux/fs.h b/include/linux/fs.h index 03e450dd5211..9e7556e79d19 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3196,6 +3196,8 @@ extern void iterate_supers_type(struct file_system_ty= pe *, void filesystems_freeze(void); void filesystems_thaw(void); =20 +void end_dirop(struct dentry *de); + extern int dcache_dir_open(struct inode *, struct file *); extern int dcache_dir_close(struct inode *, struct file *); extern loff_t dcache_dir_lseek(struct file *, loff_t, int); --=20 2.50.0.107.gf914562f5916.dirty