From nobody Fri Dec 19 14:21:18 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7EF8E288534 for ; Mon, 19 May 2025 16:12:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747671178; cv=none; b=AvC7GRbrtIEjb/6ufe1OHSxLpmCFMY7h4cJQ+HixbnZSLh4wUjZib/PLFAr4hXvTLi45yN/7tQSoPGMkHvG4SpYmjbNtFyGbDaCxBaq15F/4AIyMb+n0rjBuAE5q3TQjp90GPWB+SXVcPrIHOQg9RVbMyxWKuKcWx/Bag6qm+rs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747671178; c=relaxed/simple; bh=ISTczGxadcUclkDXLZV3Eskuo03uqtJ5PbSQWhYAFf8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=kV3nW9+YveOWk4f0JInjmzDnHTedHwenBWhEt3o3zElf6hgPfCUGRdKxho1oXMZCz1BZaZjC9z5WdWVN97A4PiR2UzcKRxe6ebJoVDQQ8u82sFAA/cGqPyecVl4u+PA/AOeeU7YMWJyeaUF6Gl+t0TRMVi0pkcPN+sXBN0fv8W0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=iHBJdZwG; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="iHBJdZwG" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1747671174; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=twf/BYD5BbQPuvJN3ljwkR3vFlDrn0agjdPwKHC7ADw=; b=iHBJdZwGUEgbKi1vSXbI+xVKnbdCKKzdbrozbpwOkWruKPCK/drZ9sMPeY/0kL/fTwWrCq DYfXJ8/j159mcP1kvkS6TfHsbV2JlEAFtgFy4vtzQTao5qNIacdhONhb2F3D4VJWe+4Igc qLMV6sjcrQFgLxl2lKudRwE1PxCNF5M= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-491-K69R6ugYPU-q2QQQMtf50Q-1; Mon, 19 May 2025 12:11:44 -0400 X-MC-Unique: K69R6ugYPU-q2QQQMtf50Q-1 X-Mimecast-MFC-AGG-ID: K69R6ugYPU-q2QQQMtf50Q_1747671102 Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id B30A01956046; Mon, 19 May 2025 16:11:41 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.188]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 74A8830001AA; Mon, 19 May 2025 16:11:37 +0000 (UTC) From: David Howells To: Christian Brauner Cc: David Howells , Marc Dionne , linux-afs@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Etienne Champetier , Jeffrey Altman , Chet Ramey , Cheyenne Wills , Alexander Viro , Christian Brauner , Steve French , openafs-devel@openafs.org, linux-cifs@vger.kernel.org Subject: [PATCH 1/2] afs, bash: Fix open(O_CREAT) on an extant AFS file in a sticky dir Date: Mon, 19 May 2025 17:11:22 +0100 Message-ID: <20250519161125.2981681-2-dhowells@redhat.com> In-Reply-To: <20250519161125.2981681-1-dhowells@redhat.com> References: <20250519161125.2981681-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 Content-Type: text/plain; charset="utf-8" Since version 1.11 (January 1992) Bash has a work around in redir_open() that causes open(O_CREAT) of a file to be retried without O_CREAT if open() fails with an EACCES error if bash was built with AFS workarounds configured: #if defined (AFS) if ((fd < 0) && (errno =3D=3D EACCES)) { fd =3D open (filename, flags & ~O_CREAT, mode); errno =3D EACCES; /* restore errno */ } #endif /* AFS */ The ~O_CREAT fallback logic was introduced to workaround a bug[1] in the IBM AFS 3.1 cache manager and server which can return EACCES in preference to EEXIST if the requested file exists but the caller is neither granted explicit PRSFS_READ permission nor is the file owner and is granted PRSFS_INSERT permission on the directory. IBM AFS 3.2 altered the cache manager permission checks but failed to correct the permission checks in the AFS server. As of this writing, all IBM AFS derived servers continue to return EACCES in preference to EEXIST when these conditions are met. Bug reports have been filed with all implementations. As an unintended side effect, the Bash fallback logic also undermines the Linux kernel protections against O_CREAT opening FIFOs and regular files not owned by the user in world writeable sticky directories - unless the owner is the same as that of the directory - as was added in commit 30aba6656f61e ("namei: allow restricted O_CREAT of FIFOs and regular files"). As a result the Bash fallback logic masks an incompatibility between the ownership checks performed by may_create_in_sticky() and network filesystems such as AFS where the uid namespace is disjoint from the uid namespace of the local system. However, the bash work around is going to be removed[2]. Fix this in the kernel by: (1) Provide an ->is_owned_by_me() inode op, similar to ->permission(), and, if provided, call that to determine if the caller owns the file instead of checking the i_uid to current_fsuid(). (2) Provide a ->have_same_owner() inode op, that, if provided, can be called to see if an inode has the same owner as the parent on the path walked. For kafs, use the first hook to check to see if the server indicated the ADMINISTER bit in the access rights returned by the FS.FetchStatus and suchlike and the second hook to compare the AFS user IDs retrieved by FS.FetchStatus (which may not fit in a kuid if AuriStor's YFS variant). These hooks should probably be used in other places too, and a follow-up patch will be submitted for that. This can be tested by creating a sticky directory (the user must have a token to do this) and creating a file in it. Then strace bash doing "echo foo >>file" and look at whether bash does a single, successful O_CREAT open on the file or whether that one fails and then bash does one without O_CREAT that succeeds. Fixes: 30aba6656f61 ("namei: allow restricted O_CREAT of FIFOs and regular = files") Reported-by: Etienne Champetier Signed-off-by: David Howells cc: Marc Dionne cc: Jeffrey Altman cc: Chet Ramey cc: Cheyenne Wills cc: Alexander Viro cc: Christian Brauner cc: Steve French cc: linux-afs@lists.infradead.org cc: openafs-devel@openafs.org cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Link: https://groups.google.com/g/gnu.bash.bug/c/6PPTfOgFdL4/m/2AQU-S1N76UJ= [1] Link: https://git.savannah.gnu.org/cgit/bash.git/tree/redir.c?h=3Dbash-5.3-= rc1#n733 [2] --- fs/afs/dir.c | 2 ++ fs/afs/file.c | 2 ++ fs/afs/internal.h | 3 +++ fs/afs/security.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++ fs/internal.h | 1 + fs/namei.c | 50 +++++++++++++++++++++++++++++++++++--------- include/linux/fs.h | 3 +++ 7 files changed, 103 insertions(+), 10 deletions(-) diff --git a/fs/afs/dir.c b/fs/afs/dir.c index 9e7b1fe82c27..6360db1673b0 100644 --- a/fs/afs/dir.c +++ b/fs/afs/dir.c @@ -65,6 +65,8 @@ const struct inode_operations afs_dir_inode_operations = =3D { .permission =3D afs_permission, .getattr =3D afs_getattr, .setattr =3D afs_setattr, + .is_owned_by_me =3D afs_is_owned_by_me, + .have_same_owner =3D afs_have_same_owner, }; =20 const struct address_space_operations afs_dir_aops =3D { diff --git a/fs/afs/file.c b/fs/afs/file.c index fc15497608c6..0317f0a36cf2 100644 --- a/fs/afs/file.c +++ b/fs/afs/file.c @@ -47,6 +47,8 @@ const struct inode_operations afs_file_inode_operations = =3D { .getattr =3D afs_getattr, .setattr =3D afs_setattr, .permission =3D afs_permission, + .is_owned_by_me =3D afs_is_owned_by_me, + .have_same_owner =3D afs_have_same_owner, }; =20 const struct address_space_operations afs_file_aops =3D { diff --git a/fs/afs/internal.h b/fs/afs/internal.h index 440b0e731093..fbfbf615abe3 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -1495,6 +1495,9 @@ extern struct key *afs_request_key(struct afs_cell *); extern struct key *afs_request_key_rcu(struct afs_cell *); extern int afs_check_permit(struct afs_vnode *, struct key *, afs_access_t= *); extern int afs_permission(struct mnt_idmap *, struct inode *, int); +int afs_is_owned_by_me(struct mnt_idmap *idmap, struct inode *inode); +int afs_have_same_owner(struct mnt_idmap *idmap, struct inode *inode, + struct dentry *dentry); extern void __exit afs_clean_up_permit_cache(void); =20 /* diff --git a/fs/afs/security.c b/fs/afs/security.c index 6a7744c9e2a2..a49070c8342d 100644 --- a/fs/afs/security.c +++ b/fs/afs/security.c @@ -477,6 +477,58 @@ int afs_permission(struct mnt_idmap *idmap, struct ino= de *inode, return ret; } =20 +/* + * Determine if an inode is owned by 'me' - whatever that means for the + * filesystem. In the case of AFS, this means that the file is owned by t= he + * AFS user represented by the token (e.g. from a kerberos server) held in= a + * key. Returns 0 if owned by me, 1 if not; can also return an error. + */ +int afs_is_owned_by_me(struct mnt_idmap *idmap, struct inode *inode) +{ + struct afs_vnode *vnode =3D AFS_FS_I(inode); + afs_access_t access; + struct key *key; + int ret; + + key =3D afs_request_key(vnode->volume->cell); + if (IS_ERR(key)) + return PTR_ERR(key); + + /* Get the access rights for the key on this file. */ + ret =3D afs_check_permit(vnode, key, &access); + if (ret < 0) + goto error; + + /* We get the ADMINISTER bit if we own the file. */ + ret =3D (access & AFS_ACE_ADMINISTER) ? 0 : 1; +error: + key_put(key); + return ret; +} + +/* + * Determine if a file has the same owner as its parent - whatever that me= ans + * for the filesystem. In the case of AFS, this means comparing their AFS + * UIDs. Returns 0 if same, 1 if not same; can also return an error. + */ +int afs_have_same_owner(struct mnt_idmap *idmap, struct inode *inode, + struct dentry *dentry) +{ + struct afs_vnode *vnode =3D AFS_FS_I(inode), *dvnode; + struct dentry *parent; + s64 owner; + + /* Get the owner's ID for the directory. Ideally, we'd use RCU to + * access the parent rather than getting a ref. + */ + parent =3D dget_parent(dentry); + dvnode =3D AFS_FS_I(d_backing_inode(parent)); + owner =3D dvnode->status.owner; + dput(parent); + + return vnode->status.owner !=3D owner; +} + void __exit afs_clean_up_permit_cache(void) { int i; diff --git a/fs/internal.h b/fs/internal.h index b9b3e29a73fd..9e84bfc5aee6 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -52,6 +52,7 @@ extern int finish_clean_context(struct fs_context *fc); /* * namei.c */ +int vfs_inode_is_owned_by_me(struct mnt_idmap *idmap, struct inode *inode); extern int filename_lookup(int dfd, struct filename *name, unsigned flags, struct path *path, struct path *root); int do_rmdir(int dfd, struct filename *name); diff --git a/fs/namei.c b/fs/namei.c index 84a0e0b0111c..9f42dc46322f 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -53,8 +53,8 @@ * The new code replaces the old recursive symlink resolution with * an iterative one (in case of non-nested symlink chains). It does * this with calls to _follow_link(). - * As a side effect, dir_namei(), _namei() and follow_link() are now=20 - * replaced with a single function lookup_dentry() that can handle all=20 + * As a side effect, dir_namei(), _namei() and follow_link() are now + * replaced with a single function lookup_dentry() that can handle all * the special cases of the former code. * * With the new dcache, the pathname is stored at each inode, at least as @@ -1149,6 +1149,36 @@ fs_initcall(init_fs_namei_sysctls); =20 #endif /* CONFIG_SYSCTL */ =20 +/* + * Determine if an inode is owned by the process (allowing for fsuid overr= ide), + * returning 0 if so, 1 if not and a negative error code if there was a pr= oblem + * making the determination. + */ +int vfs_inode_is_owned_by_me(struct mnt_idmap *idmap, struct inode *inode) +{ + if (unlikely(inode->i_op->is_owned_by_me)) + return inode->i_op->is_owned_by_me(idmap, inode); + + return vfsuid_eq_kuid(i_uid_into_vfsuid(idmap, inode), + current_fsuid()) ? 1 : 0; +} + +/* + * Determine if two inodes have the same owner, returning 0 if so, 1 if no= t and + * a negative error code if there was a problem making the determination. + */ +static int vfs_inodes_have_same_owner(struct mnt_idmap *idmap, struct inod= e *inode, + const struct nameidata *nd) +{ + if (unlikely(inode->i_op->have_same_owner)) + return inode->i_op->have_same_owner(idmap, inode, nd->path.dentry); + + if (vfsuid_valid(nd->dir_vfsuid) && + vfsuid_eq(i_uid_into_vfsuid(idmap, inode), nd->dir_vfsuid)) + return 0; + return 1; /* Not same. */ +} + /** * may_follow_link - Check symlink following for unsafe situations * @nd: nameidata pathwalk data @@ -1302,10 +1332,10 @@ int may_linkat(struct mnt_idmap *idmap, const struc= t path *link) * Returns 0 if the open is allowed, -ve on error. */ static int may_create_in_sticky(struct mnt_idmap *idmap, struct nameidata = *nd, - struct inode *const inode) + struct inode *inode) { umode_t dir_mode =3D nd->dir_mode; - vfsuid_t dir_vfsuid =3D nd->dir_vfsuid, i_vfsuid; + int ret; =20 if (likely(!(dir_mode & S_ISVTX))) return 0; @@ -1316,13 +1346,13 @@ static int may_create_in_sticky(struct mnt_idmap *i= dmap, struct nameidata *nd, if (S_ISFIFO(inode->i_mode) && !sysctl_protected_fifos) return 0; =20 - i_vfsuid =3D i_uid_into_vfsuid(idmap, inode); - - if (vfsuid_eq(i_vfsuid, dir_vfsuid)) - return 0; + ret =3D vfs_inodes_have_same_owner(idmap, inode, nd); + if (ret <=3D 0) + return ret; =20 - if (vfsuid_eq_kuid(i_vfsuid, current_fsuid())) - return 0; + ret =3D vfs_inode_is_owned_by_me(idmap, inode); + if (ret <=3D 0) + return ret; =20 if (likely(dir_mode & 0002)) { audit_log_path_denied(AUDIT_ANOM_CREAT, "sticky_create"); diff --git a/include/linux/fs.h b/include/linux/fs.h index 016b0fe1536e..ec278d2d362a 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2236,6 +2236,9 @@ struct inode_operations { struct dentry *dentry, struct fileattr *fa); int (*fileattr_get)(struct dentry *dentry, struct fileattr *fa); struct offset_ctx *(*get_offset_ctx)(struct inode *inode); + int (*is_owned_by_me)(struct mnt_idmap *idmap, struct inode *inode); + int (*have_same_owner)(struct mnt_idmap *idmap, struct inode *inode, + struct dentry *dentry); } ____cacheline_aligned; =20 static inline int call_mmap(struct file *file, struct vm_area_struct *vma) From nobody Fri Dec 19 14:21:18 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C512828937C for ; Mon, 19 May 2025 16:11:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747671120; cv=none; b=oJRZzri1mLu8+GieEX2w3jQ804mZ3hOgm5xHmaPjcjJ/C1Zk/ZkEZ6lh0pzbCFijcEaJ2wjTqOvlqxX+cdz/OVATMwCvobE3Ex5eEaT2TbFr8jiW1+vri5AvkzwKasSYAkTzEACz9+2nT1pIevb/iBvUNUgL5iJRakS1N1/B9+A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747671120; c=relaxed/simple; bh=qdUEjr0o3Loh5jJb54EHdWN7nxxaG+JTaVGIjKDdt6k=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Fv+RmT7WzfwN4d/YhM6+ajhG5tg2Kl+qO9uXmhhBeAwnO4TkJiUxteVanMNlxeST80NB+N/E+wvdr0aCrwBxMMHSDLM30/roIeo27Ulf2z31yQZyPFJffB3O/AtrV3A7s+YT69WaatY3Wjfks1W1Pncm+OT48NqSNYV6axJZOnM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=M68sPAoz; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="M68sPAoz" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1747671116; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UKCLX6wUxa17ZeR6w14OcNdKezZe2LB8Wh0QVRX/Q1o=; b=M68sPAozsImE6+VrxtDNPKg0GuUz5KxgTOsT+blEXaZGURnGDagklZMSEEHJV2imHaA44s ZCSASzILxu3Edr92RpLWQQGZh9wSP9X2aeMrW9Mg8jQF361kDhnX03ste1/st1trAocmRX xcbaivJKB/wQRfDy0Z3rVVpvoOqlYVQ= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-661-dAhIxSivP-2MC3fpxUnfHA-1; Mon, 19 May 2025 12:11:51 -0400 X-MC-Unique: dAhIxSivP-2MC3fpxUnfHA-1 X-Mimecast-MFC-AGG-ID: dAhIxSivP-2MC3fpxUnfHA_1747671108 Received: from mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 7E5841800360; Mon, 19 May 2025 16:11:47 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.188]) by mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 36B0B1956095; Mon, 19 May 2025 16:11:43 +0000 (UTC) From: David Howells To: Christian Brauner Cc: David Howells , Marc Dionne , linux-afs@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Etienne Champetier , Jeffrey Altman , Chet Ramey , Cheyenne Wills , Alexander Viro , Christian Brauner , Steve French , Mimi Zohar , openafs-devel@openafs.org, linux-cifs@vger.kernel.org, linux-integrity@vger.kernel.org Subject: [PATCH 2/2] vfs: Fix inode ownership checks with regard to foreign ownership Date: Mon, 19 May 2025 17:11:23 +0100 Message-ID: <20250519161125.2981681-3-dhowells@redhat.com> In-Reply-To: <20250519161125.2981681-1-dhowells@redhat.com> References: <20250519161125.2981681-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.15 Content-Type: text/plain; charset="utf-8" Fix a number of ownership checks made by the VFS that assume that inode->i_uid is meaningful with respect to the UID space of the system performing the check. Network filesystems, however, may violate this assumption - and, indeed, a network filesystem may not even have an actual concept of a UNIX integer UID (cifs, for example). There are a number of places within the VFS where UID checks are made and some of these should be deferring the interpretation to the filesystem by way of the previously added vfs_inode_is_owned_by_me() and vfs_inodes_have_same_owner(): (*) chown_ok() (*) chgrp_ok() These should call vfs_inode_is_owned_by_me(). Possibly these need to defer all their checks to the network filesystem as the interpretation of the new UID/GID depends on the netfs too, but the ->setattr() method gets a chance to deal with that. (*) do_coredump() Should probably call vfs_is_owned_by_me() to check that the file created is owned by the caller - but the check that's there might be sufficient. (*) inode_owner_or_capable() Should call vfs_is_owned_by_me(). I'm not sure whether the namespace mapping makes sense in such a case, but it probably could be used. (*) vfs_setlease() Should call vfs_is_owned_by_me(). Actually, it should query if leasing is permitted. Also, setting locks could perhaps do with a permission call to the filesystem driver as AFS, for example, has a lock permission bit in the ACL, but since the AFS server checks that when the RPC call is made, it's probably unnecessary. (*) acl_permission_check() (*) posix_acl_permission() These functions are only used by generic_permission() which is overridden if ->permission() is supplied, and when evaluating a POSIX ACL, it should arguably be checking the UID anyway. AFS, for example, implements its own ACLs and evaluates them in ->permission() and on the server. (*) may_follow_link() Should call vfs_is_owned_by_me() and also vfs_have_same_owner() on the the link and its parent dir. The latter only applies on world-writable sticky dirs. (*) may_create_in_sticky() The initial subject of this patch. Should call vfs_is_owned_by_me() and also vfs_have_same_owner() both. (*) __check_sticky() Should call vfs_is_owned_by_me() on both the dir and the inode. (*) may_dedupe_file() Should call vfs_is_owned_by_me(). (*) IMA policy ops. I'm not sure what the best way to deal with this is - if, indeed, it needs any changes. Note that wrapping stuff up into vfs_inode_is_owned_by_me() isn't necessarily the most efficient as it means we may end up doing the uid idmapping an extra time - though mostly this is in places where I'm not sure it matters so much. Signed-off-by: David Howells cc: Etienne Champetier cc: Marc Dionne cc: Jeffrey Altman cc: Chet Ramey cc: Cheyenne Wills cc: Alexander Viro cc: Christian Brauner cc: Steve French cc: Mimi Zohar cc: linux-afs@lists.infradead.org cc: openafs-devel@openafs.org cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-integrity@vger.kernel.org Link: https://groups.google.com/g/gnu.bash.bug/c/6PPTfOgFdL4/m/2AQU-S1N76UJ Link: https://git.savannah.gnu.org/cgit/bash.git/tree/redir.c?h=3Dbash-5.3-= rc1#n733 Reviewed-by: Jeffrey Altman --- fs/attr.c | 58 +++++++++++++++++++++++++++++------------------- fs/coredump.c | 3 +-- fs/inode.c | 8 +++++-- fs/locks.c | 7 ++++-- fs/namei.c | 30 +++++++++++++------------ fs/remap_range.c | 20 +++++++++-------- 6 files changed, 74 insertions(+), 52 deletions(-) diff --git a/fs/attr.c b/fs/attr.c index 9caf63d20d03..fefd92af56a2 100644 --- a/fs/attr.c +++ b/fs/attr.c @@ -16,6 +16,7 @@ #include #include #include +#include "internal.h" =20 /** * setattr_should_drop_sgid - determine whether the setgid bit needs to be @@ -91,19 +92,21 @@ EXPORT_SYMBOL(setattr_should_drop_suidgid); * permissions. On non-idmapped mounts or if permission checking is to be * performed on the raw inode simply pass @nop_mnt_idmap. */ -static bool chown_ok(struct mnt_idmap *idmap, - const struct inode *inode, vfsuid_t ia_vfsuid) +static int chown_ok(struct mnt_idmap *idmap, + const struct inode *inode, vfsuid_t ia_vfsuid) { vfsuid_t vfsuid =3D i_uid_into_vfsuid(idmap, inode); - if (vfsuid_eq_kuid(vfsuid, current_fsuid()) && - vfsuid_eq(ia_vfsuid, vfsuid)) - return true; + int ret; + + ret =3D vfs_inode_is_owned_by_me(idmap, inode); + if (ret <=3D 0) + return ret; if (capable_wrt_inode_uidgid(idmap, inode, CAP_CHOWN)) - return true; + return 0; if (!vfsuid_valid(vfsuid) && ns_capable(inode->i_sb->s_user_ns, CAP_CHOWN)) - return true; - return false; + return 0; + return -EPERM; } =20 /** @@ -118,23 +121,27 @@ static bool chown_ok(struct mnt_idmap *idmap, * permissions. On non-idmapped mounts or if permission checking is to be * performed on the raw inode simply pass @nop_mnt_idmap. */ -static bool chgrp_ok(struct mnt_idmap *idmap, - const struct inode *inode, vfsgid_t ia_vfsgid) +static int chgrp_ok(struct mnt_idmap *idmap, + const struct inode *inode, vfsgid_t ia_vfsgid) { vfsgid_t vfsgid =3D i_gid_into_vfsgid(idmap, inode); - vfsuid_t vfsuid =3D i_uid_into_vfsuid(idmap, inode); - if (vfsuid_eq_kuid(vfsuid, current_fsuid())) { + int ret; + + ret =3D vfs_inode_is_owned_by_me(idmap, inode); + if (ret < 0) + return ret; + if (ret =3D=3D 0) { if (vfsgid_eq(ia_vfsgid, vfsgid)) - return true; + return 0; if (vfsgid_in_group_p(ia_vfsgid)) - return true; + return 0; } if (capable_wrt_inode_uidgid(idmap, inode, CAP_CHOWN)) - return true; + return 0; if (!vfsgid_valid(vfsgid) && ns_capable(inode->i_sb->s_user_ns, CAP_CHOWN)) - return true; - return false; + return 0; + return -EPERM; } =20 /** @@ -163,6 +170,7 @@ int setattr_prepare(struct mnt_idmap *idmap, struct den= try *dentry, { struct inode *inode =3D d_inode(dentry); unsigned int ia_valid =3D attr->ia_valid; + int ret; =20 /* * First check size constraints. These can't be overriden using @@ -179,14 +187,18 @@ int setattr_prepare(struct mnt_idmap *idmap, struct d= entry *dentry, goto kill_priv; =20 /* Make sure a caller can chown. */ - if ((ia_valid & ATTR_UID) && - !chown_ok(idmap, inode, attr->ia_vfsuid)) - return -EPERM; + if (ia_valid & ATTR_UID) { + ret =3D chown_ok(idmap, inode, attr->ia_vfsuid); + if (ret < 0) + return ret; + } =20 /* Make sure caller can chgrp. */ - if ((ia_valid & ATTR_GID) && - !chgrp_ok(idmap, inode, attr->ia_vfsgid)) - return -EPERM; + if (ia_valid & ATTR_GID) { + ret =3D chgrp_ok(idmap, inode, attr->ia_vfsgid); + if (ret < 0) + return ret; + } =20 /* Make sure a caller can chmod. */ if (ia_valid & ATTR_MODE) { diff --git a/fs/coredump.c b/fs/coredump.c index c33c177a701b..ded3936b2067 100644 --- a/fs/coredump.c +++ b/fs/coredump.c @@ -720,8 +720,7 @@ void do_coredump(const kernel_siginfo_t *siginfo) * filesystem. */ idmap =3D file_mnt_idmap(cprm.file); - if (!vfsuid_eq_kuid(i_uid_into_vfsuid(idmap, inode), - current_fsuid())) { + if (vfs_inode_is_owned_by_me(idmap, inode) !=3D 0) { coredump_report_failure("Core dump to %s aborted: " "cannot preserve file owner", cn.corename); goto close_fail; diff --git a/fs/inode.c b/fs/inode.c index 99318b157a9a..7e91b6f87757 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -2586,11 +2586,15 @@ bool inode_owner_or_capable(struct mnt_idmap *idmap, { vfsuid_t vfsuid; struct user_namespace *ns; + int ret; =20 - vfsuid =3D i_uid_into_vfsuid(idmap, inode); - if (vfsuid_eq_kuid(vfsuid, current_fsuid())) + ret =3D vfs_inode_is_owned_by_me(idmap, inode); + if (ret =3D=3D 0) return true; + if (ret < 0) + return false; =20 + vfsuid =3D i_uid_into_vfsuid(idmap, inode); ns =3D current_user_ns(); if (vfsuid_has_mapping(ns, vfsuid) && ns_capable(ns, CAP_FOWNER)) return true; diff --git a/fs/locks.c b/fs/locks.c index 1619cddfa7a4..b7a2a3ab7529 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -68,6 +68,7 @@ #include =20 #include +#include "internal.h" =20 static struct file_lock *file_lock(struct file_lock_core *flc) { @@ -2013,10 +2014,12 @@ int vfs_setlease(struct file *filp, int arg, struct file_lease **lease, void *= *priv) { struct inode *inode =3D file_inode(filp); - vfsuid_t vfsuid =3D i_uid_into_vfsuid(file_mnt_idmap(filp), inode); int error; =20 - if ((!vfsuid_eq_kuid(vfsuid, current_fsuid())) && !capable(CAP_LEASE)) + error =3D vfs_inode_is_owned_by_me(file_mnt_idmap(filp), inode); + if (error < 0) + return error; + if (error !=3D 0 && !capable(CAP_LEASE)) return -EACCES; if (!S_ISREG(inode->i_mode)) return -EINVAL; diff --git a/fs/namei.c b/fs/namei.c index 9f42dc46322f..6ede952d424a 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1195,26 +1195,26 @@ static int vfs_inodes_have_same_owner(struct mnt_id= map *idmap, struct inode *ino * * Returns 0 if following the symlink is allowed, -ve on error. */ -static inline int may_follow_link(struct nameidata *nd, const struct inode= *inode) +static inline int may_follow_link(struct nameidata *nd, struct inode *inod= e) { struct mnt_idmap *idmap; - vfsuid_t vfsuid; + int ret; =20 if (!sysctl_protected_symlinks) return 0; =20 - idmap =3D mnt_idmap(nd->path.mnt); - vfsuid =3D i_uid_into_vfsuid(idmap, inode); - /* Allowed if owner and follower match. */ - if (vfsuid_eq_kuid(vfsuid, current_fsuid())) - return 0; - /* Allowed if parent directory not sticky and world-writable. */ if ((nd->dir_mode & (S_ISVTX|S_IWOTH)) !=3D (S_ISVTX|S_IWOTH)) return 0; =20 + idmap =3D mnt_idmap(nd->path.mnt); + /* Allowed if owner and follower match. */ + ret =3D vfs_inode_is_owned_by_me(idmap, inode); + if (ret <=3D 0) + return ret; + /* Allowed if parent directory and link owner match. */ - if (vfsuid_valid(nd->dir_vfsuid) && vfsuid_eq(nd->dir_vfsuid, vfsuid)) + if (vfs_inodes_have_same_owner(idmap, inode, nd)) return 0; =20 if (nd->flags & LOOKUP_RCU) @@ -3157,12 +3157,14 @@ EXPORT_SYMBOL(user_path_at); int __check_sticky(struct mnt_idmap *idmap, struct inode *dir, struct inode *inode) { - kuid_t fsuid =3D current_fsuid(); + int ret; =20 - if (vfsuid_eq_kuid(i_uid_into_vfsuid(idmap, inode), fsuid)) - return 0; - if (vfsuid_eq_kuid(i_uid_into_vfsuid(idmap, dir), fsuid)) - return 0; + ret =3D vfs_inode_is_owned_by_me(idmap, inode); + if (ret <=3D 0) + return ret; + ret =3D vfs_inode_is_owned_by_me(idmap, dir); + if (ret <=3D 0) + return ret; return !capable_wrt_inode_uidgid(idmap, inode, CAP_FOWNER); } EXPORT_SYMBOL(__check_sticky); diff --git a/fs/remap_range.c b/fs/remap_range.c index 26afbbbfb10c..9eee93c27001 100644 --- a/fs/remap_range.c +++ b/fs/remap_range.c @@ -413,20 +413,22 @@ loff_t vfs_clone_file_range(struct file *file_in, lof= f_t pos_in, EXPORT_SYMBOL(vfs_clone_file_range); =20 /* Check whether we are allowed to dedupe the destination file */ -static bool may_dedupe_file(struct file *file) +static int may_dedupe_file(struct file *file) { struct mnt_idmap *idmap =3D file_mnt_idmap(file); struct inode *inode =3D file_inode(file); + int ret; =20 if (capable(CAP_SYS_ADMIN)) - return true; + return 0; if (file->f_mode & FMODE_WRITE) - return true; - if (vfsuid_eq_kuid(i_uid_into_vfsuid(idmap, inode), current_fsuid())) - return true; + return 0; + ret =3D vfs_inode_is_owned_by_me(idmap, inode); + if (ret <=3D 0) + return ret; if (!inode_permission(idmap, inode, MAY_WRITE)) - return true; - return false; + return 0; + return -EPERM; } =20 loff_t vfs_dedupe_file_range_one(struct file *src_file, loff_t src_pos, @@ -459,8 +461,8 @@ loff_t vfs_dedupe_file_range_one(struct file *src_file,= loff_t src_pos, if (ret) return ret; =20 - ret =3D -EPERM; - if (!may_dedupe_file(dst_file)) + ret =3D may_dedupe_file(dst_file); + if (ret < 0) goto out_drop_write; =20 ret =3D -EXDEV;