[PATCH v2 5/8] landlock: Always allow signals between threads of the same process

Mickaël Salaün posted 8 patches 9 months ago
[PATCH v2 5/8] landlock: Always allow signals between threads of the same process
Posted by Mickaël Salaün 9 months ago
Because Linux credentials are managed per thread, user space relies on
some hack to synchronize credential update across threads from the same
process.  This is required by the Native POSIX Threads Library and
implemented by set*id(2) wrappers and libcap(3) to use tgkill(2) to
synchronize threads.  See nptl(7) and libpsx(3).  Furthermore, some
runtimes like Go do not enable developers to have control over threads
[1].

To avoid potential issues, and because threads are not security
boundaries, let's relax the Landlock (optional) signal scoping to always
allow signals sent between threads of the same process.  This exception
is similar to the __ptrace_may_access() one.

hook_file_set_fowner() now checks if the target task is part of the same
process as the caller.  If this is the case, then the related signal
triggered by the socket will always be allowed.

Scoping of abstract UNIX sockets is not changed because kernel objects
(e.g. sockets) should be tied to their creator's domain at creation
time.

Note that creating one Landlock domain per thread puts each of these
threads (and their future children) in their own scope, which is
probably not what users expect, especially in Go where we do not control
threads.  However, being able to drop permissions on all threads should
not be restricted by signal scoping.  We are working on a way to make it
possible to atomically restrict all threads of a process with the same
domain [2].

Add erratum for signal scoping.

Closes: https://github.com/landlock-lsm/go-landlock/issues/36
Fixes: 54a6e6bbf3be ("landlock: Add signal scoping")
Fixes: c8994965013e ("selftests/landlock: Test signal scoping for threads")
Depends-on: 26f204380a3c ("fs: Fix file_set_fowner LSM hook inconsistencies")
Link: https://pkg.go.dev/kernel.org/pub/linux/libs/security/libcap/psx [1]
Link: https://github.com/landlock-lsm/linux/issues/2 [2]
Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Serge Hallyn <serge@hallyn.com>
Cc: Tahera Fahimi <fahimitahera@gmail.com>
Cc: stable@vger.kernel.org
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20250318161443.279194-6-mic@digikod.net
---

Changes since v1:
- Add Acked-by Christian.
- Add Landlock erratum.
- Update subject.
---
 security/landlock/errata/abi-6.h              | 19 ++++++++++++++++
 security/landlock/fs.c                        | 22 +++++++++++++++----
 security/landlock/task.c                      | 12 ++++++++++
 .../selftests/landlock/scoped_signal_test.c   |  2 +-
 4 files changed, 50 insertions(+), 5 deletions(-)
 create mode 100644 security/landlock/errata/abi-6.h

diff --git a/security/landlock/errata/abi-6.h b/security/landlock/errata/abi-6.h
new file mode 100644
index 000000000000..df7bc0e1fdf4
--- /dev/null
+++ b/security/landlock/errata/abi-6.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+/**
+ * DOC: erratum_2
+ *
+ * Erratum 2: Scoped signal handling
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ * This fix addresses an issue where signal scoping was overly restrictive,
+ * preventing sandboxed threads from signaling other threads within the same
+ * process if they belonged to different domains.  Because threads are not
+ * security boundaries, user space might assume that any thread within the same
+ * process can send signals between themselves (see :manpage:`nptl(7)` and
+ * :manpage:`libpsx(3)`).  Consistent with :manpage:`ptrace(2)` behavior, direct
+ * interaction between threads of the same process should always be allowed.
+ * This change ensures that any thread is allowed to send signals to any other
+ * thread within the same process, regardless of their domain.
+ */
+LANDLOCK_ERRATUM(2)
diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index 71b9dc331aae..47c862fe14e4 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -27,7 +27,9 @@
 #include <linux/mount.h>
 #include <linux/namei.h>
 #include <linux/path.h>
+#include <linux/pid.h>
 #include <linux/rcupdate.h>
+#include <linux/sched/signal.h>
 #include <linux/spinlock.h>
 #include <linux/stat.h>
 #include <linux/types.h>
@@ -1630,15 +1632,27 @@ static int hook_file_ioctl_compat(struct file *file, unsigned int cmd,
 
 static void hook_file_set_fowner(struct file *file)
 {
-	struct landlock_ruleset *new_dom, *prev_dom;
+	struct fown_struct *fown = file_f_owner(file);
+	struct landlock_ruleset *new_dom = NULL;
+	struct landlock_ruleset *prev_dom;
+	struct task_struct *p;
 
 	/*
 	 * Lock already held by __f_setown(), see commit 26f204380a3c ("fs: Fix
 	 * file_set_fowner LSM hook inconsistencies").
 	 */
-	lockdep_assert_held(&file_f_owner(file)->lock);
-	new_dom = landlock_get_current_domain();
-	landlock_get_ruleset(new_dom);
+	lockdep_assert_held(&fown->lock);
+
+	/*
+	 * Always allow sending signals between threads of the same process.  This
+	 * ensures consistency with hook_task_kill().
+	 */
+	p = pid_task(fown->pid, fown->pid_type);
+	if (!same_thread_group(p, current)) {
+		new_dom = landlock_get_current_domain();
+		landlock_get_ruleset(new_dom);
+	}
+
 	prev_dom = landlock_file(file)->fown_domain;
 	landlock_file(file)->fown_domain = new_dom;
 
diff --git a/security/landlock/task.c b/security/landlock/task.c
index dc7dab78392e..4578ce6e319d 100644
--- a/security/landlock/task.c
+++ b/security/landlock/task.c
@@ -13,6 +13,7 @@
 #include <linux/lsm_hooks.h>
 #include <linux/rcupdate.h>
 #include <linux/sched.h>
+#include <linux/sched/signal.h>
 #include <net/af_unix.h>
 #include <net/sock.h>
 
@@ -264,6 +265,17 @@ static int hook_task_kill(struct task_struct *const p,
 		/* Dealing with USB IO. */
 		dom = landlock_cred(cred)->domain;
 	} else {
+		/*
+		 * Always allow sending signals between threads of the same process.
+		 * This is required for process credential changes by the Native POSIX
+		 * Threads Library and implemented by the set*id(2) wrappers and
+		 * libcap(3) with tgkill(2).  See nptl(7) and libpsx(3).
+		 *
+		 * This exception is similar to the __ptrace_may_access() one.
+		 */
+		if (same_thread_group(p, current))
+			return 0;
+
 		dom = landlock_get_current_domain();
 	}
 	dom = landlock_get_applicable_domain(dom, signal_scope);
diff --git a/tools/testing/selftests/landlock/scoped_signal_test.c b/tools/testing/selftests/landlock/scoped_signal_test.c
index 475ee62a832d..767f117703b7 100644
--- a/tools/testing/selftests/landlock/scoped_signal_test.c
+++ b/tools/testing/selftests/landlock/scoped_signal_test.c
@@ -281,7 +281,7 @@ TEST(signal_scoping_threads)
 	/* Restricts the domain after creating the first thread. */
 	create_scoped_domain(_metadata, LANDLOCK_SCOPE_SIGNAL);
 
-	ASSERT_EQ(EPERM, pthread_kill(no_sandbox_thread, 0));
+	ASSERT_EQ(0, pthread_kill(no_sandbox_thread, 0));
 	ASSERT_EQ(1, write(thread_pipe[1], ".", 1));
 
 	ASSERT_EQ(0, pthread_create(&scoped_thread, NULL, thread_func, NULL));
-- 
2.48.1

Re: [PATCH v2 5/8] landlock: Always allow signals between threads of the same process
Posted by kernel test robot 8 months, 3 weeks ago

Hello,

kernel test robot noticed "Oops:general_protection_fault,probably_for_non-canonical_address#:#[##]SMP_KASAN" on:

commit: b9fb5bfdb361fd6d945c09c93d351740310a26c7 ("[PATCH v2 5/8] landlock: Always allow signals between threads of the same process")
url: https://github.com/intel-lab-lkp/linux/commits/Micka-l-Sala-n/landlock-Move-code-to-ease-future-backports/20250319-003737
patch link: https://lore.kernel.org/all/20250318161443.279194-6-mic@digikod.net/
patch subject: [PATCH v2 5/8] landlock: Always allow signals between threads of the same process

in testcase: trinity
version: trinity-x86_64-ba2360ed-1_20241228
with following parameters:

	runtime: 300s
	group: group-03
	nr_groups: 5



config: x86_64-randconfig-005-20250325
compiler: gcc-12
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

(please refer to attached dmesg/kmsg for entire log/backtrace)


we noticed the issue happens randomly (35 times out of 200 runs as below).
but parent keeps clean.


37897789c51dd898 b9fb5bfdb361fd6d945c09c93d3
---------------- ---------------------------
       fail:runs  %reproduction    fail:runs
           |             |             |
           :200         18%          35:200   dmesg.KASAN:null-ptr-deref_in_range[#-#]
           :200         18%          35:200   dmesg.Kernel_panic-not_syncing:Fatal_exception
           :200         18%          35:200   dmesg.Oops:general_protection_fault,probably_for_non-canonical_address#:#[##]SMP_KASAN
           :200         18%          35:200   dmesg.RIP:hook_file_set_fowner



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202503261534.22d970e8-lkp@intel.com


[  354.738531][  T222]
[  355.199494][  T222] [main] 2245715 iterations. [F:1644455 S:601688 HI:11581]
[  355.199514][  T222]
[  355.934630][  T222] [main] 2273938 iterations. [F:1665198 S:609188 HI:11581]
[  355.934650][  T222]
[  356.308897][ T3147] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000151: 0000 [#1] SMP KASAN
[  356.309510][ T3147] KASAN: null-ptr-deref in range [0x0000000000000a88-0x0000000000000a8f]
[  356.309910][ T3147] CPU: 1 UID: 65534 PID: 3147 Comm: trinity-c2 Not tainted 6.14.0-rc5-00005-gb9fb5bfdb361 #1 145c38dc5407add8933da653ccf9cf31d58da93c
[  356.310560][ T3147] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 356.311050][ T3147] RIP: 0010:hook_file_set_fowner (kbuild/src/consumer/include/linux/sched/signal.h:707 (discriminator 9) kbuild/src/consumer/security/landlock/fs.c:1651 (discriminator 9)) 
[ 356.311345][ T3147] Code: 49 8b 7c 24 50 65 4c 8b 25 e7 e4 0c 7e e8 52 63 33 ff 48 ba 00 00 00 00 00 fc ff df 48 8d b8 88 0a 00 00 48 89 f9 48 c1 e9 03 <80> 3c 11 00 0f 85 7e 02 00 00 49 8d bc 24 88 0a 00 00 4c 8b a8 88
All code
========
   0:	49 8b 7c 24 50       	mov    0x50(%r12),%rdi
   5:	65 4c 8b 25 e7 e4 0c 	mov    %gs:0x7e0ce4e7(%rip),%r12        # 0x7e0ce4f4
   c:	7e 
   d:	e8 52 63 33 ff       	call   0xffffffffff336364
  12:	48 ba 00 00 00 00 00 	movabs $0xdffffc0000000000,%rdx
  19:	fc ff df 
  1c:	48 8d b8 88 0a 00 00 	lea    0xa88(%rax),%rdi
  23:	48 89 f9             	mov    %rdi,%rcx
  26:	48 c1 e9 03          	shr    $0x3,%rcx
  2a:*	80 3c 11 00          	cmpb   $0x0,(%rcx,%rdx,1)		<-- trapping instruction
  2e:	0f 85 7e 02 00 00    	jne    0x2b2
  34:	49 8d bc 24 88 0a 00 	lea    0xa88(%r12),%rdi
  3b:	00 
  3c:	4c                   	rex.WR
  3d:	8b                   	.byte 0x8b
  3e:	a8 88                	test   $0x88,%al

Code starting with the faulting instruction
===========================================
   0:	80 3c 11 00          	cmpb   $0x0,(%rcx,%rdx,1)
   4:	0f 85 7e 02 00 00    	jne    0x288
   a:	49 8d bc 24 88 0a 00 	lea    0xa88(%r12),%rdi
  11:	00 
  12:	4c                   	rex.WR
  13:	8b                   	.byte 0x8b
  14:	a8 88                	test   $0x88,%al
[  356.312254][ T3147] RSP: 0018:ffffc9000883fd20 EFLAGS: 00010002
[  356.312556][ T3147] RAX: 0000000000000000 RBX: ffff88816ee4c700 RCX: 0000000000000151
[  356.312933][ T3147] RDX: dffffc0000000000 RSI: 0000000000000000 RDI: 0000000000000a88
[  356.313310][ T3147] RBP: ffffc9000883fd48 R08: 0000000000000000 R09: 0000000000000000
[  356.313687][ T3147] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88814f0c8000
[  356.314063][ T3147] R13: ffff88814f92b700 R14: ffff888161e71450 R15: ffff888161e71408
[  356.314440][ T3147] FS:  00007f3c72136740(0000) GS:ffff8883af000000(0000) knlGS:0000000000000000
[  356.314879][ T3147] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  356.315194][ T3147] CR2: 00007f3c708bd000 CR3: 0000000165606000 CR4: 00000000000406f0
[  356.315573][ T3147] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  356.315950][ T3147] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  356.316334][ T3147] Call Trace:
[  356.316498][ T3147]  <TASK>
[ 356.316645][ T3147] ? show_regs (kbuild/src/consumer/arch/x86/kernel/dumpstack.c:479) 
[ 356.316859][ T3147] ? die_addr (kbuild/src/consumer/arch/x86/kernel/dumpstack.c:421 kbuild/src/consumer/arch/x86/kernel/dumpstack.c:460) 
[ 356.317066][ T3147] ? exc_general_protection (kbuild/src/consumer/arch/x86/kernel/traps.c:751 kbuild/src/consumer/arch/x86/kernel/traps.c:693) 
[ 356.317349][ T3147] ? asm_exc_general_protection (kbuild/src/consumer/arch/x86/include/asm/idtentry.h:574) 


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250326/202503261534.22d970e8-lkp@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Re: [PATCH v2 5/8] landlock: Always allow signals between threads of the same process
Posted by Mickaël Salaün 8 months, 3 weeks ago
Thanks for the report.

I assumed __f_setown() was only called in an RCU read-side critical section but
that's not the case (e.g. fcntl_dirnotify).  I moved the pid_task() call in a
dedicated function to protect it with an RCU guard.  Here is the new hunk:


@@ -1628,21 +1630,46 @@ static int hook_file_ioctl_compat(struct file *file, unsigned int cmd,
        return -EACCES;
 }

-static void hook_file_set_fowner(struct file *file)
+/*
+ * Always allow sending signals between threads of the same process.  This
+ * ensures consistency with hook_task_kill().
+ */
+static bool control_current_fowner(struct fown_struct *const fown)
 {
-       struct landlock_ruleset *new_dom, *prev_dom;
+       struct task_struct *p;

        /*
         * Lock already held by __f_setown(), see commit 26f204380a3c ("fs: Fix
         * file_set_fowner LSM hook inconsistencies").
         */
-       lockdep_assert_held(&file_f_owner(file)->lock);
-       new_dom = landlock_get_current_domain();
-       landlock_get_ruleset(new_dom);
+       lockdep_assert_held(&fown->lock);
+
+       /*
+        * Some callers (e.g. fcntl_dirnotify) may not be in an RCU read-side
+        * critical section.
+        */
+       guard(rcu)();
+       p = pid_task(fown->pid, fown->pid_type);
+       if (!p)
+               return true;
+
+       return !same_thread_group(p, current);
+}
+
+static void hook_file_set_fowner(struct file *file)
+{
+       struct landlock_ruleset *prev_dom;
+       struct landlock_ruleset *new_dom = NULL;
+
+       if (control_current_fowner(file_f_owner(file))) {
+               new_dom = landlock_get_current_domain();
+               landlock_get_ruleset(new_dom);
+       }
+
        prev_dom = landlock_file(file)->fown_domain;
        landlock_file(file)->fown_domain = new_dom;

-       /* Called in an RCU read-side critical section. */
+       /* May be called in an RCU read-side critical section. */
        landlock_put_ruleset_deferred(prev_dom);
 }


This other report gives more details:
https://lore.kernel.org/all/202503261510.f9652c11-lkp@intel.com/


On Wed, Mar 26, 2025 at 03:51:50PM +0800, kernel test robot wrote:
> 
> 
> Hello,
> 
> kernel test robot noticed "Oops:general_protection_fault,probably_for_non-canonical_address#:#[##]SMP_KASAN" on:
> 
> commit: b9fb5bfdb361fd6d945c09c93d351740310a26c7 ("[PATCH v2 5/8] landlock: Always allow signals between threads of the same process")
> url: https://github.com/intel-lab-lkp/linux/commits/Micka-l-Sala-n/landlock-Move-code-to-ease-future-backports/20250319-003737
> patch link: https://lore.kernel.org/all/20250318161443.279194-6-mic@digikod.net/
> patch subject: [PATCH v2 5/8] landlock: Always allow signals between threads of the same process
> 
> in testcase: trinity
> version: trinity-x86_64-ba2360ed-1_20241228
> with following parameters:
> 
> 	runtime: 300s
> 	group: group-03
> 	nr_groups: 5
> 
> 
> 
> config: x86_64-randconfig-005-20250325
> compiler: gcc-12
> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
> 
> (please refer to attached dmesg/kmsg for entire log/backtrace)
> 
> 
> we noticed the issue happens randomly (35 times out of 200 runs as below).
> but parent keeps clean.
> 
> 
> 37897789c51dd898 b9fb5bfdb361fd6d945c09c93d3
> ---------------- ---------------------------
>        fail:runs  %reproduction    fail:runs
>            |             |             |
>            :200         18%          35:200   dmesg.KASAN:null-ptr-deref_in_range[#-#]
>            :200         18%          35:200   dmesg.Kernel_panic-not_syncing:Fatal_exception
>            :200         18%          35:200   dmesg.Oops:general_protection_fault,probably_for_non-canonical_address#:#[##]SMP_KASAN
>            :200         18%          35:200   dmesg.RIP:hook_file_set_fowner
> 
> 
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202503261534.22d970e8-lkp@intel.com
> 
> 
> [  354.738531][  T222]
> [  355.199494][  T222] [main] 2245715 iterations. [F:1644455 S:601688 HI:11581]
> [  355.199514][  T222]
> [  355.934630][  T222] [main] 2273938 iterations. [F:1665198 S:609188 HI:11581]
> [  355.934650][  T222]
> [  356.308897][ T3147] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000151: 0000 [#1] SMP KASAN
> [  356.309510][ T3147] KASAN: null-ptr-deref in range [0x0000000000000a88-0x0000000000000a8f]
> [  356.309910][ T3147] CPU: 1 UID: 65534 PID: 3147 Comm: trinity-c2 Not tainted 6.14.0-rc5-00005-gb9fb5bfdb361 #1 145c38dc5407add8933da653ccf9cf31d58da93c
> [  356.310560][ T3147] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> [ 356.311050][ T3147] RIP: 0010:hook_file_set_fowner (kbuild/src/consumer/include/linux/sched/signal.h:707 (discriminator 9) kbuild/src/consumer/security/landlock/fs.c:1651 (discriminator 9)) 
> [ 356.311345][ T3147] Code: 49 8b 7c 24 50 65 4c 8b 25 e7 e4 0c 7e e8 52 63 33 ff 48 ba 00 00 00 00 00 fc ff df 48 8d b8 88 0a 00 00 48 89 f9 48 c1 e9 03 <80> 3c 11 00 0f 85 7e 02 00 00 49 8d bc 24 88 0a 00 00 4c 8b a8 88
> All code
> ========
>    0:	49 8b 7c 24 50       	mov    0x50(%r12),%rdi
>    5:	65 4c 8b 25 e7 e4 0c 	mov    %gs:0x7e0ce4e7(%rip),%r12        # 0x7e0ce4f4
>    c:	7e 
>    d:	e8 52 63 33 ff       	call   0xffffffffff336364
>   12:	48 ba 00 00 00 00 00 	movabs $0xdffffc0000000000,%rdx
>   19:	fc ff df 
>   1c:	48 8d b8 88 0a 00 00 	lea    0xa88(%rax),%rdi
>   23:	48 89 f9             	mov    %rdi,%rcx
>   26:	48 c1 e9 03          	shr    $0x3,%rcx
>   2a:*	80 3c 11 00          	cmpb   $0x0,(%rcx,%rdx,1)		<-- trapping instruction
>   2e:	0f 85 7e 02 00 00    	jne    0x2b2
>   34:	49 8d bc 24 88 0a 00 	lea    0xa88(%r12),%rdi
>   3b:	00 
>   3c:	4c                   	rex.WR
>   3d:	8b                   	.byte 0x8b
>   3e:	a8 88                	test   $0x88,%al
> 
> Code starting with the faulting instruction
> ===========================================
>    0:	80 3c 11 00          	cmpb   $0x0,(%rcx,%rdx,1)
>    4:	0f 85 7e 02 00 00    	jne    0x288
>    a:	49 8d bc 24 88 0a 00 	lea    0xa88(%r12),%rdi
>   11:	00 
>   12:	4c                   	rex.WR
>   13:	8b                   	.byte 0x8b
>   14:	a8 88                	test   $0x88,%al
> [  356.312254][ T3147] RSP: 0018:ffffc9000883fd20 EFLAGS: 00010002
> [  356.312556][ T3147] RAX: 0000000000000000 RBX: ffff88816ee4c700 RCX: 0000000000000151
> [  356.312933][ T3147] RDX: dffffc0000000000 RSI: 0000000000000000 RDI: 0000000000000a88
> [  356.313310][ T3147] RBP: ffffc9000883fd48 R08: 0000000000000000 R09: 0000000000000000
> [  356.313687][ T3147] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88814f0c8000
> [  356.314063][ T3147] R13: ffff88814f92b700 R14: ffff888161e71450 R15: ffff888161e71408
> [  356.314440][ T3147] FS:  00007f3c72136740(0000) GS:ffff8883af000000(0000) knlGS:0000000000000000
> [  356.314879][ T3147] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  356.315194][ T3147] CR2: 00007f3c708bd000 CR3: 0000000165606000 CR4: 00000000000406f0
> [  356.315573][ T3147] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  356.315950][ T3147] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  356.316334][ T3147] Call Trace:
> [  356.316498][ T3147]  <TASK>
> [ 356.316645][ T3147] ? show_regs (kbuild/src/consumer/arch/x86/kernel/dumpstack.c:479) 
> [ 356.316859][ T3147] ? die_addr (kbuild/src/consumer/arch/x86/kernel/dumpstack.c:421 kbuild/src/consumer/arch/x86/kernel/dumpstack.c:460) 
> [ 356.317066][ T3147] ? exc_general_protection (kbuild/src/consumer/arch/x86/kernel/traps.c:751 kbuild/src/consumer/arch/x86/kernel/traps.c:693) 
> [ 356.317349][ T3147] ? asm_exc_general_protection (kbuild/src/consumer/arch/x86/include/asm/idtentry.h:574) 
> 
> 
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20250326/202503261534.22d970e8-lkp@intel.com
> 
> 
> 
> -- 
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
> 
>
Re: [linux-next:master] [landlock] 9d65581539: WARNING:suspicious_RCU_usage
Posted by Mickaël Salaün 8 months, 3 weeks ago
On Fri, Mar 28, 2025 at 05:26:32PM +0800, Oliver Sang wrote:
> hi, Mickaël Salaün,
> 
> On Fri, Mar 28, 2025 at 09:31:15AM +0100, Mickaël Salaün wrote:
> > On Fri, Mar 28, 2025 at 02:05:59PM +0800, Oliver Sang wrote:
> > > hi, Mickaël Salaün,
> > > 
> > > On Fri, Mar 28, 2025 at 11:00:37AM +0800, Oliver Sang wrote:
> > > > hi, Mickaël Salaün,
> > > > 
> > > > On Thu, Mar 27, 2025 at 07:41:08PM +0100, Mickaël Salaün wrote:
> > > > > Hi Olivier,
> > > > > 
> > > > > I pushed an updated yesterday in linux-next that should fix this issue
> > > > > and this other issue too:
> > > > > https://lore.kernel.org/all/20250326.yee0ba6Yai3m@digikod.net/
> > > > > 
> > > > > Could you please confirm that these issues are really fixed? Or
> > > > > otherwise, please let me know when I should expect (or not) an email
> > > > > from kernel test robot. :)
> > > > 
> > > > ok, we've started the tests for both issues. upon the commit:
> > > > db8da9da41bce (tag: next-20250327, linux-next/master) Add linux-next specific files for 20250327
> > > 
> > > sorry that due to unknown reason, we cannot build sucessfully upon db8da9da41bce
> > > for both tests (which are using randconfig). could you give us the commit-id
> > > of your fix? we could try test upon that fix commit again.
> > 
> > The new commit is 18eb75f3af40 ("landlock: Always allow signals between
> > threads of the same process").
> 
> it turned out the build failure is due to my typo. shamed...
> 
> we finished tests still upon db8da9da41bce, for the WARNING:suspicious_RCU_usage
> issue in this report, we run the tests 20 times, cannot reproduce now.
> 
> for the random issues we reported in
> https://lore.kernel.org/all/202503261534.22d970e8-lkp@intel.com/
> now we cannot reproduce it with db8da9da41bce by 500 runs.

Yes, that makes sense with my fix.

> 
> we think both issues are solved.
> 
> and since db8da9da41bce includes the 18eb75f3af40, we won't test again upon
> 18eb75f3af40. thanks!

Thanks for the confirmation!

> 
> 
> > 
> > > 
> > > > 
> > > > if this is not the correct commit to check, please let us know. thanks!
> > > > 
> > > > > 
> > > > > Regards,
> > > > >  Mickaël
> > > > > 
> > > > > On Wed, Mar 26, 2025 at 04:00:12PM +0800, kernel test robot wrote:
> > > > > > 
> > > > > > hi, Mickaël Salaün,
> > > > > > 
> > > > > > we just reported a random
> > > > > > "Oops:general_protection_fault,probably_for_non-canonical_address#:#[##]SMP_KASAN"
> > > > > > issue in
> > > > > > https://lore.kernel.org/all/202503261534.22d970e8-lkp@intel.com/
> > > > > > 
> > > > > > now we noticed this commit is also in linux-next/master.
> > > > > > 
> > > > > > we don't have enough knowledge to check the difference, but we found a
> > > > > > persistent issue for this commit.
> > > > > > 
> > > > > > 6d9ac5e4d70eba3e 9d65581539252fdb1666917a095
> > > > > > ---------------- ---------------------------
> > > > > >        fail:runs  %reproduction    fail:runs
> > > > > >            |             |             |
> > > > > >            :6          100%           6:6     dmesg.WARNING:suspicious_RCU_usage
> > > > > >            :6          100%           6:6     dmesg.boot_failures
> > > > > >            :6          100%           6:6     dmesg.kernel/pid.c:#suspicious_rcu_dereference_check()usage
> > > > > > 
> > > > > > below full report FYI.
> > > > > > 
> > > > > > 
> > > > > > Hello,
> > > > > > 
> > > > > > kernel test robot noticed "WARNING:suspicious_RCU_usage" on:
> > > > > > 
> > > > > > commit: 9d65581539252fdb1666917a09549c13090fe9e5 ("landlock: Always allow signals between threads of the same process")
> > > > > > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> > > > > > 
> > > > > > [test failed on linux-next/master eb4bc4b07f66f01618d9cb1aa4eaef59b1188415]
> > > > > > 
> > > > > > in testcase: trinity
> > > > > > version: trinity-x86_64-ba2360ed-1_20241228
> > > > > > with following parameters:
> > > > > > 
> > > > > > 	runtime: 300s
> > > > > > 	group: group-00
> > > > > > 	nr_groups: 5
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > config: x86_64-randconfig-101-20250325
> > > > > > compiler: gcc-12
> > > > > > test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
> > > > > > 
> > > > > > (please refer to attached dmesg/kmsg for entire log/backtrace)
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > > > > > the same patch/commit), kindly add following tags
> > > > > > | Reported-by: kernel test robot <oliver.sang@intel.com>
> > > > > > | Closes: https://lore.kernel.org/oe-lkp/202503261510.f9652c11-lkp@intel.com
> > > > > > 
> > > > > > 
> > > > > > [  166.893101][ T3747] WARNING: suspicious RCU usage
> > > > > > [  166.893462][ T3747] 6.14.0-rc5-00006-g9d6558153925 #1 Not tainted
> > > > > > [  166.893895][ T3747] -----------------------------
> > > > > > [  166.894239][ T3747] kernel/pid.c:414 suspicious rcu_dereference_check() usage!
> > > > > > [  166.894747][ T3747]
> > > > > > [  166.894747][ T3747] other info that might help us debug this:
> > > > > > [  166.894747][ T3747]
> > > > > > [  166.895450][ T3747]
> > > > > > [  166.895450][ T3747] rcu_scheduler_active = 2, debug_locks = 1
> > > > > > [  166.896030][ T3747] 3 locks held by trinity-c2/3747:
> > > > > > [ 166.896415][ T3747] #0: ffff888114a5a930 (&group->mark_mutex){+.+.}-{4:4}, at: fcntl_dirnotify (include/linux/sched/mm.h:332 include/linux/sched/mm.h:386 include/linux/fsnotify_backend.h:279 fs/notify/dnotify/dnotify.c:329) 
> > > > > > [ 166.897165][ T3747] #1: ffff888148bbea60 (&mark->lock){+.+.}-{3:3}, at: fcntl_dirnotify (fs/notify/dnotify/dnotify.c:349) 
> > > > > > [ 166.897831][ T3747] #2: ffff888108a53220 (&f_owner->lock){....}-{3:3}, at: __f_setown (fs/fcntl.c:137) 
> > > > > > [  166.898481][ T3747]
> > > > > > [  166.898481][ T3747] stack backtrace:
> > > > > > [  166.898901][ T3747] CPU: 0 UID: 65534 PID: 3747 Comm: trinity-c2 Not tainted 6.14.0-rc5-00006-g9d6558153925 #1
> > > > > > [  166.898908][ T3747] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> > > > > > [  166.898912][ T3747] Call Trace:
> > > > > > [  166.898916][ T3747]  <TASK>
> > > > > > [ 166.898921][ T3747] dump_stack_lvl (lib/dump_stack.c:123) 
> > > > > > [ 166.898932][ T3747] lockdep_rcu_suspicious (kernel/locking/lockdep.c:6848) 
> > > > > > [ 166.898945][ T3747] pid_task (kernel/pid.c:414 (discriminator 11)) 
> > > > > > [ 166.898954][ T3747] hook_file_set_fowner (security/landlock/fs.c:1651 (discriminator 9)) 
> > > > > > [ 166.898963][ T3747] security_file_set_fowner (arch/x86/include/asm/atomic.h:23 (discriminator 4) include/linux/atomic/atomic-arch-fallback.h:457 (discriminator 4) include/linux/jump_label.h:262 (discriminator 4) security/security.c:3062 (discriminator 4)) 
> > > > > > [ 166.898969][ T3747] __f_setown (fs/fcntl.c:145) 
> > > > > > [ 166.898980][ T3747] fcntl_dirnotify (fs/notify/dnotify/dnotify.c:233 fs/notify/dnotify/dnotify.c:371) 
> > > > > > [ 166.898996][ T3747] do_fcntl (fs/fcntl.c:539) 
> > > > > > [ 166.899002][ T3747] ? f_getown (fs/fcntl.c:448) 
> > > > > > [ 166.899007][ T3747] ? check_prev_add (kernel/locking/lockdep.c:3862) 
> > > > > > [ 166.899011][ T3747] ? do_syscall_64 (arch/x86/entry/common.c:102) 
> > > > > > [ 166.899023][ T3747] ? syscall_exit_to_user_mode (include/linux/entry-common.h:361 kernel/entry/common.c:220) 
> > > > > > [ 166.899038][ T3747] __x64_sys_fcntl (fs/fcntl.c:591 fs/fcntl.c:576 fs/fcntl.c:576) 
> > > > > > [ 166.899050][ T3747] do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83) 
> > > > > > [ 166.899062][ T3747] ? find_held_lock (kernel/locking/lockdep.c:5341) 
> > > > > > [ 166.899072][ T3747] ? __lock_release+0x10b/0x440 
> > > > > > [ 166.899076][ T3747] ? __task_pid_nr_ns (include/linux/rcupdate.h:347 include/linux/rcupdate.h:880 kernel/pid.c:514) 
> > > > > > [ 166.899082][ T3747] ? reacquire_held_locks (kernel/locking/lockdep.c:5502) 
> > > > > > [ 166.899087][ T3747] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4470) 
> > > > > > [ 166.899093][ T3747] ? do_syscall_64 (arch/x86/entry/common.c:102) 
> > > > > > [ 166.899099][ T3747] ? do_syscall_64 (arch/x86/entry/common.c:102) 
> > > > > > [ 166.899111][ T3747] ? syscall_exit_to_user_mode (include/linux/entry-common.h:361 kernel/entry/common.c:220) 
> > > > > > [ 166.899119][ T3747] ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:469 kernel/locking/lockdep.c:4409) 
> > > > > > [ 166.899124][ T3747] ? do_syscall_64 (arch/x86/entry/common.c:102) 
> > > > > > [ 166.899129][ T3747] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4470) 
> > > > > > [ 166.899134][ T3747] ? do_syscall_64 (arch/x86/entry/common.c:102) 
> > > > > > [ 166.899139][ T3747] ? do_int80_emulation (arch/x86/include/asm/atomic.h:23 include/linux/atomic/atomic-arch-fallback.h:457 include/linux/jump_label.h:262 arch/x86/entry/common.c:230) 
> > > > > > [ 166.899149][ T3747] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) 
> > > > > > [  166.899155][ T3747] RIP: 0033:0x7f55ad007719
> > > > > > [ 166.899159][ T3747] Code: 08 89 e8 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d b7 06 0d 00 f7 d8 64 89 01 48
> > > > > > All code
> > > > > > ========
> > > > > >    0:	08 89 e8 5b 5d c3    	or     %cl,-0x3ca2a418(%rcx)
> > > > > >    6:	66 2e 0f 1f 84 00 00 	cs nopw 0x0(%rax,%rax,1)
> > > > > >    d:	00 00 00 
> > > > > >   10:	90                   	nop
> > > > > >   11:	48 89 f8             	mov    %rdi,%rax
> > > > > >   14:	48 89 f7             	mov    %rsi,%rdi
> > > > > >   17:	48 89 d6             	mov    %rdx,%rsi
> > > > > >   1a:	48 89 ca             	mov    %rcx,%rdx
> > > > > >   1d:	4d 89 c2             	mov    %r8,%r10
> > > > > >   20:	4d 89 c8             	mov    %r9,%r8
> > > > > >   23:	4c 8b 4c 24 08       	mov    0x8(%rsp),%r9
> > > > > >   28:	0f 05                	syscall
> > > > > >   2a:*	48 3d 01 f0 ff ff    	cmp    $0xfffffffffffff001,%rax		<-- trapping instruction
> > > > > >   30:	73 01                	jae    0x33
> > > > > >   32:	c3                   	ret
> > > > > >   33:	48 8b 0d b7 06 0d 00 	mov    0xd06b7(%rip),%rcx        # 0xd06f1
> > > > > >   3a:	f7 d8                	neg    %eax
> > > > > >   3c:	64 89 01             	mov    %eax,%fs:(%rcx)
> > > > > >   3f:	48                   	rex.W
> > > > > > 
> > > > > > Code starting with the faulting instruction
> > > > > > ===========================================
> > > > > >    0:	48 3d 01 f0 ff ff    	cmp    $0xfffffffffffff001,%rax
> > > > > >    6:	73 01                	jae    0x9
> > > > > >    8:	c3                   	ret
> > > > > >    9:	48 8b 0d b7 06 0d 00 	mov    0xd06b7(%rip),%rcx        # 0xd06c7
> > > > > >   10:	f7 d8                	neg    %eax
> > > > > >   12:	64 89 01             	mov    %eax,%fs:(%rcx)
> > > > > >   15:	48                   	rex.W
> > > > > > [  166.899164][ T3747] RSP: 002b:00007ffff6eefb48 EFLAGS: 00000246 ORIG_RAX: 0000000000000048
> > > > > > [  166.899168][ T3747] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f55ad007719
> > > > > > [  166.899172][ T3747] RDX: 0000000000000027 RSI: 0000000000000402 RDI: 0000000000000043
> > > > > > [  166.899174][ T3747] RBP: 00007f55ab92f058 R08: 0000000099999999 R09: 00000000377dd000
> > > > > > [  166.899177][ T3747] R10: 0000000084848484 R11: 0000000000000246 R12: 0000000000000048
> > > > > > [  166.899180][ T3747] R13: 00007f55acf036c0 R14: 00007f55ab92f058 R15: 00007f55ab92f000
> > > > > > [  166.899203][ T3747]  </TASK>
> > > > > > 
> > > > > > 
> > > > > > The kernel config and materials to reproduce are available at:
> > > > > > https://download.01.org/0day-ci/archive/20250326/202503261510.f9652c11-lkp@intel.com
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > -- 
> > > > > > 0-DAY CI Kernel Test Service
> > > > > > https://github.com/intel/lkp-tests/wiki
> > > > > > 
> > > > > > 
Re: [PATCH v2 5/8] landlock: Always allow signals between threads of the same process
Posted by Mickaël Salaün 9 months ago
On Tue, Mar 18, 2025 at 05:14:40PM +0100, Mickaël Salaün wrote:
> Because Linux credentials are managed per thread, user space relies on
> some hack to synchronize credential update across threads from the same
> process.  This is required by the Native POSIX Threads Library and
> implemented by set*id(2) wrappers and libcap(3) to use tgkill(2) to
> synchronize threads.  See nptl(7) and libpsx(3).  Furthermore, some
> runtimes like Go do not enable developers to have control over threads
> [1].
> 
> To avoid potential issues, and because threads are not security
> boundaries, let's relax the Landlock (optional) signal scoping to always
> allow signals sent between threads of the same process.  This exception
> is similar to the __ptrace_may_access() one.
> 
> hook_file_set_fowner() now checks if the target task is part of the same
> process as the caller.  If this is the case, then the related signal
> triggered by the socket will always be allowed.
> 
> Scoping of abstract UNIX sockets is not changed because kernel objects
> (e.g. sockets) should be tied to their creator's domain at creation
> time.
> 
> Note that creating one Landlock domain per thread puts each of these
> threads (and their future children) in their own scope, which is
> probably not what users expect, especially in Go where we do not control
> threads.  However, being able to drop permissions on all threads should
> not be restricted by signal scoping.  We are working on a way to make it
> possible to atomically restrict all threads of a process with the same
> domain [2].
> 
> Add erratum for signal scoping.
> 
> Closes: https://github.com/landlock-lsm/go-landlock/issues/36
> Fixes: 54a6e6bbf3be ("landlock: Add signal scoping")
> Fixes: c8994965013e ("selftests/landlock: Test signal scoping for threads")
> Depends-on: 26f204380a3c ("fs: Fix file_set_fowner LSM hook inconsistencies")
> Link: https://pkg.go.dev/kernel.org/pub/linux/libs/security/libcap/psx [1]
> Link: https://github.com/landlock-lsm/linux/issues/2 [2]
> Cc: Günther Noack <gnoack@google.com>
> Cc: Paul Moore <paul@paul-moore.com>
> Cc: Serge Hallyn <serge@hallyn.com>
> Cc: Tahera Fahimi <fahimitahera@gmail.com>
> Cc: stable@vger.kernel.org
> Acked-by: Christian Brauner <brauner@kernel.org>
> Signed-off-by: Mickaël Salaün <mic@digikod.net>
> Link: https://lore.kernel.org/r/20250318161443.279194-6-mic@digikod.net

> index 71b9dc331aae..47c862fe14e4 100644
> --- a/security/landlock/fs.c
> +++ b/security/landlock/fs.c
> @@ -27,7 +27,9 @@
>  #include <linux/mount.h>
>  #include <linux/namei.h>
>  #include <linux/path.h>
> +#include <linux/pid.h>
>  #include <linux/rcupdate.h>
> +#include <linux/sched/signal.h>
>  #include <linux/spinlock.h>
>  #include <linux/stat.h>
>  #include <linux/types.h>
> @@ -1630,15 +1632,27 @@ static int hook_file_ioctl_compat(struct file *file, unsigned int cmd,
>  
>  static void hook_file_set_fowner(struct file *file)
>  {
> -	struct landlock_ruleset *new_dom, *prev_dom;
> +	struct fown_struct *fown = file_f_owner(file);
> +	struct landlock_ruleset *new_dom = NULL;
> +	struct landlock_ruleset *prev_dom;
> +	struct task_struct *p;
>  
>  	/*
>  	 * Lock already held by __f_setown(), see commit 26f204380a3c ("fs: Fix
>  	 * file_set_fowner LSM hook inconsistencies").
>  	 */
> -	lockdep_assert_held(&file_f_owner(file)->lock);
> -	new_dom = landlock_get_current_domain();
> -	landlock_get_ruleset(new_dom);
> +	lockdep_assert_held(&fown->lock);
> +
> +	/*
> +	 * Always allow sending signals between threads of the same process.  This
> +	 * ensures consistency with hook_task_kill().
> +	 */
> +	p = pid_task(fown->pid, fown->pid_type);
> +	if (!same_thread_group(p, current)) {

There is a missing pointer check.  I'll apply this:

-       if (!same_thread_group(p, current)) {
+       if (!p || !same_thread_group(p, current)) {

> +		new_dom = landlock_get_current_domain();
> +		landlock_get_ruleset(new_dom);
> +	}
> +
>  	prev_dom = landlock_file(file)->fown_domain;
>  	landlock_file(file)->fown_domain = new_dom;
>