[PATCH RFC v3 19/26] fs: stop sharing fs_struct between init_task and pid 1

Christian Brauner posted 26 patches 3 weeks, 5 days ago
[PATCH RFC v3 19/26] fs: stop sharing fs_struct between init_task and pid 1
Posted by Christian Brauner 3 weeks, 5 days ago
Spawn kernel_init (PID 1) via kernel_clone() directly instead of
user_mode_thread(), without CLONE_FS. This gives PID 1 its own private
copy of init_task's fs_struct rather than sharing it.

This is a prerequisite for isolating kthreads in nullfs: when
init_task's fs is later pointed at nullfs, PID 1 must not share it
or init_userspace_fs() would modify init_task's fs as well, defeating
the isolation.

At this stage PID 1 still gets rootfs (a private copy rather than a
shared reference), so there is no functional change.

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 init/main.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/init/main.c b/init/main.c
index 5ccc642a5aa7..6633d4bea52b 100644
--- a/init/main.c
+++ b/init/main.c
@@ -714,6 +714,11 @@ static __initdata DECLARE_COMPLETION(kthreadd_done);
 
 static noinline void __ref __noreturn rest_init(void)
 {
+	struct kernel_clone_args init_args = {
+		.flags		= (CLONE_VM | CLONE_UNTRACED),
+		.fn		= kernel_init,
+		.fn_arg		= NULL,
+	};
 	struct task_struct *tsk;
 	int pid;
 
@@ -723,7 +728,7 @@ static noinline void __ref __noreturn rest_init(void)
 	 * the init task will end up wanting to create kthreads, which, if
 	 * we schedule it before we create kthreadd, will OOPS.
 	 */
-	pid = user_mode_thread(kernel_init, NULL, CLONE_FS);
+	pid = kernel_clone(&init_args);
 	/*
 	 * Pin init on the boot CPU. Task migration is not properly working
 	 * until sched_init_smp() has been run. It will set the allowed

-- 
2.47.3