From nobody Sat Feb 7 04:47:11 2026 Received: from mail-pj1-f43.google.com (mail-pj1-f43.google.com [209.85.216.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 017EA29B8DC for ; Thu, 18 Sep 2025 05:16:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758172605; cv=none; b=LOLW5+ejRLScsr/LwYjvzYLInkqWsSO25Za95VIOrS3djVNRN6gIDUqEF3Pt8+4lb/zSLob+cePdwZD0AunWcC7pcq3ZkAtOyiKop3i7mnVvjpGjSPA0/KDx3q/bibFoiQW5E0wzDpeoTcyqhowreN/dt0i7eZuLia1IwkwTnqA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758172605; c=relaxed/simple; bh=+Waag5s2KV4AtkQ8FWm0y4/xw2uolW4OhcFUtzX2NSU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=gzMRk1yTwztZISM8tGdIlCLm1l0zSvMZRWBYdWceXzJbBd2XGGOHJ2ugGR4JLpR5FmpWzt6K2zYIRrw6MIZGvq72ykGDsim3m12OTA0OI6MAsS5tuRmP+AWCrTQVukGnLm6QAvnSHP4GkmC0ozjKdOstBdDutTU2m2iNAW0rcWo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=myBWHlpM; arc=none smtp.client-ip=209.85.216.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="myBWHlpM" Received: by mail-pj1-f43.google.com with SMTP id 98e67ed59e1d1-32dc6616f7dso355424a91.1 for ; Wed, 17 Sep 2025 22:16:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1758172603; x=1758777403; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=V2ZhXR0YE7Nfxy83gEc2kisuCO87If9KaW5qC95ko0I=; b=myBWHlpMMa0G0F1NSxpcNSAzBPE9WkNSOpsoEwneUN2qbuseyF3SelOblShvIX8Cys aoX0Y8akm/9QPdrdEZs30QZubWXU+AUgeOe16zGJ3TScprATffaOaEICEYEYiWmOCQeY JuP6f658ghfsNbzI9mYxBjnTZ/Wtj6/UJulAcUh8YkEAvUpwH26qX4dQrwWxOZqe7GZQ 41yAmDWiemEmq0NujnE0mPhEDgoq6uFui7doVL8rB8TScY/zEcSTHVz/ZKrk7Ip7SvvI MnwFPViyDSAh24e3d1KFL8tOjiYizT2dSiLzm2mef/0sJ9TZ2KEHhgbA+bWAWaXGdICq zXDw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1758172603; x=1758777403; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=V2ZhXR0YE7Nfxy83gEc2kisuCO87If9KaW5qC95ko0I=; b=oBkSiS66V1Ag4mNuPFBWeJ4iPvu1JCzj3aAHN2lDx2gadDLaFKao5oFKUTxgBF3Rxz cgI55o+ojxQ1xFyPhaVgtYJBYnp9Aj1Se+buK57utQWg9nJEjpkabkEPvPh9NGiFSY7T N0wkSW1PmNVHFRz4MWGCVqc8TxfJTWLSSHwOVz7ZRLrEIg2kjWH+a2hgN/baMGYNEguS ayfD8FlomEuowGSg8KBIPZQxZWNY0d4lAG9LupBoY3v0o1ne2O14ffs7mRR6zbd/aJcv rMl5HBsAUjl30+Op3lUL+Yy7qHizzXYXyhZDM05ajwyaxFQh0BNFFMtbJs4E8LW7YbsC IXGQ== X-Forwarded-Encrypted: i=1; AJvYcCVE6ARXKlbLR3c31GFBUiS+LQ3Wfb7DIHaKb3/z5ah+tlIUT8nOWz8raUxXBND5TzuRoi73doE+DXuN+K4=@vger.kernel.org X-Gm-Message-State: AOJu0YyHs7sWA8pbW6maZiTbaDQwyrMosKt0nG1FfL//cJ3OnObVimx7 gREff6+87OyrHC8ch1mfuZBT6AygpBWpxvHks/ms9OCb0V/FyIOrmzjb X-Gm-Gg: ASbGnctUle8oX4ITnIWhsw1Nxjm26ynzSBY/aJm67RLnN9bKE/9vq0pfW1yTPs0lxK+ lIXw4J0Jdj92L3RE9t6VB3XRibi6ElJXKeEirDvvnjX7HxKk/NzQzkCVF3aqypNKoH1LXZHCCiP y+XY1FdnKdyyLZ+4u782j3tIaW5kxWjBtWkpa1rmgYPnF5iYY+O39XF1s8xz1Aihiczh0E1zR11 yWVjeHc3pghoZdeuMfhev7ry9a9RAT22o8HFTlr3MAgZpoggMlYYG04PqqN7xebhF+kQwDMx6sj TU/+uzS9FWLp4Pc+BlB3ZDjmbrmvQZYB01RZ0pKfFAq7oDM8Y4hOfp1Yw4ZKofliGN2G6gfIXTN x0EiBTzVJZ21zXUbFtqjYnDV6uWYQyodCn723Rcj9IZP/10jLx5GOzHh/9HmmxYpeYsa3SFMW9x zCrqE9MkaAquU= X-Google-Smtp-Source: AGHT+IG8kVGxyfKRfwX/JNXisY3HoiZuY6aX/okVnX+nb2qOOqX+U51XZ6lCxRUZ1cTr4GcOddepSA== X-Received: by 2002:a17:90b:3b4b:b0:32e:37af:b012 with SMTP id 98e67ed59e1d1-32ee3d89bccmr6251146a91.0.1758172602923; Wed, 17 Sep 2025 22:16:42 -0700 (PDT) Received: from ikb-h07-29-noble.in.iijlab.net ([202.214.97.5]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-330607e9475sm1178905a91.19.2025.09.17.22.16.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Sep 2025 22:16:42 -0700 (PDT) Received: by ikb-h07-29-noble.in.iijlab.net (Postfix, from userid 1010) id 4FF1B1061EF0; Thu, 18 Sep 2025 14:16:40 +0900 (JST) From: Hajime Tazaki To: linux-um@lists.infradead.org Cc: thehajime@gmail.com, ricarkol@google.com, Liam.Howlett@oracle.com, linux-kernel@vger.kernel.org Subject: [PATCH v11 08/14] um: nommu: configure fs register on host syscall invocation Date: Thu, 18 Sep 2025 14:16:01 +0900 Message-ID: <8f953c5284f2bf5339540e18844531a146a8051f.1758171893.git.thehajime@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" As userspace on UML/!MMU also need to configure %fs register when it is running to correctly access thread structure, host syscalls implemented in os-Linux drivers may be puzzled when they are called. Thus it has to configure %fs register via arch_prctl(SET_FS) on every host syscalls. Signed-off-by: Hajime Tazaki Signed-off-by: Ricardo Koller --- arch/um/include/shared/os.h | 6 +++ arch/um/os-Linux/process.c | 6 +++ arch/um/os-Linux/start_up.c | 21 +++++++++ arch/x86/um/nommu/do_syscall_64.c | 37 ++++++++++++++++ arch/x86/um/nommu/syscalls_64.c | 71 +++++++++++++++++++++++++++++++ 5 files changed, 141 insertions(+) diff --git a/arch/um/include/shared/os.h b/arch/um/include/shared/os.h index 1251f08e26d0..7c6a8bc0447c 100644 --- a/arch/um/include/shared/os.h +++ b/arch/um/include/shared/os.h @@ -189,6 +189,7 @@ extern void check_host_supports_tls(int *supports_tls, = int *tls_min); extern void get_host_cpu_features( void (*flags_helper_func)(char *line), void (*cache_helper_func)(char *line)); +extern int host_has_fsgsbase; =20 /* mem.c */ extern int create_mem_file(unsigned long long len); @@ -213,6 +214,11 @@ extern int os_protect_memory(void *addr, unsigned long= len, extern int os_unmap_memory(void *addr, int len); extern int os_drop_memory(void *addr, int length); extern int can_drop_memory(void); +extern int os_arch_prctl(int pid, int option, unsigned long *arg); +#ifndef CONFIG_MMU +extern long long host_fs; +#endif + =20 void os_set_pdeathsig(void); =20 diff --git a/arch/um/os-Linux/process.c b/arch/um/os-Linux/process.c index bc598e661ec6..0a1d078dce19 100644 --- a/arch/um/os-Linux/process.c +++ b/arch/um/os-Linux/process.c @@ -16,6 +16,7 @@ #include #include #include +#include /* For SYS_xxx definitions */ #include #include #include @@ -177,6 +178,11 @@ int __init can_drop_memory(void) return ok; } =20 +int os_arch_prctl(int pid, int option, unsigned long *arg2) +{ + return syscall(SYS_arch_prctl, option, arg2); +} + void init_new_thread_signals(void) { set_handler(SIGSEGV); diff --git a/arch/um/os-Linux/start_up.c b/arch/um/os-Linux/start_up.c index 4e1f05360c49..55dd92bd2a0b 100644 --- a/arch/um/os-Linux/start_up.c +++ b/arch/um/os-Linux/start_up.c @@ -20,6 +20,8 @@ #include #include #include +#include +#include #include #include #include @@ -36,6 +38,8 @@ #include #include "internal.h" =20 +int host_has_fsgsbase; + static void ptrace_child(void) { int ret; @@ -459,6 +463,20 @@ __uml_setup("seccomp=3D", uml_seccomp_config, " This is insecure and should only be used with a trusted userspace\n\n" ); =20 +static void __init check_fsgsbase(void) +{ + unsigned long auxv =3D getauxval(AT_HWCAP2); + + os_info("Checking FSGSBASE instructions..."); + if (auxv & HWCAP2_FSGSBASE) { + host_has_fsgsbase =3D 1; + os_info("OK\n"); + } else { + host_has_fsgsbase =3D 0; + os_info("disabled\n"); + } +} + void __init os_early_checks(void) { int pid; @@ -484,6 +502,9 @@ void __init os_early_checks(void) using_seccomp =3D 0; check_ptrace(); =20 + /* probe fsgsbase instruction */ + check_fsgsbase(); + pid =3D start_ptraced_child(); if (init_pid_registers(pid)) fatal("Failed to initialize default registers"); diff --git a/arch/x86/um/nommu/do_syscall_64.c b/arch/x86/um/nommu/do_sysca= ll_64.c index 292d7c578622..9bc630995df9 100644 --- a/arch/x86/um/nommu/do_syscall_64.c +++ b/arch/x86/um/nommu/do_syscall_64.c @@ -2,10 +2,38 @@ =20 #include #include +#include +#include #include #include #include =20 +static int os_x86_arch_prctl(int pid, int option, unsigned long *arg2) +{ + if (!host_has_fsgsbase) + return os_arch_prctl(pid, option, arg2); + + switch (option) { + case ARCH_SET_FS: + wrfsbase(*arg2); + break; + case ARCH_SET_GS: + wrgsbase(*arg2); + break; + case ARCH_GET_FS: + *arg2 =3D rdfsbase(); + break; + case ARCH_GET_GS: + *arg2 =3D rdgsbase(); + break; + default: + pr_warn("%s: unsupported option: 0x%x", __func__, option); + break; + } + + return 0; +} + __visible void do_syscall_64(struct pt_regs *regs) { int syscall; @@ -13,6 +41,9 @@ __visible void do_syscall_64(struct pt_regs *regs) syscall =3D PT_SYSCALL_NR(regs->regs.gp); UPT_SYSCALL_NR(®s->regs) =3D syscall; =20 + /* set fs register to the original host one */ + os_x86_arch_prctl(0, ARCH_SET_FS, (void *)host_fs); + if (likely(syscall < NR_syscalls)) { unsigned long ret; =20 @@ -29,4 +60,10 @@ __visible void do_syscall_64(struct pt_regs *regs) =20 /* handle tasks and signals at the end */ interrupt_end(); + + /* restore back fs register to userspace configured one */ + os_x86_arch_prctl(0, ARCH_SET_FS, + (void *)(current->thread.regs.regs.gp[FS_BASE + / sizeof(unsigned long)])); + } diff --git a/arch/x86/um/nommu/syscalls_64.c b/arch/x86/um/nommu/syscalls_6= 4.c index d56027ebc651..19d23686fc5b 100644 --- a/arch/x86/um/nommu/syscalls_64.c +++ b/arch/x86/um/nommu/syscalls_64.c @@ -13,8 +13,70 @@ #include /* XXX This should get the constants from libc */ #include #include +#include +#include #include "syscalls.h" =20 +/* + * The guest libc can change FS, which confuses the host libc. + * In fact, changing FS directly is not supported (check + * man arch_prctl). So, whenever we make a host syscall, + * we should be changing FS to the original FS (not the + * one set by the guest libc). This original FS is stored + * in host_fs. + */ +long long host_fs =3D -1; + +long arch_prctl(struct task_struct *task, int option, + unsigned long __user *arg2) +{ + long ret =3D -EINVAL; + unsigned long *ptr =3D arg2, tmp; + + switch (option) { + case ARCH_SET_FS: + if (host_fs =3D=3D -1) + os_arch_prctl(0, ARCH_GET_FS, (void *)&host_fs); + ret =3D 0; + break; + case ARCH_SET_GS: + ret =3D 0; + break; + case ARCH_GET_FS: + case ARCH_GET_GS: + ptr =3D &tmp; + break; + } + + ret =3D os_arch_prctl(0, option, ptr); + if (ret) + return ret; + + switch (option) { + case ARCH_SET_FS: + current->thread.regs.regs.gp[FS_BASE / sizeof(unsigned long)] =3D + (unsigned long) arg2; + break; + case ARCH_SET_GS: + current->thread.regs.regs.gp[GS_BASE / sizeof(unsigned long)] =3D + (unsigned long) arg2; + break; + case ARCH_GET_FS: + ret =3D put_user(current->thread.regs.regs.gp[FS_BASE / sizeof(unsigned = long)], arg2); + break; + case ARCH_GET_GS: + ret =3D put_user(current->thread.regs.regs.gp[GS_BASE / sizeof(unsigned = long)], arg2); + break; + } + + return ret; +} + +SYSCALL_DEFINE2(arch_prctl, int, option, unsigned long, arg2) +{ + return arch_prctl(current, option, (unsigned long __user *) arg2); +} + void arch_set_stack_to_current(void) { current_top_of_stack =3D task_top_of_stack(current); @@ -48,3 +110,12 @@ SYSCALL_DEFINE6(mmap, unsigned long, addr, unsigned lon= g, len, =20 return ksys_mmap_pgoff(addr, len, prot, flags, fd, off >> PAGE_SHIFT); } + +static int __init um_nommu_setup_hostfs(void) +{ + /* initialize the host_fs value at boottime */ + os_arch_prctl(0, ARCH_GET_FS, (void *)&host_fs); + + return 0; +} +arch_initcall(um_nommu_setup_hostfs); --=20 2.43.0