From nobody Fri Dec 19 04:19:26 2025 Received: from mail-qv1-f54.google.com (mail-qv1-f54.google.com [209.85.219.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 32C3F33EC for ; Wed, 30 Apr 2025 04:36:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745987768; cv=none; b=Y8KJpr220sxdP1xYCOY5EKZA1yt4Z+ztB8OMtDP2z/juE017ZADt4BmZ74gT8NRc1YK9O4wubfxP7WzIFJ/1fk3XfZNtiYlgCb7QClpcRigu42ZE7fsY2mS9rd8+Twhh+iRvAVXis5YMAInpoEtx4UzfpxL0ZuDKlFAA28FVQNs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745987768; c=relaxed/simple; bh=khQbOphnaTF3Rbt52w+eeGcF1LQO4WOApscpm1Ts69I=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=XQxSvMJqEjSdBl5KWcDDtOt9x67qOzudvAV89+sOErVD22V6aDCtmzYuAY/Ho0zA3bEk0B03+wVKPdQcTCl/qlOjFx+WOfEaPEz5bKDBxdGV8RjoS9PfF703nXmYo7h5LZzu012QU3jfRgISlEmnEab5sVsC0dPtXGauIzhkhdU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=BL3N+bEa; arc=none smtp.client-ip=209.85.219.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="BL3N+bEa" Received: by mail-qv1-f54.google.com with SMTP id 6a1803df08f44-6ecf99dd567so99784276d6.0 for ; Tue, 29 Apr 2025 21:36:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1745987765; x=1746592565; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=RhZelJwiTUJ/UkMrQlTULCdNhOyleWw7wnwmeZqJPjo=; b=BL3N+bEaGvfZ/pb9AMiBGlwTTaazthGNUEKZP9634MzsVhmNrymnaA5hYQh2iQwAag iHGu07hCHuSnpezDZAGNfNgjDIPkZxR8rEnCNTNlobbBQQsfuFADEwTsevCc7O3Uo1hx t3hBbYunQFlaPKPsuE7HOhlkQ5djDBXRIeP0tGtRUhFfvvmTay4nf4es4yxfiTn53yOS /AUfM8Ds5/uvRM1g2eQ8XDXdq18syXkK99oUfHxtSfJ3vJenyZ6JTnEeVWozRVfezbUL dCQX5TeZYZBTKqJDYkp09mMieaLguUWcFQQcgSvI+I3tWRiw1VlDTLX7kjpxGNxp/UwY ffUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745987765; x=1746592565; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RhZelJwiTUJ/UkMrQlTULCdNhOyleWw7wnwmeZqJPjo=; b=CI3j6h9+hYMFmdx55C+BrOPbck3tjfv36XQdY59LXx7yg0Q9U2OAQXldRbnhRB+5tg rok0yS3CUxaiJgM+mdkCO5mNmuB6Fv21b+QgHnvN+V9SvEkDRF8CgO0ukNnR848GK+eU 67BTvIDGoasE0sUDtn46XTILjgp0uhQ3i7X7akWOM7HzVu6VH+IufE82vlFMyDl5Pit1 sjpwoJKLW07Fe3BLltKCO/4HehGsZhmaFiTwPkhO9rVSP4B5NL4wdhX8n+3xBxZRVk7w o4bkLjEElSGmdyzoEPuDrvAEKeqRFgIhIdMT6r5AGNQlksZNOGPuTlxnvz0biRHGsRdh AtkQ== X-Forwarded-Encrypted: i=1; AJvYcCVPcjwQz8pcRKQOoaoahT1LIiFTTuuL5QlGLD5+c53VEAdEB483fVqQhVkk3EwSlAEWmKGhzUFRglVo62s=@vger.kernel.org X-Gm-Message-State: AOJu0Yz7XrhpCLF+dF+jVEvPx+Cwpne5U05VDmQ/nlN7U3qiidws1/jO TWPbrD5/zRcD0wMOOnKqis4OSjETJnrunRbWmVt9itzJTDiveQNDP7La6v/h X-Gm-Gg: ASbGncurwVb7X2sEylRNGfcz+U+N/ITVJC77a53CHENssWWDsy0/0v7ru5F40rZWg8I 7kdcydzkDg2Jj1lz39B+u4VDMLEuIK9WD9b7h0Dz8f29v96JVu+kDXB6aPDT5V0PbOUbhqCjIUK Z2q6Jp/bHtxYD3hgKS4PCRK5aRXeHAVuxOIfbpFRAseWQ2QJF9sqYjxwjhI3wB7+xjlNZW7qG0m 4+b3/ofPN7AbsoiKBRZHkQvi7e5Kepluw7+ufqH2RoZesCubsYiyeFgJDtYndHXy8yXoqPDfgxf CLDPLTQZR+w67kz7lkrMDy2Ps43ylGsyWwusZzoW19Mtk53cwqprg/UA5IakwgYrsXzC1WcfsnX zQU2jAJAIIiE= X-Google-Smtp-Source: AGHT+IF7i/I3IdOLHAQ2T3n6rN/DkuQz7/uZDc2c33Jl4R9I8AWjlNgT7XTdUlU6shCJ7XEQ1alnEg== X-Received: by 2002:a17:90b:6d0:b0:2fe:b9be:216 with SMTP id 98e67ed59e1d1-30a3336ba48mr2266861a91.31.1745987275453; Tue, 29 Apr 2025 21:27:55 -0700 (PDT) Received: from ikb-h07-29-noble.in.iijlab.net ([202.214.97.5]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-30a349e3c2csm566258a91.2.2025.04.29.21.27.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Apr 2025 21:27:54 -0700 (PDT) Received: by ikb-h07-29-noble.in.iijlab.net (Postfix, from userid 1010) id CFEAFE94820; Wed, 30 Apr 2025 13:27:52 +0900 (JST) From: Hajime Tazaki To: linux-um@lists.infradead.org Cc: thehajime@gmail.com, ricarkol@google.com, Liam.Howlett@oracle.com, linux-kernel@vger.kernel.org Subject: [PATCH v8 07/12] um: nommu: configure fs register on host syscall invocation Date: Wed, 30 Apr 2025 13:27:13 +0900 Message-ID: <3b21fc4fab514b550076b10e571a371dfee54a0d.1745980082.git.thehajime@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" As userspace on UML/!MMU also need to configure %fs register when it is running to correctly access thread structure, host syscalls implemented in os-Linux drivers may be puzzled when they are called. Thus it has to configure %fs register via arch_prctl(SET_FS) on every host syscalls. Signed-off-by: Hajime Tazaki Signed-off-by: Ricardo Koller --- arch/um/include/shared/os.h | 6 +++ arch/um/os-Linux/process.c | 6 +++ arch/um/os-Linux/start_up.c | 20 +++++++++ arch/x86/um/nommu/do_syscall_64.c | 37 ++++++++++++++++ arch/x86/um/nommu/syscalls_64.c | 71 +++++++++++++++++++++++++++++++ 5 files changed, 140 insertions(+) diff --git a/arch/um/include/shared/os.h b/arch/um/include/shared/os.h index 7f6703869dde..2e972ca55213 100644 --- a/arch/um/include/shared/os.h +++ b/arch/um/include/shared/os.h @@ -190,6 +190,7 @@ extern void check_host_supports_tls(int *supports_tls, = int *tls_min); extern void get_host_cpu_features( void (*flags_helper_func)(char *line), void (*cache_helper_func)(char *line)); +extern int host_has_fsgsbase; =20 /* mem.c */ extern int create_mem_file(unsigned long long len); @@ -213,6 +214,11 @@ extern int os_protect_memory(void *addr, unsigned long= len, extern int os_unmap_memory(void *addr, int len); extern int os_drop_memory(void *addr, int length); extern int can_drop_memory(void); +extern int os_arch_prctl(int pid, int option, unsigned long *arg); +#ifndef CONFIG_MMU +extern long long host_fs; +#endif + =20 void os_set_pdeathsig(void); =20 diff --git a/arch/um/os-Linux/process.c b/arch/um/os-Linux/process.c index 3ac01881e905..7806b51e38b3 100644 --- a/arch/um/os-Linux/process.c +++ b/arch/um/os-Linux/process.c @@ -16,6 +16,7 @@ #include #include #include +#include /* For SYS_xxx definitions */ #include #include #include @@ -155,6 +156,11 @@ int __init can_drop_memory(void) return ok; } =20 +int os_arch_prctl(int pid, int option, unsigned long *arg2) +{ + return syscall(SYS_arch_prctl, option, arg2); +} + void init_new_thread_signals(void) { set_handler(SIGSEGV); diff --git a/arch/um/os-Linux/start_up.c b/arch/um/os-Linux/start_up.c index 93fc82c01aba..dbab091892b3 100644 --- a/arch/um/os-Linux/start_up.c +++ b/arch/um/os-Linux/start_up.c @@ -19,6 +19,8 @@ #include #include #include +#include +#include #include #include #include @@ -28,6 +30,8 @@ #include #include "internal.h" =20 +int host_has_fsgsbase; + static void ptrace_child(void) { int ret; @@ -278,6 +282,19 @@ void __init get_host_cpu_features( } } =20 +static void __init check_fsgsbase(void) +{ + unsigned long auxv =3D getauxval(AT_HWCAP2); + + os_info("Checking FSGSBASE instructions..."); + if (auxv & HWCAP2_FSGSBASE) { + host_has_fsgsbase =3D 1; + os_info("OK\n"); + } else { + host_has_fsgsbase =3D 0; + os_info("disabled\n"); + } +} =20 void __init os_early_checks(void) { @@ -293,6 +310,9 @@ void __init os_early_checks(void) */ check_tmpexec(); =20 + /* probe fsgsbase instruction */ + check_fsgsbase(); + pid =3D start_ptraced_child(); if (init_pid_registers(pid)) fatal("Failed to initialize default registers"); diff --git a/arch/x86/um/nommu/do_syscall_64.c b/arch/x86/um/nommu/do_sysca= ll_64.c index 5d0fa83e7fdc..796beb0089fc 100644 --- a/arch/x86/um/nommu/do_syscall_64.c +++ b/arch/x86/um/nommu/do_syscall_64.c @@ -2,10 +2,38 @@ =20 #include #include +#include +#include #include #include #include =20 +static int os_x86_arch_prctl(int pid, int option, unsigned long *arg2) +{ + if (!host_has_fsgsbase) + return os_arch_prctl(pid, option, arg2); + + switch (option) { + case ARCH_SET_FS: + wrfsbase(*arg2); + break; + case ARCH_SET_GS: + wrgsbase(*arg2); + break; + case ARCH_GET_FS: + *arg2 =3D rdfsbase(); + break; + case ARCH_GET_GS: + *arg2 =3D rdgsbase(); + break; + default: + pr_warn("%s: unsupported option: 0x%x", __func__, option); + break; + } + + return 0; +} + __visible void do_syscall_64(struct pt_regs *regs) { int syscall; @@ -17,6 +45,9 @@ __visible void do_syscall_64(struct pt_regs *regs) syscall, (unsigned long)current, (unsigned long)sys_call_table[syscall]); =20 + /* set fs register to the original host one */ + os_x86_arch_prctl(0, ARCH_SET_FS, (void *)host_fs); + if (likely(syscall < NR_syscalls)) { PT_REGS_SET_SYSCALL_RETURN(regs, EXECUTE_SYSCALL(syscall, regs)); @@ -34,4 +65,10 @@ __visible void do_syscall_64(struct pt_regs *regs) /* force do_signal() --> is_syscall() */ set_thread_flag(TIF_SIGPENDING); interrupt_end(); + + /* restore back fs register to userspace configured one */ + os_x86_arch_prctl(0, ARCH_SET_FS, + (void *)(current->thread.regs.regs.gp[FS_BASE + / sizeof(unsigned long)])); + } diff --git a/arch/x86/um/nommu/syscalls_64.c b/arch/x86/um/nommu/syscalls_6= 4.c index c78c442aed1d..5bb6d55b4bb5 100644 --- a/arch/x86/um/nommu/syscalls_64.c +++ b/arch/x86/um/nommu/syscalls_64.c @@ -13,8 +13,70 @@ #include /* XXX This should get the constants from libc */ #include #include +#include +#include #include "syscalls.h" =20 +/* + * The guest libc can change FS, which confuses the host libc. + * In fact, changing FS directly is not supported (check + * man arch_prctl). So, whenever we make a host syscall, + * we should be changing FS to the original FS (not the + * one set by the guest libc). This original FS is stored + * in host_fs. + */ +long long host_fs =3D -1; + +long arch_prctl(struct task_struct *task, int option, + unsigned long __user *arg2) +{ + long ret =3D -EINVAL; + unsigned long *ptr =3D arg2, tmp; + + switch (option) { + case ARCH_SET_FS: + if (host_fs =3D=3D -1) + os_arch_prctl(0, ARCH_GET_FS, (void *)&host_fs); + ret =3D 0; + break; + case ARCH_SET_GS: + ret =3D 0; + break; + case ARCH_GET_FS: + case ARCH_GET_GS: + ptr =3D &tmp; + break; + } + + ret =3D os_arch_prctl(0, option, ptr); + if (ret) + return ret; + + switch (option) { + case ARCH_SET_FS: + current->thread.regs.regs.gp[FS_BASE / sizeof(unsigned long)] =3D + (unsigned long) arg2; + break; + case ARCH_SET_GS: + current->thread.regs.regs.gp[GS_BASE / sizeof(unsigned long)] =3D + (unsigned long) arg2; + break; + case ARCH_GET_FS: + ret =3D put_user(current->thread.regs.regs.gp[FS_BASE / sizeof(unsigned = long)], arg2); + break; + case ARCH_GET_GS: + ret =3D put_user(current->thread.regs.regs.gp[GS_BASE / sizeof(unsigned = long)], arg2); + break; + } + + return ret; +} + +SYSCALL_DEFINE2(arch_prctl, int, option, unsigned long, arg2) +{ + return arch_prctl(current, option, (unsigned long __user *) arg2); +} + void arch_switch_to(struct task_struct *to) { /* @@ -42,3 +104,12 @@ SYSCALL_DEFINE6(mmap, unsigned long, addr, unsigned lon= g, len, =20 return ksys_mmap_pgoff(addr, len, prot, flags, fd, off >> PAGE_SHIFT); } + +static int __init um_nommu_setup_hostfs(void) +{ + /* initialize the host_fs value at boottime */ + os_arch_prctl(0, ARCH_GET_FS, (void *)&host_fs); + + return 0; +} +arch_initcall(um_nommu_setup_hostfs); --=20 2.43.0