From nobody Thu Oct 2 11:50:38 2025 Received: from mail-pg1-f176.google.com (mail-pg1-f176.google.com [209.85.215.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 11CD92E7F0B for ; Thu, 18 Sep 2025 07:39:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758181166; cv=none; b=LDr4b0qhU1+cp4Tubu/O/gPcKQTwj7bJtSkSCx0j6Clo0uT+GwohSS0upK05ug6ug/v0IXBQ/4vF4vBKoAxA3MQdDE1bhglSg4EEmxCD4PzUBbfjL6udAwzMRRQbX7AfeZ0UGwcEaXbKctygZFgoOTcsXGNuidQN9gkAKrJif2g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758181166; c=relaxed/simple; bh=Z6qZSB5ccwxVgYVj98AksF/DlcdpqcgoKyj9hpY/Brg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=lxEBtpKgP5eNbA5xsW2JgAfHtmxl9AT+jZK/TzXGzU2YlKm/g9yoxu1PVNG8ZYV0LTVh/UAle42fdYWDaESD2QWtLkV5Gh652ZN4Dqj+n1qPLu0ldBF82iTXjEOw25rfbOoz4d4xwV87JpDIgnzqG52P4qUF6ZFknbSZRC6E5hE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=k3Ry3y1h; arc=none smtp.client-ip=209.85.215.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="k3Ry3y1h" Received: by mail-pg1-f176.google.com with SMTP id 41be03b00d2f7-b54fc45db56so579335a12.3 for ; Thu, 18 Sep 2025 00:39:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1758181163; x=1758785963; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=MNruZWOB1A9W2xPWfDeeJw/PtthshopimV+lzxEm4to=; b=k3Ry3y1h4W+aM5hylaiF71sI4Eb+EqKe3+jTwAZWQsZ62fYg49Wzt4sIQccgD9ofjS q0Cr7WDsFiWFoUIjM7vSu+VCG8j23lmu4S8gztAgCptA9tYdWzLGtXAxrBE/wx47ezIj 9O+kmEPKBwTpmEkdVJo6cxH2v0kdE5GCMc4EP6VZMigBNuc7u9zOI/rPp+cuUB4erwzt /8FhknvK7yAzRRwO9jThb/CMJ1/muZFwfpKhYBGVj1kYbm7YasTrHC2Uctd6dDMVXR9O zQXT1trnzij1O+PbHnD/vBYMmeU8s+PRUFIdef/8xOUsAMw/jMRFtDTOnx4HW9YFCKnV EuqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1758181163; x=1758785963; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MNruZWOB1A9W2xPWfDeeJw/PtthshopimV+lzxEm4to=; b=p9NurfKegAsGqzGIkW3k+gUoBUfT3RyOcy/ImA8C5fHpA3vHlZUwe9PYXGHtp1w28s tdgjkidL58tfr9ZX86aiXuG/5jlxcuNwssMlMF6ExuYG+c6Spk1HIVZVKf4+Rj6FlbnN nb8YMfHAtqSL3vAvIJ4u85GW8T0ns2ZLadDUmfmAEvx+EVBPk9c5iWuD9/f5p84ejor+ 36LNCGAtmCt/0eELZGUOvTfEDp0yknEXWu+kKXQcTYKojIxrWpCUUWBgzLBCC9rGsi7W LBGpxRChC4NaRCCkEYdzVjW9q3UfKGqnQuhY85Z4G2CHw4HlTI0DYZ+CuUKGm0Ki7rrF FkqQ== X-Forwarded-Encrypted: i=1; AJvYcCUcNhm6oYBbEHQZVkzNqE524/Huf50LqGmeqKYsyS8py9FKMHDx+cRfH+lq1T7kRSqxGkp2SKPirOXCmFA=@vger.kernel.org X-Gm-Message-State: AOJu0YygqtqQx26v8xdYNdse9JktYZWc7u4cH6Y4shEugGoQ5Xl5Hix4 lFoaAx5Tf9HR/UMGoCUNt0fruG7ED7su96XBrd419fdY+7X/3UaD8L1X X-Gm-Gg: ASbGncspFJj8QwE/MwqcViAJr3vZVaM3rezjIPhyIqfGJaoyml7mEfsmXzz/ZoX6swb 5qCqx0Xkz9eYQfB3CPQMrtvo7Cm1aP5B0TVYdu4z5YSjaGOKae9fT1cl6guoACUXEO2UB6yJUU2 amUFIw3T5vrtJUCGW9GpUXUYEGfXE2VYqiPu/P/JmF+GAPQ6M12onVYMK20hAisxpx90EF1tNZe 7B97g1ONtL2c8CTiLYByBTDGO45Czeq9i3UarKjsv16jHA4ViTE+EtUq42/51KcVO+aIm9hF3wQ JkX4oOv4b1jsFe48pdDVE9cpnjxbys6hPMk3VnG3ff5uj4GQNnd8gwlYKSCEKDABojhkcuSQ1qn eAiQP4+2df1SHJXNivoY93rKq00svyhfzbrB/6ZOcBjZmjy06UcWqtmUoLsQQiAPLPKL2APPR+l ExF6XiAKD1TvGp592IZbjqmg== X-Google-Smtp-Source: AGHT+IHb67OmugF2cEMXD12M/xW4W+V68Zi7+KrzUwtAstQdAuQx8i/xU/qunW2Twhas2liFfVE02A== X-Received: by 2002:a17:902:ce8e:b0:246:24d:2394 with SMTP id d9443c01a7336-268119b2bdbmr55687205ad.8.1758181163196; Thu, 18 Sep 2025 00:39:23 -0700 (PDT) Received: from ikb-h07-29-noble.in.iijlab.net ([202.214.97.5]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-269802de7a9sm16783015ad.80.2025.09.18.00.39.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Sep 2025 00:39:22 -0700 (PDT) Received: by ikb-h07-29-noble.in.iijlab.net (Postfix, from userid 1010) id 5B7C51062102; Thu, 18 Sep 2025 16:39:20 +0900 (JST) From: Hajime Tazaki To: linux-um@lists.infradead.org Cc: thehajime@gmail.com, ricarkol@google.com, Liam.Howlett@oracle.com, linux-kernel@vger.kernel.org Subject: [PATCH RESEND v11 04/13] x86/um: nommu: syscall handling Date: Thu, 18 Sep 2025 16:38:59 +0900 Message-ID: <99a54b96cee85275692ca01fe4a0ef0d83996f0f.1758181109.git.thehajime@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This commit introduces an entry point of syscall interface for !MMU mode. It uses an entry function, __kernel_vsyscall, a kernel-wide global symbol accessible from any locations. Although it isn't in the scope of this commit, it can be also exposed via vdso image which is directly accessible from userspace. A standard library (i.e., libc) can utilize this entry point to implement syscall wrapper; we can also use this by hooking syscall for unmodified userspace applications/libraries, which will be implemented in the subsequent commit. This only supports 64-bit mode of x86 architecture. Signed-off-by: Hajime Tazaki Signed-off-by: Ricardo Koller --- arch/x86/um/Makefile | 4 ++ arch/x86/um/asm/syscall.h | 6 ++ arch/x86/um/nommu/Makefile | 8 +++ arch/x86/um/nommu/do_syscall_64.c | 32 +++++++++ arch/x86/um/nommu/entry_64.S | 112 ++++++++++++++++++++++++++++++ arch/x86/um/nommu/syscalls.h | 16 +++++ 6 files changed, 178 insertions(+) create mode 100644 arch/x86/um/nommu/Makefile create mode 100644 arch/x86/um/nommu/do_syscall_64.c create mode 100644 arch/x86/um/nommu/entry_64.S create mode 100644 arch/x86/um/nommu/syscalls.h diff --git a/arch/x86/um/Makefile b/arch/x86/um/Makefile index b42c31cd2390..227af2a987e2 100644 --- a/arch/x86/um/Makefile +++ b/arch/x86/um/Makefile @@ -32,6 +32,10 @@ obj-y +=3D syscalls_64.o vdso/ subarch-y =3D ../lib/csum-partial_64.o ../lib/memcpy_64.o \ ../lib/memmove_64.o ../lib/memset_64.o =20 +ifneq ($(CONFIG_MMU),y) +obj-y +=3D nommu/ +endif + endif =20 subarch-$(CONFIG_MODULES) +=3D ../kernel/module.o diff --git a/arch/x86/um/asm/syscall.h b/arch/x86/um/asm/syscall.h index d6208d0fad51..bb4f6f011667 100644 --- a/arch/x86/um/asm/syscall.h +++ b/arch/x86/um/asm/syscall.h @@ -20,4 +20,10 @@ static inline int syscall_get_arch(struct task_struct *t= ask) #endif } =20 +#ifndef CONFIG_MMU +extern void do_syscall_64(struct pt_regs *regs); +extern long __kernel_vsyscall(int64_t a0, int64_t a1, int64_t a2, int64_t = a3, + int64_t a4, int64_t a5, int64_t a6); +#endif + #endif /* __UM_ASM_SYSCALL_H */ diff --git a/arch/x86/um/nommu/Makefile b/arch/x86/um/nommu/Makefile new file mode 100644 index 000000000000..d72c63afffa5 --- /dev/null +++ b/arch/x86/um/nommu/Makefile @@ -0,0 +1,8 @@ +# SPDX-License-Identifier: GPL-2.0 +ifeq ($(CONFIG_X86_32),y) + BITS :=3D 32 +else + BITS :=3D 64 +endif + +obj-y =3D do_syscall_$(BITS).o entry_$(BITS).o diff --git a/arch/x86/um/nommu/do_syscall_64.c b/arch/x86/um/nommu/do_sysca= ll_64.c new file mode 100644 index 000000000000..292d7c578622 --- /dev/null +++ b/arch/x86/um/nommu/do_syscall_64.c @@ -0,0 +1,32 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include +#include +#include + +__visible void do_syscall_64(struct pt_regs *regs) +{ + int syscall; + + syscall =3D PT_SYSCALL_NR(regs->regs.gp); + UPT_SYSCALL_NR(®s->regs) =3D syscall; + + if (likely(syscall < NR_syscalls)) { + unsigned long ret; + + ret =3D (*sys_call_table[syscall])(UPT_SYSCALL_ARG1(®s->regs), + UPT_SYSCALL_ARG2(®s->regs), + UPT_SYSCALL_ARG3(®s->regs), + UPT_SYSCALL_ARG4(®s->regs), + UPT_SYSCALL_ARG5(®s->regs), + UPT_SYSCALL_ARG6(®s->regs)); + PT_REGS_SET_SYSCALL_RETURN(regs, ret); + } + + PT_REGS_SYSCALL_RET(regs) =3D regs->regs.gp[HOST_AX]; + + /* handle tasks and signals at the end */ + interrupt_end(); +} diff --git a/arch/x86/um/nommu/entry_64.S b/arch/x86/um/nommu/entry_64.S new file mode 100644 index 000000000000..485c578aae64 --- /dev/null +++ b/arch/x86/um/nommu/entry_64.S @@ -0,0 +1,112 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#include + +#include +#include +#include + +#include "../entry/calling.h" + +#ifdef CONFIG_SMP +#error need to stash these variables somewhere else +#endif + +#define UM_GLOBAL_VAR(x) .data; .align 8; .globl x; x:; .long 0 + +UM_GLOBAL_VAR(current_top_of_stack) +UM_GLOBAL_VAR(current_ptregs) + +.code64 +.section .entry.text, "ax" + +.align 8 +#undef ENTRY +#define ENTRY(x) .text; .globl x; .type x,%function; x: +#undef END +#define END(x) .size x, . - x + +/* + * %rcx has the return address (we set it before entering __kernel_vsyscal= l). + * + * Registers on entry: + * rax system call number + * rcx return address + * rdi arg0 + * rsi arg1 + * rdx arg2 + * r10 arg3 + * r8 arg4 + * r9 arg5 + * + * (note: we are allowed to mess with r11: r11 is callee-clobbered + * register in C ABI) + */ +ENTRY(__kernel_vsyscall) + + movq %rsp, %r11 + + /* Point rsp to the top of the ptregs array, so we can + just fill it with a bunch of push'es. */ + movq current_ptregs, %rsp + + /* 8 bytes * 20 registers (plus 8 for the push) */ + addq $168, %rsp + + /* Construct struct pt_regs on stack */ + pushq $0 /* pt_regs->ss (index 20) */ + pushq %r11 /* pt_regs->sp */ + pushfq /* pt_regs->flags */ + pushq $0 /* pt_regs->cs */ + pushq %rcx /* pt_regs->ip */ + pushq %rax /* pt_regs->orig_ax */ + + PUSH_AND_CLEAR_REGS rax=3D$-ENOSYS + + mov %rsp, %rdi + + /* + * Switch to current top of stack, so "current->" points + * to the right task. + */ + movq current_top_of_stack, %rsp + + call do_syscall_64 + + jmp userspace + +END(__kernel_vsyscall) + +/* + * common userspace returning routine + * + * all procedures like syscalls, signal handlers, umh processes, will gate + * this routine to properly configure registers/stacks. + * + * void userspace(struct uml_pt_regs *regs) + */ +ENTRY(userspace) + + /* clear direction flag to meet ABI */ + cld + /* align the stack for x86_64 ABI */ + and $-0x10, %rsp + /* Handle any immediate reschedules or signals */ + call interrupt_end + + movq current_ptregs, %rsp + + POP_REGS + + addq $8, %rsp /* skip orig_ax */ + popq %rcx /* pt_regs->ip */ + addq $8, %rsp /* skip cs */ + addq $8, %rsp /* skip flags */ + popq %rsp + + /* + * not return w/ ret but w/ jmp as the stack is already popped before + * entering __kernel_vsyscall + */ + jmp *%rcx + +END(userspace) diff --git a/arch/x86/um/nommu/syscalls.h b/arch/x86/um/nommu/syscalls.h new file mode 100644 index 000000000000..a2433756b1fc --- /dev/null +++ b/arch/x86/um/nommu/syscalls.h @@ -0,0 +1,16 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __UM_NOMMU_SYSCALLS_H +#define __UM_NOMMU_SYSCALLS_H + + +#define task_top_of_stack(task) \ +({ \ + unsigned long __ptr =3D (unsigned long)task->stack; \ + __ptr +=3D THREAD_SIZE; \ + __ptr; \ +}) + +extern long current_top_of_stack; +extern long current_ptregs; + +#endif --=20 2.43.0