From nobody Mon Feb 9 08:06:05 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass header.i=@bmw.de; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=bmw.de ARC-Seal: i=1; a=rsa-sha256; t=1764067187; cv=none; d=zohomail.com; s=zohoarc; b=G617MV2F3F5tsqUQO6aB+SWO2Z494ZNOyolqK404CAS0stwxCGpgYVj1koZL+bP9f6+cMgOQtSTMbiNA1Qy2k4d74e6iPLm2kN93VoeC8VN1fkgCZB2qEORjtAuB5kjMBpMD3NvNkQBAhIrVlzPIb8RQij54vZZ01QH5g5545B0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1764067187; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=MZBj1LK0BH+VeELLHeY2dzetJgbgeDjtg8uEPC2hPhI=; b=GeUiMp3fa21wpGqurHWsMG2MvanLORrbXeOeZ00TmT5cDalTdSfcAATAAxoIJqqrgUEUUQbJMUGNsobVvP0Zvg2VYx6Hh235w6oUhJS3+LDYBJaXUDIA7VmQoAcSwwKHattmiBc3K68tctlgiLt8mMLbuZ5/mi6j9si04teRX/Y= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=@bmw.de; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1764067187128477.41690580066165; Tue, 25 Nov 2025 02:39:47 -0800 (PST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1vNqSL-0003Hc-8v; Tue, 25 Nov 2025 05:39:33 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vNqSJ-0003Gm-F1 for qemu-devel@nongnu.org; Tue, 25 Nov 2025 05:39:31 -0500 Received: from esa13.hc324-48.eu.iphmx.com ([207.54.72.35]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vNqSF-0007Vw-VF for qemu-devel@nongnu.org; Tue, 25 Nov 2025 05:39:31 -0500 Received: from esagw2.bmwgroup.com (HELO esagw2.muc) ([160.46.252.38]) by esa13.hc324-48.eu.iphmx.com with ESMTP/TLS; 25 Nov 2025 11:39:21 +0100 Received: from unknown (HELO esabb5.muc) ([10.31.187.136]) by esagw2.muc with ESMTP/TLS; 25 Nov 2025 11:39:21 +0100 Received: from smucmp19a.bmwgroup.net (HELO smucmp19a.europe.bmw.corp) ([10.30.13.167]) by esabb5.muc with ESMTP/TLS; 25 Nov 2025 11:39:21 +0100 Received: from smucmp21a.europe.bmw.corp (2a03:1e80:a01:524::1:44) by smucmp19a.europe.bmw.corp (2a03:1e80:a15:58f::211a) with Microsoft SMTP Server (version=TLS; Tue, 25 Nov 2025 11:39:20 +0100 Received: from q1054628.de-cci.bmwgroup.net (10.30.85.205) by smucmp21a.europe.bmw.corp (2a03:1e80:a01:524::1:44) with Microsoft SMTP Server (version=TLS; Tue, 25 Nov 2025 11:39:20 +0100 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bmw.de; i=@bmw.de; q=dns/txt; s=mailing1; t=1764067167; x=1795603167; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=MZBj1LK0BH+VeELLHeY2dzetJgbgeDjtg8uEPC2hPhI=; b=nS7snTkiAc6/Ml5coZcAhCs+1a4H2np8+i6yYCn1140CAP0gvVd264HQ aW9tj92RtDVr4hq4dZWoPmiMT8u53GPCowdxblO9PlYMrO27EUF+38gzm Qseehdv1ZNgv+7CpaLkSvOlq9C1S0xHjsumFEvu1xL1EQ6LdC+KEP3PaN E=; X-CSE-ConnectionGUID: mdI12UFeT5OJLKyvg/9iIQ== X-CSE-MsgGUID: l3uakVWMRu+ZKqY7mW9Eig== X-CSE-ConnectionGUID: Hcp7mlohRy2p9kmUGtzR2Q== X-CSE-MsgGUID: b36kcuxzRNuhytNOjHlZtQ== X-CSE-ConnectionGUID: lTDTsdK8TY6F2ILDI1OK0Q== X-CSE-MsgGUID: HQyh8XesThq3DiUv5b4NyQ== From: Moritz Haase To: CC: Moritz Haase , , , , Subject: [PATCH] linux-user: add option to intercept execve() syscalls Date: Tue, 25 Nov 2025 11:38:59 +0100 Message-ID: <20251125103859.1449760-1-Moritz.Haase@bmw.de> X-Mailer: git-send-email 2.51.0 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: smucmp18f.europe.bmw.corp (2a03:1e80:a16:52a::1:60) To smucmp21a.europe.bmw.corp (2a03:1e80:a01:524::1:44) Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=207.54.72.35; envelope-from=prvs=417ea8bb2=Moritz.Haase@bmw.de; helo=esa13.hc324-48.eu.iphmx.com X-Spam_score_int: -44 X-Spam_score: -4.5 X-Spam_bar: ---- X-Spam_report: (-4.5 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.075, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @bmw.de) X-ZM-MESSAGEID: 1764067189609019200 Content-Type: text/plain; charset="utf-8" In order for one to use QEMU user mode emulation under a chroot, it is required to use binfmt_misc. This can be avoided by QEMU never doing a raw execve() to the host system, which is especially useful in environments where binfmt_misc can't be used. Introduce a new option, -execve, that uses the current QEMU interpreter to intercept execve(). In addition, execve mode can also be en- and disabled using the 'QEMU_EXECVE' env var. qemu_execve() will prepend the interpreter path, similar to what binfmt_misc would do, and then pass the modified execve() to the host. It is necessary to parse hashbang scripts in that function otherwise the kernel will try to run the interpreter of a script without QEMU and get an invalid exec format error. Note that a previous incarnation of this patch was submitted a few years ago (see [0]) by Petros Angelatos as the original author who confirmed that it's OK to resubmit it. CC: petrosagg@resin.io CC: nghiant2710@gmail.com CC: forumi0721@gmail.com CC: laurent@vivier.eu Signed-off-by: Moritz Haase --- We've been using this feature internally for at least five years by now. Prior to submission, the code was updated to (hopefully) conform to the current QEMU coding style. I'd be happy to add test cases for this feature, but I'd need some pointers given that I'm a first-time contributor. Thanks! [0]: https://patchwork.kernel.org/project/qemu-devel/patch/1453091602-21843= -1-git-send-email-petrosagg@gmail.com/ --- linux-user/linuxload.c | 119 ++++++++++++++++++++++++++++++++++-- linux-user/loader.h | 1 + linux-user/main.c | 54 ++++++++++++++++ linux-user/syscall.c | 94 ++++++++++++++++++++++++---- linux-user/user-internals.h | 1 + 5 files changed, 252 insertions(+), 17 deletions(-) diff --git a/linux-user/linuxload.c b/linux-user/linuxload.c index 85d700953e..eb1fdf3f85 100644 --- a/linux-user/linuxload.c +++ b/linux-user/linuxload.c @@ -138,15 +138,124 @@ abi_ulong loader_build_argptr(int envc, int argc, ab= i_ulong sp, return sp; } =20 +int load_script_file(const char *filename, struct linux_binprm *bprm) +{ + int retval, fd; + char *i_arg =3D NULL, *i_name =3D NULL; + char **new_argv; + char *cp; + char buf[BPRM_BUF_SIZE]; + + /* Check if it is a script */ + fd =3D open(filename, O_RDONLY); + if (fd =3D=3D -1) { + return fd; + } + + retval =3D read(fd, buf, BPRM_BUF_SIZE); + if (retval =3D=3D -1) { + close(fd); + return retval; + } + + /* if we have less than 2 bytes, we can guess it is not executable */ + if (retval < 2) { + close(fd); + return -ENOEXEC; + } + + close(fd); + /* + * adapted from the kernel + * https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tre= e/fs/binfmt_script.c + */ + if ((buf[0] =3D=3D '#') && (buf[1] =3D=3D '!')) { + buf[BPRM_BUF_SIZE - 1] =3D '\0'; + cp =3D strchr(buf, '\n'); + if (cp =3D=3D NULL) { + cp =3D buf + BPRM_BUF_SIZE - 1; + } + *cp =3D '\0'; + while (cp > buf) { + cp--; + if ((*cp =3D=3D ' ') || (*cp =3D=3D '\t')) { + *cp =3D '\0'; + } else { + break; + } + } + for (cp =3D buf + 2; (*cp =3D=3D ' ') || (*cp =3D=3D '\t'); cp++) { + /* nothing */ ; + } + if (*cp =3D=3D '\0') { + return -ENOEXEC; /* No interpreter name found */ + } + i_name =3D cp; + i_arg =3D NULL; + for ( ; *cp && (*cp !=3D ' ') && (*cp !=3D '\t'); cp++) { + /* nothing */ ; + } + while ((*cp =3D=3D ' ') || (*cp =3D=3D '\t')) { + *cp++ =3D '\0'; + } + + new_argv =3D NULL; + if (*cp) { + i_arg =3D cp; + } + + if (i_arg) { + new_argv =3D g_alloca(sizeof(void *)); + new_argv[0] =3D i_arg; + } + bprm->argv =3D new_argv; + bprm->filename =3D i_name; + } else { + return 1; + } + return 0; +} + int loader_exec(int fdexec, const char *filename, char **argv, char **envp, struct image_info *infop, struct linux_binprm *bprm) { - int retval; + int retval, fd, offset =3D 1, argc =3D count(argv); + char **new_argv; + + retval =3D load_script_file(filename, bprm); + if (retval =3D=3D 0) { + if (bprm->argv !=3D NULL) { + offset =3D 2; + } + new_argv =3D g_alloca((argc + offset + 1) * sizeof(void *)); + + new_argv[0] =3D (char *)filename; + if (bprm->argv !=3D NULL) { + new_argv[1] =3D bprm->argv[0]; + } + /* Copy the original arguments with offset */ + for (int i =3D 0; i < argc; i++) { + new_argv[i + offset] =3D argv[i]; + } + new_argv[argc + offset] =3D NULL; + + bprm->argc =3D count(new_argv); + bprm->argv =3D new_argv; + fd =3D open(bprm->filename, O_RDONLY); + if (fd < 0) { + printf("Error while loading %s: %s\n", + bprm->filename, + strerror(errno)); + _exit(EXIT_FAILURE); + } + bprm->src.fd =3D fd; + } else { + bprm->filename =3D (char *)filename; + bprm->argc =3D count(argv); + bprm->argv =3D argv; + bprm->src.fd =3D fdexec; + } =20 - bprm->src.fd =3D fdexec; - bprm->filename =3D (char *)filename; - bprm->argc =3D count(argv); - bprm->argv =3D argv; bprm->envc =3D count(envp); bprm->envp =3D envp; =20 diff --git a/linux-user/loader.h b/linux-user/loader.h index da9ad28db5..2beedc5f0d 100644 --- a/linux-user/loader.h +++ b/linux-user/loader.h @@ -90,6 +90,7 @@ int loader_exec(int fdexec, const char *filename, char **= argv, char **envp, uint32_t get_elf_eflags(int fd); int load_elf_binary(struct linux_binprm *bprm, struct image_info *info); int load_flt_binary(struct linux_binprm *bprm, struct image_info *info); +int load_script_file(const char *filename, struct linux_binprm *bprm); =20 abi_long memcpy_to_target(abi_ulong dest, const void *src, unsigned long len); diff --git a/linux-user/main.c b/linux-user/main.c index db751c0757..3a8a748fda 100644 --- a/linux-user/main.c +++ b/linux-user/main.c @@ -128,6 +128,7 @@ static void usage(int exitcode); =20 static const char *interp_prefix =3D CONFIG_QEMU_INTERP_PREFIX; const char *qemu_uname_release; +const char *qemu_execve_path; =20 #if !defined(TARGET_DEFAULT_STACK_SIZE) /* XXX: on x86 MAP_GROWSDOWN only works if ESP <=3D address + 32, so @@ -367,6 +368,56 @@ static void handle_arg_guest_base(const char *arg) have_guest_base =3D true; } =20 +static void handle_arg_execve(const char *arg) +{ + const char *execfn; + char buf[PATH_MAX]; + char *ret; + int len; + + /* + * Since the 'execve' command line option has no argument ('has_arg' is + * 'false'), this function will always receive NULL for 'arg' during + * argument parsing. If 'arg' is non-NULL, we are being called during = env + * var handling, because QEMU_EXECVE is set. + */ + if (arg !=3D NULL) { + /* + * If the env var is set, check whether its value is '0'. In this = case, + * we don't want to enable 'execve' mode and thus bail out. Please= note + * that an empty value will NOT disable 'execve' mode. + */ + if (!strcmp(arg, "0")) { + return; + } + } + + /* try getauxval() */ + execfn =3D (const char *)qemu_getauxval(AT_EXECFN); + + if (execfn !=3D 0) { + ret =3D realpath(execfn, buf); + + if (ret !=3D NULL) { + qemu_execve_path =3D g_strdup(buf); + return; + } + } + + /* try /proc/self/exe */ + len =3D readlink("/proc/self/exe", buf, sizeof(buf) - 1); + + if (len !=3D -1) { + buf[len] =3D '\0'; + qemu_execve_path =3D g_strdup(buf); + return; + } + + fprintf(stderr, "qemu_execve: unable to determine interpreter's path\n= "); + exit(EXIT_FAILURE); +} + + static void handle_arg_reserved_va(const char *arg) { char *p; @@ -497,6 +548,9 @@ static const struct qemu_argument arg_table[] =3D { "uname", "set qemu uname release string to 'uname'"}, {"B", "QEMU_GUEST_BASE", true, handle_arg_guest_base, "address", "set guest_base address to 'address'"}, + {"execve", "QEMU_EXECVE", false, handle_arg_execve, + "", "use this interpreter when a process calls execve() " + "(disabled if env var is '0', enabled for all other values / when emp= ty)"}, {"R", "QEMU_RESERVED_VA", true, handle_arg_reserved_va, "size", "reserve 'size' bytes for guest virtual address space"}, {"t", "QEMU_RTSIG_MAP", true, handle_arg_rtsig_map, diff --git a/linux-user/syscall.c b/linux-user/syscall.c index 2060e561a2..bf9e084975 100644 --- a/linux-user/syscall.c +++ b/linux-user/syscall.c @@ -127,6 +127,7 @@ #include #include #endif +#include #include "linux_loop.h" #include "uname.h" =20 @@ -8726,6 +8727,86 @@ ssize_t do_guest_readlink(const char *pathname, char= *buf, size_t bufsiz) return ret; } =20 +static int qemu_execve(const char *filename, char *argv[], + char *envp[]) +{ + char **new_argv; + const char *new_filename; + int argc, ret, i, offset =3D 3; + struct linux_binprm *bprm; + + /* normal execve case */ + if (qemu_execve_path =3D=3D NULL || *qemu_execve_path =3D=3D 0) { + new_filename =3D filename; + new_argv =3D argv; + } else { + new_filename =3D qemu_execve_path; + + for (argc =3D 0; argv[argc] !=3D NULL; argc++) { + /* nothing */ ; + } + + bprm =3D g_alloca(sizeof(struct linux_binprm)); + ret =3D load_script_file(filename, bprm); + + if (ret < 0) { + if (ret =3D=3D -1) { + return get_errno(ret); + } else { + return -host_to_target_errno(ENOEXEC); + } + } + + if (ret =3D=3D 0) { + if (bprm->argv !=3D NULL) { + offset =3D 5; + } else { + offset =3D 4; + } + } + + /* Need to store execve argument */ + offset++; + + new_argv =3D g_alloca((argc + offset + 1) * sizeof(void *)); + + /* Copy the original arguments with offset */ + for (i =3D 0; i < argc; i++) { + new_argv[i + offset] =3D argv[i]; + } + + new_argv[0] =3D g_strdup(qemu_execve_path); + new_argv[1] =3D g_strdup("-execve"); /* Add execve argument */ + new_argv[2] =3D g_strdup("-0"); + new_argv[offset] =3D g_strdup(filename); + new_argv[argc + offset] =3D NULL; + + if (ret =3D=3D 0) { + new_argv[3] =3D bprm->filename; + new_argv[4] =3D bprm->filename; + + if (bprm->argv !=3D NULL) { + new_argv[5] =3D bprm->argv[0]; + } + } else { + new_argv[3] =3D argv[0]; + } + } + + /* + * Although execve() is not an interruptible syscall it is + * a special case where we must use the safe_syscall wrapper: + * if we allow a signal to happen before we make the host + * syscall then we will 'lose' it, because at the point of + * execve the process leaves QEMU's control. So we use the + * safe syscall wrapper to ensure that we either take the + * signal as a guest signal, or else it does not happen + * before the execve completes and makes it the other + * program's problem. + */ + return safe_execve(new_filename, new_argv, envp); +} + static int do_execv(CPUArchState *cpu_env, int dirfd, abi_long pathname, abi_long guest_argp, abi_long guest_envp, int flags, bool is_execveat) @@ -8791,17 +8872,6 @@ static int do_execv(CPUArchState *cpu_env, int dirfd, } *q =3D NULL; =20 - /* - * Although execve() is not an interruptible syscall it is - * a special case where we must use the safe_syscall wrapper: - * if we allow a signal to happen before we make the host - * syscall then we will 'lose' it, because at the point of - * execve the process leaves QEMU's control. So we use the - * safe syscall wrapper to ensure that we either take the - * signal as a guest signal, or else it does not happen - * before the execve completes and makes it the other - * program's problem. - */ p =3D lock_user_string(pathname); if (!p) { goto execve_efault; @@ -8813,7 +8883,7 @@ static int do_execv(CPUArchState *cpu_env, int dirfd, } ret =3D is_execveat ? safe_execveat(dirfd, exe, argp, envp, flags) - : safe_execve(exe, argp, envp); + : qemu_execve(exe, argp, envp); ret =3D get_errno(ret); =20 unlock_user(p, pathname, 0); diff --git a/linux-user/user-internals.h b/linux-user/user-internals.h index 7099349ec8..0fd97cdb4f 100644 --- a/linux-user/user-internals.h +++ b/linux-user/user-internals.h @@ -69,6 +69,7 @@ abi_long get_errno(abi_long ret); const char *target_strerror(int err); int get_osversion(void); void init_qemu_uname_release(void); +extern const char *qemu_execve_path; void fork_start(void); void fork_end(pid_t pid);