From nobody Fri Oct 10 13:32:48 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C28952E6D2F; Sat, 14 Jun 2025 13:49:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749908949; cv=none; b=inWSN7tqEuWIT07oBrkku9Y4EEtEI4EcZTZAq4m+6bql+DaVsQiEs2mtYiLn0GgM5AK6cQJWxvA7UtIwWBvABkJdT7UCPpSC9wi+6ORIwkottBfwrKcBj32kWvzA9Vz4auM1hCCHcflv4jcUAXuflG1EKwNprGyH8NL8rGiawdg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749908949; c=relaxed/simple; bh=HR3jSxpUUdGXQ0QKpM97JMgGl4QSqTNmyU2iIOv1VtQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=q5GH5umDD3iVXnsfAtbl2vawWla7lUWCO7iWuERGc4fgUABFIsn8Y69E3RpnqgGuP8RfApyj+AREWx8W3SyuA3sE/M5mmzc7Bivxg/feYrGC11jn1nOwI8Ws6bnme58klNKEihcg0j4c3/3PY7yemY6dvHGzWaJb4RF+ENnx0U0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=FukOVZmc; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="FukOVZmc" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A5050C4CEEE; Sat, 14 Jun 2025 13:49:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1749908949; bh=HR3jSxpUUdGXQ0QKpM97JMgGl4QSqTNmyU2iIOv1VtQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=FukOVZmcFAf0VQiVgCCp3/r9fELg0NqZSl5ITpxOcef9+t5SgAw2D4aMc62zpknis /MbKLDiH6qyc+ATgMSPoA6H4RYz1pUgxTHAZ2Mcv1wxwGYM6z12NCS3PYUfhLQ9sZ5 JjCV1aZNDO0W+YIsdGkmjM8+uwM8Vuk0ZZsLmbL+C3PMTtyuECInDoAL8SVbyMJxUC hbTwDrWJQxeAfp3isnrC6FVngWRM2uOMwIBP++f5kVYVIYKhZfproNjPBInVwbgsfi w6HEzXaZob4/jknyC+/2DmRcoipwwsZZ6ASQS5xu4GyQNhdSIoHF3XOjvRGrOs2g4L wmZy/kFgA31cg== From: Sasha Levin To: linux-kernel@vger.kernel.org Cc: linux-api@vger.kernel.org, workflows@vger.kernel.org, tools@kernel.org, Sasha Levin Subject: [RFC 09/19] exec: add API specification for execveat Date: Sat, 14 Jun 2025 09:48:48 -0400 Message-Id: <20250614134858.790460-10-sashal@kernel.org> X-Mailer: git-send-email 2.39.5 In-Reply-To: <20250614134858.790460-1-sashal@kernel.org> References: <20250614134858.790460-1-sashal@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add comprehensive kernel API specification for the execveat() system call. Signed-off-by: Sasha Levin --- fs/exec.c | 245 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 245 insertions(+) diff --git a/fs/exec.c b/fs/exec.c index 3d006105ab23d..49d8647c053ef 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -2223,6 +2223,251 @@ SYSCALL_DEFINE3(execve, return do_execve(getname(filename), argv, envp); } =20 + +/* Valid flag combinations for execveat */ +static const s64 execveat_valid_flags[] =3D { + 0, + AT_EMPTY_PATH, + AT_SYMLINK_NOFOLLOW, + AT_EXECVE_CHECK, + AT_EMPTY_PATH | AT_SYMLINK_NOFOLLOW, + AT_EMPTY_PATH | AT_EXECVE_CHECK, + AT_SYMLINK_NOFOLLOW | AT_EXECVE_CHECK, + AT_EMPTY_PATH | AT_SYMLINK_NOFOLLOW | AT_EXECVE_CHECK, +}; + +DEFINE_KERNEL_API_SPEC(sys_execveat) + KAPI_DESCRIPTION("Execute a new program relative to a directory file desc= riptor") + KAPI_LONG_DESC("Executes the program referred to by the combination of fd= and filename. " + "This system call is useful when implementing a secure execution = environment " + "or when the calling process has an open file descriptor but no a= ccess to " + "the corresponding pathname. Like execve(), it replaces the curre= nt process " + "image with a new process image.") + KAPI_CONTEXT(KAPI_CTX_PROCESS | KAPI_CTX_SLEEPABLE) + + KAPI_PARAM(0, "fd", "int", "Directory file descriptor") + KAPI_PARAM_FLAGS(KAPI_PARAM_IN) + .type =3D KAPI_TYPE_FD, + .constraint_type =3D KAPI_CONSTRAINT_NONE, + .constraints =3D "AT_FDCWD for current directory, or valid directory fil= e descriptor", + KAPI_PARAM_END + + KAPI_PARAM(1, "filename", "const char __user *", "Pathname of the program= to execute") + KAPI_PARAM_FLAGS(KAPI_PARAM_IN | KAPI_PARAM_USER | KAPI_PARAM_OPTIONAL) + .type =3D KAPI_TYPE_PATH, + .constraint_type =3D KAPI_CONSTRAINT_NONE, + .constraints =3D "Relative or absolute path; empty string with AT_EMPTY_= PATH to use fd directly", + KAPI_PARAM_END + + KAPI_PARAM(2, "argv", "const char __user *const __user *", "Array of argu= ment strings passed to the new program") + KAPI_PARAM_FLAGS(KAPI_PARAM_IN | KAPI_PARAM_USER) + .type =3D KAPI_TYPE_USER_PTR, + .constraint_type =3D KAPI_CONSTRAINT_NONE, + .constraints =3D "NULL-terminated array of pointers to null-terminated s= trings", + KAPI_PARAM_END + + KAPI_PARAM(3, "envp", "const char __user *const __user *", "Array of envi= ronment strings for the new program") + KAPI_PARAM_FLAGS(KAPI_PARAM_IN | KAPI_PARAM_USER) + .type =3D KAPI_TYPE_USER_PTR, + .constraint_type =3D KAPI_CONSTRAINT_NONE, + .constraints =3D "NULL-terminated array of pointers to null-terminated s= trings in form key=3Dvalue", + KAPI_PARAM_END + + KAPI_PARAM(4, "flags", "int", "Execution flags") + KAPI_PARAM_FLAGS(KAPI_PARAM_IN) + .type =3D KAPI_TYPE_INT, + .constraint_type =3D KAPI_CONSTRAINT_MASK, + .valid_mask =3D AT_EMPTY_PATH | AT_SYMLINK_NOFOLLOW | AT_EXECVE_CHECK, + .constraints =3D "0 or combination of AT_EMPTY_PATH, AT_SYMLINK_NOFOLLOW= , and AT_EXECVE_CHECK", + KAPI_PARAM_END + + /* Return specification */ + KAPI_RETURN("long", "Does not return on success (except with AT_EXECVE_CH= ECK which returns 0); returns -1 on error") + .type =3D KAPI_TYPE_INT, + .check_type =3D KAPI_RETURN_ERROR_CHECK, + KAPI_RETURN_END + + /* Error codes */ + KAPI_ERROR(0, -E2BIG, "E2BIG", "Argument list too long", "The total size = of argv and envp exceeds the system limit.") + KAPI_ERROR(1, -EACCES, "EACCES", "Permission denied", "Search permission = denied on a component of the path, file is not regular, or execute permissi= on denied for file or interpreter.") + KAPI_ERROR(2, -EBADF, "EBADF", "Bad file descriptor", "fd is not a valid = file descriptor.") + KAPI_ERROR(3, -EFAULT, "EFAULT", "Bad address", "filename, argv, or envp = points outside accessible address space.") + KAPI_ERROR(4, -EINVAL, "EINVAL", "Invalid flags or executable format", "I= nvalid flags specified, or ELF executable has more than one PT_INTERP segme= nt.") + KAPI_ERROR(5, -EIO, "EIO", "I/O error", "An I/O error occurred while read= ing from the file system.") + KAPI_ERROR(6, -EISDIR, "EISDIR", "Is a directory", "An ELF interpreter wa= s a directory.") + KAPI_ERROR(7, -ELIBBAD, "ELIBBAD", "Invalid ELF interpreter", "An ELF int= erpreter was not in a recognized format.") + KAPI_ERROR(8, -ELOOP, "ELOOP", "Too many symbolic links", "Too many symbo= lic links encountered, or AT_SYMLINK_NOFOLLOW was specified but filename re= fers to a symbolic link.") + KAPI_ERROR(9, -EMFILE, "EMFILE", "Too many open files", "The per-process = limit on open file descriptors has been reached.") + KAPI_ERROR(10, -ENAMETOOLONG, "ENAMETOOLONG", "Filename too long", "filen= ame or one of the strings in argv or envp is too long.") + KAPI_ERROR(11, -ENFILE, "ENFILE", "System file table overflow", "The syst= em-wide limit on open files has been reached.") + KAPI_ERROR(12, -ENOENT, "ENOENT", "File not found", "The file filename or= an interpreter does not exist, or filename is empty and AT_EMPTY_PATH was = not specified in flags.") + KAPI_ERROR(13, -ENOEXEC, "ENOEXEC", "Exec format error", "An executable i= s not in a recognized format, is for wrong architecture, or has other forma= t errors preventing execution.") + KAPI_ERROR(14, -ENOMEM, "ENOMEM", "Out of memory", "Insufficient kernel m= emory available.") + KAPI_ERROR(15, -ENOTDIR, "ENOTDIR", "Not a directory", "A component of th= e path prefix is not a directory, or fd is not a directory when a relative = path is given.") + KAPI_ERROR(16, -EPERM, "EPERM", "Operation not permitted", "The filesyste= m is mounted nosuid, the user is not root, and the file has set-user-ID or = set-group-ID bit set.") + KAPI_ERROR(17, -ETXTBSY, "ETXTBSY", "Text file busy", "The executable was= open for writing by one or more processes.") + KAPI_ERROR(18, -EAGAIN, "EAGAIN", "Resource temporarily unavailable", "RL= IMIT_NPROC limit exceeded - too many processes for this user.") + KAPI_ERROR(19, -EINTR, "EINTR", "Interrupted by signal", "The exec was in= terrupted by a signal during setup phase.") + + /* Signal specifications */ + KAPI_SIGNAL(0, 0, "FATAL_SIGNALS", KAPI_SIGNAL_RECEIVE, KAPI_SIGNAL_ACTIO= N_TERMINATE) + KAPI_SIGNAL_CONDITION("Fatal signal pending during exec setup") + KAPI_SIGNAL_DESC("Fatal signals (checked via fatal_signal_pending()) can= interrupt exec during setup phases like de_thread(). This causes exec to f= ail and the process to exit.") + KAPI_SIGNAL_RESTARTABLE + KAPI_SIGNAL_END + + KAPI_SIGNAL(1, SIGKILL, "SIGKILL", KAPI_SIGNAL_SEND, KAPI_SIGNAL_ACTION_T= ERMINATE) + KAPI_SIGNAL_TARGET("All other threads in the thread group") + KAPI_SIGNAL_CONDITION("Multi-threaded process doing exec") + KAPI_SIGNAL_DESC("During de_thread(), zap_other_threads() sends SIGKILL = to all other threads in the thread group to ensure only the execing thread = survives.") + KAPI_SIGNAL_END + + KAPI_SIGNAL(2, 0, "ALL_HANDLERS", KAPI_SIGNAL_HANDLE, KAPI_SIGNAL_ACTION_= CUSTOM) + KAPI_SIGNAL_CONDITION("Signal has a handler installed") + KAPI_SIGNAL_DESC("flush_signal_handlers() resets all signal handlers to = SIG_DFL except for signals that are ignored (SIG_IGN). This happens after d= e_thread() completes.") + KAPI_SIGNAL_END + + KAPI_SIGNAL(3, 0, "IGNORED_SIGNALS", KAPI_SIGNAL_IGNORE, KAPI_SIGNAL_ACTI= ON_CUSTOM) + KAPI_SIGNAL_CONDITION("Signal disposition is SIG_IGN") + KAPI_SIGNAL_DESC("Signals set to SIG_IGN are preserved across exec. This= is POSIX-compliant behavior allowing parent processes to ignore signals in= children.") + KAPI_SIGNAL_END + + KAPI_SIGNAL(4, 0, "PENDING_SIGNALS", KAPI_SIGNAL_HANDLE, KAPI_SIGNAL_ACTI= ON_CUSTOM) + KAPI_SIGNAL_CONDITION("Any pending signals") + KAPI_SIGNAL_DESC("All pending signals are cleared during exec. This incl= udes both thread-specific and process-wide pending signals.") + KAPI_SIGNAL_END + + KAPI_SIGNAL(5, 0, "TIMER_SIGNALS", KAPI_SIGNAL_HANDLE, KAPI_SIGNAL_ACTION= _CUSTOM) + KAPI_SIGNAL_CONDITION("Timer-generated signals pending") + KAPI_SIGNAL_DESC("flush_itimer_signals() clears any pending timer signal= s (SIGALRM, SIGVTALRM, SIGPROF) to prevent confusion in the new program.") + KAPI_SIGNAL_END + + KAPI_SIGNAL(6, SIGCHLD, "SIGCHLD", KAPI_SIGNAL_SEND, KAPI_SIGNAL_ACTION_D= EFAULT) + KAPI_SIGNAL_TARGET("Parent process when this process exits") + KAPI_SIGNAL_CONDITION("Process exit after exec") + KAPI_SIGNAL_DESC("The exit_signal is set to SIGCHLD during exec, ensurin= g the parent will receive SIGCHLD when this process terminates.") + KAPI_SIGNAL_END + + KAPI_SIGNAL(7, 0, "SIGALTSTACK", KAPI_SIGNAL_HANDLE, KAPI_SIGNAL_ACTION_C= USTOM) + KAPI_SIGNAL_CONDITION("Process had alternate signal stack") + KAPI_SIGNAL_DESC("Any alternate signal stack (sigaltstack) is not preser= ved across exec. The new program starts with no alternate stack.") + KAPI_SIGNAL_END + + /* Side effects */ + KAPI_SIDE_EFFECT(0, KAPI_EFFECT_PROCESS_STATE | KAPI_EFFECT_FREE_MEMORY |= KAPI_EFFECT_ALLOC_MEMORY, + "process image", + "Replaces entire process image including code, data, heap, and stack") + KAPI_SIDE_EFFECT_END + + KAPI_SIDE_EFFECT(1, KAPI_EFFECT_MODIFY_STATE | KAPI_EFFECT_RESOURCE_DESTR= OY, + "file descriptors", + "Closes all file descriptors with close-on-exec flag set") + KAPI_EFFECT_CONDITION("FD_CLOEXEC flag set") + KAPI_SIDE_EFFECT_END + + KAPI_SIDE_EFFECT(2, KAPI_EFFECT_MODIFY_STATE, + "signal handlers", + "Resets all signal handlers to default, preserves ignored signals") + KAPI_SIDE_EFFECT_END + + KAPI_SIDE_EFFECT(3, KAPI_EFFECT_PROCESS_STATE | KAPI_EFFECT_SIGNAL_SEND, + "thread group", + "Kills all other threads in the thread group with SIGKILL") + KAPI_EFFECT_CONDITION("Multi-threaded process") + KAPI_SIDE_EFFECT_END + + KAPI_SIDE_EFFECT(4, KAPI_EFFECT_MODIFY_STATE, + "process attributes", + "Clears pending signals, timers, alternate signal stack, and various p= rocess attributes") + KAPI_SIDE_EFFECT_END + + KAPI_SIDE_EFFECT(5, KAPI_EFFECT_FILESYSTEM, + "executable file", + "Opens and reads the executable file, may trigger filesystem operation= s") + KAPI_SIDE_EFFECT_END + + KAPI_SIDE_EFFECT(6, KAPI_EFFECT_MODIFY_STATE, + "security context", + "May change SELinux/AppArmor context based on file labels and transiti= ons") + KAPI_EFFECT_CONDITION("LSM enabled") + KAPI_SIDE_EFFECT_END + + /* State transitions */ + KAPI_STATE_TRANS(0, "process memory", + "old program image", "new program image", + "Complete replacement of process address space with new program") + KAPI_STATE_TRANS_END + + KAPI_STATE_TRANS(1, "process credentials", + "current credentials", "potentially modified credentials", + "May change effective UID/GID based on file permissions") + KAPI_STATE_TRANS_COND("setuid/setgid binary") + KAPI_STATE_TRANS_END + + KAPI_STATE_TRANS(2, "thread state", + "multi-threaded", "single-threaded", + "Process becomes single-threaded after killing other threads") + KAPI_STATE_TRANS_COND("Multi-threaded process") + KAPI_STATE_TRANS_END + + KAPI_STATE_TRANS(3, "signal state", + "custom handlers and pending signals", "default handlers, no pending s= ignals", + "Signal handling reset to clean state for new program") + KAPI_STATE_TRANS_END + + KAPI_STATE_TRANS(4, "file descriptor table", + "contains close-on-exec FDs", "close-on-exec FDs closed", + "All file descriptors marked FD_CLOEXEC are closed during exec") + KAPI_STATE_TRANS_COND("FDs with FD_CLOEXEC") + KAPI_STATE_TRANS_END + + KAPI_STATE_TRANS(5, "working directory", + "fd-relative operations", "resolved to absolute paths", + "Directory fd operations resolved before exec completes") + KAPI_STATE_TRANS_COND("Using dirfd !=3D AT_FDCWD") + KAPI_STATE_TRANS_END + + /* Locking information */ + KAPI_LOCK(0, "cred_guard_mutex", KAPI_LOCK_MUTEX) + KAPI_LOCK_DESC("Protects against concurrent credential changes during ex= ec") + KAPI_LOCK_ACQUIRED + KAPI_LOCK_DESC("Ensures atomic credential transition during exec process= ") + KAPI_LOCK_END + + KAPI_LOCK(1, "sighand->siglock", KAPI_LOCK_SPINLOCK) + KAPI_LOCK_DESC("Protects signal handler modifications") + KAPI_LOCK_ACQUIRED + KAPI_LOCK_RELEASED + KAPI_LOCK_DESC("Taken during signal handler reset and pending signal cle= aring") + KAPI_LOCK_END + + KAPI_SIDE_EFFECT_COUNT(7) + KAPI_STATE_TRANS_COUNT(6) + + .error_count =3D 20, + .param_count =3D 5, + .since_version =3D "3.19", + .examples =3D "/* Execute /bin/echo using AT_FDCWD */\n" + "char *argv[] =3D { \"echo\", \"hello\", NULL };\n" + "char *envp[] =3D { \"PATH=3D/bin\", NULL };\n" + "execveat(AT_FDCWD, \"/bin/echo\", argv, envp, 0);\n\n" + "/* Execute via file descriptor */\n" + "int fd =3D open(\"/bin/echo\", O_PATH);\n" + "execveat(fd, \"\", argv, envp, AT_EMPTY_PATH);\n\n" + "/* Execute relative to directory fd */\n" + "int dirfd =3D open(\"/bin\", O_RDONLY | O_DIRECTORY);\n" + "execveat(dirfd, \"echo\", argv, envp, 0);", + .notes =3D "execveat() was added to allow fexecve() to be implemented on = systems that " + "do not have /proc mounted. When filename is an empty string and AT_EMP= TY_PATH " + "is specified, the file descriptor fd specifies the file to be executed= . " + "AT_SYMLINK_NOFOLLOW prevents following symbolic links. " + "AT_EXECVE_CHECK (since Linux 6.12) only checks if execution would be a= llowed " + "without actually executing. Like execve(), on success execveat() does = not return " + "(except with AT_EXECVE_CHECK which returns 0).", + .signal_count =3D 8, + .lock_count =3D 2, +KAPI_END_SPEC; + SYSCALL_DEFINE5(execveat, int, fd, const char __user *, filename, const char __user *const __user *, argv, --=20 2.39.5