From nobody Sun Dec 14 20:29:46 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 187E8C4332F for ; Thu, 15 Dec 2022 00:14:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230193AbiLOAOb (ORCPT ); Wed, 14 Dec 2022 19:14:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33864 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229563AbiLOANn (ORCPT ); Wed, 14 Dec 2022 19:13:43 -0500 Received: from mail-pl1-x635.google.com (mail-pl1-x635.google.com [IPv6:2607:f8b0:4864:20::635]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3142F532E6 for ; Wed, 14 Dec 2022 16:12:15 -0800 (PST) Received: by mail-pl1-x635.google.com with SMTP id t2so5152326ply.2 for ; Wed, 14 Dec 2022 16:12:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=1Z0pNLRJ0HO6mEsQsP4xR7y3AzecdWz1BUkrTkkGBPM=; b=XjRcI72nPawquPIN5bRWzeD1IrnRGCrn2xOecV8quUxSYWi+9EgLb09J83xKRNESEE Fje6gfkM8VOrT54RFTc0zvPvFLENPYV2u0YEuaHM2TE7nnT5QFXwYFyPAkFspQOmx86N JdBvZ7VcmIX6VcU/OQP60sTESBMlSjCKUCfoQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1Z0pNLRJ0HO6mEsQsP4xR7y3AzecdWz1BUkrTkkGBPM=; b=zGM++bi1VRa778R8jTcvj72ALztE2YQ7O09cI8hdchpH4QrbOacFwoNvLTWTNUrjhF Kvb6BWYxA8+cZAxOyWU0n3cY1a7xSg7gp32jRxkw+axz60c94TT+FSpCys/fB54GqBGl 2zKlSCgG32XuH3a3Y7Tkl+pMI/1qzhBWNl7WUzMVyt69uhELte3TdWBK272KKyM0149i oYdgMSzCbeYpvURmTcBxq96Jacz7DvwQ+aNY8f5yb6mOMpeOmn+oRPYQKK+5BpLDyFkI HLm6RI0gp3YVAaZD4nywuW0zmKIuMo1ULoAJdm0zNN5WOtToz6h0tbu+9jnZryxT1ekb QRvA== X-Gm-Message-State: ANoB5plypXOWAdtGPYJhtF8aETABhckFFOXeo1vsaSCp41aow6ig7Ozo uGFg3ttH2hOGTToW1QYYHPlKJw== X-Google-Smtp-Source: AA0mqf7+JGkwwniTj89lgqy5bz/Es9n9nNT8cgCfEq9KAwLbHDZVvEcOgQ8eA1MrQnvs5mxlbFF8lg== X-Received: by 2002:a17:90a:a509:b0:219:ca8f:9c8b with SMTP id a9-20020a17090aa50900b00219ca8f9c8bmr27203708pjq.11.1671063134721; Wed, 14 Dec 2022 16:12:14 -0800 (PST) Received: from jeffxud.c.googlers.com.com (30.202.168.34.bc.googleusercontent.com. [34.168.202.30]) by smtp.gmail.com with ESMTPSA id 3-20020a17090a08c300b0021937b2118bsm1845738pjn.54.2022.12.14.16.12.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Dec 2022 16:12:13 -0800 (PST) From: jeffxu@chromium.org To: skhan@linuxfoundation.org, keescook@chromium.org Cc: akpm@linux-foundation.org, dmitry.torokhov@gmail.com, dverkamp@chromium.org, hughd@google.com, jeffxu@google.com, jorgelo@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, jannh@google.com, linux-hardening@vger.kernel.org, linux-security-module@vger.kernel.org Subject: [PATCH v8 1/5] mm/memfd: add F_SEAL_EXEC Date: Thu, 15 Dec 2022 00:12:01 +0000 Message-Id: <20221215001205.51969-2-jeffxu@google.com> X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog In-Reply-To: <20221215001205.51969-1-jeffxu@google.com> References: <20221215001205.51969-1-jeffxu@google.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Daniel Verkamp The new F_SEAL_EXEC flag will prevent modification of the exec bits: written as traditional octal mask, 0111, or as named flags, S_IXUSR | S_IXGRP | S_IXOTH. Any chmod(2) or similar call that attempts to modify any of these bits after the seal is applied will fail with errno EPERM. This will preserve the execute bits as they are at the time of sealing, so the memfd will become either permanently executable or permanently un-executable. Signed-off-by: Daniel Verkamp Co-developed-by: Jeff Xu Signed-off-by: Jeff Xu Reviewed-by: Kees Cook --- include/uapi/linux/fcntl.h | 1 + mm/memfd.c | 2 ++ mm/shmem.c | 6 ++++++ 3 files changed, 9 insertions(+) diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h index 2f86b2ad6d7e..e8c07da58c9f 100644 --- a/include/uapi/linux/fcntl.h +++ b/include/uapi/linux/fcntl.h @@ -43,6 +43,7 @@ #define F_SEAL_GROW 0x0004 /* prevent file from growing */ #define F_SEAL_WRITE 0x0008 /* prevent writes */ #define F_SEAL_FUTURE_WRITE 0x0010 /* prevent future writes while mapped = */ +#define F_SEAL_EXEC 0x0020 /* prevent chmod modifying exec bits */ /* (1U << 31) is reserved for signed error codes */ =20 /* diff --git a/mm/memfd.c b/mm/memfd.c index 08f5f8304746..4ebeab94aa74 100644 --- a/mm/memfd.c +++ b/mm/memfd.c @@ -147,6 +147,7 @@ static unsigned int *memfd_file_seals_ptr(struct file *= file) } =20 #define F_ALL_SEALS (F_SEAL_SEAL | \ + F_SEAL_EXEC | \ F_SEAL_SHRINK | \ F_SEAL_GROW | \ F_SEAL_WRITE | \ @@ -175,6 +176,7 @@ static int memfd_add_seals(struct file *file, unsigned = int seals) * SEAL_SHRINK: Prevent the file from shrinking * SEAL_GROW: Prevent the file from growing * SEAL_WRITE: Prevent write access to the file + * SEAL_EXEC: Prevent modification of the exec bits in the file mode * * As we don't require any trust relationship between two parties, we * must prevent seals from being removed. Therefore, sealing a file diff --git a/mm/shmem.c b/mm/shmem.c index c1d8b8a1aa3b..e18a9cf9d937 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1085,6 +1085,12 @@ static int shmem_setattr(struct user_namespace *mnt_= userns, if (error) return error; =20 + if ((info->seals & F_SEAL_EXEC) && (attr->ia_valid & ATTR_MODE)) { + if ((inode->i_mode ^ attr->ia_mode) & 0111) { + return -EPERM; + } + } + if (S_ISREG(inode->i_mode) && (attr->ia_valid & ATTR_SIZE)) { loff_t oldsize =3D inode->i_size; loff_t newsize =3D attr->ia_size; --=20 2.39.0.rc1.256.g54fd8350bd-goog From nobody Sun Dec 14 20:29:46 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D5F5C4332F for ; Thu, 15 Dec 2022 00:14:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230186AbiLOAO2 (ORCPT ); Wed, 14 Dec 2022 19:14:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36596 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230011AbiLOANn (ORCPT ); Wed, 14 Dec 2022 19:13:43 -0500 Received: from mail-pl1-x635.google.com (mail-pl1-x635.google.com [IPv6:2607:f8b0:4864:20::635]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5539D532ED for ; Wed, 14 Dec 2022 16:12:16 -0800 (PST) Received: by mail-pl1-x635.google.com with SMTP id n4so1475827plp.1 for ; Wed, 14 Dec 2022 16:12:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=BfROb3ZkcZ7Ac7CKaQoVSCHLc1zwDwCheELrvNAzva4=; b=EEBsbANJ5rVxA3zNQWjC0OwYANmCqoR2KQqQ6RbXOFYUIhlWxdJeMRwj9ifXy3gRpA bV3fXAoCNZ0OVZV99aDQcbltEdQYi4nYBWh5mT8wpvJr5QdHxBaF4+Od8j4v1tCYrKBB xIK3hSfSZ7PNnJc9ZYeZQ6xIHca1IYtzMhWgQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BfROb3ZkcZ7Ac7CKaQoVSCHLc1zwDwCheELrvNAzva4=; b=5VsHQdibD8s0O3BNq1M62FOupCcAX2g9bGL8c9a8VVnE+KXAJ8JLCbfwhScEv7fRbR fhRDxolOmcf5KN4ppeH2d0k4sjbYGBs4lK2wWd73l78B1TGcJehPa93mPFHRM0kHCGIh RhIQHUhn0D4JMpb9iwf6tZEwD+eC1aCBpAsMllLHZ27YstUgCfRdgqhO0DFlrDXevYdY hKI+C3Yl6ifzIWradwx7wixHdecHijNdMF+svEp6nLhOhqIwvjkE2SamnoZclLqGwEI5 /YzSwWS9GSAZtYcz627r8uTAphOQNEPmFAFpJxjn/PaeF6LwYmUOT7UsjCCPAnzDRpop sYoQ== X-Gm-Message-State: ANoB5pmV7W16gcc0nmrXp2yRJsPrf9MPvsB3ac1k5Z9P42QSD08Ban3u /NLDVXk1wkyNeoVgnrRKEO/RAg== X-Google-Smtp-Source: AA0mqf5FkGT3aKj+Iri+fkdEapJVxPOjm37Mu3bRXj+pTnXCNATSsFbSxU7zW7UjlccSkGKowjUlRg== X-Received: by 2002:a17:90a:9704:b0:21d:4b50:23b1 with SMTP id x4-20020a17090a970400b0021d4b5023b1mr29685502pjo.5.1671063135792; Wed, 14 Dec 2022 16:12:15 -0800 (PST) Received: from jeffxud.c.googlers.com.com (30.202.168.34.bc.googleusercontent.com. [34.168.202.30]) by smtp.gmail.com with ESMTPSA id 3-20020a17090a08c300b0021937b2118bsm1845738pjn.54.2022.12.14.16.12.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Dec 2022 16:12:15 -0800 (PST) From: jeffxu@chromium.org To: skhan@linuxfoundation.org, keescook@chromium.org Cc: akpm@linux-foundation.org, dmitry.torokhov@gmail.com, dverkamp@chromium.org, hughd@google.com, jeffxu@google.com, jorgelo@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, jannh@google.com, linux-hardening@vger.kernel.org, linux-security-module@vger.kernel.org Subject: [PATCH v8 2/5] selftests/memfd: add tests for F_SEAL_EXEC Date: Thu, 15 Dec 2022 00:12:02 +0000 Message-Id: <20221215001205.51969-3-jeffxu@google.com> X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog In-Reply-To: <20221215001205.51969-1-jeffxu@google.com> References: <20221215001205.51969-1-jeffxu@google.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Daniel Verkamp Basic tests to ensure that user/group/other execute bits cannot be changed after applying F_SEAL_EXEC to a memfd. Signed-off-by: Daniel Verkamp Co-developed-by: Jeff Xu Signed-off-by: Jeff Xu Reviewed-by: Kees Cook --- tools/testing/selftests/memfd/memfd_test.c | 123 ++++++++++++++++++++- 1 file changed, 122 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/sel= ftests/memfd/memfd_test.c index 94df2692e6e4..f18a15a1f275 100644 --- a/tools/testing/selftests/memfd/memfd_test.c +++ b/tools/testing/selftests/memfd/memfd_test.c @@ -28,12 +28,38 @@ #define MFD_DEF_SIZE 8192 #define STACK_SIZE 65536 =20 +#define F_SEAL_EXEC 0x0020 + /* * Default is not to test hugetlbfs */ static size_t mfd_def_size =3D MFD_DEF_SIZE; static const char *memfd_str =3D MEMFD_STR; =20 +static ssize_t fd2name(int fd, char *buf, size_t bufsize) +{ + char buf1[PATH_MAX]; + int size; + ssize_t nbytes; + + size =3D snprintf(buf1, PATH_MAX, "/proc/self/fd/%d", fd); + if (size < 0) { + printf("snprintf(%d) failed on %m\n", fd); + abort(); + } + + /* + * reserver one byte for string termination. + */ + nbytes =3D readlink(buf1, buf, bufsize-1); + if (nbytes =3D=3D -1) { + printf("readlink(%s) failed %m\n", buf1); + abort(); + } + buf[nbytes] =3D '\0'; + return nbytes; +} + static int mfd_assert_new(const char *name, loff_t sz, unsigned int flags) { int r, fd; @@ -98,11 +124,14 @@ static unsigned int mfd_assert_get_seals(int fd) =20 static void mfd_assert_has_seals(int fd, unsigned int seals) { + char buf[PATH_MAX]; + int nbytes; unsigned int s; + fd2name(fd, buf, PATH_MAX); =20 s =3D mfd_assert_get_seals(fd); if (s !=3D seals) { - printf("%u !=3D %u =3D GET_SEALS(%d)\n", seals, s, fd); + printf("%u !=3D %u =3D GET_SEALS(%s)\n", seals, s, buf); abort(); } } @@ -594,6 +623,64 @@ static void mfd_fail_grow_write(int fd) } } =20 +static void mfd_assert_mode(int fd, int mode) +{ + struct stat st; + char buf[PATH_MAX]; + int nbytes; + + fd2name(fd, buf, PATH_MAX); + + if (fstat(fd, &st) < 0) { + printf("fstat(%s) failed: %m\n", buf); + abort(); + } + + if ((st.st_mode & 07777) !=3D mode) { + printf("fstat(%s) wrong file mode 0%04o, but expected 0%04o\n", + buf, (int)st.st_mode & 07777, mode); + abort(); + } +} + +static void mfd_assert_chmod(int fd, int mode) +{ + char buf[PATH_MAX]; + int nbytes; + + fd2name(fd, buf, PATH_MAX); + + if (fchmod(fd, mode) < 0) { + printf("fchmod(%s, 0%04o) failed: %m\n", buf, mode); + abort(); + } + + mfd_assert_mode(fd, mode); +} + +static void mfd_fail_chmod(int fd, int mode) +{ + struct stat st; + char buf[PATH_MAX]; + int nbytes; + + fd2name(fd, buf, PATH_MAX); + + if (fstat(fd, &st) < 0) { + printf("fstat(%s) failed: %m\n", buf); + abort(); + } + + if (fchmod(fd, mode) =3D=3D 0) { + printf("fchmod(%s, 0%04o) didn't fail as expected\n", + buf, mode); + abort(); + } + + /* verify that file mode bits did not change */ + mfd_assert_mode(fd, st.st_mode & 07777); +} + static int idle_thread_fn(void *arg) { sigset_t set; @@ -880,6 +967,39 @@ static void test_seal_resize(void) close(fd); } =20 +/* + * Test SEAL_EXEC + * Test that chmod() cannot change x bits after sealing + */ +static void test_seal_exec(void) +{ + int fd; + + printf("%s SEAL-EXEC\n", memfd_str); + + fd =3D mfd_assert_new("kern_memfd_seal_exec", + mfd_def_size, + MFD_CLOEXEC | MFD_ALLOW_SEALING); + + mfd_assert_mode(fd, 0777); + + mfd_assert_chmod(fd, 0644); + + mfd_assert_has_seals(fd, 0); + mfd_assert_add_seals(fd, F_SEAL_EXEC); + mfd_assert_has_seals(fd, F_SEAL_EXEC); + + mfd_assert_chmod(fd, 0600); + mfd_fail_chmod(fd, 0777); + mfd_fail_chmod(fd, 0670); + mfd_fail_chmod(fd, 0605); + mfd_fail_chmod(fd, 0700); + mfd_fail_chmod(fd, 0100); + mfd_assert_chmod(fd, 0666); + + close(fd); +} + /* * Test sharing via dup() * Test that seals are shared between dupped FDs and they're all equal. @@ -1059,6 +1179,7 @@ int main(int argc, char **argv) test_seal_shrink(); test_seal_grow(); test_seal_resize(); + test_seal_exec(); =20 test_share_dup("SHARE-DUP", ""); test_share_mmap("SHARE-MMAP", ""); --=20 2.39.0.rc1.256.g54fd8350bd-goog From nobody Sun Dec 14 20:29:46 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73258C4332F for ; Thu, 15 Dec 2022 00:14:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230199AbiLOAOf (ORCPT ); Wed, 14 Dec 2022 19:14:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33868 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229642AbiLOANo (ORCPT ); Wed, 14 Dec 2022 19:13:44 -0500 Received: from mail-pj1-x102f.google.com (mail-pj1-x102f.google.com [IPv6:2607:f8b0:4864:20::102f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5D025532F3 for ; Wed, 14 Dec 2022 16:12:17 -0800 (PST) Received: by mail-pj1-x102f.google.com with SMTP id w4-20020a17090ac98400b002186f5d7a4cso1070007pjt.0 for ; Wed, 14 Dec 2022 16:12:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=rVyjOhi/pzntg5cu80LrTtyRL8XcAFXbux7p+GJEQIo=; b=jxcNFgeVj0Fxc7Zx42KHLwUQnKFsiu9IWedp4b+G2OMNf7qLsJU/1Q6BFjav07vpp+ CRDJLhClYOM5aVDmzdmThMzLzT7vBZ+NfoKEWjuzi5XaHPReEG8TLdqj75y76ovECaw0 J8B1MYFOInCEb2NTKyBph8idcyYY5LFmUD9CA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rVyjOhi/pzntg5cu80LrTtyRL8XcAFXbux7p+GJEQIo=; b=WNIhm5twaM6IWCDy4DSLMCXfpUzJwRrofd8WcD9NRI2DrNBOQ5A1M/jUTWH6HZMH2W F+MZf+tFY7cIOVb5SXstZz/4jlFrFZd0huQUDK4MTcPSlbxEMoDnruFOxuDdrSlzfevp 3CjM9AtmxH8iHUOg6Mr0lgRY6uWRE2AIHruoaQWSPPrKlAjKy0V5ekQLKS01zYSjPdGU xX1co27h/fGbx1/HP8jYgg/SyK6KQSg4nfz+moC+j2le9vSRKa0Fwnfkl1/Vognorvlu KAs4OUnUdsw1BcP+0b+agdlUG2ct0DAfvr5EkYXISqAjwQIYw2lj5Z7feCaPFoDNzSep RPvA== X-Gm-Message-State: ANoB5pnlTjkg99TRq15P1S6Cu5A8HsVfnTxv/dkklHY6mJSoldzAyYxH WphZg7sapNmZEHsSg+lu0BHP2A== X-Google-Smtp-Source: AA0mqf6N67OOGMq1W724yjn6V46JwAAPQJebK22wsb76tFC/URwFsfBxyXHyi5y7bzgI/wAqnQ4C5w== X-Received: by 2002:a17:90a:c38e:b0:219:6c57:c3ea with SMTP id h14-20020a17090ac38e00b002196c57c3eamr27602261pjt.12.1671063136848; Wed, 14 Dec 2022 16:12:16 -0800 (PST) Received: from jeffxud.c.googlers.com.com (30.202.168.34.bc.googleusercontent.com. [34.168.202.30]) by smtp.gmail.com with ESMTPSA id 3-20020a17090a08c300b0021937b2118bsm1845738pjn.54.2022.12.14.16.12.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Dec 2022 16:12:16 -0800 (PST) From: jeffxu@chromium.org To: skhan@linuxfoundation.org, keescook@chromium.org Cc: akpm@linux-foundation.org, dmitry.torokhov@gmail.com, dverkamp@chromium.org, hughd@google.com, jeffxu@google.com, jorgelo@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, jannh@google.com, linux-hardening@vger.kernel.org, linux-security-module@vger.kernel.org, kernel test robot Subject: [PATCH v8 3/5] mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC Date: Thu, 15 Dec 2022 00:12:03 +0000 Message-Id: <20221215001205.51969-4-jeffxu@google.com> X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog In-Reply-To: <20221215001205.51969-1-jeffxu@google.com> References: <20221215001205.51969-1-jeffxu@google.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Jeff Xu The new MFD_NOEXEC_SEAL and MFD_EXEC flags allows application to set executable bit at creation time (memfd_create). When MFD_NOEXEC_SEAL is set, memfd is created without executable bit (mode:0666), and sealed with F_SEAL_EXEC, so it can't be chmod to be executable (mode: 0777) after creation. when MFD_EXEC flag is set, memfd is created with executable bit (mode:0777), this is the same as the old behavior of memfd_create. The new pid namespaced sysctl vm.memfd_noexec has 3 values: 0: memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL acts like MFD_EXEC was set. 1: memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL acts like MFD_NOEXEC_SEAL was set. 2: memfd_create() without MFD_NOEXEC_SEAL will be rejected. The sysctl allows finer control of memfd_create for old-software that doesn't set the executable bit, for example, a container with vm.memfd_noexec=3D1 means the old-software will create non-executable memfd by default. Also, the value of memfd_noexec is passed to child namespace at creation time. For example, if the init namespace has vm.memfd_noexec=3D2, all its children namespaces will be created with 2. Signed-off-by: Jeff Xu Co-developed-by: Daniel Verkamp Signed-off-by: Daniel Verkamp Reported-by: kernel test robot Reviewed-by: Kees Cook --- include/linux/pid_namespace.h | 19 +++++++++++ include/uapi/linux/memfd.h | 4 +++ kernel/pid_namespace.c | 5 +++ kernel/pid_sysctl.h | 59 +++++++++++++++++++++++++++++++++++ mm/memfd.c | 48 ++++++++++++++++++++++++++-- 5 files changed, 133 insertions(+), 2 deletions(-) create mode 100644 kernel/pid_sysctl.h diff --git a/include/linux/pid_namespace.h b/include/linux/pid_namespace.h index 07481bb87d4e..c758809d5bcf 100644 --- a/include/linux/pid_namespace.h +++ b/include/linux/pid_namespace.h @@ -16,6 +16,21 @@ =20 struct fs_pin; =20 +#if defined(CONFIG_SYSCTL) && defined(CONFIG_MEMFD_CREATE) +/* + * sysctl for vm.memfd_noexec + * 0: memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL + * acts like MFD_EXEC was set. + * 1: memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL + * acts like MFD_NOEXEC_SEAL was set. + * 2: memfd_create() without MFD_NOEXEC_SEAL will be + * rejected. + */ +#define MEMFD_NOEXEC_SCOPE_EXEC 0 +#define MEMFD_NOEXEC_SCOPE_NOEXEC_SEAL 1 +#define MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED 2 +#endif + struct pid_namespace { struct idr idr; struct rcu_head rcu; @@ -31,6 +46,10 @@ struct pid_namespace { struct ucounts *ucounts; int reboot; /* group exit code if this pidns was rebooted */ struct ns_common ns; +#if defined(CONFIG_SYSCTL) && defined(CONFIG_MEMFD_CREATE) + /* sysctl for vm.memfd_noexec */ + int memfd_noexec_scope; +#endif } __randomize_layout; =20 extern struct pid_namespace init_pid_ns; diff --git a/include/uapi/linux/memfd.h b/include/uapi/linux/memfd.h index 7a8a26751c23..273a4e15dfcf 100644 --- a/include/uapi/linux/memfd.h +++ b/include/uapi/linux/memfd.h @@ -8,6 +8,10 @@ #define MFD_CLOEXEC 0x0001U #define MFD_ALLOW_SEALING 0x0002U #define MFD_HUGETLB 0x0004U +/* not executable and sealed to prevent changing to executable. */ +#define MFD_NOEXEC_SEAL 0x0008U +/* executable */ +#define MFD_EXEC 0x0010U =20 /* * Huge page size encoding when MFD_HUGETLB is specified, and a huge page diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c index f4f8cb0435b4..8a98b1af9376 100644 --- a/kernel/pid_namespace.c +++ b/kernel/pid_namespace.c @@ -23,6 +23,7 @@ #include #include #include +#include "pid_sysctl.h" =20 static DEFINE_MUTEX(pid_caches_mutex); static struct kmem_cache *pid_ns_cachep; @@ -110,6 +111,8 @@ static struct pid_namespace *create_pid_namespace(struc= t user_namespace *user_ns ns->ucounts =3D ucounts; ns->pid_allocated =3D PIDNS_ADDING; =20 + initialize_memfd_noexec_scope(ns); + return ns; =20 out_free_idr: @@ -455,6 +458,8 @@ static __init int pid_namespaces_init(void) #ifdef CONFIG_CHECKPOINT_RESTORE register_sysctl_paths(kern_path, pid_ns_ctl_table); #endif + + register_pid_ns_sysctl_table_vm(); return 0; } =20 diff --git a/kernel/pid_sysctl.h b/kernel/pid_sysctl.h new file mode 100644 index 000000000000..90a93161a122 --- /dev/null +++ b/kernel/pid_sysctl.h @@ -0,0 +1,59 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef LINUX_PID_SYSCTL_H +#define LINUX_PID_SYSCTL_H + +#include + +#if defined(CONFIG_SYSCTL) && defined(CONFIG_MEMFD_CREATE) +static inline void initialize_memfd_noexec_scope(struct pid_namespace *ns) +{ + ns->memfd_noexec_scope =3D + task_active_pid_ns(current)->memfd_noexec_scope; +} + +static int pid_mfd_noexec_dointvec_minmax(struct ctl_table *table, + int write, void *buf, size_t *lenp, loff_t *ppos) +{ + struct pid_namespace *ns =3D task_active_pid_ns(current); + struct ctl_table table_copy; + + if (write && !ns_capable(ns->user_ns, CAP_SYS_ADMIN)) + return -EPERM; + + table_copy =3D *table; + if (ns !=3D &init_pid_ns) + table_copy.data =3D &ns->memfd_noexec_scope; + + /* + * set minimum to current value, the effect is only bigger + * value is accepted. + */ + if (*(int *)table_copy.data > *(int *)table_copy.extra1) + table_copy.extra1 =3D table_copy.data; + + return proc_dointvec_minmax(&table_copy, write, buf, lenp, ppos); +} + +static struct ctl_table pid_ns_ctl_table_vm[] =3D { + { + .procname =3D "memfd_noexec", + .data =3D &init_pid_ns.memfd_noexec_scope, + .maxlen =3D sizeof(init_pid_ns.memfd_noexec_scope), + .mode =3D 0644, + .proc_handler =3D pid_mfd_noexec_dointvec_minmax, + .extra1 =3D SYSCTL_ZERO, + .extra2 =3D SYSCTL_TWO, + }, + { } +}; +static struct ctl_path vm_path[] =3D { { .procname =3D "vm", }, { } }; +static inline void register_pid_ns_sysctl_table_vm(void) +{ + register_sysctl_paths(vm_path, pid_ns_ctl_table_vm); +} +#else +static inline void set_memfd_noexec_scope(struct pid_namespace *ns) {} +static inline void register_pid_ns_ctl_table_vm(void) {} +#endif + +#endif /* LINUX_PID_SYSCTL_H */ diff --git a/mm/memfd.c b/mm/memfd.c index 4ebeab94aa74..ec70675a7069 100644 --- a/mm/memfd.c +++ b/mm/memfd.c @@ -18,6 +18,7 @@ #include #include #include +#include #include =20 /* @@ -263,12 +264,14 @@ long memfd_fcntl(struct file *file, unsigned int cmd,= unsigned long arg) #define MFD_NAME_PREFIX_LEN (sizeof(MFD_NAME_PREFIX) - 1) #define MFD_NAME_MAX_LEN (NAME_MAX - MFD_NAME_PREFIX_LEN) =20 -#define MFD_ALL_FLAGS (MFD_CLOEXEC | MFD_ALLOW_SEALING | MFD_HUGETLB) +#define MFD_ALL_FLAGS (MFD_CLOEXEC | MFD_ALLOW_SEALING | MFD_HUGETLB | MFD= _NOEXEC_SEAL | MFD_EXEC) =20 SYSCALL_DEFINE2(memfd_create, const char __user *, uname, unsigned int, flags) { + char comm[TASK_COMM_LEN]; + struct pid_namespace *ns; unsigned int *file_seals; struct file *file; int fd, error; @@ -285,6 +288,39 @@ SYSCALL_DEFINE2(memfd_create, return -EINVAL; } =20 + /* Invalid if both EXEC and NOEXEC_SEAL are set.*/ + if ((flags & MFD_EXEC) && (flags & MFD_NOEXEC_SEAL)) + return -EINVAL; + + if (!(flags & (MFD_EXEC | MFD_NOEXEC_SEAL))) { +#ifdef CONFIG_SYSCTL + int sysctl =3D MEMFD_NOEXEC_SCOPE_EXEC; + + ns =3D task_active_pid_ns(current); + if (ns) + sysctl =3D ns->memfd_noexec_scope; + + switch (sysctl) { + case MEMFD_NOEXEC_SCOPE_EXEC: + flags |=3D MFD_EXEC; + break; + case MEMFD_NOEXEC_SCOPE_NOEXEC_SEAL: + flags |=3D MFD_NOEXEC_SEAL; + break; + default: + pr_warn_ratelimited( + "memfd_create(): MFD_NOEXEC_SEAL is enforced, pid=3D%d '%s'\n", + task_pid_nr(current), get_task_comm(comm, current)); + return -EINVAL; + } +#else + flags |=3D MFD_EXEC; +#endif + pr_warn_ratelimited( + "memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL, pid=3D%d '%s'\n", + task_pid_nr(current), get_task_comm(comm, current)); + } + /* length includes terminating zero */ len =3D strnlen_user(uname, MFD_NAME_MAX_LEN + 1); if (len <=3D 0) @@ -328,7 +364,15 @@ SYSCALL_DEFINE2(memfd_create, file->f_mode |=3D FMODE_LSEEK | FMODE_PREAD | FMODE_PWRITE; file->f_flags |=3D O_LARGEFILE; =20 - if (flags & MFD_ALLOW_SEALING) { + if (flags & MFD_NOEXEC_SEAL) { + struct inode *inode =3D file_inode(file); + + inode->i_mode &=3D ~0111; + file_seals =3D memfd_file_seals_ptr(file); + *file_seals &=3D ~F_SEAL_SEAL; + *file_seals |=3D F_SEAL_EXEC; + } else if (flags & MFD_ALLOW_SEALING) { + /* MFD_EXEC and MFD_ALLOW_SEALING are set */ file_seals =3D memfd_file_seals_ptr(file); *file_seals &=3D ~F_SEAL_SEAL; } --=20 2.39.0.rc1.256.g54fd8350bd-goog From nobody Sun Dec 14 20:29:46 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58D77C001B2 for ; Thu, 15 Dec 2022 00:14:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230003AbiLOAOv (ORCPT ); Wed, 14 Dec 2022 19:14:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33872 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229634AbiLOANq (ORCPT ); Wed, 14 Dec 2022 19:13:46 -0500 Received: from mail-pl1-x631.google.com (mail-pl1-x631.google.com [IPv6:2607:f8b0:4864:20::631]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 64D99532FC for ; Wed, 14 Dec 2022 16:12:18 -0800 (PST) Received: by mail-pl1-x631.google.com with SMTP id s7so5138689plk.5 for ; Wed, 14 Dec 2022 16:12:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=FZNFa0Som2cEXdyImA0PHzAPIxujgLNI2oeSKx5nHZI=; b=Mar/wnyOMbOk1+C6Q5uXoFrUdKfYUVRbk56ST4Cdpt8bBvB6kbwiHHlmlZHyoJqUXq 0bC2mxcb4Aba4AQ7l5WLkMe1+FiheB82JTvBZg70Btv3MIEnYkyusR0jImZrs5zZt2R4 WYPS2HBzWXj5LCp/LfNQb6K4VUdeyB2KhB+/8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FZNFa0Som2cEXdyImA0PHzAPIxujgLNI2oeSKx5nHZI=; b=34Kf/Yom/wQz9fOXv5MGe6D1pBR9Wc/EKWScmVF0pZPw27T0kMVddjgzinrsUaZ24T ITh1KVi20IeRw9nkTNu9mUu2fVPRLAdtAU9ih/c4SORkIYRRb43b7s+/p2MQlILRUo3S T/c2jwHtuSncCHWG8u1SGtQSAGimyuGVoELD+I+VVagsL2PI+NoNDCL7S9dCfwZz908K Qg1vdljtetZ8TCtsheU27LoFOcN88Fga3+l6QO7FSdr+iUyKp2E/gvVF1lYKRE2sCgxH +kX1wGBLTQRuyU7qB7qrGqK3CFQeHw00+zLmqbunxj+vv/yOc0v1Ioyav9feIRZX950T C7yQ== X-Gm-Message-State: ANoB5pnCJ+lSgNNTzowvJ4e53mgv58XPRgLyJnf2m6Yi+HCQbKYB8xuL lgvKGS7cW1gk4M2dRP7QI5TKWw== X-Google-Smtp-Source: AA0mqf6GJzrv+PwU1r/NMlYDtOak9TKNkI9c0zXY6nAVkp4XlqigEungkSMJ1qEv73IT1BAvIE/pQw== X-Received: by 2002:a17:90a:7890:b0:221:4338:a6ae with SMTP id x16-20020a17090a789000b002214338a6aemr17461030pjk.33.1671063137955; Wed, 14 Dec 2022 16:12:17 -0800 (PST) Received: from jeffxud.c.googlers.com.com (30.202.168.34.bc.googleusercontent.com. [34.168.202.30]) by smtp.gmail.com with ESMTPSA id 3-20020a17090a08c300b0021937b2118bsm1845738pjn.54.2022.12.14.16.12.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Dec 2022 16:12:17 -0800 (PST) From: jeffxu@chromium.org To: skhan@linuxfoundation.org, keescook@chromium.org Cc: akpm@linux-foundation.org, dmitry.torokhov@gmail.com, dverkamp@chromium.org, hughd@google.com, jeffxu@google.com, jorgelo@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, jannh@google.com, linux-hardening@vger.kernel.org, linux-security-module@vger.kernel.org Subject: [PATCH v8 4/5] mm/memfd: Add write seals when apply SEAL_EXEC to executable memfd Date: Thu, 15 Dec 2022 00:12:04 +0000 Message-Id: <20221215001205.51969-5-jeffxu@google.com> X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog In-Reply-To: <20221215001205.51969-1-jeffxu@google.com> References: <20221215001205.51969-1-jeffxu@google.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Jeff Xu In order to avoid WX mappings, add F_SEAL_WRITE when apply F_SEAL_EXEC to an executable memfd, so W^X from start. This implys application need to fill the content of the memfd first, after F_SEAL_EXEC is applied, application can no longer modify the content of the memfd. Typically, application seals the memfd right after writing to it. For example: 1. memfd_create(MFD_EXEC). 2. write() code to the memfd. 3. fcntl(F_ADD_SEALS, F_SEAL_EXEC) to convert the memfd to W^X. 4. call exec() on the memfd. Signed-off-by: Jeff Xu Reviewed-by: Kees Cook --- mm/memfd.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/mm/memfd.c b/mm/memfd.c index ec70675a7069..92f0a5765f7c 100644 --- a/mm/memfd.c +++ b/mm/memfd.c @@ -222,6 +222,12 @@ static int memfd_add_seals(struct file *file, unsigned= int seals) } } =20 + /* + * SEAL_EXEC implys SEAL_WRITE, making W^X from the start. + */ + if (seals & F_SEAL_EXEC && inode->i_mode & 0111) + seals |=3D F_SEAL_SHRINK|F_SEAL_GROW|F_SEAL_WRITE|F_SEAL_FUTURE_WRITE; + *file_seals |=3D seals; error =3D 0; =20 --=20 2.39.0.rc1.256.g54fd8350bd-goog From nobody Sun Dec 14 20:29:46 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F29EC10F31 for ; Thu, 15 Dec 2022 00:15:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230225AbiLOAPL (ORCPT ); Wed, 14 Dec 2022 19:15:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33888 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230075AbiLOANq (ORCPT ); Wed, 14 Dec 2022 19:13:46 -0500 Received: from mail-pj1-x102f.google.com (mail-pj1-x102f.google.com [IPv6:2607:f8b0:4864:20::102f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6A1D6511F5 for ; Wed, 14 Dec 2022 16:12:19 -0800 (PST) Received: by mail-pj1-x102f.google.com with SMTP id v13-20020a17090a6b0d00b00219c3be9830so985512pjj.4 for ; Wed, 14 Dec 2022 16:12:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Wap6ucoC/WNJsFqIKWfaQDf7c8pQKriu9bpCkWTKgYw=; b=jrjMjhda4PybAUVjFY7IQKKupKhbGV+Mr9a0lFv3B35sRZPqco0xNPmAkFb+WHAFg7 I9RucflGE1RXALl2sPWfUMgLRPERW8ynUHuM4CeSRxwbtCOFb0CeDVWn+kfFugc7kcVV aI+YohYNTtIqZ6VwI2BNdLYoh/3EFdcfO5Ouw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Wap6ucoC/WNJsFqIKWfaQDf7c8pQKriu9bpCkWTKgYw=; b=40qGOUmHtpPFSqcyh2VGeUS1OBWZFR9J4gWqSjNgPcHzXqWKw79RRvOEVgswMVzKZX jNjjS1PnIMtOba6YcZ9Mg8iSsadHo6zwfu826F31PnsVmuDFqN9kSGJoHqUy1Zxfbgk8 /9rZSzrfXQ0RnGzBCYwJT1ZNvp+shmIX70V2Hu2PogrspFDAVNv/coeMAv4vnsB5p+0f Af9ecVnyWvyn7aI19kdeNGd08QzAoiNb35aqXYbTi7/YxSTfQcRMXW7XARYvTQgUP4Aq m1jkfg5D2n4Z+YU1aU5VnuBvxjzS2rqlHU35w4on+kDDCbZgJ6nCA2MR8Bj0hZ5Dj/mz bXyg== X-Gm-Message-State: ANoB5pmIwKRlqLdb5u80pCXnJeSwG7H/Tk1e9zjHro/z4HBXBhEsfDYM ZCD4cyMmIsG1uB8pFLuAoEK/+w== X-Google-Smtp-Source: AA0mqf5cZwYQAYuPApV9GBj1mQ1CYi6VGv1tzcvhwLtLDWN1jPWIpK9QDfSh5+tGbml7AZgIHJjXTQ== X-Received: by 2002:a05:6a21:3a8d:b0:ad:2ae:bb6e with SMTP id zv13-20020a056a213a8d00b000ad02aebb6emr29116822pzb.55.1671063138943; Wed, 14 Dec 2022 16:12:18 -0800 (PST) Received: from jeffxud.c.googlers.com.com (30.202.168.34.bc.googleusercontent.com. [34.168.202.30]) by smtp.gmail.com with ESMTPSA id 3-20020a17090a08c300b0021937b2118bsm1845738pjn.54.2022.12.14.16.12.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Dec 2022 16:12:18 -0800 (PST) From: jeffxu@chromium.org To: skhan@linuxfoundation.org, keescook@chromium.org Cc: akpm@linux-foundation.org, dmitry.torokhov@gmail.com, dverkamp@chromium.org, hughd@google.com, jeffxu@google.com, jorgelo@chromium.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, jannh@google.com, linux-hardening@vger.kernel.org, linux-security-module@vger.kernel.org Subject: [PATCH v8 5/5] selftests/memfd: add tests for MFD_NOEXEC_SEAL MFD_EXEC Date: Thu, 15 Dec 2022 00:12:05 +0000 Message-Id: <20221215001205.51969-6-jeffxu@google.com> X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog In-Reply-To: <20221215001205.51969-1-jeffxu@google.com> References: <20221215001205.51969-1-jeffxu@google.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Jeff Xu Tests to verify MFD_NOEXEC, MFD_EXEC and vm.memfd_noexec sysctl. Signed-off-by: Jeff Xu Co-developed-by: Daniel Verkamp Signed-off-by: Daniel Verkamp Reviewed-by: Kees Cook --- tools/testing/selftests/memfd/fuse_test.c | 1 + tools/testing/selftests/memfd/memfd_test.c | 228 ++++++++++++++++++++- 2 files changed, 224 insertions(+), 5 deletions(-) diff --git a/tools/testing/selftests/memfd/fuse_test.c b/tools/testing/self= tests/memfd/fuse_test.c index be675002f918..93798c8c5d54 100644 --- a/tools/testing/selftests/memfd/fuse_test.c +++ b/tools/testing/selftests/memfd/fuse_test.c @@ -22,6 +22,7 @@ #include #include #include +#include #include #include #include diff --git a/tools/testing/selftests/memfd/memfd_test.c b/tools/testing/sel= ftests/memfd/memfd_test.c index f18a15a1f275..ae71f15f790d 100644 --- a/tools/testing/selftests/memfd/memfd_test.c +++ b/tools/testing/selftests/memfd/memfd_test.c @@ -30,6 +30,14 @@ =20 #define F_SEAL_EXEC 0x0020 =20 +#define F_WX_SEALS (F_SEAL_SHRINK | \ + F_SEAL_GROW | \ + F_SEAL_WRITE | \ + F_SEAL_FUTURE_WRITE | \ + F_SEAL_EXEC) + +#define MFD_NOEXEC_SEAL 0x0008U + /* * Default is not to test hugetlbfs */ @@ -80,6 +88,37 @@ static int mfd_assert_new(const char *name, loff_t sz, u= nsigned int flags) return fd; } =20 +static void sysctl_assert_write(const char *val) +{ + int fd =3D open("/proc/sys/vm/memfd_noexec", O_WRONLY | O_CLOEXEC); + + if (fd < 0) { + printf("open sysctl failed\n"); + abort(); + } + + if (write(fd, val, strlen(val)) < 0) { + printf("write sysctl failed\n"); + abort(); + } +} + +static void sysctl_fail_write(const char *val) +{ + int fd =3D open("/proc/sys/vm/memfd_noexec", O_WRONLY | O_CLOEXEC); + + if (fd < 0) { + printf("open sysctl failed\n"); + abort(); + } + + if (write(fd, val, strlen(val)) >=3D 0) { + printf("write sysctl %s succeeded, but failure expected\n", + val); + abort(); + } +} + static int mfd_assert_reopen_fd(int fd_in) { int fd; @@ -758,6 +797,9 @@ static void test_create(void) mfd_fail_new("", ~0); mfd_fail_new("", 0x80000000U); =20 + /* verify EXEC and NOEXEC_SEAL can't both be set */ + mfd_fail_new("", MFD_EXEC | MFD_NOEXEC_SEAL); + /* verify MFD_CLOEXEC is allowed */ fd =3D mfd_assert_new("", 0, MFD_CLOEXEC); close(fd); @@ -969,20 +1011,21 @@ static void test_seal_resize(void) =20 /* * Test SEAL_EXEC - * Test that chmod() cannot change x bits after sealing + * Test fd is created with exec and allow sealing. + * chmod() cannot change x bits after sealing. */ -static void test_seal_exec(void) +static void test_exec_seal(void) { int fd; =20 printf("%s SEAL-EXEC\n", memfd_str); =20 + printf("%s Apply SEAL_EXEC\n", memfd_str); fd =3D mfd_assert_new("kern_memfd_seal_exec", mfd_def_size, - MFD_CLOEXEC | MFD_ALLOW_SEALING); + MFD_CLOEXEC | MFD_ALLOW_SEALING | MFD_EXEC); =20 mfd_assert_mode(fd, 0777); - mfd_assert_chmod(fd, 0644); =20 mfd_assert_has_seals(fd, 0); @@ -996,10 +1039,181 @@ static void test_seal_exec(void) mfd_fail_chmod(fd, 0700); mfd_fail_chmod(fd, 0100); mfd_assert_chmod(fd, 0666); + mfd_assert_write(fd); + close(fd); + + printf("%s Apply ALL_SEALS\n", memfd_str); + fd =3D mfd_assert_new("kern_memfd_seal_exec", + mfd_def_size, + MFD_CLOEXEC | MFD_ALLOW_SEALING | MFD_EXEC); + + mfd_assert_mode(fd, 0777); + mfd_assert_chmod(fd, 0700); + + mfd_assert_has_seals(fd, 0); + mfd_assert_add_seals(fd, F_SEAL_EXEC); + mfd_assert_has_seals(fd, F_WX_SEALS); =20 + mfd_fail_chmod(fd, 0711); + mfd_fail_chmod(fd, 0600); + mfd_fail_write(fd); + close(fd); +} + +/* + * Test EXEC_NO_SEAL + * Test fd is created with exec and not allow sealing. + */ +static void test_exec_no_seal(void) +{ + int fd; + + printf("%s EXEC_NO_SEAL\n", memfd_str); + + /* Create with EXEC but without ALLOW_SEALING */ + fd =3D mfd_assert_new("kern_memfd_exec_no_sealing", + mfd_def_size, + MFD_CLOEXEC | MFD_EXEC); + mfd_assert_mode(fd, 0777); + mfd_assert_has_seals(fd, F_SEAL_SEAL); + mfd_assert_chmod(fd, 0666); close(fd); } =20 +/* + * Test memfd_create with MFD_NOEXEC flag + */ +static void test_noexec_seal(void) +{ + int fd; + + printf("%s NOEXEC_SEAL\n", memfd_str); + + /* Create with NOEXEC and ALLOW_SEALING */ + fd =3D mfd_assert_new("kern_memfd_noexec", + mfd_def_size, + MFD_CLOEXEC | MFD_ALLOW_SEALING | MFD_NOEXEC_SEAL); + mfd_assert_mode(fd, 0666); + mfd_assert_has_seals(fd, F_SEAL_EXEC); + mfd_fail_chmod(fd, 0777); + close(fd); + + /* Create with NOEXEC but without ALLOW_SEALING */ + fd =3D mfd_assert_new("kern_memfd_noexec", + mfd_def_size, + MFD_CLOEXEC | MFD_NOEXEC_SEAL); + mfd_assert_mode(fd, 0666); + mfd_assert_has_seals(fd, F_SEAL_EXEC); + mfd_fail_chmod(fd, 0777); + close(fd); +} + +static void test_sysctl_child(void) +{ + int fd; + + printf("%s sysctl 0\n", memfd_str); + sysctl_assert_write("0"); + fd =3D mfd_assert_new("kern_memfd_sysctl_0", + mfd_def_size, + MFD_CLOEXEC | MFD_ALLOW_SEALING); + + mfd_assert_mode(fd, 0777); + mfd_assert_has_seals(fd, 0); + mfd_assert_chmod(fd, 0644); + close(fd); + + printf("%s sysctl 1\n", memfd_str); + sysctl_assert_write("1"); + fd =3D mfd_assert_new("kern_memfd_sysctl_1", + mfd_def_size, + MFD_CLOEXEC | MFD_ALLOW_SEALING); + + mfd_assert_mode(fd, 0666); + mfd_assert_has_seals(fd, F_SEAL_EXEC); + mfd_fail_chmod(fd, 0777); + sysctl_fail_write("0"); + close(fd); + + printf("%s sysctl 2\n", memfd_str); + sysctl_assert_write("2"); + mfd_fail_new("kern_memfd_sysctl_2", + MFD_CLOEXEC | MFD_ALLOW_SEALING); + sysctl_fail_write("0"); + sysctl_fail_write("1"); +} + +static int newpid_thread_fn(void *arg) +{ + test_sysctl_child(); + return 0; +} + +static void test_sysctl_child2(void) +{ + int fd; + + sysctl_fail_write("0"); + fd =3D mfd_assert_new("kern_memfd_sysctl_1", + mfd_def_size, + MFD_CLOEXEC | MFD_ALLOW_SEALING); + + mfd_assert_mode(fd, 0666); + mfd_assert_has_seals(fd, F_SEAL_EXEC); + mfd_fail_chmod(fd, 0777); + close(fd); +} + +static int newpid_thread_fn2(void *arg) +{ + test_sysctl_child2(); + return 0; +} +static pid_t spawn_newpid_thread(unsigned int flags, int (*fn)(void *)) +{ + uint8_t *stack; + pid_t pid; + + stack =3D malloc(STACK_SIZE); + if (!stack) { + printf("malloc(STACK_SIZE) failed: %m\n"); + abort(); + } + + pid =3D clone(fn, + stack + STACK_SIZE, + SIGCHLD | flags, + NULL); + if (pid < 0) { + printf("clone() failed: %m\n"); + abort(); + } + + return pid; +} + +static void join_newpid_thread(pid_t pid) +{ + waitpid(pid, NULL, 0); +} + +/* + * Test sysctl + * A very basic sealing test to see whether setting/retrieving seals works. + */ +static void test_sysctl(void) +{ + int pid =3D spawn_newpid_thread(CLONE_NEWPID, newpid_thread_fn); + + join_newpid_thread(pid); + + printf("%s child ns\n", memfd_str); + sysctl_assert_write("1"); + + pid =3D spawn_newpid_thread(CLONE_NEWPID, newpid_thread_fn2); + join_newpid_thread(pid); +} + /* * Test sharing via dup() * Test that seals are shared between dupped FDs and they're all equal. @@ -1173,13 +1387,15 @@ int main(int argc, char **argv) =20 test_create(); test_basic(); + test_exec_seal(); + test_exec_no_seal(); + test_noexec_seal(); =20 test_seal_write(); test_seal_future_write(); test_seal_shrink(); test_seal_grow(); test_seal_resize(); - test_seal_exec(); =20 test_share_dup("SHARE-DUP", ""); test_share_mmap("SHARE-MMAP", ""); @@ -1195,6 +1411,8 @@ int main(int argc, char **argv) test_share_fork("SHARE-FORK", SHARED_FT_STR); join_idle_thread(pid); =20 + test_sysctl(); + printf("memfd: DONE\n"); =20 return 0; --=20 2.39.0.rc1.256.g54fd8350bd-goog