From nobody Fri Sep 19 05:36:00 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 179F0C433FE for ; Mon, 28 Nov 2022 11:14:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231136AbiK1LOt (ORCPT ); Mon, 28 Nov 2022 06:14:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33998 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230033AbiK1LOp (ORCPT ); Mon, 28 Nov 2022 06:14:45 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F9D16172 for ; Mon, 28 Nov 2022 03:13:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669634029; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cWseajU1uHRD8WD/jBJN0zTTv3jrVL60LfzIXX5DzQE=; b=MoLexcotAvjIjTw+ym/bCSKu4yn9i95RUx1xbkCArNeomLY2eiwL2JYlGQQSSLX0PKn3B7 9du5kDTK8jhhAHf/hjLSaV7n4QwvXzTkRx1LHQj53khwyknX1773liFLzj62Mt1IXvKm33 OuYDnbHTI5cqsqQdSmJQio3SSw1nll8= Received: from mail-lf1-f69.google.com (mail-lf1-f69.google.com [209.85.167.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-218-XOqvcphLN7OXWYd8nrEDiA-1; Mon, 28 Nov 2022 06:13:48 -0500 X-MC-Unique: XOqvcphLN7OXWYd8nrEDiA-1 Received: by mail-lf1-f69.google.com with SMTP id q2-20020ac24a62000000b004b4ec7b83f3so3644847lfp.19 for ; Mon, 28 Nov 2022 03:13:48 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=cWseajU1uHRD8WD/jBJN0zTTv3jrVL60LfzIXX5DzQE=; b=Wbedf4+0wI/RVDHpXA3M7+6EOmRMUwYPVDrj+wTcV83age6HAeH6+NBeZeem/7l6mn aPF37uLgaLXyP8t8DJDVUy1vWLk2cdBDF1PM9dZaYmuOFRt1G8/ACQEq36p8D56EmEg/ lyugvYFQBZ2JMUCqB4ofOMHwGRAnm9CYrriWV2wztMXFkHkpA81DY+OgvULtYLS/cWO2 OYeOmOK6ejaj9GvgeaZGOk+0Yh1kDhtNZ0tsPya2uKXhYl3pvSSy7ijyIWg9afslkmQG VBZBEN/UTwQaOnt8jiorBq7owQjS526gYDi1+ATSGXdXZ+vfIZrBv4JI3A3UQ6StT5za oRqw== X-Gm-Message-State: ANoB5pmWh7D1GPGyEHNUt0w8Wh2DeaS74vi98B8n076OwYI9Ak8syZBl Zu3fwJcKaPlZJBhiFzStwQGaHzEXZzitOX0N+fwvlxWyI09AZ8ld8BzlIQRPANIJbt/SPlBGpsQ JK3aB3F2yj5/f7kO+xA84FwEx X-Received: by 2002:ac2:430e:0:b0:4b4:9c0b:f4d3 with SMTP id l14-20020ac2430e000000b004b49c0bf4d3mr16448209lfh.349.1669634027374; Mon, 28 Nov 2022 03:13:47 -0800 (PST) X-Google-Smtp-Source: AA0mqf6D8Ws/YxQs1ZwIgV/j2KmeApUF/SVLWIDJ5Zj1WSOSqT+ngp8Fjz+wLmjWYpGSFFJmrBXumw== X-Received: by 2002:ac2:430e:0:b0:4b4:9c0b:f4d3 with SMTP id l14-20020ac2430e000000b004b49c0bf4d3mr16448200lfh.349.1669634027211; Mon, 28 Nov 2022 03:13:47 -0800 (PST) Received: from localhost.localdomain (c-e6a5e255.022-110-73746f36.bbcust.telenor.se. [85.226.165.230]) by smtp.googlemail.com with ESMTPSA id q22-20020a2e8756000000b0027703e09b71sm1141250ljj.64.2022.11.28.03.13.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Nov 2022 03:13:46 -0800 (PST) From: Alexander Larsson To: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org, gscrivan@redhat.com, alexl@redhat.com Subject: [PATCH 1/6] fsverity: Export fsverity_get_digest Date: Mon, 28 Nov 2022 12:13:32 +0100 Message-Id: X-Mailer: git-send-email 2.38.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Composefs needs to call this when built in module form, so we need to export the symbol. This uses EXPORT_SYMBOL_GPL like the other fsverity functions do. Signed-off-by: Alexander Larsson --- fs/verity/measure.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/verity/measure.c b/fs/verity/measure.c index e99c00350c28..2f0f7e369bf2 100644 --- a/fs/verity/measure.c +++ b/fs/verity/measure.c @@ -100,3 +100,4 @@ int fsverity_get_digest(struct inode *inode, =20 return 0; } +EXPORT_SYMBOL_GPL(fsverity_get_digest); --=20 2.38.1 From nobody Fri Sep 19 05:36:00 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 734CFC433FE for ; Mon, 28 Nov 2022 11:17:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231201AbiK1LRl (ORCPT ); Mon, 28 Nov 2022 06:17:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36000 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231154AbiK1LRc (ORCPT ); Mon, 28 Nov 2022 06:17:32 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 986E5193CE for ; Mon, 28 Nov 2022 03:16:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669634195; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OmyCfjmOk6+j6r8nH419i3TD5Kdx+xb7B6SU54l4Gk8=; b=N4XVulgTkbaSl9KGd0+TY+TOJyuv5ZU3EXTYkhBlvTk0G3ea0JXq1fVyKQ0+ZA5rc8gSij o8fPb/HMCfmDbwld5BRYhZUQpo3Kqv+G4U9JDnYRPY7Na5kNvrRyI6pVonOlO5639+KZ9U VZOqEyOwT8at01Hpd4vV1/ktfdBi34c= Received: from mail-lf1-f71.google.com (mail-lf1-f71.google.com [209.85.167.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-446-7U31alsXOvuSgIy_mxA_5g-1; Mon, 28 Nov 2022 06:16:34 -0500 X-MC-Unique: 7U31alsXOvuSgIy_mxA_5g-1 Received: by mail-lf1-f71.google.com with SMTP id t14-20020a056512068e00b004b4f4d584c0so3689447lfe.23 for ; Mon, 28 Nov 2022 03:16:34 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=OmyCfjmOk6+j6r8nH419i3TD5Kdx+xb7B6SU54l4Gk8=; b=ZT+xraFBhkC6UWRn8ky0xAKMTuKHvgbdSsJsVsl7c0zIr1Bs3JI972T3Y2y9FhAQ6j kTFdT+P+phWwQ61l6XvL7Q0f+00go4vjBQ78JY5GLprOY6aLZ2K+BKDwjEWip1GuPRXX HACAAH9RE/CESiSOuDGzWSdxddgBCZ1FaOY7wK8LqJ8O9qN3C/W5Kzj18ADePOZPmdO1 3PLBYBg48w1DlSYPjK//LBkYXXSttYYyjm2/9kJSyLc9Xy9keiL3+ZvMIhnFRnw0vvxc 1vauT2dR3uN5bUFHMiFkQS/z1zALjfpySJIofKvoeGJAVLnlHc1ovyn4M56Us4Ziib6l rU8A== X-Gm-Message-State: ANoB5plXA54Qfy7pmjkOYGHTe34fPFokhK2nx9kuBm2bEVOoNeZIzJ6d HrOOeEeNklHAlKXCrZDyS8CH89qEgpLozX0HlNE+Z8XLIR7DpW28UuAUZeJ/rCBUmRC5dtFZFJU txd+8ITVz5cV+Ms9EKgjYLVpX X-Received: by 2002:ac2:5199:0:b0:4b4:e6d7:ad19 with SMTP id u25-20020ac25199000000b004b4e6d7ad19mr9924762lfi.392.1669634192727; Mon, 28 Nov 2022 03:16:32 -0800 (PST) X-Google-Smtp-Source: AA0mqf4ToyOkZ3IaD6CXSr6E/F0UoxejnUKPaezvbbXu+Da8xksi5OmRVNntTRPutFLToPWB94au3Q== X-Received: by 2002:ac2:5199:0:b0:4b4:e6d7:ad19 with SMTP id u25-20020ac25199000000b004b4e6d7ad19mr9924752lfi.392.1669634192467; Mon, 28 Nov 2022 03:16:32 -0800 (PST) Received: from localhost.localdomain (c-e6a5e255.022-110-73746f36.bbcust.telenor.se. [85.226.165.230]) by smtp.googlemail.com with ESMTPSA id bn24-20020a05651c179800b00277041268absm1188223ljb.78.2022.11.28.03.16.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Nov 2022 03:16:31 -0800 (PST) From: Alexander Larsson To: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org, gscrivan@redhat.com, alexl@redhat.com Subject: [PATCH 2/6] composefs: Add on-disk layout Date: Mon, 28 Nov 2022 12:16:23 +0100 Message-Id: X-Mailer: git-send-email 2.38.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" This commit adds the on-disk layout header file of composefs. Signed-off-by: Alexander Larsson Signed-off-by: Giuseppe Scrivano --- fs/composefs/cfs.h | 242 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 242 insertions(+) create mode 100644 fs/composefs/cfs.h diff --git a/fs/composefs/cfs.h b/fs/composefs/cfs.h new file mode 100644 index 000000000000..8f001fd28d6b --- /dev/null +++ b/fs/composefs/cfs.h @@ -0,0 +1,242 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * composefs + * + * Copyright (C) 2021 Giuseppe Scrivano + * Copyright (C) 2022 Alexander Larsson + * + * This file is released under the GPL. + */ + +#ifndef _CFS_H +#define _CFS_H + +#include +#include +#include +#include +#include + +#define CFS_VERSION 1 + +#define CFS_MAGIC 0xc078629aU + +#define CFS_MAX_DIR_CHUNK_SIZE 4096 +#define CFS_MAX_XATTRS_SIZE 4096 + +static inline u16 cfs_u16_to_file(u16 val) +{ + return cpu_to_le16(val); +} + +static inline u32 cfs_u32_to_file(u32 val) +{ + return cpu_to_le32(val); +} + +static inline u64 cfs_u64_to_file(u64 val) +{ + return cpu_to_le64(val); +} + +static inline u16 cfs_u16_from_file(u16 val) +{ + return le16_to_cpu(val); +} + +static inline u32 cfs_u32_from_file(u32 val) +{ + return le32_to_cpu(val); +} + +static inline u64 cfs_u64_from_file(u64 val) +{ + return le64_to_cpu(val); +} + +static inline int cfs_xdigit_value(char c) +{ + if (c >=3D '0' && c <=3D '9') + return c - '0'; + if (c >=3D 'A' && c <=3D 'F') + return c - 'A' + 10; + if (c >=3D 'a' && c <=3D 'f') + return c - 'a' + 10; + return -1; +} + +static inline int cfs_digest_from_payload(const char *payload, + size_t payload_len, + u8 digest_out[SHA256_DIGEST_SIZE]) +{ + const char *p, *end; + u8 last_digit =3D 0; + int digit =3D 0; + size_t n_nibbles =3D 0; + + end =3D payload + payload_len; + for (p =3D payload; p !=3D end; p++) { + /* Skip subdir structure */ + if (*p =3D=3D '/') + continue; + + /* Break at (and ignore) extension */ + if (*p =3D=3D '.') + break; + + if (n_nibbles =3D=3D SHA256_DIGEST_SIZE * 2) + return -1; /* Too long */ + + digit =3D cfs_xdigit_value(*p); + if (digit =3D=3D -1) + return -1; /* Not hex digit */ + + n_nibbles++; + if ((n_nibbles % 2) =3D=3D 0) { + digest_out[n_nibbles / 2 - 1] =3D + (last_digit << 4) | digit; + } + last_digit =3D digit; + } + + if (n_nibbles !=3D SHA256_DIGEST_SIZE * 2) + return -1; /* Too short */ + + return 0; +} + +struct cfs_vdata_s { + u64 off; + u32 len; +} __packed; + +struct cfs_header_s { + u8 version; + u8 unused1; + u16 unused2; + + u32 magic; + u64 data_offset; + u64 root_inode; + + u64 unused3[2]; +} __packed; + +enum cfs_inode_flags { + CFS_INODE_FLAGS_NONE =3D 0, + CFS_INODE_FLAGS_PAYLOAD =3D 1 << 0, + CFS_INODE_FLAGS_MODE =3D 1 << 1, + CFS_INODE_FLAGS_NLINK =3D 1 << 2, + CFS_INODE_FLAGS_UIDGID =3D 1 << 3, + CFS_INODE_FLAGS_RDEV =3D 1 << 4, + CFS_INODE_FLAGS_TIMES =3D 1 << 5, + CFS_INODE_FLAGS_TIMES_NSEC =3D 1 << 6, + CFS_INODE_FLAGS_LOW_SIZE =3D 1 << 7, /* Low 32bit of st_size */ + CFS_INODE_FLAGS_HIGH_SIZE =3D 1 << 8, /* High 32bit of st_size */ + CFS_INODE_FLAGS_XATTRS =3D 1 << 9, + CFS_INODE_FLAGS_DIGEST =3D 1 + << 10, /* fs-verity sha256 digest of content */ + CFS_INODE_FLAGS_DIGEST_FROM_PAYLOAD =3D + 1 << 11, /* Compute digest from payload */ +}; + +#define CFS_INODE_FLAG_CHECK(_flag, _name) = \ + (((_flag) & (CFS_INODE_FLAGS_##_name)) !=3D 0) +#define CFS_INODE_FLAG_CHECK_SIZE(_flag, _name, _size) = \ + (CFS_INODE_FLAG_CHECK(_flag, _name) ? (_size) : 0) + +#define CFS_INODE_DEFAULT_MODE 0100644 +#define CFS_INODE_DEFAULT_NLINK 1 +#define CFS_INODE_DEFAULT_NLINK_DIR 2 +#define CFS_INODE_DEFAULT_UIDGID 0 +#define CFS_INODE_DEFAULT_RDEV 0 +#define CFS_INODE_DEFAULT_TIMES 0 + +struct cfs_inode_s { + u32 flags; + + /* Optional data: (selected by flags) */ + + /* This is the size of the type specific data that comes directly after + * the inode in the file. Of this type: + * + * directory: cfs_dir_s + * regular file: the backing filename + * symlink: the target link + * + * Canonically payload_length is 0 for empty dir/file/symlink. + */ + u32 payload_length; + + u32 st_mode; /* File type and mode. */ + u32 st_nlink; /* Number of hard links, only for regular files. */ + u32 st_uid; /* User ID of owner. */ + u32 st_gid; /* Group ID of owner. */ + u32 st_rdev; /* Device ID (if special file). */ + u64 st_size; /* Size of file, only used for regular files */ + + struct cfs_vdata_s xattrs; /* ref to variable data */ + + u8 digest[SHA256_DIGEST_SIZE]; /* fs-verity digest */ + + struct timespec64 st_mtim; /* Time of last modification. */ + struct timespec64 st_ctim; /* Time of last status change. */ +}; + +static inline u32 cfs_inode_encoded_size(u32 flags) +{ + return sizeof(u32) /* flags */ + + CFS_INODE_FLAG_CHECK_SIZE(flags, PAYLOAD, sizeof(u32)) + + CFS_INODE_FLAG_CHECK_SIZE(flags, MODE, sizeof(u32)) + + CFS_INODE_FLAG_CHECK_SIZE(flags, NLINK, sizeof(u32)) + + CFS_INODE_FLAG_CHECK_SIZE(flags, UIDGID, + sizeof(u32) + sizeof(u32)) + + CFS_INODE_FLAG_CHECK_SIZE(flags, RDEV, sizeof(u32)) + + CFS_INODE_FLAG_CHECK_SIZE(flags, TIMES, sizeof(u64) * 2) + + CFS_INODE_FLAG_CHECK_SIZE(flags, TIMES_NSEC, sizeof(u32) * 2) + + CFS_INODE_FLAG_CHECK_SIZE(flags, LOW_SIZE, sizeof(u32)) + + CFS_INODE_FLAG_CHECK_SIZE(flags, HIGH_SIZE, sizeof(u32)) + + CFS_INODE_FLAG_CHECK_SIZE(flags, XATTRS, + sizeof(u64) + sizeof(u32)) + + CFS_INODE_FLAG_CHECK_SIZE(flags, DIGEST, SHA256_DIGEST_SIZE); +} + +struct cfs_dentry_s { + /* Index of struct cfs_inode_s */ + u64 inode_index; + u8 d_type; + u8 name_len; + u16 name_offset; +} __packed; + +struct cfs_dir_chunk_s { + u16 n_dentries; + u16 chunk_size; + u64 chunk_offset; +} __packed; + +struct cfs_dir_s { + u32 n_chunks; + struct cfs_dir_chunk_s chunks[]; +} __packed; + +#define cfs_dir_size(_n_chunks) = \ + (sizeof(struct cfs_dir_s) + \ + (_n_chunks) * sizeof(struct cfs_dir_chunk_s)) + +/* xattr representation. */ +struct cfs_xattr_element_s { + u16 key_length; + u16 value_length; +} __packed; + +struct cfs_xattr_header_s { + u16 n_attr; + struct cfs_xattr_element_s attr[0]; +} __packed; + +#define cfs_xattr_header_size(_n_element) = \ + (sizeof(struct cfs_xattr_header_s) + \ + (_n_element) * sizeof(struct cfs_xattr_element_s)) + +#endif --=20 2.38.1 From nobody Fri Sep 19 05:36:00 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1B3E9C43217 for ; Mon, 28 Nov 2022 11:18:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231268AbiK1LSf (ORCPT ); Mon, 28 Nov 2022 06:18:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36102 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231241AbiK1LSF (ORCPT ); Mon, 28 Nov 2022 06:18:05 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C9C345FC5 for ; Mon, 28 Nov 2022 03:17:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669634232; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YutyptEWmXfGz/OsrYoJMAz02nU95q6kOdmz32QZeqE=; b=Zz3lX3rudsTZTMATAIf2kZZdJCQU+PmcgHWJGGwIP1CsORF3///bq3f+WtNyQKBazaI6pV 9N3HwMxOuFms6EPUU6X8dt0xCiZ3Btd0jbwjhD47ANBhS3vXV1m8fFqef4vIoQJGoBfuFI 0OVyHlDiYyz6Pd4epGdKdxGZKdUsM4Q= Received: from mail-lf1-f69.google.com (mail-lf1-f69.google.com [209.85.167.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-7--xoUONjuNCKqqTl0WkzEyw-1; Mon, 28 Nov 2022 06:17:10 -0500 X-MC-Unique: -xoUONjuNCKqqTl0WkzEyw-1 Received: by mail-lf1-f69.google.com with SMTP id l7-20020a19c207000000b004a471b5cbabso3688038lfc.18 for ; Mon, 28 Nov 2022 03:17:10 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YutyptEWmXfGz/OsrYoJMAz02nU95q6kOdmz32QZeqE=; b=cHPxZxk6wkj9uvs+PSJdthoqEYMJImvpjjSWXY+oxtEdEmkgi8/IN5e/v4nslADTtv H+R1GPDYD+874q6+a3D8vg6CFPA4IWrWa5xXtGAh401H5xhL2l0xT6N3Z7j2iAzz/bw5 lryH4LLWNP7HCnRHfBcBWaQun3ttVSl1+kz9z0YdsA/0de4pUvhyZXCVVe3mIIhMdh/7 pF24Klr1gSoPjbErrl17rdkaiwBsk8MJgs6/CZEi0tmrZ2aoOjRQY3fFpRmuA5gVXOt6 UQ8TPXp7lOoG2e2CnQG0Tj0GuAqOk9/2VRaxcBuiwX0pVXihvOf4hYOOoXZeK8pMBWYw vibQ== X-Gm-Message-State: ANoB5pmNm0+9vOLSUPZPp8odFBP18Jr31s6KYdTaQ61XksCKp3qq0Rsq Ve2yIT9w8I2YwECRv6RQER55trkLC2JiKeaeSKJMGLr27W6cSq+J9x21HspYzAb84ktVZToIKUK FdnxoUs+XQQEXA5nn7ULt0yb+ X-Received: by 2002:a2e:908e:0:b0:277:515b:3dcd with SMTP id l14-20020a2e908e000000b00277515b3dcdmr15627130ljg.501.1669634228400; Mon, 28 Nov 2022 03:17:08 -0800 (PST) X-Google-Smtp-Source: AA0mqf458n6+xgcVP3bQxbi1oHYr7quUF9B13ePbs4wtxB4zms1aMtEii9t0z8Zea9FF8/uOFU/PjA== X-Received: by 2002:a2e:908e:0:b0:277:515b:3dcd with SMTP id l14-20020a2e908e000000b00277515b3dcdmr15627119ljg.501.1669634228020; Mon, 28 Nov 2022 03:17:08 -0800 (PST) Received: from localhost.localdomain (c-e6a5e255.022-110-73746f36.bbcust.telenor.se. [85.226.165.230]) by smtp.googlemail.com with ESMTPSA id s19-20020a2eb633000000b002772414817esm1211507ljn.1.2022.11.28.03.17.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Nov 2022 03:17:07 -0800 (PST) From: Alexander Larsson To: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org, gscrivan@redhat.com, alexl@redhat.com Subject: [PATCH 3/6] composefs: Add descriptor parsing code Date: Mon, 28 Nov 2022 12:16:59 +0100 Message-Id: <1c4c49fac5bb6406a8cb55ca71f8060703aa63f6.1669631086.git.alexl@redhat.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" This adds the code to load and decode the filesystem descriptor file format. Signed-off-by: Alexander Larsson Signed-off-by: Giuseppe Scrivano --- fs/composefs/cfs-internals.h | 65 +++ fs/composefs/cfs-reader.c | 958 +++++++++++++++++++++++++++++++++++ 2 files changed, 1023 insertions(+) create mode 100644 fs/composefs/cfs-internals.h create mode 100644 fs/composefs/cfs-reader.c diff --git a/fs/composefs/cfs-internals.h b/fs/composefs/cfs-internals.h new file mode 100644 index 000000000000..f4cb50eec9b8 --- /dev/null +++ b/fs/composefs/cfs-internals.h @@ -0,0 +1,65 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _CFS_INTERNALS_H +#define _CFS_INTERNALS_H + +#include "cfs.h" + +#define EFSCORRUPTED EUCLEAN /* Filesystem is corrupted */ + +#define CFS_N_PRELOAD_DIR_CHUNKS 4 + +struct cfs_inode_data_s { + u32 payload_length; + char *path_payload; /* Real pathname for files, target for symlinks */ + u32 n_dir_chunks; + struct cfs_dir_chunk_s preloaded_dir_chunks[CFS_N_PRELOAD_DIR_CHUNKS]; + + u64 xattrs_offset; + u32 xattrs_len; + + bool has_digest; + u8 digest[SHA256_DIGEST_SIZE]; /* fs-verity digest */ +}; + +struct cfs_context_s { + struct cfs_header_s header; + struct file *descriptor; + + u64 descriptor_len; +}; + +int cfs_init_ctx(const char *descriptor_path, const u8 *required_digest, + struct cfs_context_s *ctx); + +void cfs_ctx_put(struct cfs_context_s *ctx); + +void cfs_inode_data_put(struct cfs_inode_data_s *inode_data); + +struct cfs_inode_s *cfs_get_root_ino(struct cfs_context_s *ctx, + struct cfs_inode_s *ino_buf, u64 *index); + +struct cfs_inode_s *cfs_get_ino_index(struct cfs_context_s *ctx, u64 index, + struct cfs_inode_s *buffer); + +int cfs_init_inode_data(struct cfs_context_s *ctx, struct cfs_inode_s *ino, + u64 index, struct cfs_inode_data_s *data); + +ssize_t cfs_list_xattrs(struct cfs_context_s *ctx, + struct cfs_inode_data_s *inode_data, char *names, + size_t size); +int cfs_get_xattr(struct cfs_context_s *ctx, + struct cfs_inode_data_s *inode_data, const char *name, + void *value, size_t size); + +typedef bool (*cfs_dir_iter_cb)(void *private, const char *name, int namel= en, + u64 ino, unsigned int dtype); + +int cfs_dir_iterate(struct cfs_context_s *ctx, u64 index, + struct cfs_inode_data_s *inode_data, loff_t first, + cfs_dir_iter_cb cb, void *private); + +int cfs_dir_lookup(struct cfs_context_s *ctx, u64 index, + struct cfs_inode_data_s *inode_data, const char *name, + size_t name_len, u64 *index_out); + +#endif diff --git a/fs/composefs/cfs-reader.c b/fs/composefs/cfs-reader.c new file mode 100644 index 000000000000..ad77ef0bd4d4 --- /dev/null +++ b/fs/composefs/cfs-reader.c @@ -0,0 +1,958 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * composefs + * + * Copyright (C) 2021 Giuseppe Scrivano + * Copyright (C) 2022 Alexander Larsson + * + * This file is released under the GPL. + */ + +#include "cfs-internals.h" + +#include +#include +#include +#include + +struct cfs_buf { + struct page *page; + void *base; +}; + +#define CFS_VDATA_BUF_INIT = \ + { \ + NULL, NULL \ + } + +static void cfs_buf_put(struct cfs_buf *buf) +{ + if (buf->page) { + if (buf->base) + kunmap(buf->page); + put_page(buf->page); + buf->base =3D NULL; + buf->page =3D NULL; + } +} + +static void *cfs_get_buf(struct cfs_context_s *ctx, u64 offset, u32 size, + struct cfs_buf *buf) +{ + u64 index =3D offset >> PAGE_SHIFT; + u32 page_offset =3D offset & (PAGE_SIZE - 1); + struct page *page =3D buf->page; + struct inode *inode =3D ctx->descriptor->f_inode; + struct address_space *const mapping =3D inode->i_mapping; + + if (offset > ctx->descriptor_len) + return ERR_PTR(-EFSCORRUPTED); + + if ((offset + size < offset) || (offset + size > ctx->descriptor_len)) + return ERR_PTR(-EFSCORRUPTED); + + if (size > PAGE_SIZE) + return ERR_PTR(-EINVAL); + + if (PAGE_SIZE - page_offset < size) + return ERR_PTR(-EINVAL); + + if (!page || page->index !=3D index) { + cfs_buf_put(buf); + + page =3D read_cache_page(mapping, index, NULL, NULL); + if (IS_ERR(page)) + return page; + + buf->page =3D page; + buf->base =3D kmap(page); + } + + return buf->base + page_offset; +} + +static void *cfs_read_data(struct cfs_context_s *ctx, u64 offset, u64 size, + u8 *dest) +{ + size_t copied; + loff_t pos =3D offset; + + if (offset > ctx->descriptor_len) + return ERR_PTR(-EFSCORRUPTED); + + if ((offset + size < offset) || (offset + size > ctx->descriptor_len)) + return ERR_PTR(-EFSCORRUPTED); + + copied =3D 0; + while (copied < size) { + ssize_t bytes; + + bytes =3D kernel_read(ctx->descriptor, dest + copied, + size - copied, &pos); + if (bytes < 0) + return ERR_PTR(bytes); + if (bytes =3D=3D 0) + return ERR_PTR(-EINVAL); + + copied +=3D bytes; + } + + if (copied !=3D size) + return ERR_PTR(-EFSCORRUPTED); + return dest; +} + +int cfs_init_ctx(const char *descriptor_path, const u8 *required_digest, + struct cfs_context_s *ctx_out) +{ + struct cfs_header_s *header; + struct file *descriptor; + loff_t i_size; + u8 verity_digest[FS_VERITY_MAX_DIGEST_SIZE]; + enum hash_algo verity_algo; + struct cfs_context_s ctx; + int res; + + descriptor =3D filp_open(descriptor_path, O_RDONLY, 0); + if (IS_ERR(descriptor)) + return PTR_ERR(descriptor); + + if (required_digest) { + res =3D fsverity_get_digest(d_inode(descriptor->f_path.dentry), + verity_digest, &verity_algo); + if (res < 0) { + pr_err("ERROR: composefs descriptor has no fs-verity digest\n"); + goto fail; + } + if (verity_algo !=3D HASH_ALGO_SHA256 || + memcmp(required_digest, verity_digest, + SHA256_DIGEST_SIZE) !=3D 0) { + pr_err("ERROR: composefs descriptor has wrong fs-verity digest\n"); + res =3D -EINVAL; + goto fail; + } + } + + i_size =3D i_size_read(file_inode(descriptor)); + if (i_size <=3D + (sizeof(struct cfs_header_s) + sizeof(struct cfs_inode_s))) { + res =3D -EINVAL; + goto fail; + } + + /* Need this temporary ctx for cfs_read_data() */ + ctx.descriptor =3D descriptor; + ctx.descriptor_len =3D i_size; + + header =3D cfs_read_data(&ctx, 0, sizeof(struct cfs_header_s), + (u8 *)&ctx.header); + if (IS_ERR(header)) { + res =3D PTR_ERR(header); + goto fail; + } + header->magic =3D cfs_u32_from_file(header->magic); + header->data_offset =3D cfs_u64_from_file(header->data_offset); + header->root_inode =3D cfs_u64_from_file(header->root_inode); + + if (header->magic !=3D CFS_MAGIC || + header->data_offset > ctx.descriptor_len || + sizeof(struct cfs_header_s) + header->root_inode > + ctx.descriptor_len) { + res =3D -EINVAL; + goto fail; + } + + *ctx_out =3D ctx; + return 0; + +fail: + fput(descriptor); + return res; +} + +void cfs_ctx_put(struct cfs_context_s *ctx) +{ + if (ctx->descriptor) { + fput(ctx->descriptor); + ctx->descriptor =3D NULL; + } +} + +static void *cfs_get_inode_data(struct cfs_context_s *ctx, u64 offset, u64= size, + u8 *dest) +{ + return cfs_read_data(ctx, offset + sizeof(struct cfs_header_s), size, + dest); +} + +static void *cfs_get_inode_data_max(struct cfs_context_s *ctx, u64 offset, + u64 max_size, u64 *read_size, u8 *dest) +{ + u64 remaining =3D ctx->descriptor_len - sizeof(struct cfs_header_s); + u64 size; + + if (offset > remaining) + return ERR_PTR(-EINVAL); + remaining -=3D offset; + + /* Read at most remaining bytes, and no more than max_size */ + size =3D min(remaining, max_size); + *read_size =3D size; + + return cfs_get_inode_data(ctx, offset, size, dest); +} + +static void *cfs_get_inode_payload_w_len(struct cfs_context_s *ctx, + u32 payload_length, u64 index, + u8 *dest, u64 offset, size_t len) +{ + /* Payload is stored before the inode, check it fits */ + if (payload_length > index) + return ERR_PTR(-EINVAL); + + if (offset > payload_length) + return ERR_PTR(-EINVAL); + + if (offset + len > payload_length) + return ERR_PTR(-EINVAL); + + return cfs_get_inode_data(ctx, index - payload_length + offset, len, + dest); +} + +static void *cfs_get_inode_payload(struct cfs_context_s *ctx, + struct cfs_inode_s *ino, u64 index, u8 *dest) +{ + return cfs_get_inode_payload_w_len(ctx, ino->payload_length, index, + dest, 0, ino->payload_length); +} + +static void *cfs_get_vdata_buf(struct cfs_context_s *ctx, u64 offset, u32 = len, + struct cfs_buf *buf) +{ + if (offset > ctx->descriptor_len - ctx->header.data_offset) + return ERR_PTR(-EINVAL); + + if (len > ctx->descriptor_len - ctx->header.data_offset - offset) + return ERR_PTR(-EINVAL); + + return cfs_get_buf(ctx, ctx->header.data_offset + offset, len, buf); +} + +static u32 cfs_read_u32(u8 **data) +{ + u32 v =3D cfs_u32_from_file(__get_unaligned_cpu32(*data)); + *data +=3D sizeof(u32); + return v; +} + +static u64 cfs_read_u64(u8 **data) +{ + u64 v =3D cfs_u64_from_file(__get_unaligned_cpu64(*data)); + *data +=3D sizeof(u64); + return v; +} + +struct cfs_inode_s *cfs_get_ino_index(struct cfs_context_s *ctx, u64 index, + struct cfs_inode_s *ino) +{ + u64 offset =3D index; + /* Buffer that fits the maximal encoded size: */ + u8 buffer[sizeof(struct cfs_inode_s)]; + u64 read_size; + u64 inode_size; + u8 *data; + + data =3D cfs_get_inode_data_max(ctx, offset, sizeof(buffer), &read_size, + buffer); + if (IS_ERR(data)) + return ERR_CAST(data); + + /* Need to fit at least flags to decode */ + if (read_size < sizeof(u32)) + return ERR_PTR(-EFSCORRUPTED); + + memset(ino, 0, sizeof(struct cfs_inode_s)); + ino->flags =3D cfs_read_u32(&data); + + inode_size =3D cfs_inode_encoded_size(ino->flags); + /* Shouldn't happen, but lets check */ + if (inode_size > sizeof(buffer)) + return ERR_PTR(-EFSCORRUPTED); + + if (CFS_INODE_FLAG_CHECK(ino->flags, PAYLOAD)) + ino->payload_length =3D cfs_read_u32(&data); + else + ino->payload_length =3D 0; + + if (CFS_INODE_FLAG_CHECK(ino->flags, MODE)) + ino->st_mode =3D cfs_read_u32(&data); + else + ino->st_mode =3D CFS_INODE_DEFAULT_MODE; + + if (CFS_INODE_FLAG_CHECK(ino->flags, NLINK)) { + ino->st_nlink =3D cfs_read_u32(&data); + } else { + if ((ino->st_mode & S_IFMT) =3D=3D S_IFDIR) + ino->st_nlink =3D CFS_INODE_DEFAULT_NLINK_DIR; + else + ino->st_nlink =3D CFS_INODE_DEFAULT_NLINK; + } + + if (CFS_INODE_FLAG_CHECK(ino->flags, UIDGID)) { + ino->st_uid =3D cfs_read_u32(&data); + ino->st_gid =3D cfs_read_u32(&data); + } else { + ino->st_uid =3D CFS_INODE_DEFAULT_UIDGID; + ino->st_gid =3D CFS_INODE_DEFAULT_UIDGID; + } + + if (CFS_INODE_FLAG_CHECK(ino->flags, RDEV)) + ino->st_rdev =3D cfs_read_u32(&data); + else + ino->st_rdev =3D CFS_INODE_DEFAULT_RDEV; + + if (CFS_INODE_FLAG_CHECK(ino->flags, TIMES)) { + ino->st_mtim.tv_sec =3D cfs_read_u64(&data); + ino->st_ctim.tv_sec =3D cfs_read_u64(&data); + } else { + ino->st_mtim.tv_sec =3D CFS_INODE_DEFAULT_TIMES; + ino->st_ctim.tv_sec =3D CFS_INODE_DEFAULT_TIMES; + } + + if (CFS_INODE_FLAG_CHECK(ino->flags, TIMES_NSEC)) { + ino->st_mtim.tv_nsec =3D cfs_read_u32(&data); + ino->st_ctim.tv_nsec =3D cfs_read_u32(&data); + } else { + ino->st_mtim.tv_nsec =3D 0; + ino->st_ctim.tv_nsec =3D 0; + } + + if (CFS_INODE_FLAG_CHECK(ino->flags, LOW_SIZE)) + ino->st_size =3D cfs_read_u32(&data); + else + ino->st_size =3D 0; + + if (CFS_INODE_FLAG_CHECK(ino->flags, HIGH_SIZE)) + ino->st_size +=3D (u64)cfs_read_u32(&data) << 32; + + if (CFS_INODE_FLAG_CHECK(ino->flags, XATTRS)) { + ino->xattrs.off =3D cfs_read_u64(&data); + ino->xattrs.len =3D cfs_read_u32(&data); + } else { + ino->xattrs.off =3D 0; + ino->xattrs.len =3D 0; + } + + if (CFS_INODE_FLAG_CHECK(ino->flags, DIGEST)) { + memcpy(ino->digest, data, SHA256_DIGEST_SIZE); + data +=3D 32; + } + + return ino; +} + +struct cfs_inode_s *cfs_get_root_ino(struct cfs_context_s *ctx, + struct cfs_inode_s *ino_buf, u64 *index) +{ + u64 root_ino =3D ctx->header.root_inode; + + *index =3D root_ino; + return cfs_get_ino_index(ctx, root_ino, ino_buf); +} + +static int cfs_get_digest(struct cfs_context_s *ctx, struct cfs_inode_s *i= no, + const char *payload, + u8 digest_out[SHA256_DIGEST_SIZE]) +{ + int r; + + if (CFS_INODE_FLAG_CHECK(ino->flags, DIGEST)) { + memcpy(digest_out, ino->digest, SHA256_DIGEST_SIZE); + return 1; + } + + if (payload && CFS_INODE_FLAG_CHECK(ino->flags, DIGEST_FROM_PAYLOAD)) { + r =3D cfs_digest_from_payload(payload, ino->payload_length, + digest_out); + if (r < 0) + return r; + return 1; + } + + return 0; +} + +static bool cfs_validate_filename(const char *name, size_t name_len) +{ + if (name_len =3D=3D 0) + return false; + + if (name_len =3D=3D 1 && name[0] =3D=3D '.') + return false; + + if (name_len =3D=3D 2 && name[0] =3D=3D '.' && name[1] =3D=3D '.') + return false; + + if (memchr(name, '/', name_len)) + return false; + + return true; +} + +static struct cfs_dir_s *cfs_dir_read_chunk_header(struct cfs_context_s *c= tx, + size_t payload_length, + u64 index, u8 *chunk_buf, + size_t chunk_buf_size, + size_t max_n_chunks) +{ + size_t n_chunks, i; + struct cfs_dir_s *dir; + + /* Payload and buffer should be large enough to fit the n_chunks */ + if (payload_length < sizeof(struct cfs_dir_s) || + chunk_buf_size < sizeof(struct cfs_dir_s)) + return ERR_PTR(-EFSCORRUPTED); + + /* Make sure we fit max_n_chunks in buffer before reading it */ + if (chunk_buf_size < cfs_dir_size(max_n_chunks)) + return ERR_PTR(-EINVAL); + + dir =3D cfs_get_inode_payload_w_len(ctx, payload_length, index, chunk_buf, + 0, + min(chunk_buf_size, payload_length)); + if (IS_ERR(dir)) + return ERR_CAST(dir); + + n_chunks =3D cfs_u32_from_file(dir->n_chunks); + dir->n_chunks =3D n_chunks; + + /* Don't support n_chunks =3D=3D 0, the canonical version of that is payl= oad_length =3D=3D 0 */ + if (n_chunks =3D=3D 0) + return ERR_PTR(-EFSCORRUPTED); + + if (payload_length !=3D cfs_dir_size(n_chunks)) + return ERR_PTR(-EFSCORRUPTED); + + max_n_chunks =3D min(n_chunks, max_n_chunks); + + /* Verify data (up to max_n_chunks) */ + for (i =3D 0; i < max_n_chunks; i++) { + struct cfs_dir_chunk_s *chunk =3D &dir->chunks[i]; + + chunk->n_dentries =3D cfs_u16_from_file(chunk->n_dentries); + chunk->chunk_size =3D cfs_u16_from_file(chunk->chunk_size); + chunk->chunk_offset =3D cfs_u64_from_file(chunk->chunk_offset); + + if (chunk->chunk_size < + sizeof(struct cfs_dentry_s) * chunk->n_dentries) + return ERR_PTR(-EFSCORRUPTED); + + if (chunk->chunk_size > CFS_MAX_DIR_CHUNK_SIZE) + return ERR_PTR(-EFSCORRUPTED); + + if (chunk->n_dentries =3D=3D 0) + return ERR_PTR(-EFSCORRUPTED); + + if (chunk->chunk_size =3D=3D 0) + return ERR_PTR(-EFSCORRUPTED); + + if (chunk->chunk_offset > + ctx->descriptor_len - ctx->header.data_offset) + return ERR_PTR(-EFSCORRUPTED); + } + + return dir; +} + +static char *cfs_dup_payload_path(struct cfs_context_s *ctx, + struct cfs_inode_s *ino, u64 index) +{ + const char *v; + u8 *path; + + if ((ino->st_mode & S_IFMT) !=3D S_IFREG && + (ino->st_mode & S_IFMT) !=3D S_IFLNK) { + return ERR_PTR(-EINVAL); + } + + if (ino->payload_length =3D=3D 0 || ino->payload_length > PATH_MAX) + return ERR_PTR(-EFSCORRUPTED); + + path =3D kmalloc(ino->payload_length + 1, GFP_KERNEL); + if (!path) + return ERR_PTR(-ENOMEM); + + v =3D cfs_get_inode_payload(ctx, ino, index, path); + if (IS_ERR(v)) { + kfree(path); + return ERR_CAST(v); + } + + /* zero terminate */ + path[ino->payload_length] =3D 0; + + return (char *)path; +} + +int cfs_init_inode_data(struct cfs_context_s *ctx, struct cfs_inode_s *ino, + u64 index, struct cfs_inode_data_s *inode_data) +{ + u8 buf[cfs_dir_size(CFS_N_PRELOAD_DIR_CHUNKS)]; + struct cfs_dir_s *dir; + int ret =3D 0; + size_t i; + char *path_payload =3D NULL; + + inode_data->payload_length =3D ino->payload_length; + + if ((ino->st_mode & S_IFMT) !=3D S_IFDIR || ino->payload_length =3D=3D 0)= { + inode_data->n_dir_chunks =3D 0; + } else { + u32 n_chunks; + + dir =3D cfs_dir_read_chunk_header(ctx, ino->payload_length, index, + buf, sizeof(buf), + CFS_N_PRELOAD_DIR_CHUNKS); + if (IS_ERR(dir)) + return PTR_ERR(dir); + + n_chunks =3D dir->n_chunks; + inode_data->n_dir_chunks =3D n_chunks; + + for (i =3D 0; i < n_chunks && i < CFS_N_PRELOAD_DIR_CHUNKS; i++) + inode_data->preloaded_dir_chunks[i] =3D dir->chunks[i]; + } + + if ((ino->st_mode & S_IFMT) =3D=3D S_IFLNK || + ((ino->st_mode & S_IFMT) =3D=3D S_IFREG && ino->payload_length > 0)) { + path_payload =3D cfs_dup_payload_path(ctx, ino, index); + if (IS_ERR(path_payload)) { + ret =3D PTR_ERR(path_payload); + goto fail; + } + } + inode_data->path_payload =3D path_payload; + + ret =3D cfs_get_digest(ctx, ino, path_payload, inode_data->digest); + if (ret < 0) + goto fail; + + inode_data->has_digest =3D ret !=3D 0; + + inode_data->xattrs_offset =3D ino->xattrs.off; + inode_data->xattrs_len =3D ino->xattrs.len; + + if (inode_data->xattrs_len !=3D 0) { + /* Validate xattr size */ + if (inode_data->xattrs_len < + sizeof(struct cfs_xattr_header_s) || + inode_data->xattrs_len > CFS_MAX_XATTRS_SIZE) { + ret =3D -EFSCORRUPTED; + goto fail; + } + } + + return 0; + +fail: + cfs_inode_data_put(inode_data); + return ret; +} + +void cfs_inode_data_put(struct cfs_inode_data_s *inode_data) +{ + inode_data->n_dir_chunks =3D 0; + kfree(inode_data->path_payload); + inode_data->path_payload =3D NULL; +} + +ssize_t cfs_list_xattrs(struct cfs_context_s *ctx, + struct cfs_inode_data_s *inode_data, char *names, + size_t size) +{ + u8 *data, *data_end; + size_t n_xattrs =3D 0, i; + ssize_t copied =3D 0; + const struct cfs_xattr_header_s *xattrs; + struct cfs_buf vdata_buf =3D CFS_VDATA_BUF_INIT; + + if (inode_data->xattrs_len =3D=3D 0) + return 0; + + /* xattrs_len basic size req was verified in cfs_init_inode_data */ + + xattrs =3D cfs_get_vdata_buf(ctx, inode_data->xattrs_offset, + inode_data->xattrs_len, &vdata_buf); + if (IS_ERR(xattrs)) + return PTR_ERR(xattrs); + + n_xattrs =3D cfs_u16_from_file(xattrs->n_attr); + + /* Verify that array fits */ + if (inode_data->xattrs_len < cfs_xattr_header_size(n_xattrs)) { + copied =3D -EFSCORRUPTED; + goto exit; + } + + data =3D ((u8 *)xattrs) + cfs_xattr_header_size(n_xattrs); + data_end =3D ((u8 *)xattrs) + inode_data->xattrs_len; + + for (i =3D 0; i < n_xattrs; i++) { + const struct cfs_xattr_element_s *e =3D &xattrs->attr[i]; + u16 this_key_len =3D cfs_u16_from_file(e->key_length); + u16 this_value_len =3D cfs_u16_from_file(e->value_length); + const char *this_key, *this_value; + + if (this_key_len > XATTR_NAME_MAX || + /* key and data needs to fit in data */ + data_end - data < this_key_len + this_value_len) { + copied =3D -EFSCORRUPTED; + goto exit; + } + + this_key =3D data; + this_value =3D data + this_key_len; + data +=3D this_key_len + this_value_len; + + if (size) { + if (size - copied < this_key_len + 1) { + copied =3D -E2BIG; + goto exit; + } + + memcpy(names + copied, this_key, this_key_len); + names[copied + this_key_len] =3D '\0'; + } + + copied +=3D this_key_len + 1; + } + +exit: + cfs_buf_put(&vdata_buf); + + return copied; +} + +int cfs_get_xattr(struct cfs_context_s *ctx, + struct cfs_inode_data_s *inode_data, const char *name, + void *value, size_t size) +{ + size_t name_len =3D strlen(name); + size_t n_xattrs =3D 0, i; + struct cfs_xattr_header_s *xattrs; + u8 *data, *data_end; + struct cfs_buf vdata_buf =3D CFS_VDATA_BUF_INIT; + int res; + + if (inode_data->xattrs_len =3D=3D 0) + return -ENODATA; + + /* xattrs_len basic size req was verified in cfs_init_inode_data */ + + xattrs =3D cfs_get_vdata_buf(ctx, inode_data->xattrs_offset, + inode_data->xattrs_len, &vdata_buf); + if (IS_ERR(xattrs)) + return PTR_ERR(xattrs); + + n_xattrs =3D cfs_u16_from_file(xattrs->n_attr); + + /* Verify that array fits */ + if (inode_data->xattrs_len < cfs_xattr_header_size(n_xattrs)) { + res =3D -EFSCORRUPTED; + goto exit; + } + + data =3D ((u8 *)xattrs) + cfs_xattr_header_size(n_xattrs); + data_end =3D ((u8 *)xattrs) + inode_data->xattrs_len; + + for (i =3D 0; i < n_xattrs; i++) { + const struct cfs_xattr_element_s *e =3D &xattrs->attr[i]; + u16 this_key_len =3D cfs_u16_from_file(e->key_length); + u16 this_value_len =3D cfs_u16_from_file(e->value_length); + const char *this_key, *this_value; + + if (this_key_len > XATTR_NAME_MAX || + /* key and data needs to fit in data */ + data_end - data < this_key_len + this_value_len) { + res =3D -EFSCORRUPTED; + goto exit; + } + + this_key =3D data; + this_value =3D data + this_key_len; + data +=3D this_key_len + this_value_len; + + if (this_key_len !=3D name_len || + memcmp(this_key, name, name_len) !=3D 0) + continue; + + if (size > 0) { + if (size < this_value_len) { + res =3D -E2BIG; + goto exit; + } + memcpy(value, this_value, this_value_len); + } + + res =3D this_value_len; + goto exit; + } + + res =3D -ENODATA; + +exit: + return res; +} + +static struct cfs_dir_s * +cfs_dir_read_chunk_header_alloc(struct cfs_context_s *ctx, u64 index, + struct cfs_inode_data_s *inode_data) +{ + size_t chunk_buf_size =3D cfs_dir_size(inode_data->n_dir_chunks); + u8 *chunk_buf; + struct cfs_dir_s *dir; + + chunk_buf =3D kmalloc(chunk_buf_size, GFP_KERNEL); + if (!chunk_buf) + return ERR_PTR(-ENOMEM); + + dir =3D cfs_dir_read_chunk_header(ctx, inode_data->payload_length, index, + chunk_buf, chunk_buf_size, + inode_data->n_dir_chunks); + if (IS_ERR(dir)) { + kfree(chunk_buf); + return ERR_CAST(dir); + } + + return dir; +} + +static struct cfs_dir_chunk_s * +cfs_dir_get_chunk_info(struct cfs_context_s *ctx, u64 index, + struct cfs_inode_data_s *inode_data, void **chunks_buf) +{ + struct cfs_dir_s *full_dir; + + if (inode_data->n_dir_chunks <=3D CFS_N_PRELOAD_DIR_CHUNKS) { + *chunks_buf =3D NULL; + return inode_data->preloaded_dir_chunks; + } + + full_dir =3D cfs_dir_read_chunk_header_alloc(ctx, index, inode_data); + if (IS_ERR(full_dir)) + return ERR_CAST(full_dir); + + *chunks_buf =3D full_dir; + return full_dir->chunks; +} + +static inline int memcmp2(const void *a, const size_t a_size, const void *= b, + size_t b_size) +{ + size_t common_size =3D min(a_size, b_size); + int res; + + res =3D memcmp(a, b, common_size); + if (res !=3D 0 || a_size =3D=3D b_size) + return res; + + return a_size < b_size ? -1 : 1; +} + +int cfs_dir_iterate(struct cfs_context_s *ctx, u64 index, + struct cfs_inode_data_s *inode_data, loff_t first, + cfs_dir_iter_cb cb, void *private) +{ + size_t i, j, n_chunks; + char *namedata, *namedata_end; + struct cfs_dir_chunk_s *chunks; + struct cfs_dentry_s *dentries; + struct cfs_buf vdata_buf =3D CFS_VDATA_BUF_INIT; + void *chunks_buf; + loff_t pos; + int res; + + n_chunks =3D inode_data->n_dir_chunks; + if (n_chunks =3D=3D 0) + return 0; + + chunks =3D cfs_dir_get_chunk_info(ctx, index, inode_data, &chunks_buf); + if (IS_ERR(chunks)) + return PTR_ERR(chunks); + + pos =3D 0; + for (i =3D 0; i < n_chunks; i++) { + /* Chunks info are verified/converted in cfs_dir_read_chunk_header */ + u64 chunk_offset =3D chunks[i].chunk_offset; + size_t chunk_size =3D chunks[i].chunk_size; + size_t n_dentries =3D chunks[i].n_dentries; + + /* Do we need to look at this chunk */ + if (first >=3D pos + n_dentries) { + pos +=3D n_dentries; + continue; + } + + /* Read chunk dentries from page cache */ + dentries =3D cfs_get_vdata_buf(ctx, chunk_offset, chunk_size, + &vdata_buf); + if (IS_ERR(dentries)) { + res =3D PTR_ERR(dentries); + goto exit; + } + + namedata =3D ((char *)dentries) + + sizeof(struct cfs_dentry_s) * n_dentries; + namedata_end =3D ((char *)dentries) + chunk_size; + + for (j =3D 0; j < n_dentries; j++) { + struct cfs_dentry_s *dentry =3D &dentries[j]; + size_t dentry_name_len =3D dentry->name_len; + char *dentry_name =3D + (char *)namedata + dentry->name_offset; + + /* name needs to fit in namedata */ + if (dentry_name >=3D namedata_end || + namedata_end - dentry_name < dentry_name_len) { + res =3D -EFSCORRUPTED; + goto exit; + } + + if (!cfs_validate_filename(dentry_name, + dentry_name_len)) { + res =3D -EFSCORRUPTED; + goto exit; + } + + if (pos++ < first) + continue; + + if (!cb(private, dentry_name, dentry_name_len, + cfs_u64_from_file(dentry->inode_index), + dentry->d_type)) { + res =3D 0; + goto exit; + } + } + } + + res =3D 0; +exit: + kfree(chunks_buf); + cfs_buf_put(&vdata_buf); + return res; +} + +#define BEFORE_CHUNK 1 +#define AFTER_CHUNK 2 +// -1 =3D> error, 0 =3D=3D hit, 1 =3D=3D name is before chunk, 2 =3D=3D na= me is after chunk +static int cfs_dir_lookup_in_chunk(const char *name, size_t name_len, + struct cfs_dentry_s *dentries, + size_t n_dentries, char *namedata, + char *namedata_end, u64 *index_out) +{ + int start_dentry, end_dentry; + int cmp; + + // This should not happen in a valid fs, and if it does we don't know if + // the name is before or after the chunk. + if (n_dentries =3D=3D 0) + return -EFSCORRUPTED; + + start_dentry =3D 0; + end_dentry =3D n_dentries - 1; + while (start_dentry <=3D end_dentry) { + int mid_dentry =3D start_dentry + (end_dentry - start_dentry) / 2; + struct cfs_dentry_s *dentry =3D &dentries[mid_dentry]; + size_t dentry_name_len =3D dentry->name_len; + char *dentry_name =3D (char *)namedata + dentry->name_offset; + + /* name needs to fit in namedata */ + if (dentry_name >=3D namedata_end || + namedata_end - dentry_name < dentry_name_len) { + return -EFSCORRUPTED; + } + + cmp =3D memcmp2(name, name_len, dentry_name, dentry_name_len); + if (cmp =3D=3D 0) { + *index_out =3D cfs_u64_from_file(dentry->inode_index); + return 0; + } + + if (cmp > 0) + start_dentry =3D mid_dentry + 1; + else + end_dentry =3D mid_dentry - 1; + } + + return cmp > 0 ? AFTER_CHUNK : BEFORE_CHUNK; +} + +int cfs_dir_lookup(struct cfs_context_s *ctx, u64 index, + struct cfs_inode_data_s *inode_data, const char *name, + size_t name_len, u64 *index_out) +{ + int n_chunks, start_chunk, end_chunk; + char *namedata, *namedata_end; + struct cfs_dir_chunk_s *chunks; + struct cfs_dentry_s *dentries; + void *chunks_buf; + struct cfs_buf vdata_buf =3D CFS_VDATA_BUF_INIT; + int res, r; + + n_chunks =3D inode_data->n_dir_chunks; + if (n_chunks =3D=3D 0) + return 0; + + chunks =3D cfs_dir_get_chunk_info(ctx, index, inode_data, &chunks_buf); + if (IS_ERR(chunks)) + return PTR_ERR(chunks); + + start_chunk =3D 0; + end_chunk =3D n_chunks - 1; + + while (start_chunk <=3D end_chunk) { + int mid_chunk =3D start_chunk + (end_chunk - start_chunk) / 2; + + /* Chunks info are verified/converted in cfs_dir_read_chunk_header */ + u64 chunk_offset =3D chunks[mid_chunk].chunk_offset; + size_t chunk_size =3D chunks[mid_chunk].chunk_size; + size_t n_dentries =3D chunks[mid_chunk].n_dentries; + + /* Read chunk dentries from page cache */ + dentries =3D cfs_get_vdata_buf(ctx, chunk_offset, chunk_size, + &vdata_buf); + if (IS_ERR(dentries)) { + res =3D PTR_ERR(dentries); + goto exit; + } + + namedata =3D ((u8 *)dentries) + + sizeof(struct cfs_dentry_s) * n_dentries; + namedata_end =3D ((u8 *)dentries) + chunk_size; + + r =3D cfs_dir_lookup_in_chunk(name, name_len, dentries, + n_dentries, namedata, namedata_end, + index_out); + if (r < 0) { + res =3D r; /* error */ + goto exit; + } else if (r =3D=3D 0) { + res =3D 1; /* found it */ + goto exit; + } else if (r =3D=3D AFTER_CHUNK) { + start_chunk =3D mid_chunk + 1; + } else { /* before */ + end_chunk =3D mid_chunk - 1; + } + } + + /* not found */ + res =3D 0; + +exit: + kfree(chunks_buf); + cfs_buf_put(&vdata_buf); + return res; +} --=20 2.38.1 From nobody Fri Sep 19 05:36:00 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 609ECC433FE for ; Mon, 28 Nov 2022 11:19:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231245AbiK1LTQ (ORCPT ); Mon, 28 Nov 2022 06:19:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36438 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230325AbiK1LSa (ORCPT ); Mon, 28 Nov 2022 06:18:30 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E679A1BE89 for ; Mon, 28 Nov 2022 03:17:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669634245; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5mRa4+C6/Po3ncGUY4jy5hlW4ieneowaCP9/UlmJZtQ=; b=O+swSm3Qt+RnXzDz66sbyjTUe7bM2sooijgF9j9Jkh2+MdKktvhMIs+nlISImNqrDMtTcf C+FFNxclfmJcR62Q55HWNj9Npl5c3a/VH9H+uCkf3umB2AzxLwLOYzIn1bF7J0cY5PwJ/F W74H+bYWzJaANy67aEG96+IEzvoXwJU= Received: from mail-lj1-f200.google.com (mail-lj1-f200.google.com [209.85.208.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-135-EndmgiIFMtCwaEC4aMAdaA-1; Mon, 28 Nov 2022 06:17:23 -0500 X-MC-Unique: EndmgiIFMtCwaEC4aMAdaA-1 Received: by mail-lj1-f200.google.com with SMTP id y3-20020a05651c106300b002799d3aae9bso834900ljm.9 for ; Mon, 28 Nov 2022 03:17:23 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5mRa4+C6/Po3ncGUY4jy5hlW4ieneowaCP9/UlmJZtQ=; b=RbsjSdT1KgFFs1OnRFPwUt6OSISXlARCx1hGz5EEcsXFhbC/Nxv2c8hzOL6rozwYha 6BUrmI2Z3sDeGKA/EMH19O+MvDsGwDDP2NEOhlOFbkz9b3QI9FloCVgyyEV/AP9H8k3n cOMawzkKHFaj5AcXE2g4tmPs6uNH7QsyqKMjEkclHR9zBbEoeLsazwkW5Whj2ggV3QHi akzGUiTxidjJVDM7jJn39edTMPKX2ppbH5RMpynXe/DWd+Y7jfReu4g15wcHOpWggZ9A 8Id/51KQubI0sQrCbIh2kvLw1Rvmh2H6iimI5s8SfFAS+nekZzqIG9O4btiu8pfV9XBq fSpA== X-Gm-Message-State: ANoB5plgvEpvY3WZVuQIf6NsJw/e916COY8/v/MMhCPKmnLnxodxgi78 gWw0BMBOz0FLKxeQ2ClDnOi35PPbmSKR+hY5iBwC0jAZ6CxHGKHS+b2C+o7QmJ/BrtpAA6EbYl1 dcnXu+NNFaxaDB3p7TZrwOLXP X-Received: by 2002:a19:6411:0:b0:4b4:7093:9ed0 with SMTP id y17-20020a196411000000b004b470939ed0mr12907061lfb.106.1669634241653; Mon, 28 Nov 2022 03:17:21 -0800 (PST) X-Google-Smtp-Source: AA0mqf4KvrUUC2Hr+80m4Yebe5nl63nn6blJt+gINMpXLHG4FBFS8XCb05jDYhEy3lY2vUr4Gq8UPw== X-Received: by 2002:a19:6411:0:b0:4b4:7093:9ed0 with SMTP id y17-20020a196411000000b004b470939ed0mr12907052lfb.106.1669634241313; Mon, 28 Nov 2022 03:17:21 -0800 (PST) Received: from localhost.localdomain (c-e6a5e255.022-110-73746f36.bbcust.telenor.se. [85.226.165.230]) by smtp.googlemail.com with ESMTPSA id m16-20020a056512359000b00497a1f92a72sm1682055lfr.221.2022.11.28.03.17.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Nov 2022 03:17:20 -0800 (PST) From: Alexander Larsson To: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org, gscrivan@redhat.com, alexl@redhat.com Subject: [PATCH 4/6] composefs: Add filesystem implementation Date: Mon, 28 Nov 2022 12:17:12 +0100 Message-Id: <1f0bd3e3a0c68ee19dd96ee0d573bb113428f1b6.1669631086.git.alexl@redhat.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" This is the basic inode and filesystem implementation. Signed-off-by: Alexander Larsson Signed-off-by: Giuseppe Scrivano --- fs/composefs/cfs.c | 941 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 941 insertions(+) create mode 100644 fs/composefs/cfs.c diff --git a/fs/composefs/cfs.c b/fs/composefs/cfs.c new file mode 100644 index 000000000000..4ed355ab079d --- /dev/null +++ b/fs/composefs/cfs.c @@ -0,0 +1,941 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * composefs + * + * Copyright (C) 2000 Linus Torvalds. + * 2000 Transmeta Corp. + * Copyright (C) 2021 Giuseppe Scrivano + * Copyright (C) 2022 Alexander Larsson + * + * This file is released under the GPL. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "cfs-internals.h" + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Giuseppe Scrivano "); + +#define CFS_MAX_STACK 500 + +struct cfs_info { + struct cfs_context_s cfs_ctx; + + char *base_path; + + size_t n_bases; + struct vfsmount **bases; + + u32 verity_check; /* 0 =3D=3D none, 1 =3D=3D if specified in image, 2 =3D= =3D require in image */ + bool has_digest; + u8 digest[SHA256_DIGEST_SIZE]; /* fs-verity digest */ +}; + +struct cfs_inode { + /* must be first for clear in cfs_alloc_inode to work */ + struct inode vfs_inode; + + struct cfs_inode_data_s inode_data; +}; + +static inline struct cfs_inode *CFS_I(struct inode *inode) +{ + return container_of(inode, struct cfs_inode, vfs_inode); +} + +static struct file empty_file; + +static const struct file_operations cfs_file_operations; +static const struct vm_operations_struct generic_file_vm_ops; + +static const struct super_operations cfs_ops; +static const struct file_operations cfs_dir_operations; +static const struct inode_operations cfs_dir_inode_operations; +static const struct inode_operations cfs_file_inode_operations; +static const struct inode_operations cfs_link_inode_operations; + +static const struct xattr_handler *cfs_xattr_handlers[]; +static const struct export_operations cfs_export_operations; + +static const struct address_space_operations cfs_aops =3D { + .direct_IO =3D noop_direct_IO, +}; + +static ssize_t cfs_listxattr(struct dentry *dentry, char *names, size_t si= ze); + +/* copied from overlayfs. */ +static unsigned int cfs_split_basedirs(char *str) +{ + unsigned int ctr =3D 1; + char *s, *d; + + for (s =3D d =3D str;; s++, d++) { + if (*s =3D=3D '\\') { + s++; + } else if (*s =3D=3D ':') { + *d =3D '\0'; + ctr++; + continue; + } + *d =3D *s; + if (!*s) + break; + } + return ctr; +} + +static struct inode *cfs_make_inode(struct cfs_context_s *ctx, + struct super_block *sb, ino_t ino_num, + struct cfs_inode_s *ino, + const struct inode *dir) +{ + struct cfs_xattr_header_s *xattrs =3D NULL; + struct cfs_inode *cino; + struct inode *inode =3D NULL; + struct cfs_inode_data_s inode_data =3D { 0 }; + int ret, res; + + res =3D cfs_init_inode_data(ctx, ino, ino_num, &inode_data); + if (res < 0) { + ret =3D res; + goto fail; + } + + inode =3D new_inode(sb); + if (inode) { + inode_init_owner(&init_user_ns, inode, dir, ino->st_mode); + inode->i_mapping->a_ops =3D &cfs_aops; + + cino =3D CFS_I(inode); + cino->inode_data =3D inode_data; + + inode->i_ino =3D ino_num; + set_nlink(inode, ino->st_nlink); + inode->i_rdev =3D ino->st_rdev; + inode->i_uid =3D make_kuid(current_user_ns(), ino->st_uid); + inode->i_gid =3D make_kgid(current_user_ns(), ino->st_gid); + inode->i_mode =3D ino->st_mode; + inode->i_atime =3D ino->st_mtim; + inode->i_mtime =3D ino->st_mtim; + inode->i_ctime =3D ino->st_ctim; + + switch (ino->st_mode & S_IFMT) { + case S_IFREG: + inode->i_op =3D &cfs_file_inode_operations; + inode->i_fop =3D &cfs_file_operations; + inode->i_size =3D ino->st_size; + break; + case S_IFLNK: + inode->i_link =3D cino->inode_data.path_payload; + inode->i_op =3D &cfs_link_inode_operations; + inode->i_fop =3D &cfs_file_operations; + break; + case S_IFDIR: + inode->i_op =3D &cfs_dir_inode_operations; + inode->i_fop =3D &cfs_dir_operations; + inode->i_size =3D 4096; + break; + case S_IFCHR: + case S_IFBLK: + if (current_user_ns() !=3D &init_user_ns) { + ret =3D -EPERM; + goto fail; + } + fallthrough; + default: + inode->i_op =3D &cfs_file_inode_operations; + init_special_inode(inode, ino->st_mode, ino->st_rdev); + break; + } + } + return inode; + +fail: + if (inode) + iput(inode); + kfree(xattrs); + cfs_inode_data_put(&inode_data); + return ERR_PTR(ret); +} + +static struct inode *cfs_get_root_inode(struct super_block *sb) +{ + struct cfs_info *fsi =3D sb->s_fs_info; + struct cfs_inode_s ino_buf; + struct cfs_inode_s *ino; + u64 index; + + ino =3D cfs_get_root_ino(&fsi->cfs_ctx, &ino_buf, &index); + if (IS_ERR(ino)) + return ERR_CAST(ino); + + return cfs_make_inode(&fsi->cfs_ctx, sb, index, ino, NULL); +} + +static bool cfs_iterate_cb(void *private, const char *name, int name_len, + u64 ino, unsigned int dtype) +{ + struct dir_context *ctx =3D private; + + if (!dir_emit(ctx, name, name_len, ino, dtype)) + return 0; + + ctx->pos++; + return 1; +} + +static int cfs_iterate(struct file *file, struct dir_context *ctx) +{ + struct inode *inode =3D file->f_inode; + struct cfs_info *fsi =3D inode->i_sb->s_fs_info; + struct cfs_inode *cino =3D CFS_I(inode); + + if (!dir_emit_dots(file, ctx)) + return 0; + + return cfs_dir_iterate(&fsi->cfs_ctx, inode->i_ino, &cino->inode_data, + ctx->pos - 2, cfs_iterate_cb, ctx); +} + +static struct dentry *cfs_lookup(struct inode *dir, struct dentry *dentry, + unsigned int flags) +{ + struct cfs_info *fsi =3D dir->i_sb->s_fs_info; + struct cfs_inode *cino =3D CFS_I(dir); + struct cfs_inode_s ino_buf; + struct inode *inode; + struct cfs_inode_s *ino_s; + u64 index; + int ret; + + if (dentry->d_name.len > NAME_MAX) + return ERR_PTR(-ENAMETOOLONG); + + ret =3D cfs_dir_lookup(&fsi->cfs_ctx, dir->i_ino, &cino->inode_data, + dentry->d_name.name, dentry->d_name.len, &index); + if (ret < 0) + return ERR_PTR(ret); + if (ret =3D=3D 0) + goto return_negative; + + ino_s =3D cfs_get_ino_index(&fsi->cfs_ctx, index, &ino_buf); + if (IS_ERR(ino_s)) + return ERR_CAST(ino_s); + + inode =3D cfs_make_inode(&fsi->cfs_ctx, dir->i_sb, index, ino_s, dir); + if (IS_ERR(inode)) + return ERR_CAST(inode); + + return d_splice_alias(inode, dentry); + +return_negative: + d_add(dentry, NULL); + return NULL; +} + +static const struct file_operations cfs_dir_operations =3D { + .llseek =3D generic_file_llseek, + .read =3D generic_read_dir, + .iterate_shared =3D cfs_iterate, +}; + +static const struct inode_operations cfs_dir_inode_operations =3D { + .lookup =3D cfs_lookup, + .listxattr =3D cfs_listxattr, +}; + +static const struct inode_operations cfs_link_inode_operations =3D { + .get_link =3D simple_get_link, + .listxattr =3D cfs_listxattr, +}; + +static void digest_to_string(const u8 *digest, char *buf) +{ + static const char hexchars[] =3D "0123456789abcdef"; + u32 i, j; + + for (i =3D 0, j =3D 0; i < SHA256_DIGEST_SIZE; i++, j +=3D 2) { + u8 byte =3D digest[i]; + + buf[j] =3D hexchars[byte >> 4]; + buf[j + 1] =3D hexchars[byte & 0xF]; + } + buf[j] =3D '\0'; +} + +static int digest_from_string(const char *digest_str, u8 *digest) +{ + size_t i, j; + + for (i =3D 0, j =3D 0; i < SHA256_DIGEST_SIZE; i +=3D 1, j +=3D 2) { + int big, little; + + if (digest_str[j] =3D=3D 0 || digest_str[j + 1] =3D=3D 0) + return -EINVAL; /* Too short string */ + + big =3D cfs_xdigit_value(digest_str[j]); + little =3D cfs_xdigit_value(digest_str[j + 1]); + + if (big =3D=3D -1 || little =3D=3D -1) + return -EINVAL; /* Not hex digit */ + + digest[i] =3D (big << 4) | little; + } + + if (digest_str[j] !=3D 0) + return -EINVAL; /* Too long string */ + + return 0; +} + +/* + * Display the mount options in /proc/mounts. + */ +static int cfs_show_options(struct seq_file *m, struct dentry *root) +{ + struct cfs_info *fsi =3D root->d_sb->s_fs_info; + + if (fsi->base_path) + seq_show_option(m, "basedir", fsi->base_path); + if (fsi->has_digest) { + char buf[SHA256_DIGEST_SIZE * 2 + 1]; + + digest_to_string(fsi->digest, buf); + seq_show_option(m, "digest", buf); + } + if (fsi->verity_check !=3D 0) + seq_printf(m, ",verity_check=3D%u", fsi->verity_check); + + return 0; +} + +static struct kmem_cache *cfs_inode_cachep; + +static struct inode *cfs_alloc_inode(struct super_block *sb) +{ + struct cfs_inode *cino =3D + alloc_inode_sb(sb, cfs_inode_cachep, GFP_KERNEL); + + if (!cino) + return NULL; + + memset((u8 *)cino + sizeof(struct inode), 0, + sizeof(struct cfs_inode) - sizeof(struct inode)); + + return &cino->vfs_inode; +} + +static void cfs_destroy_inode(struct inode *inode) +{ + struct cfs_inode *cino =3D CFS_I(inode); + + cfs_inode_data_put(&cino->inode_data); +} + +static void cfs_free_inode(struct inode *inode) +{ + struct cfs_inode *cino =3D CFS_I(inode); + + kmem_cache_free(cfs_inode_cachep, cino); +} + +static void cfs_put_super(struct super_block *sb) +{ + struct cfs_info *fsi =3D sb->s_fs_info; + + cfs_ctx_put(&fsi->cfs_ctx); + if (fsi->bases) { + kern_unmount_array(fsi->bases, fsi->n_bases); + kfree(fsi->bases); + } + kfree(fsi->base_path); + + kfree(fsi); +} + +static int cfs_statfs(struct dentry *dentry, struct kstatfs *buf) +{ + struct cfs_info *fsi =3D dentry->d_sb->s_fs_info; + int err =3D 0; + + /* We return the free space, etc from the first base dir. */ + if (fsi->n_bases > 0) { + struct path root =3D { .mnt =3D fsi->bases[0], + .dentry =3D fsi->bases[0]->mnt_root }; + err =3D vfs_statfs(&root, buf); + } + + if (!err) { + buf->f_namelen =3D NAME_MAX; + buf->f_type =3D dentry->d_sb->s_magic; + } + + return err; +} + +static const struct super_operations cfs_ops =3D { + .statfs =3D cfs_statfs, + .drop_inode =3D generic_delete_inode, + .show_options =3D cfs_show_options, + .put_super =3D cfs_put_super, + .destroy_inode =3D cfs_destroy_inode, + .alloc_inode =3D cfs_alloc_inode, + .free_inode =3D cfs_free_inode, +}; + +enum cfs_param { + Opt_base_path, + Opt_digest, + Opt_verity_check, +}; + +const struct fs_parameter_spec cfs_parameters[] =3D { + fsparam_string("basedir", Opt_base_path), + fsparam_string("digest", Opt_digest), + fsparam_u32("verity_check", Opt_verity_check), + {} +}; + +static int cfs_parse_param(struct fs_context *fc, struct fs_parameter *par= am) +{ + struct fs_parse_result result; + struct cfs_info *fsi =3D fc->s_fs_info; + int opt, r; + + opt =3D fs_parse(fc, cfs_parameters, param, &result); + if (opt =3D=3D -ENOPARAM) + return vfs_parse_fs_param_source(fc, param); + if (opt < 0) + return opt; + + switch (opt) { + case Opt_base_path: + kfree(fsi->base_path); + /* Take ownership. */ + fsi->base_path =3D param->string; + param->string =3D NULL; + break; + case Opt_digest: + r =3D digest_from_string(param->string, fsi->digest); + if (r < 0) + return r; + fsi->has_digest =3D true; + fsi->verity_check =3D 2; /* Default to full verity check */ + break; + case Opt_verity_check: + if (result.uint_32 > 2) + return invalfc(fc, "Invalid verity_check mode"); + fsi->verity_check =3D result.uint_32; + break; + } + + return 0; +} + +static struct vfsmount *resolve_basedir(const char *name) +{ + struct path path =3D {}; + struct vfsmount *mnt; + int err =3D -EINVAL; + + if (!*name) { + pr_err("empty basedir\n"); + goto out; + } + err =3D kern_path(name, LOOKUP_FOLLOW, &path); + if (err) { + pr_err("failed to resolve '%s': %i\n", name, err); + goto out; + } + + mnt =3D clone_private_mount(&path); + err =3D PTR_ERR(mnt); + if (IS_ERR(mnt)) { + pr_err("failed to clone basedir\n"); + goto out_put; + } + + path_put(&path); + + /* Don't inherit atime flags */ + mnt->mnt_flags &=3D ~(MNT_NOATIME | MNT_NODIRATIME | MNT_RELATIME); + + return mnt; + +out_put: + path_put(&path); +out: + return ERR_PTR(err); +} + +static int cfs_fill_super(struct super_block *sb, struct fs_context *fc) +{ + struct cfs_info *fsi =3D sb->s_fs_info; + struct vfsmount **bases =3D NULL; + size_t numbasedirs =3D 0; + struct inode *inode; + struct vfsmount *mnt; + int ret; + + if (sb->s_root) + return -EINVAL; + + /* Set up the inode allocator early */ + sb->s_op =3D &cfs_ops; + sb->s_flags |=3D SB_RDONLY; + sb->s_magic =3D CFS_MAGIC; + sb->s_xattr =3D cfs_xattr_handlers; + sb->s_export_op =3D &cfs_export_operations; + + if (fsi->base_path) { + char *lower, *splitlower =3D NULL; + size_t i; + + ret =3D -ENOMEM; + splitlower =3D kstrdup(fsi->base_path, GFP_KERNEL); + if (!splitlower) + goto fail; + + ret =3D -EINVAL; + numbasedirs =3D cfs_split_basedirs(splitlower); + if (numbasedirs > CFS_MAX_STACK) { + pr_err("too many lower directories, limit is %d\n", + CFS_MAX_STACK); + kfree(splitlower); + goto fail; + } + + ret =3D -ENOMEM; + bases =3D kcalloc(numbasedirs, sizeof(struct vfsmount *), + GFP_KERNEL); + if (!bases) { + kfree(splitlower); + goto fail; + } + + lower =3D splitlower; + for (i =3D 0; i < numbasedirs; i++) { + mnt =3D resolve_basedir(lower); + if (IS_ERR(mnt)) { + ret =3D PTR_ERR(mnt); + kfree(splitlower); + goto fail; + } + bases[i] =3D mnt; + + lower =3D strchr(lower, '\0') + 1; + } + kfree(splitlower); + } + + /* Must be inited before calling cfs_get_inode. */ + ret =3D cfs_init_ctx(fc->source, fsi->has_digest ? fsi->digest : NULL, + &fsi->cfs_ctx); + if (ret < 0) + goto fail; + + inode =3D cfs_get_root_inode(sb); + if (IS_ERR(inode)) { + ret =3D PTR_ERR(inode); + goto fail; + } + sb->s_root =3D d_make_root(inode); + + ret =3D -ENOMEM; + if (!sb->s_root) + goto fail; + + sb->s_maxbytes =3D MAX_LFS_FILESIZE; + sb->s_blocksize =3D PAGE_SIZE; + sb->s_blocksize_bits =3D PAGE_SHIFT; + + sb->s_time_gran =3D 1; + + fsi->bases =3D bases; + fsi->n_bases =3D numbasedirs; + return 0; +fail: + if (bases) { + size_t i; + + for (i =3D 0; i < numbasedirs; i++) { + if (bases[i]) + kern_unmount(bases[i]); + } + kfree(bases); + } + cfs_ctx_put(&fsi->cfs_ctx); + return ret; +} + +static int cfs_get_tree(struct fs_context *fc) +{ + return get_tree_nodev(fc, cfs_fill_super); +} + +static const struct fs_context_operations cfs_context_ops =3D { + .parse_param =3D cfs_parse_param, + .get_tree =3D cfs_get_tree, +}; + +static struct file *open_base_file(struct cfs_info *fsi, struct inode *ino= de, + struct file *file) +{ + struct cfs_inode *cino =3D CFS_I(inode); + struct file *real_file; + char *real_path =3D cino->inode_data.path_payload; + size_t i; + + for (i =3D 0; i < fsi->n_bases; i++) { + real_file =3D file_open_root_mnt(fsi->bases[i], real_path, + file->f_flags, 0); + if (!IS_ERR(real_file) || PTR_ERR(real_file) !=3D -ENOENT) + return real_file; + } + + return ERR_PTR(-ENOENT); +} + +static int cfs_open_file(struct inode *inode, struct file *file) +{ + struct cfs_inode *cino =3D CFS_I(inode); + struct cfs_info *fsi =3D inode->i_sb->s_fs_info; + struct file *real_file; + struct file *faked_file; + char *real_path =3D cino->inode_data.path_payload; + + if (WARN_ON(!file)) + return -EIO; + + if (file->f_flags & (O_WRONLY | O_RDWR | O_CREAT | O_EXCL | O_TRUNC)) + return -EROFS; + + if (!real_path) { + file->private_data =3D &empty_file; + return 0; + } + + if (fsi->verity_check >=3D 2 && !cino->inode_data.has_digest) { + pr_warn("WARNING: composefs image file '%pd' specified no fs-verity dige= st\n", + file->f_path.dentry); + return -EIO; + } + + real_file =3D open_base_file(fsi, inode, file); + + if (IS_ERR(real_file)) + return PTR_ERR(real_file); + + /* If metadata records a digest for the file, ensure it is there + * and correct before using the contents. + */ + if (cino->inode_data.has_digest && fsi->verity_check >=3D 1) { + u8 verity_digest[FS_VERITY_MAX_DIGEST_SIZE]; + enum hash_algo verity_algo; + int res; + + res =3D fsverity_get_digest(d_inode(real_file->f_path.dentry), + verity_digest, &verity_algo); + if (res < 0) { + pr_warn("WARNING: composefs backing file '%pd' has no fs-verity digest\= n", + real_file->f_path.dentry); + fput(real_file); + return -EIO; + } + if (verity_algo !=3D HASH_ALGO_SHA256 || + memcmp(cino->inode_data.digest, verity_digest, + SHA256_DIGEST_SIZE) !=3D 0) { + pr_warn("WARNING: composefs backing file '%pd' has the wrong fs-verity = digest\n", + real_file->f_path.dentry); + fput(real_file); + return -EIO; + } + } + + faked_file =3D open_with_fake_path(&file->f_path, file->f_flags, + real_file->f_inode, current_cred()); + fput(real_file); + + if (IS_ERR(faked_file)) + return PTR_ERR(faked_file); + + file->private_data =3D faked_file; + return 0; +} + +static unsigned long cfs_mmu_get_unmapped_area(struct file *file, + unsigned long addr, + unsigned long len, + unsigned long pgoff, + unsigned long flags) +{ + struct file *realfile =3D file->private_data; + + if (realfile =3D=3D &empty_file) + return 0; + + return current->mm->get_unmapped_area(file, addr, len, pgoff, flags); +} + +static int cfs_release_file(struct inode *inode, struct file *file) +{ + struct file *realfile =3D file->private_data; + + if (WARN_ON(!realfile)) + return -EIO; + + if (realfile =3D=3D &empty_file) + return 0; + + fput(file->private_data); + + return 0; +} + +static int cfs_mmap(struct file *file, struct vm_area_struct *vma) +{ + struct file *realfile =3D file->private_data; + int ret; + + if (realfile =3D=3D &empty_file) + return 0; + + if (!realfile->f_op->mmap) + return -ENODEV; + + if (WARN_ON(file !=3D vma->vm_file)) + return -EIO; + + vma_set_file(vma, realfile); + + ret =3D call_mmap(vma->vm_file, vma); + + return ret; +} + +static ssize_t cfs_read_iter(struct kiocb *iocb, struct iov_iter *iter) +{ + struct file *file =3D iocb->ki_filp; + struct file *realfile =3D file->private_data; + int ret; + + if (realfile =3D=3D &empty_file) + return 0; + + if (!realfile->f_op->read_iter) + return -ENODEV; + + iocb->ki_filp =3D realfile; + ret =3D call_read_iter(realfile, iocb, iter); + iocb->ki_filp =3D file; + + return ret; +} + +static int cfs_fadvise(struct file *file, loff_t offset, loff_t len, int a= dvice) +{ + struct file *realfile =3D file->private_data; + + if (realfile =3D=3D &empty_file) + return 0; + + return vfs_fadvise(realfile, offset, len, advice); +} + +static int cfs_encode_fh(struct inode *inode, u32 *fh, int *max_len, + struct inode *parent) +{ + int len =3D 3; + u64 nodeid; + u32 generation; + + if (*max_len < len) { + *max_len =3D len; + return FILEID_INVALID; + } + + nodeid =3D inode->i_ino; + generation =3D inode->i_generation; + + fh[0] =3D (u32)(nodeid >> 32); + fh[1] =3D (u32)(nodeid & 0xffffffff); + fh[2] =3D generation; + + *max_len =3D len; + + return 0x91; +} + +static struct dentry *cfs_fh_to_dentry(struct super_block *sb, struct fid = *fid, + int fh_len, int fh_type) +{ + struct cfs_info *fsi =3D sb->s_fs_info; + struct inode *ino; + u64 inode_index; + u32 generation; + + if (fh_type !=3D 0x91 || fh_len < 3) + return NULL; + + inode_index =3D (u64)(fid->raw[0]) << 32; + inode_index |=3D fid->raw[1]; + generation =3D fid->raw[2]; + + ino =3D ilookup(sb, inode_index); + if (!ino) { + struct cfs_inode_s inode_buf; + struct cfs_inode_s *inode; + + inode =3D cfs_get_ino_index(&fsi->cfs_ctx, inode_index, + &inode_buf); + if (IS_ERR(inode)) + return ERR_CAST(inode); + + ino =3D cfs_make_inode(&fsi->cfs_ctx, sb, inode_index, inode, + NULL); + if (IS_ERR(ino)) + return ERR_CAST(ino); + } + if (ino->i_generation !=3D generation) { + iput(ino); + return ERR_PTR(-ESTALE); + } + return d_obtain_alias(ino); +} + +static struct dentry *cfs_fh_to_parent(struct super_block *sb, struct fid = *fid, + int fh_len, int fh_type) +{ + return ERR_PTR(-EACCES); +} + +static int cfs_get_name(struct dentry *parent, char *name, struct dentry *= child) +{ + WARN_ON_ONCE(1); + return -EIO; +} + +static struct dentry *cfs_get_parent(struct dentry *dentry) +{ + WARN_ON_ONCE(1); + return ERR_PTR(-EIO); +} + +static const struct export_operations cfs_export_operations =3D { + .fh_to_dentry =3D cfs_fh_to_dentry, + .fh_to_parent =3D cfs_fh_to_parent, + .encode_fh =3D cfs_encode_fh, + .get_parent =3D cfs_get_parent, + .get_name =3D cfs_get_name, +}; + +static int cfs_getxattr(const struct xattr_handler *handler, + struct dentry *unused2, struct inode *inode, + const char *name, void *value, size_t size) +{ + struct cfs_info *fsi =3D inode->i_sb->s_fs_info; + struct cfs_inode *cino =3D CFS_I(inode); + + return cfs_get_xattr(&fsi->cfs_ctx, &cino->inode_data, name, value, + size); +} + +static ssize_t cfs_listxattr(struct dentry *dentry, char *names, size_t si= ze) +{ + struct inode *inode =3D d_inode(dentry); + struct cfs_info *fsi =3D inode->i_sb->s_fs_info; + struct cfs_inode *cino =3D CFS_I(inode); + + return cfs_list_xattrs(&fsi->cfs_ctx, &cino->inode_data, names, size); +} + +static const struct file_operations cfs_file_operations =3D { + .read_iter =3D cfs_read_iter, + .mmap =3D cfs_mmap, + .fadvise =3D cfs_fadvise, + .fsync =3D noop_fsync, + .splice_read =3D generic_file_splice_read, + .llseek =3D generic_file_llseek, + .get_unmapped_area =3D cfs_mmu_get_unmapped_area, + .release =3D cfs_release_file, + .open =3D cfs_open_file, +}; + +static const struct xattr_handler cfs_xattr_handler =3D { + .prefix =3D "", /* catch all */ + .get =3D cfs_getxattr, +}; + +static const struct xattr_handler *cfs_xattr_handlers[] =3D { + &cfs_xattr_handler, + NULL, +}; + +static const struct inode_operations cfs_file_inode_operations =3D { + .setattr =3D simple_setattr, + .getattr =3D simple_getattr, + + .listxattr =3D cfs_listxattr, +}; + +static int cfs_init_fs_context(struct fs_context *fc) +{ + struct cfs_info *fsi; + + fsi =3D kzalloc(sizeof(*fsi), GFP_KERNEL); + if (!fsi) + return -ENOMEM; + + fc->s_fs_info =3D fsi; + fc->ops =3D &cfs_context_ops; + return 0; +} + +static struct file_system_type cfs_type =3D { + .owner =3D THIS_MODULE, + .name =3D "composefs", + .init_fs_context =3D cfs_init_fs_context, + .parameters =3D cfs_parameters, + .kill_sb =3D kill_anon_super, +}; + +static void cfs_inode_init_once(void *foo) +{ + struct cfs_inode *cino =3D foo; + + inode_init_once(&cino->vfs_inode); +} + +static int __init init_cfs(void) +{ + cfs_inode_cachep =3D kmem_cache_create( + "cfs_inode", sizeof(struct cfs_inode), 0, + (SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD | SLAB_ACCOUNT), + cfs_inode_init_once); + if (!cfs_inode_cachep) + return -ENOMEM; + + return register_filesystem(&cfs_type); +} + +static void __exit exit_cfs(void) +{ + unregister_filesystem(&cfs_type); + + /* Ensure all RCU free inodes are safe to be destroyed. */ + rcu_barrier(); + + kmem_cache_destroy(cfs_inode_cachep); +} + +module_init(init_cfs); +module_exit(exit_cfs); --=20 2.38.1 From nobody Fri Sep 19 05:36:00 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1646FC4167D for ; Mon, 28 Nov 2022 11:19:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230453AbiK1LTc (ORCPT ); Mon, 28 Nov 2022 06:19:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37028 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231283AbiK1LSh (ORCPT ); Mon, 28 Nov 2022 06:18:37 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 53587C5D for ; Mon, 28 Nov 2022 03:17:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669634258; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JXn8CbC4v61MSts+Slejy+sOsfCvEIm7Eo3SJJdD8pw=; b=dzeTi8oKtKx0G+M3Aa1z+2tGfB17WtKU8OmxMriPuBGZakNczUr8aq+KHjx1AR6jA4QFSy m3Vs+rZ9BVABeYPXY43Sao6+tc68q/qWJDjqTSWYnTaGT+TF2qriAt4J/W5VasHI1ZQObK ljhG7XEA9lSSWWZbC5P1MH2YTrsTtM4= Received: from mail-lf1-f69.google.com (mail-lf1-f69.google.com [209.85.167.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-653-Crr-KeKANq-jojsRAzx11Q-1; Mon, 28 Nov 2022 06:17:37 -0500 X-MC-Unique: Crr-KeKANq-jojsRAzx11Q-1 Received: by mail-lf1-f69.google.com with SMTP id m25-20020a195219000000b004b40c1d5566so3674736lfb.4 for ; Mon, 28 Nov 2022 03:17:36 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=JXn8CbC4v61MSts+Slejy+sOsfCvEIm7Eo3SJJdD8pw=; b=HXTAZPyvMMTauFWCPaiEv7MOWvYKllTXrG6dx2/mT+Sk7NDzeDZLABGEaTpltqvpZd qD83MdlZPm0noQjgVncra09mY2MKpiH1NjuxDbliIxlhvZ2x+AtJysurpTzOOcJdAoZ4 huPXH4N9k+91CVuYTkRQIu2RfRiHoklP0FP+Pq5Y+zzt7PFfIM2H0CSHCmrRXOnQgCn6 Tvq/tOsAESvamwvgYJHUGsB6/usnOyn+rQTPQn7PmeJEMECIMOUVRPW69ZU5E1X6qIc9 2OpzMyTNNblodlpF8CyFiTza5dmUZKeIsH6H4CZV5E532KswDdtKvOmFShHiV9ATuAxY gTlw== X-Gm-Message-State: ANoB5pnCdw2Td9nvPmpMdCnftWfY0yroF/LPgWuT5Zaa7CRFBKFOOdLz ulsjtss39SegPn/Z4GjP1Mrw78LAscuvnQ+oQnHcMbo76O6DOll8+BU+V1HcTTj12AErkyIBgrE T79Me9MhW6Ag6cNEKKkODlDro X-Received: by 2002:ac2:41c6:0:b0:4b0:4b08:6873 with SMTP id d6-20020ac241c6000000b004b04b086873mr18214319lfi.329.1669634255634; Mon, 28 Nov 2022 03:17:35 -0800 (PST) X-Google-Smtp-Source: AA0mqf7Oq9xM109ETn0OjdyWli6jQ9pxM8qtPSQ42EmIJgI9D6yzQXcC4Njn7FLZc0QQ5+pWnmivqQ== X-Received: by 2002:ac2:41c6:0:b0:4b0:4b08:6873 with SMTP id d6-20020ac241c6000000b004b04b086873mr18214308lfi.329.1669634255326; Mon, 28 Nov 2022 03:17:35 -0800 (PST) Received: from localhost.localdomain (c-e6a5e255.022-110-73746f36.bbcust.telenor.se. [85.226.165.230]) by smtp.googlemail.com with ESMTPSA id g25-20020a2eb0d9000000b0026dce0a5ca9sm1187582ljl.70.2022.11.28.03.17.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Nov 2022 03:17:34 -0800 (PST) From: Alexander Larsson To: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org, gscrivan@redhat.com, alexl@redhat.com Subject: [PATCH 5/6] composefs: Add documentation Date: Mon, 28 Nov 2022 12:17:26 +0100 Message-Id: <8a9aefceebe42d36164f3516c173f18189f0d7e7.1669631086.git.alexl@redhat.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" This adds documentation about the composefs filesystem and how to use it. Signed-off-by: Alexander Larsson --- Documentation/filesystems/composefs.rst | 162 ++++++++++++++++++++++++ 1 file changed, 162 insertions(+) create mode 100644 Documentation/filesystems/composefs.rst diff --git a/Documentation/filesystems/composefs.rst b/Documentation/filesy= stems/composefs.rst new file mode 100644 index 000000000000..75fbf14aeb33 --- /dev/null +++ b/Documentation/filesystems/composefs.rst @@ -0,0 +1,162 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Composefs Filesystem +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Introduction +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Composefs is a read-only file system that is backed by regular files +(rather than a block device). It is designed to help easily share +content between different directory trees, such as container images in +a local store or ostree checkouts. In addition it also has support for +integrity validation of file content and directory metadata, in an +efficient way (using fs-verity). + +The filesystem mount source is a binary blob called the descriptor. It +contains all the inode and directory entry data for the entire +filesystem. However, instead of storing the file content each regular +file inode stores a relative path name, and the filesystem gets the +file content from the filesystem by looking up that filename in a set +of base directories. + +Given such a descriptor called "image.cfs" and a directory with files +called "/dir" you can mount it like: + + mount -t composefs image.cfs -o basedir=3D/dir /mnt + +Content sharing +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Suppose you have a single basedir where the files are content +addressed (i.e. named by content digest), and a set of composefs +descriptors using this basedir. Any file that happen to be shared +between two images (same content, so same digest) will now only be +stored once on the disk. + +Such sharing is possible even if the metadata for the file in the +image differs (common reasons for metadata difference are mtime, +permissions, xattrs, etc). The sharing is also anonymous in the sense +that you can't tell the difference on the mounted files from a +non-shared file (for example by looking at the link count for a +hardlinked file). + +In addition, any shared files that are actively in use will share +page-cache, because the page cache for the file contents will be +addressed by the backing file in the basedir, This means (for example) +that shared libraries between images will only be mmap:ed once across +all mounts. + +Integrity validation +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Composefs uses `fs-verity +` +for integrity validation, and extends it by making the validation also +apply to the directory metadata. This happens on two levels, +validation of the descriptor and validation of the backing files. + +For descriptor validation, the idea is that you enable fs-verity on +the descriptor file which seals it from changes that would affect the +directory metadata. Additionally you can pass a `digest` mount option, +which composefs verifies against the descriptor fs-verity +measure. Such a mount option could be encoded in a trusted source +(like a signed kernel command line) and be used as a root of trust if +using composefs for the root filesystem. + +For file validation, the descriptor can contain digest for each +backing file, and you can enable fs-verity on the backing +files. Composefs will validate the digest before using the backing +files. This means any (accidental or malicious) modification of the +basedir will be detected at the time the file is used. + +Expected use-cases +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Container Image Storage +``````````````````````` + +Typically a container image is stored as a set of "layer" +directories. merged into one mount by using overlayfs. The lower +layers are read-only image content and the upper layer is the +writable state of a running container. Multiple uses of the same +layer can be shared this way, but it is hard to share individual +files between unrelated layers. + +Using composefs, we can instead use a shared, content-addressed +store for all the images in the system, and use a composefs image +for the read-only image content of each image, pointing into the +shared store. Then for a running container we use an overlayfs +with the lower dir being the composefs and the upper dir being +the writable state. + + +Ostree root filesystem validation +````````````````````````````````` + +Ostree uses a content-addressed on-disk store for file content, +allowing efficient updates and sharing of content. However to actually +use these as a root filesystem it needs to create a real +"chroot-style" directory, containing hard links into the store. The +store itself is validated when created, but once the hard-link +directory is created, nothing validates the directory structure of +that. + +Instead of a chroot we can we can use composefs. We create a composefs +image pointing into the object store, enable fs-verity for everything +and encode the fs-verity digest of the descriptor in the +kernel-command line. This will allow booting a trusted system where +all directory metadata and file content is validated lazily at use. + + +Mount options +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +`basedir`: A colon separated list of directories to use as a base when res= olving relative content paths. +`verity_check=3D[0,1,2]`: When to verify backing file fs-verity: 0 =3D=3D = never, 1 =3D=3D if specified in image, 2 =3D=3D always and require it in im= age. +`digest`: A fs-verity sha256 digest that the descriptor file must match. I= f set, `verity_check` defaults to 2. + + +Filesystem format +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The format of the descriptor is contains three sections: header, +inodes and variable data. All data in the file is stored in +little-endian form. + +The header starts at the beginning of the file and contains version, +magic value, offsets to the variable data and the root inode nr. + +The inode section starts at a fixed location right after the +header. It is a array of inode data, where for each inode there is +first a variable length chunk and then a fixed size chunk. An inode nr +is the offset in the inode data to the start of the fixed chunk. + +The fixed inode chunk starts with a flag that tells what parts of the +inode are stored in the file (meaning it is only the maximal size that +is fixed). After that the various inode attributes are serialized in +order, such as mode, ownership, xattrs, and payload length. The +payload length attribute gives the size of the variable chunk. + +The inode variable chunk contains different things depending on the +file type. For regular files it is the backing filename. For symlinks +it is the symlink target. For directories it is a list of references to +dentries, stored in chunks of maximum 4k. The dentry chunks themselves +are stored in the variable data section. + +The variable data section is stored after the inode section, and you +can find it from the offset in the header. It contains dentries and +Xattrs data. The xattrs are referred to by offset and size in the +xattr attribute in the inode data. Each xattr data can be used by many +inodes in the filesystem. The variable data chunks are all smaller than +a page (4K) and are padded to not span pages. + +Tools +=3D=3D=3D=3D=3D + +Tools for composefs can be found at https://github.com/containers/composefs + +There is a mkcomposefs tool which can be used to create images on the +CLI, and a library that applications can use to create composefs +images. --=20 2.38.1 From nobody Fri Sep 19 05:36:00 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C892CC43217 for ; Mon, 28 Nov 2022 11:19:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231293AbiK1LTg (ORCPT ); Mon, 28 Nov 2022 06:19:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37196 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230123AbiK1LSz (ORCPT ); Mon, 28 Nov 2022 06:18:55 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3B0F71A061 for ; Mon, 28 Nov 2022 03:17:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669634271; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bfBW73LGL9+SMTcT0q3UKBY6n7ebNKGVYFSRGPrdczE=; b=MvuMVSZNHXqapxtUIzdA8vfjBBX3aNG5S1ca/Pg+4uOLc4PwD8vtW+FIeMkDIfEZx6u6EN P3rstCT5fK7U2PfPycla9Zy8VpmrxSYX7j8JsRS4Skvq5QjrYH2Eky18MTE/RGRaFwbTb8 DJVzWUUKLQbd7CSZpMHmMYYLDWkYN5A= Received: from mail-lf1-f72.google.com (mail-lf1-f72.google.com [209.85.167.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-245-NxOe1aT9Ncmafmcr2h4A-A-1; Mon, 28 Nov 2022 06:17:50 -0500 X-MC-Unique: NxOe1aT9Ncmafmcr2h4A-A-1 Received: by mail-lf1-f72.google.com with SMTP id q2-20020ac24a62000000b004b4ec7b83f3so3648781lfp.19 for ; Mon, 28 Nov 2022 03:17:49 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bfBW73LGL9+SMTcT0q3UKBY6n7ebNKGVYFSRGPrdczE=; b=zEPgumCABQATUWc2QL4OBlfuaqLcIXYJjqIYkxgHVLK6hucCFp/5tEzX52yAqNsB8d fM16dZWlT2e2Z6ELsaymHi8kUt73dRNIo5VFiOxsNb9fZSm17wyx1SHP2zSIpsih1U6X 5he9QvXK93Tcd8+OAW5NOePF2oUY2sf8410ox7UAUJvAJtfl51dOluJMFD0beMZgWo/4 EVCusK/PZ+LYHRo8C5y3IccriOfMHhwzG2CZjNu0KGuW83IHL1HdDuHwN0tI8eZiF9qw PY8M6pLzYlfLVioLdpAfgB9H2JBCDEo33cYmwZXTCPI1zY5WC0aw9IYVXJxmD7pNlZf1 tdUQ== X-Gm-Message-State: ANoB5plx1OUaVd3ZAATuUhWg56kGZ+wHWAqE7OWI/iRdDCT4pxr8PjEq 6WTLobcXm5txbJs8Y21GdYkpghZZ5cKYb2QNssO+OMuNpWzOvRWc0DxbvXL/Oikwgck+C5yKLNl O9COTDoNz+K84qvorc0WGlFd/ X-Received: by 2002:a19:6917:0:b0:4b0:2da9:55d0 with SMTP id e23-20020a196917000000b004b02da955d0mr10495820lfc.187.1669634268568; Mon, 28 Nov 2022 03:17:48 -0800 (PST) X-Google-Smtp-Source: AA0mqf4mc5mXsUXD1MdME9y94GOyO8gtF9VyPiX8dxqD782++BQhr3pi0l1rgNCaW19RrCU5CChosQ== X-Received: by 2002:a19:6917:0:b0:4b0:2da9:55d0 with SMTP id e23-20020a196917000000b004b02da955d0mr10495813lfc.187.1669634268372; Mon, 28 Nov 2022 03:17:48 -0800 (PST) Received: from localhost.localdomain (c-e6a5e255.022-110-73746f36.bbcust.telenor.se. [85.226.165.230]) by smtp.googlemail.com with ESMTPSA id t1-20020ac25481000000b0048a982ad0a8sm1699609lfk.23.2022.11.28.03.17.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Nov 2022 03:17:47 -0800 (PST) From: Alexander Larsson To: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org, gscrivan@redhat.com, alexl@redhat.com Subject: [PATCH 6/6] composefs: Add kconfig and build support Date: Mon, 28 Nov 2022 12:17:39 +0100 Message-Id: X-Mailer: git-send-email 2.38.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" This commit adds Makefile and Kconfig for composefs, and updates Makefile and Kconfig files in the fs directory Signed-off-by: Alexander Larsson --- fs/Kconfig | 1 + fs/Makefile | 1 + fs/composefs/Kconfig | 18 ++++++++++++++++++ fs/composefs/Makefile | 5 +++++ 4 files changed, 25 insertions(+) create mode 100644 fs/composefs/Kconfig create mode 100644 fs/composefs/Makefile diff --git a/fs/Kconfig b/fs/Kconfig index 2685a4d0d353..de8493fc2b1e 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -127,6 +127,7 @@ source "fs/quota/Kconfig" source "fs/autofs/Kconfig" source "fs/fuse/Kconfig" source "fs/overlayfs/Kconfig" +source "fs/composefs/Kconfig" =20 menu "Caches" =20 diff --git a/fs/Makefile b/fs/Makefile index 4dea17840761..d16974e02468 100644 --- a/fs/Makefile +++ b/fs/Makefile @@ -137,3 +137,4 @@ obj-$(CONFIG_EFIVAR_FS) +=3D efivarfs/ obj-$(CONFIG_EROFS_FS) +=3D erofs/ obj-$(CONFIG_VBOXSF_FS) +=3D vboxsf/ obj-$(CONFIG_ZONEFS_FS) +=3D zonefs/ +obj-$(CONFIG_COMPOSEFS_FS) +=3D composefs/ diff --git a/fs/composefs/Kconfig b/fs/composefs/Kconfig new file mode 100644 index 000000000000..88c5b55380e6 --- /dev/null +++ b/fs/composefs/Kconfig @@ -0,0 +1,18 @@ +# SPDX-License-Identifier: GPL-2.0-only + +config COMPOSEFS_FS + tristate "Composefs filesystem support" + select EXPORTFS + help + Composefs is a filesystem that allows combining file content from + existing regular files with a metadata directory structure from + a separate binary file. This is useful to share file content between + many different directory trees, such as in a local container image stor= e. + + Composefs also allows using fs-verity to validate the content of the + content-files as well as the metadata file which allows dm-verity + like validation with the flexibility of regular files. + + For more information see Documentation/filesystems/composefs.rst + + If unsure, say N. diff --git a/fs/composefs/Makefile b/fs/composefs/Makefile new file mode 100644 index 000000000000..eac8445e7d25 --- /dev/null +++ b/fs/composefs/Makefile @@ -0,0 +1,5 @@ +# SPDX-License-Identifier: GPL-2.0-only + +obj-$(CONFIG_COMPOSEFS_FS) +=3D composefs.o + +composefs-objs +=3D cfs-reader.o cfs.o --=20 2.38.1