From nobody Sun Feb 8 10:56:07 2026 Received: from out199-16.us.a.mail.aliyun.com (out199-16.us.a.mail.aliyun.com [47.90.199.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EDB2525A64D for ; Wed, 21 May 2025 10:03:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=47.90.199.16 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747821829; cv=none; b=SunTGVHkYFJHy6nqhCH5Jqcn3i4GtyTE/MRPMcGMn0f/ssac8oqE63bNjwHvBsGrUexbm5HXoXLYSGOGY8g3AnhMR3FtUQ2fL4I/Qz5a9h+sFdArhBkiWVr6DdXTjqBI19ZXsmYjoaafsWpVp+uLrfRYU4tTJx3mh+F9Vyojqu8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747821829; c=relaxed/simple; bh=yGEwEaX3ORE4r3dAz3i4C3/NPdtc4Km69ujeDFA1vIQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=MjKsIRqatOxRddkHxD61TtXzP5HuAXOFKOmWlqt/KOzlZIJy+Kl4qlP2XQii9qFTQ+0roVtqmUB5PyPG9eytc4EQMfh+RMGkSLPLq7viCMMG7c8EyCzm2t5I2FYgRSh5ZTshj88TkoxyNJZajD+WDV9nZXP22O0FS8ziwVXgUeY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=eoB5lgfc; arc=none smtp.client-ip=47.90.199.16 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="eoB5lgfc" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1747821814; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=kuAQeZ6nxWNWh2iDG6R4yw8ZJ6kS+bopttR0Cx3ixyA=; b=eoB5lgfcpAKQ8Tt4fseNhfY/uLwKOOUHQJnLxoOnPGeMbGZh7lm9UP3nfMML1dXInvA36iA0tX4pApi+SJMOase/bm5YhYUzuM2YfS5NULTlWOAO8SNIzK+pBNeFU6dftMkc5j3ybjoSyTE87cLSZXfSUkoXUHXJGCja+I6AcT0= Received: from x31i01179.sqa.na131.tbsite.net(mailfrom:hsiangkao@linux.alibaba.com fp:SMTPD_---0WbRHrq9_1747821807 cluster:ay36) by smtp.aliyun-inc.com; Wed, 21 May 2025 18:03:33 +0800 From: Gao Xiang To: linux-erofs@lists.ozlabs.org Cc: LKML , Bo Liu , Gao Xiang Subject: [PATCH UNTESTED v4] erofs: support DEFLATE decompression by using Intel QAT Date: Wed, 21 May 2025 18:03:26 +0800 Message-ID: <20250521100326.2867828-1-hsiangkao@linux.alibaba.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20250516082634.3801-1-liubo03@inspur.com> References: <20250516082634.3801-1-liubo03@inspur.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Bo Liu This patch introduces the use of the Intel QAT to offload data compression in the EROFS filesystem, aiming to improve the decompression performance. A 285MiB dataset is used with the following command to create EROFS images with different cluster sizes: $ mkfs.erofs -zdeflate,level=3D9 -C16384 Fio is used to test the following read patterns: # fio -filename=3Dtestfile -bs=3D4k -rw=3Dread -name=3Djob1 # fio -filename=3Dtestfile -bs=3D4k -rw=3Drandread -name=3Djob1 # fio -filename=3Dtestfile -bs=3D4k -rw=3Drandread --io_size=3D14m -n= ame=3Djob1 Here are some performance numbers for reference: Processors: Intel(R) Xeon(R) 6766E(144 core) Memory: 521 GiB |--------------------------------------------------------------------------= ---| | | Cluster size | sequential read | randread | small randread(5= %) | |-----------|--------------|-----------------|-----------|-----------------= ---| | Intel QAT | 4096 | 538 MiB/s | 112 MiB/s | 20.76 MiB/s = | | Intel QAT | 16384 | 699 MiB/s | 158 MiB/s | 21.02 MiB/s = | | Intel QAT | 65536 | 917 MiB/s | 278 MiB/s | 20.90 MiB/s = | | Intel QAT | 131072 | 1056 MiB/s | 351 MiB/s | 23.36 MiB/s = | | Intel QAT | 262144 | 1145 MiB/s | 431 MiB/s | 26.66 MiB/s = | | deflate | 4096 | 499 MiB/s | 108 MiB/s | 21.50 MiB/s = | | deflate | 16384 | 422 MiB/s | 125 MiB/s | 18.94 MiB/s = | | deflate | 65536 | 452 MiB/s | 159 MiB/s | 13.02 MiB/s = | | deflate | 131072 | 452 MiB/s | 177 MiB/s | 11.44 MiB/s = | | deflate | 262144 | 466 MiB/s | 194 MiB/s | 10.60 MiB/s = | Signed-off-by: Bo Liu Signed-off-by: Gao Xiang --- Hi Bo, Please test/refine this version and complete sysfs documentation. Thanks, Gao Xiang fs/erofs/Kconfig | 14 +++ fs/erofs/Makefile | 1 + fs/erofs/compress.h | 10 ++ fs/erofs/decompressor_crypto.c | 184 ++++++++++++++++++++++++++++++++ fs/erofs/decompressor_deflate.c | 20 +++- fs/erofs/sysfs.c | 35 +++++- fs/erofs/zdata.c | 1 + 7 files changed, 260 insertions(+), 5 deletions(-) create mode 100644 fs/erofs/decompressor_crypto.c diff --git a/fs/erofs/Kconfig b/fs/erofs/Kconfig index 8f68ec49ad89..6beeb7063871 100644 --- a/fs/erofs/Kconfig +++ b/fs/erofs/Kconfig @@ -144,6 +144,20 @@ config EROFS_FS_ZIP_ZSTD =20 If unsure, say N. =20 +config EROFS_FS_ZIP_ACCEL + bool "EROFS hardware decompression support" + depends on EROFS_FS_ZIP + help + Saying Y here includes hardware accelerator support for reading + EROFS file systems containing compressed data. It gives better + decompression speed than the software-implemented decompression, and + it costs lower CPU overhead. + + Hardware accelerator support is an experimental feature for now and + file systems are still readable without selecting this option. + + If unsure, say N. + config EROFS_FS_ONDEMAND bool "EROFS fscache-based on-demand read support (deprecated)" depends on EROFS_FS diff --git a/fs/erofs/Makefile b/fs/erofs/Makefile index 4331d53c7109..549abc424763 100644 --- a/fs/erofs/Makefile +++ b/fs/erofs/Makefile @@ -7,5 +7,6 @@ erofs-$(CONFIG_EROFS_FS_ZIP) +=3D decompressor.o zmap.o zda= ta.o zutil.o erofs-$(CONFIG_EROFS_FS_ZIP_LZMA) +=3D decompressor_lzma.o erofs-$(CONFIG_EROFS_FS_ZIP_DEFLATE) +=3D decompressor_deflate.o erofs-$(CONFIG_EROFS_FS_ZIP_ZSTD) +=3D decompressor_zstd.o +erofs-$(CONFIG_EROFS_FS_ZIP_ACCEL) +=3D decompressor_crypto.o erofs-$(CONFIG_EROFS_FS_BACKED_BY_FILE) +=3D fileio.o erofs-$(CONFIG_EROFS_FS_ONDEMAND) +=3D fscache.o diff --git a/fs/erofs/compress.h b/fs/erofs/compress.h index 2704d7a592a5..2bea4097e0b4 100644 --- a/fs/erofs/compress.h +++ b/fs/erofs/compress.h @@ -76,4 +76,14 @@ int z_erofs_fixup_insize(struct z_erofs_decompress_req *= rq, const char *padbuf, unsigned int padbufsize); int __init z_erofs_init_decompressor(void); void z_erofs_exit_decompressor(void); +int z_erofs_crypto_decompress(struct z_erofs_decompress_req *rq, + struct page **pgpl); +int z_erofs_crypto_enable_engine(const char *name, int len); +#ifdef CONFIG_EROFS_FS_ZIP_ACCEL +void z_erofs_crypto_disable_all_engines(void); +int z_erofs_crypto_show_engines(char *buf, int size, char sep); +#else +static inline void z_erofs_crypto_disable_all_engines(void) {} +static inline int z_erofs_crypto_show_engines(char *buf, int size, char se= p) {} +#endif #endif diff --git a/fs/erofs/decompressor_crypto.c b/fs/erofs/decompressor_crypto.c new file mode 100644 index 000000000000..ba0f46f6ef12 --- /dev/null +++ b/fs/erofs/decompressor_crypto.c @@ -0,0 +1,184 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +#include +#include +#include "compress.h" + +static int __z_erofs_crypto_decompress(struct z_erofs_decompress_req *rq, + struct crypto_acomp *tfm) +{ + struct sg_table st_src, st_dst; + struct acomp_req *req; + struct crypto_wait wait; + u8 *headpage; + int ret; + + headpage =3D kmap_local_page(*rq->in); + ret =3D z_erofs_fixup_insize(rq, headpage + rq->pageofs_in, + min_t(unsigned int, rq->inputsize, + rq->sb->s_blocksize - rq->pageofs_in)); + kunmap_local(headpage); + if (ret) + return ret; + + req =3D acomp_request_alloc(tfm); + if (!req) { + erofs_err(rq->sb, "failed to alloc decompress request"); + return -ENOMEM; + } + + ret =3D sg_alloc_table_from_pages_segment(&st_src, rq->in, rq->inpages, + rq->pageofs_in, rq->inputsize, UINT_MAX, GFP_KERNEL); + if (ret < 0) + goto failed_src_alloc; + + ret =3D sg_alloc_table_from_pages_segment(&st_dst, rq->out, rq->outpages, + rq->pageofs_out, rq->outputsize, UINT_MAX, GFP_KERNEL); + if (ret < 0) + goto failed_dst_alloc; + + acomp_request_set_params(req, st_src.sgl, + st_dst.sgl, rq->inputsize, rq->outputsize); + + crypto_init_wait(&wait); + acomp_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG, + crypto_req_done, &wait); + + ret =3D crypto_wait_req(crypto_acomp_decompress(req), &wait); + if (ret) { + erofs_err(rq->sb, "failed to decompress %d in[%u, %u] out[%u]", + ret, rq->inputsize, rq->pageofs_in, rq->outputsize); + ret =3D -EIO; + } + + sg_free_table(&st_dst); +failed_dst_alloc: + sg_free_table(&st_src); +failed_src_alloc: + acomp_request_free(req); + return ret; +} + +struct z_erofs_crypto_engine { + char *crypto_name; + struct crypto_acomp *tfm; +}; + +struct z_erofs_crypto_engine *z_erofs_crypto[Z_EROFS_COMPRESSION_MAX] =3D { + [Z_EROFS_COMPRESSION_LZ4] =3D (struct z_erofs_crypto_engine[]) { + {}, + }, + [Z_EROFS_COMPRESSION_LZMA] =3D (struct z_erofs_crypto_engine[]) { + {}, + }, + [Z_EROFS_COMPRESSION_DEFLATE] =3D (struct z_erofs_crypto_engine[]) { + { .crypto_name =3D "qat_deflate", }, + {}, + }, + [Z_EROFS_COMPRESSION_ZSTD] =3D (struct z_erofs_crypto_engine[]) { + {}, + }, +}; + +static DECLARE_RWSEM(z_erofs_crypto_rwsem); + +static struct crypto_acomp *z_erofs_crypto_get_engine(int alg) +{ + struct z_erofs_crypto_engine *e; + + for (e =3D z_erofs_crypto[alg]; e->crypto_name; ++e) + if (e->tfm) + return e->tfm; + return NULL; +} + +int z_erofs_crypto_decompress(struct z_erofs_decompress_req *rq, + struct page **pgpl) +{ + struct crypto_acomp *tfm; + int i, err; + + down_read(&z_erofs_crypto_rwsem); + tfm =3D z_erofs_crypto_get_engine(rq->alg); + if (!tfm) { + err =3D -EOPNOTSUPP; + goto out; + } + + for (i =3D 0; i < rq->outpages; i++) { + struct page *const page =3D rq->out[i]; + struct page *victim; + + if (!page) { + victim =3D __erofs_allocpage(pgpl, rq->gfp, true); + if (!victim) { + err =3D -ENOMEM; + goto out; + } + set_page_private(victim, Z_EROFS_SHORTLIVED_PAGE); + rq->out[i] =3D victim; + } + } + err =3D __z_erofs_crypto_decompress(rq, tfm); +out: + up_read(&z_erofs_crypto_rwsem); + return err; +} + +int z_erofs_crypto_enable_engine(const char *name, int len) +{ + struct z_erofs_crypto_engine *e; + struct crypto_acomp *tfm; + int alg; + + down_write(&z_erofs_crypto_rwsem); + for (alg =3D 0; alg < Z_EROFS_COMPRESSION_MAX; ++alg) { + for (e =3D z_erofs_crypto[alg]; e->crypto_name; ++e) { + if (!strncmp(name, e->crypto_name, len)) { + if (e->tfm) + break; + tfm =3D crypto_alloc_acomp(e->crypto_name, 0, 0); + if (IS_ERR(tfm)) { + up_write(&z_erofs_crypto_rwsem); + return -EOPNOTSUPP; + } + e->tfm =3D tfm; + break; + } + } + } + up_write(&z_erofs_crypto_rwsem); + return 0; +} + +void z_erofs_crypto_disable_all_engines(void) +{ + struct z_erofs_crypto_engine *e; + int alg; + + down_write(&z_erofs_crypto_rwsem); + for (alg =3D 0; alg < Z_EROFS_COMPRESSION_MAX; ++alg) { + for (e =3D z_erofs_crypto[alg]; e->crypto_name; ++e) { + if (!e->tfm) + continue; + crypto_free_acomp(e->tfm); + e->tfm =3D NULL; + } + } + up_write(&z_erofs_crypto_rwsem); +} + +int z_erofs_crypto_show_engines(char *buf, int size, char sep) +{ + struct z_erofs_crypto_engine *e; + int alg, len =3D 0; + + for (alg =3D 0; alg < Z_EROFS_COMPRESSION_MAX; ++alg) { + for (e =3D z_erofs_crypto[alg]; e->crypto_name; ++e) { + if (!e->tfm) + continue; + len +=3D scnprintf(buf + len, size - len, "%s%c", + e->crypto_name, sep); + } + } + return len; +} diff --git a/fs/erofs/decompressor_deflate.c b/fs/erofs/decompressor_deflat= e.c index c6908a487054..6909b2d529c7 100644 --- a/fs/erofs/decompressor_deflate.c +++ b/fs/erofs/decompressor_deflate.c @@ -97,8 +97,8 @@ static int z_erofs_load_deflate_config(struct super_block= *sb, return -ENOMEM; } =20 -static int z_erofs_deflate_decompress(struct z_erofs_decompress_req *rq, - struct page **pgpl) +static int __z_erofs_deflate_decompress(struct z_erofs_decompress_req *rq, + struct page **pgpl) { struct super_block *sb =3D rq->sb; struct z_erofs_stream_dctx dctx =3D { .rq =3D rq, .no =3D -1, .ni =3D 0 }; @@ -178,6 +178,22 @@ static int z_erofs_deflate_decompress(struct z_erofs_d= ecompress_req *rq, return err; } =20 +static int z_erofs_deflate_decompress(struct z_erofs_decompress_req *rq, + struct page **pgpl) +{ +#ifdef CONFIG_EROFS_FS_ZIP_ACCEL + int err; + + if (!rq->partial_decoding) { + err =3D z_erofs_crypto_decompress(rq, pgpl); + if (err !=3D -EOPNOTSUPP) + return err; + + } +#endif + return __z_erofs_deflate_decompress(rq, pgpl); +} + const struct z_erofs_decompressor z_erofs_deflate_decomp =3D { .config =3D z_erofs_load_deflate_config, .decompress =3D z_erofs_deflate_decompress, diff --git a/fs/erofs/sysfs.c b/fs/erofs/sysfs.c index dad4e6c6c155..4c0ad4b93161 100644 --- a/fs/erofs/sysfs.c +++ b/fs/erofs/sysfs.c @@ -7,12 +7,14 @@ #include =20 #include "internal.h" +#include "compress.h" =20 enum { attr_feature, attr_drop_caches, attr_pointer_ui, attr_pointer_bool, + attr_accel, }; =20 enum { @@ -60,14 +62,25 @@ static struct erofs_attr erofs_attr_##_name =3D { \ EROFS_ATTR_RW_UI(sync_decompress, erofs_mount_opts); EROFS_ATTR_FUNC(drop_caches, 0200); #endif +#ifdef CONFIG_EROFS_FS_ZIP_ACCEL +EROFS_ATTR_FUNC(accel, 0644); +#endif =20 -static struct attribute *erofs_attrs[] =3D { +static struct attribute *erofs_sb_attrs[] =3D { #ifdef CONFIG_EROFS_FS_ZIP ATTR_LIST(sync_decompress), ATTR_LIST(drop_caches), #endif NULL, }; +ATTRIBUTE_GROUPS(erofs_sb); + +static struct attribute *erofs_attrs[] =3D { +#ifdef CONFIG_EROFS_FS_ZIP_ACCEL + ATTR_LIST(accel), +#endif + NULL, +}; ATTRIBUTE_GROUPS(erofs); =20 /* Features this copy of erofs supports */ @@ -128,12 +141,14 @@ static ssize_t erofs_attr_show(struct kobject *kobj, if (!ptr) return 0; return sysfs_emit(buf, "%d\n", *(bool *)ptr); + case attr_accel: + return z_erofs_crypto_show_engines(buf, PAGE_SIZE, '\n'); } return 0; } =20 static ssize_t erofs_attr_store(struct kobject *kobj, struct attribute *at= tr, - const char *buf, size_t len) + const char *buf, size_t len) { struct erofs_sb_info *sbi =3D container_of(kobj, struct erofs_sb_info, s_kobj); @@ -181,6 +196,19 @@ static ssize_t erofs_attr_store(struct kobject *kobj, = struct attribute *attr, if (t & 1) invalidate_mapping_pages(MNGD_MAPPING(sbi), 0, -1); return len; +#endif +#ifdef CONFIG_EROFS_FS_ZIP_ACCEL + case attr_accel: + buf =3D skip_spaces(buf); + z_erofs_crypto_disable_all_engines(); + while (*buf) { + t =3D strcspn(buf, "\n"); + ret =3D z_erofs_crypto_enable_engine(buf, t); + if (ret < 0) + return ret; + buf +=3D buf[t] =3D=3D '\n' ? t + 1 : t; + } + return len; #endif } return 0; @@ -199,12 +227,13 @@ static const struct sysfs_ops erofs_attr_ops =3D { }; =20 static const struct kobj_type erofs_sb_ktype =3D { - .default_groups =3D erofs_groups, + .default_groups =3D erofs_sb_groups, .sysfs_ops =3D &erofs_attr_ops, .release =3D erofs_sb_release, }; =20 static const struct kobj_type erofs_ktype =3D { + .default_groups =3D erofs_groups, .sysfs_ops =3D &erofs_attr_ops, }; =20 diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c index ab61c84d47cd..fe8071844724 100644 --- a/fs/erofs/zdata.c +++ b/fs/erofs/zdata.c @@ -441,6 +441,7 @@ void z_erofs_exit_subsystem(void) z_erofs_destroy_pcpu_workers(); destroy_workqueue(z_erofs_workqueue); z_erofs_destroy_pcluster_pool(); + z_erofs_crypto_disable_all_engines(); z_erofs_exit_decompressor(); } =20 --=20 2.43.5