From nobody Tue Apr 7 13:08:26 2026 Received: from mta20.hihonor.com (mta20.hihonor.com [81.70.206.69]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5BE871E633C; Fri, 13 Mar 2026 02:41:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=81.70.206.69 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773369695; cv=none; b=dB92Zo+GKicL9N+/ZrHGE981vnSpO/lYATg42BKTlUd4NxRCI0J68ZfMxj99nXvkItEYaSXInb3sXnRev+tqzzT5Mo4rxlZU67bIY2BtoACKCnh2ZqXUJPeGzrXBo0ejOh0Ok3AJih4/OetD3t6d1mxavfjl7+kjnFgUcNxDLmA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773369695; c=relaxed/simple; bh=jDIUV0XJYHkMsfe2XaUbabxW0P2+eqZwWvAIWiaj6tg=; h=From:To:CC:Subject:Date:Message-ID:Content-Type:MIME-Version; b=UvNFXyKRj4VUlq/dInCsrX08nd6cnNz3ebCSloNmepXQTuLNQDEI+EF0beTrWPQdMZnR+Spe6l7wRGPGh6BJ4EX6oLnP5hhyaG5TQ6/dWHStSWw5+8wh6UkaNCMEz/rVQ975PpfoNPZN56/qGEVUttiEhcuEAVkpD2EHRdbKyRI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=honor.com; spf=pass smtp.mailfrom=honor.com; arc=none smtp.client-ip=81.70.206.69 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=honor.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=honor.com Received: from w001.hihonor.com (unknown [10.68.25.235]) by mta20.hihonor.com (SkyGuard) with ESMTPS id 4fX7vG3ZPBzYmW2D; Fri, 13 Mar 2026 10:38:02 +0800 (CST) Received: from a008.hihonor.com (10.68.30.56) by w001.hihonor.com (10.68.25.235) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.27; Fri, 13 Mar 2026 10:41:21 +0800 Received: from a008.hihonor.com (10.68.30.56) by a008.hihonor.com (10.68.30.56) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.27; Fri, 13 Mar 2026 10:41:21 +0800 Received: from a008.hihonor.com ([fe80::b6bf:fc6a:207:6851]) by a008.hihonor.com ([fe80::b6bf:fc6a:207:6851%6]) with mapi id 15.02.2562.027; Fri, 13 Mar 2026 10:41:14 +0800 From: gao xu To: Sergey Senozhatsky CC: Minchan Kim , Jens Axboe , "linux-block@vger.kernel.org" , "linux-kernel@vger.kernel.org" , Andrew Morton , "surenb@google.com" , zhouxiaolong Subject: [PATCH v2] zram: Optimize LZ4 dictionary compression performance Thread-Topic: [PATCH v2] zram: Optimize LZ4 dictionary compression performance Thread-Index: AdyykR12UwAs41bVT+uuRgnhIVy8Cw== Date: Fri, 13 Mar 2026 02:41:14 +0000 Message-ID: <698181478c9c4b10aa21b4a847bdc706@honor.com> Accept-Language: zh-CN, en-US Content-Language: zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Calling `LZ4_loadDict()` repeatedly in Zram causes significant overhead due to its internal dictionary pre-processing. This commit introduces a template stream mechanism to pre-process the dictionary only once when the dictionary is initially set or modified. It then efficiently copies this state for subsequent compressions. Verification Test Items: Test Platform: android16-6.12 1. Collect Anonymous Page Dataset 1) Apply the following patch: static bool zram_meta_alloc(struct zram *zram, u64 disksize) if (!huge_class_size) - huge_class_size =3D zs_huge_class_size(zram->mem_pool); + huge_class_size =3D 0; 2=EF=BC=89Install multiple apps and monkey testing until SwapFree is close = to 0. 3=EF=BC=89Execute the following command to export data: dd if=3D/dev/block/zram0 of=3D/data/samples/zram_dump.img bs=3D4K 2. Train Dictionary Since LZ4 does not have a dedicated dictionary training tool, the zstd tool can be used for training[1]. The command is as follows: zstd --train /data/samples/* --split=3D4096 --maxdict=3D64KB -o /vendor/etc= /dict_data 3. Test Code adb shell "dd if=3D/data/samples/zram_dump.img of=3D/dev/test_pattern bs=3D= 4096 count=3D131072 conv=3Dfsync" adb shell "swapoff /dev/block/zram0" adb shell "echo 1 > /sys/block/zram0/reset" adb shell "echo lz4 > /sys/block/zram0/comp_algorithm" adb shell "echo dict=3D/vendor/etc/dict_data > /sys/block/zram0/algorith= m_params" adb shell "echo 6G > /sys/block/zram0/disksize" echo "Start Compression" adb shell "taskset 80 dd if=3D/dev/test_pattern of=3D/dev/block/zram0 bs=3D= 4096 count=3D131072 conv=3Dfsync" echo. echo "Start Decompression" adb shell "taskset 80 dd if=3D/dev/block/zram0 of=3D/dev/output_result bs= =3D4096 count=3D131072 conv=3Dfsync" echo "mm_stat:" adb shell "cat /sys/block/zram0/mm_stat" echo. Note: To ensure stable test results, it is best to lock the CPU frequency before executing the test. LZ4 supports dictionaries up to 64KB. Below are the test results for compression rates at various dictionary sizes: dict_size base patch 4 KB 156M/s 219M/s 8 KB 136M/s 217M/s 16KB 98M/s 214M/s 32KB 66M/s 225M/s 64KB 38M/s 224M/s When an LZ4 compression dictionary is enabled, compression speed is negatively impacted by the dictionary's size; larger dictionaries result in slower compression. This patch eliminates the influence of dictionary size on compression speed, ensuring consistent performance regardless of dictionary scale. [1] https://github.com/lz4/lz4?tab=3Dreadme-ov-file Signed-off-by: gao xu Acked-by: Sergey Senozhatsky --- v1 -> v2: https://lore.kernel.org/linux-block/20260311084312.1766036-2-senozhatsky@ch= romium.org/ This patch ensures that the compression dictionary is set only during initi= alization, removing code related to the compression dictionary changing during device = operation. Based on Sergey Senozhatsky's suggestions. https://lore.kernel.org/linux-block/ae51966c3cb445e9983230243bb6a5b2@honor.= com/ --- drivers/block/zram/backend_lz4.c | 29 ++++++++++++++++++++++++++--- 1 file changed, 26 insertions(+), 3 deletions(-) diff --git a/drivers/block/zram/backend_lz4.c b/drivers/block/zram/backend_= lz4.c index 04e186614..c449d511b 100644 --- a/drivers/block/zram/backend_lz4.c +++ b/drivers/block/zram/backend_lz4.c @@ -14,13 +14,38 @@ struct lz4_ctx { =20 static void lz4_release_params(struct zcomp_params *params) { + LZ4_stream_t *dict_stream =3D params->drv_data; + + params->drv_data =3D NULL; + if (!dict_stream) + return; + + kfree(dict_stream); } =20 static int lz4_setup_params(struct zcomp_params *params) { + LZ4_stream_t *dict_stream; + int ret; + if (params->level =3D=3D ZCOMP_PARAM_NOT_SET) params->level =3D LZ4_ACCELERATION_DEFAULT; =20 + if (!params->dict || !params->dict_sz) + return 0; + + dict_stream =3D kzalloc_obj(*dict_stream, GFP_KERNEL); + if (!dict_stream) + return -ENOMEM; + + ret =3D LZ4_loadDict(dict_stream, + params->dict, params->dict_sz); + if (ret !=3D params->dict_sz) { + kfree(dict_stream); + return -EINVAL; + } + params->drv_data =3D dict_stream; + return 0; } =20 @@ -79,9 +104,7 @@ static int lz4_compress(struct zcomp_params *params, str= uct zcomp_ctx *ctx, zctx->mem); } else { /* Cstrm needs to be reset */ - ret =3D LZ4_loadDict(zctx->cstrm, params->dict, params->dict_sz); - if (ret !=3D params->dict_sz) - return -EINVAL; + memcpy(zctx->cstrm, params->drv_data, sizeof(*zctx->cstrm)); ret =3D LZ4_compress_fast_continue(zctx->cstrm, req->src, req->dst, req->src_len, req->dst_len, params->level); --