From nobody Fri Nov 29 19:27:45 2024 Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E6C5C487A5 for ; Tue, 17 Sep 2024 02:10:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726539037; cv=none; b=eybPZi1UivkeXSQZPejoEwfnRerScu4CvEWFcIGizFRA+pA3Y4+v+nvcPHevk+wUKeBhD8M5bF+9bo7jtO8Np9S2f9z0taAWH0Uxw4M8D485O9QOb4UO5OOYbHaorMUuFBvNDwgRXdrRw2vEqX7Eh118szApIkyxQfiLSEJyWYs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1726539037; c=relaxed/simple; bh=8XiQVRaJH4ln3N1mxHOpbsElpJ0EPxnbzCxXtzZwTbI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Uwd35W+mVRaVTJV7V121ZoGK0kkdag5HmpTc8ZvDl+ksZzIivDVWXy0wpYVhpfmJ1l4S9AWwRh3fw9nsrKCx2IOVva1wlXyVp7PgMgU1DHlysYjW03lhpIgWrdtsZFndbTAJ5mOwY1BqPSo7XxIBbiWzM3kTyIHAlMB4g05BnDQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org; spf=pass smtp.mailfrom=chromium.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b=BUcTPJEk; arc=none smtp.client-ip=209.85.214.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=chromium.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="BUcTPJEk" Received: by mail-pl1-f172.google.com with SMTP id d9443c01a7336-2068acc8b98so45631925ad.3 for ; Mon, 16 Sep 2024 19:10:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1726539035; x=1727143835; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ffF6dMQqehUmDqvR186gvSHc1Me3zwo9POTkpGbeCZg=; b=BUcTPJEk2LPUMc5UQxiJ7HHE5kcv9XfWljSfNz7U3AuOfNkropQtH3VtLvnHIhXXfc tt0JXSv7WC5NE3SKCATrdfBI6KYHL0d0FAkxENg1l2/ciJU6SI2A5ITUNjAtJhr4rxtQ JgMVkbW/wUJNHdVo9ctIgMMO7BbUdVVMccyig= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1726539035; x=1727143835; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ffF6dMQqehUmDqvR186gvSHc1Me3zwo9POTkpGbeCZg=; b=HIa7usqqIx6YY/F9ZF0ad9V1qDvmzzDNNrhwFtnk9xpawXy1kjyAQDxms0dKkHhttq JhAwp6LD/UGgbPohx9bFF3bjuhcA0vWTXZf0zR/wHY4s0SADsNGBDIi6l9JntbpyKDh9 s152TTK2QCxSsMO1zSGMik4inTKZPG+89Jmsb0OO5MYkq1qALz3JQi0p6WbacKXbpW9R JWzv5gT2n6Q6RqYm5690ErI3s+2WaHorgK0gW+8utXJZ3zOcPPCHa9PYPp2uFDLND38t 0YE67nfTBv7725sHwD+pC7il+Kb2aVeM/KTHoZ/suL9QrC27pUidXvsaPuL0Uar9fPVv EBtQ== X-Gm-Message-State: AOJu0YyP0RQPXjoYNvulIENJqrJ0bj8oq5Yj6vALbCsoriddzzPDxG3M hZZNA1r0rO5CESg/8bHEaAYb8sF2Tew/YxW+f7KwcYv2GIRpU7U/ytp0QXiRLg== X-Google-Smtp-Source: AGHT+IHHnE35X5FvcsbIX3uy/70uqZ2pFf4H1VjobdJZCi2odQms7Jk4fhbq+rkiH0RTQe9iwJkv0Q== X-Received: by 2002:a17:902:f68e:b0:205:7829:9d83 with SMTP id d9443c01a7336-2076e412fafmr198287575ad.38.1726539035111; Mon, 16 Sep 2024 19:10:35 -0700 (PDT) Received: from tigerii.tok.corp.google.com ([2401:fa00:8f:203:693c:b4a9:5e6e:c397]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2dbcfd9313fsm6037293a91.44.2024.09.16.19.10.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Sep 2024 19:10:34 -0700 (PDT) From: Sergey Senozhatsky To: Andrew Morton , Minchan Kim Cc: linux-kernel@vger.kernel.org, Sergey Senozhatsky Subject: [PATCHv5 3/7] zram: rework recompress target selection strategy Date: Tue, 17 Sep 2024 11:09:08 +0900 Message-ID: <20240917021020.883356-4-senozhatsky@chromium.org> X-Mailer: git-send-email 2.46.0.662.g92d0881bb0-goog In-Reply-To: <20240917021020.883356-1-senozhatsky@chromium.org> References: <20240917021020.883356-1-senozhatsky@chromium.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Target slot selection for recompression is just a simple iteration over zram->table entries (stored pages) from slot 0 to max slot. Given that zram->table slots are written in random order and are not sorted by size, a simple iteration over slots selects suboptimal targets for recompression. This is not a problem if we recompress every single zram->table slot, but we never do that in reality. In reality we limit the number of slots we can recompress (via max_pages parameter) and hence proper slot selection becomes very important. The strategy is quite simple, suppose we have two candidate slots for recompression, one of size 48 bytes and one of size 2800 bytes, and we can recompress only one, then it certainly makes more sense to pick 2800 entry for recompression. Because even if we manage to compress 48 bytes objects even further the savings are going to be very small. Potential savings after good re-compression of 2800 bytes objects are much higher. This patch reworks slot selection and introduces the strategy described above: among candidate slots always select the biggest ones first. For that the patch introduces zram_pp_ctl (post-processing) structure which holds NUM_PP_BUCKETS pp buckets of slots. Slots are assigned to a particular group based on their sizes - the larger the size of the slot the higher the group index. This, basically, sorts slots by size in liner time (we still perform just one iteration over zram->table slots). When we select slot for recompression we always first lookup in higher pp buckets (those that hold the largest slots). Which achieves the desired behavior. TEST =3D=3D=3D=3D A very simple demonstration: zram is configured with zstd, and zstd with dict as a recompression stream. A limited (max 4096 pages) recompression is performed then, with a log of sizes of slots that were recompressed. You can see that patched zram selects slots for recompression in significantly different manner, which leads to higher memory savings (see column #2 of mm_stat output). BASE ---- *** initial state of zram device /sys/block/zram0/mm_stat 1750994944 504491413 514203648 0 514203648 1 0 3420= 4 34204 *** recompress idle max_pages=3D4096 /sys/block/zram0/mm_stat 1750994944 504262229 514953216 0 514203648 1 0 3420= 4 34204 Sizes of selected objects for recompression: ... 45 58 24 226 91 40 24 24 24 424 2104 93 2078 2078 2078 959 154 ... PATCHED ------- *** initial state of zram device /sys/block/zram0/mm_stat 1750982656 504492801 514170880 0 514170880 1 0 3420= 4 34204 *** recompress idle max_pages=3D4096 /sys/block/zram0/mm_stat 1750982656 503716710 517586944 0 514170880 1 0 3420= 4 34204 Sizes of selected objects for recompression: ... 3680 3694 3667 3590 3614 3553 3537 3548 3550 3542 3543 3537 ... Note, pp-slots are not strictly sorted, there is a PP_BUCKET_SIZE_RANGE variation of sizes within particular bucket. Signed-off-by: Sergey Senozhatsky --- drivers/block/zram/zram_drv.c | 187 +++++++++++++++++++++++++++++----- 1 file changed, 160 insertions(+), 27 deletions(-) diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index 37622268104e..6688f70b2140 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -184,6 +184,99 @@ static void zram_accessed(struct zram *zram, u32 index) #endif } =20 +#ifdef CONFIG_ZRAM_MULTI_COMP +struct zram_pp_slot { + unsigned long index; + struct list_head entry; +}; + +/* + * A post-processing bucket is, essentially, a size class, this defines + * the range (in bytes) of pp-slots sizes in particular bucket. + */ +#define PP_BUCKET_SIZE_RANGE 64 +#define NUM_PP_BUCKETS ((PAGE_SIZE / PP_BUCKET_SIZE_RANGE) + 1) + +struct zram_pp_ctl { + struct list_head pp_buckets[NUM_PP_BUCKETS]; +}; + +static struct zram_pp_ctl *init_pp_ctl(void) +{ + struct zram_pp_ctl *ctl; + u32 idx; + + ctl =3D kmalloc(sizeof(*ctl), GFP_KERNEL); + if (!ctl) + return NULL; + + for (idx =3D 0; idx < NUM_PP_BUCKETS; idx++) + INIT_LIST_HEAD(&ctl->pp_buckets[idx]); + return ctl; +} + +static void release_pp_slot(struct zram *zram, struct zram_pp_slot *pps) +{ + list_del_init(&pps->entry); + + zram_slot_lock(zram, pps->index); + zram_clear_flag(zram, pps->index, ZRAM_PP_SLOT); + zram_slot_unlock(zram, pps->index); + + kfree(pps); +} + +static void release_pp_ctl(struct zram *zram, struct zram_pp_ctl *ctl) +{ + u32 idx; + + if (!ctl) + return; + + for (idx =3D 0; idx < NUM_PP_BUCKETS; idx++) { + while (!list_empty(&ctl->pp_buckets[idx])) { + struct zram_pp_slot *pps; + + pps =3D list_first_entry(&ctl->pp_buckets[idx], + struct zram_pp_slot, + entry); + release_pp_slot(zram, pps); + } + } + + kfree(ctl); +} + +static void place_pp_slot(struct zram *zram, struct zram_pp_ctl *ctl, + struct zram_pp_slot *pps) +{ + u32 idx; + + idx =3D zram_get_obj_size(zram, pps->index) / PP_BUCKET_SIZE_RANGE; + list_add(&pps->entry, &ctl->pp_buckets[idx]); + + zram_set_flag(zram, pps->index, ZRAM_PP_SLOT); +} + +static struct zram_pp_slot *select_pp_slot(struct zram_pp_ctl *ctl) +{ + struct zram_pp_slot *pps =3D NULL; + s32 idx =3D NUM_PP_BUCKETS - 1; + + /* The higher the bucket id the more optimal slot post-processing is */ + while (idx > 0) { + pps =3D list_first_entry_or_null(&ctl->pp_buckets[idx], + struct zram_pp_slot, + entry); + if (pps) + break; + + idx--; + } + return pps; +} +#endif + static inline void update_used_max(struct zram *zram, const unsigned long pages) { @@ -1657,6 +1750,52 @@ static int zram_bvec_write(struct zram *zram, struct= bio_vec *bvec, } =20 #ifdef CONFIG_ZRAM_MULTI_COMP +#define RECOMPRESS_IDLE (1 << 0) +#define RECOMPRESS_HUGE (1 << 1) + +static int scan_slots_for_recompress(struct zram *zram, u32 mode, + struct zram_pp_ctl *ctl) +{ + unsigned long nr_pages =3D zram->disksize >> PAGE_SHIFT; + struct zram_pp_slot *pps =3D NULL; + unsigned long index; + + for (index =3D 0; index < nr_pages; index++) { + if (!pps) + pps =3D kmalloc(sizeof(*pps), GFP_KERNEL); + if (!pps) + return -ENOMEM; + + INIT_LIST_HEAD(&pps->entry); + + zram_slot_lock(zram, index); + if (!zram_allocated(zram, index)) + goto next; + + if (mode & RECOMPRESS_IDLE && + !zram_test_flag(zram, index, ZRAM_IDLE)) + goto next; + + if (mode & RECOMPRESS_HUGE && + !zram_test_flag(zram, index, ZRAM_HUGE)) + goto next; + + if (zram_test_flag(zram, index, ZRAM_WB) || + zram_test_flag(zram, index, ZRAM_SAME) || + zram_test_flag(zram, index, ZRAM_INCOMPRESSIBLE)) + goto next; + + pps->index =3D index; + place_pp_slot(zram, ctl, pps); + pps =3D NULL; +next: + zram_slot_unlock(zram, index); + } + + kfree(pps); + return 0; +} + /* * This function will decompress (unless it's ZRAM_HUGE) the page and then * attempt to compress it using provided compression algorithm priority @@ -1664,7 +1803,7 @@ static int zram_bvec_write(struct zram *zram, struct = bio_vec *bvec, * * Corresponding ZRAM slot should be locked. */ -static int zram_recompress(struct zram *zram, u32 index, struct page *page, +static int recompress_slot(struct zram *zram, u32 index, struct page *page, u64 *num_recomp_pages, u32 threshold, u32 prio, u32 prio_max) { @@ -1807,20 +1946,17 @@ static int zram_recompress(struct zram *zram, u32 i= ndex, struct page *page, return 0; } =20 -#define RECOMPRESS_IDLE (1 << 0) -#define RECOMPRESS_HUGE (1 << 1) - static ssize_t recompress_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t len) { u32 prio =3D ZRAM_SECONDARY_COMP, prio_max =3D ZRAM_MAX_COMPS; struct zram *zram =3D dev_to_zram(dev); - unsigned long nr_pages =3D zram->disksize >> PAGE_SHIFT; char *args, *param, *val, *algo =3D NULL; u64 num_recomp_pages =3D ULLONG_MAX; + struct zram_pp_ctl *ctl =3D NULL; + struct zram_pp_slot *pps; u32 mode =3D 0, threshold =3D 0; - unsigned long index; struct page *page; ssize_t ret; =20 @@ -1922,36 +2058,32 @@ static ssize_t recompress_store(struct device *dev, goto release_init_lock; } =20 + ctl =3D init_pp_ctl(); + if (!ctl) { + ret =3D -ENOMEM; + goto release_init_lock; + } + + scan_slots_for_recompress(zram, mode, ctl); + ret =3D len; - for (index =3D 0; index < nr_pages; index++) { + while ((pps =3D select_pp_slot(ctl))) { int err =3D 0; =20 if (!num_recomp_pages) break; =20 - zram_slot_lock(zram, index); - - if (!zram_allocated(zram, index)) - goto next; - - if (mode & RECOMPRESS_IDLE && - !zram_test_flag(zram, index, ZRAM_IDLE)) + zram_slot_lock(zram, pps->index); + if (!zram_test_flag(zram, pps->index, ZRAM_PP_SLOT)) goto next; =20 - if (mode & RECOMPRESS_HUGE && - !zram_test_flag(zram, index, ZRAM_HUGE)) - goto next; - - if (zram_test_flag(zram, index, ZRAM_WB) || - zram_test_flag(zram, index, ZRAM_UNDER_WB) || - zram_test_flag(zram, index, ZRAM_SAME) || - zram_test_flag(zram, index, ZRAM_INCOMPRESSIBLE)) - goto next; - - err =3D zram_recompress(zram, index, page, &num_recomp_pages, - threshold, prio, prio_max); + err =3D recompress_slot(zram, pps->index, page, + &num_recomp_pages, threshold, + prio, prio_max); next: - zram_slot_unlock(zram, index); + zram_slot_unlock(zram, pps->index); + release_pp_slot(zram, pps); + if (err) { ret =3D err; break; @@ -1963,6 +2095,7 @@ static ssize_t recompress_store(struct device *dev, __free_page(page); =20 release_init_lock: + release_pp_ctl(zram, ctl); atomic_set(&zram->pp_in_progress, 0); up_read(&zram->init_lock); return ret; --=20 2.46.0.662.g92d0881bb0-goog