From nobody Fri Dec 19 20:19:34 2025 Received: from mail-pl1-f173.google.com (mail-pl1-f173.google.com [209.85.214.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 962B91A8F94 for ; Wed, 22 Jan 2025 05:59:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737525553; cv=none; b=S+CgNvtkOkf/XrYSWW+He8/uBtEWRMt3/+1a7Hk15GBvIWoSQZMmY2jlL9NnOMt6/zT/XyHTRdv00FBQpwWoFTsk1cRUDmW9hAUB/Wn26WIykOy7V6Wy9bypsGvzUGtjpU7G6K+WtMu3VzD/h4VQ4rRvzWBQj5WDXHRSGGrMzhM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737525553; c=relaxed/simple; bh=QR494XRe89R99jPD+S55o6vt63m7Zv73xJQUq5WFSII=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=hVuyebDgC0n8xG4dPfGilw4Qc9asDdeJ/O+jmfg9o9JAsVwPVbcOjjRbmgdM5NIHL//kBIrJgHdAW5J58Jt/JjOsUoifcBotWWs7LHIwR+pnXP8JeJrr/nLa8L0u3oSLM4ZpEoXL8WoxOo3aQgFd0zt+NZnHTV4rZGiM8DombiM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org; spf=pass smtp.mailfrom=chromium.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b=FBE0rufV; arc=none smtp.client-ip=209.85.214.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=chromium.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="FBE0rufV" Received: by mail-pl1-f173.google.com with SMTP id d9443c01a7336-2167141dfa1so9362985ad.1 for ; Tue, 21 Jan 2025 21:59:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1737525551; x=1738130351; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=86ToOuUyh/4wwxhxClpdCo8Ac9RuY+5nejrk9oqhwi8=; b=FBE0rufVpUzpcRm5F3vRgoKID/UTYpHbmWub09de8kh5K74XHX5miuvarLjgU40GLQ wiRo8cM79oHNZwqoSZWfkUAGft5CrX+qEul9ugxCWZ+wAzGdOAuxNXAGfV6u/57oFJ6e cGeXZWarcs9nH9aKU/6YHeIWOTFJ+6CEbPYu4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737525551; x=1738130351; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=86ToOuUyh/4wwxhxClpdCo8Ac9RuY+5nejrk9oqhwi8=; b=gACvX7WoIz65vXH6mFoqVjSu24KSx2YqFCJ4skTqZJCm/hW26qfFmYKpKcqYqZL89X 1GjZS2iVCjEm5V57Qf1nSungjbhaYxZIx1MCso+2u9JYuX3i64seJ43/nsKUFwRqZ+yE o3iCgkVjnXq/jXOjpQA2HRBOpRIGR49T4NTWXYpuuo7KxTDFIz9bzzrukNCNyP9TE8xY RepBzAdAl9c7UsfUuek7otOduo/+6hCqXB98WMngHuBrhlOhwUgAWiKexBnU2Y2QOw9n DMkwS+YD+09IRL8xdGvCnfWzcM6Kn1gIj1RSrPp3xJcqSAakpltoFLlOx5ZeYUP5ymLO e/eg== X-Forwarded-Encrypted: i=1; AJvYcCUZQD4NrI0qa7yA8WBz6lkug3wO+WMVKmVuIEdBtkRV4hlZWH1y219PtxEg6khQYf/nwbQgQlfkNPaHO/0=@vger.kernel.org X-Gm-Message-State: AOJu0YxJP0Uy5aR3GExQvfcV4HoRDDqS+1Q/M7hqwme6Ti/bat1g3CCs Q/tC6eeMEA35EZ0k+F3K/UseLUonFz8YZfo+OZCxaE/0GHiFGA8i2o0NlIL3eQ== X-Gm-Gg: ASbGncudDX/n8YPEp4myGOJrZGFGx1iADEslAC8CGzFVdEInC5m7DTQoVTtDMcpyOcH kIgQA91Bjqt19B/NQ2cEGO2C9Mtkghj/kUcXJfOL4gFKE7+IbKtl8IM22Nl/JYf4Lp3QsPqGQE+ 5iHnVPUxM2a7OMpstaXOw3o1siUYdEBJSXAzZOmvwCnaQMi7ldI30xX/63d1x7TgLyx1lDoiZTy GWRPSzeTAff67ue5HPlxvomPISAing+SCW5Fmp5tdVELfiVpyzCLWc89FOlxyNCeJXO9tsL X-Google-Smtp-Source: AGHT+IFI3izmvrB4GENXvTqLLu8ZEkzBN0mPpSFCervBudtCE84arWbb2mVIRxYrPs+7DZVxq2jA1A== X-Received: by 2002:a17:902:cec5:b0:215:a56f:1e50 with SMTP id d9443c01a7336-21c35c8b62fmr273018845ad.8.1737525550752; Tue, 21 Jan 2025 21:59:10 -0800 (PST) Received: from localhost ([2401:fa00:8f:203:2902:8f0f:12b3:c251]) by smtp.gmail.com with UTF8SMTPSA id d9443c01a7336-21c2d4042desm87675045ad.242.2025.01.21.21.59.08 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 21 Jan 2025 21:59:10 -0800 (PST) From: Sergey Senozhatsky To: Andrew Morton , Minchan Kim Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky Subject: [PATCH 1/7] zram: switch to non-atomic entry locking Date: Wed, 22 Jan 2025 14:57:39 +0900 Message-ID: <20250122055831.3341175-2-senozhatsky@chromium.org> X-Mailer: git-send-email 2.48.0.rc2.279.g1de40edade-goog In-Reply-To: <20250122055831.3341175-1-senozhatsky@chromium.org> References: <20250122055831.3341175-1-senozhatsky@chromium.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Concurrent modifications of meta table entries is now handled by per-entry spin-lock. This has a number of shortcomings. First, this imposes atomic requirements on compression backends. zram can call both zcomp_compress() and zcomp_decompress() under entry spin-lock, which implies that we can use only compression algorithms that don't schedule/sleep/wait during compression and decompression. This, for instance, makes it impossible to use some of the ASYNC compression algorithms (H/W compression, etc.) implementations. Second, this can potentially trigger watchdogs. For example, entry re-compression with secondary algorithms is performed under entry spin-lock. Given that we chain secondary compression algorithms and that some of them can be configured for best compression ratio (and worst compression speed) zram can stay under spin-lock for quite some time. Do not use per-entry spin-locks and instead switch to a bucket RW-sem locking scheme. Each rw-lock controls access to a number of zram entries (a bucket). Each bucket is configured to protect 12 (for the time being) pages of zram disk, in order to minimize memory footprint of bucket locks - we cannot afford a mutex of rwsem per-entry, that can use hundreds of megabytes on a relatively common setup, yet we still want some degree of parallelism wrt entries access. Bucket lock is taken in write-mode only to modify zram entry, compression and zsmalloc handle allocation performed outside of bucket lock scopr; decompression is performed under bucket read-lock. At a glance there doesn't seem to be any immediate performance difference: time make -j$(nproc) ARCH=3Dx86 all modules # defconfig, 24-vCPU VM BEFORE ------ Kernel: arch/x86/boot/bzImage is ready (#1) 1371.43user 158.84system 1:30.70elapsed 1687%CPU (0avgtext+0avgdata 825620m= axresident)k 32504inputs+1768352outputs (259major+51536895minor)pagefaults 0swaps AFTER ----- Kernel: arch/x86/boot/bzImage is ready (#1) 1366.79user 158.82system 1:31.17elapsed 1673%CPU (0avgtext+0avgdata 825676m= axresident)k 32504inputs+1768352outputs (263major+51538123minor)pagefaults 0swaps Signed-off-by: Sergey Senozhatsky --- drivers/block/zram/zram_drv.c | 151 ++++++++++++++++++++-------------- drivers/block/zram/zram_drv.h | 8 +- 2 files changed, 98 insertions(+), 61 deletions(-) diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index 9f5020b077c5..7eb7feba3cac 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -58,19 +58,44 @@ static void zram_free_page(struct zram *zram, size_t in= dex); static int zram_read_from_zspool(struct zram *zram, struct page *page, u32 index); =20 -static int zram_slot_trylock(struct zram *zram, u32 index) +static u32 zram_bucket_idx(u32 index) { - return spin_trylock(&zram->table[index].lock); + return index / ZRAM_PAGES_PER_BUCKET_LOCK; } =20 -static void zram_slot_lock(struct zram *zram, u32 index) +static int zram_slot_write_trylock(struct zram *zram, u32 index) { - spin_lock(&zram->table[index].lock); + u32 idx =3D zram_bucket_idx(index); + + return down_write_trylock(&zram->locks[idx].lock); +} + +static void zram_slot_write_lock(struct zram *zram, u32 index) +{ + u32 idx =3D zram_bucket_idx(index); + + down_write(&zram->locks[idx].lock); +} + +static void zram_slot_write_unlock(struct zram *zram, u32 index) +{ + u32 idx =3D zram_bucket_idx(index); + + up_write(&zram->locks[idx].lock); +} + +static void zram_slot_read_lock(struct zram *zram, u32 index) +{ + u32 idx =3D zram_bucket_idx(index); + + down_read(&zram->locks[idx].lock); } =20 -static void zram_slot_unlock(struct zram *zram, u32 index) +static void zram_slot_read_unlock(struct zram *zram, u32 index) { - spin_unlock(&zram->table[index].lock); + u32 idx =3D zram_bucket_idx(index); + + up_read(&zram->locks[idx].lock); } =20 static inline bool init_done(struct zram *zram) @@ -93,7 +118,6 @@ static void zram_set_handle(struct zram *zram, u32 index= , unsigned long handle) zram->table[index].handle =3D handle; } =20 -/* flag operations require table entry bit_spin_lock() being held */ static bool zram_test_flag(struct zram *zram, u32 index, enum zram_pageflags flag) { @@ -229,9 +253,9 @@ static void release_pp_slot(struct zram *zram, struct z= ram_pp_slot *pps) { list_del_init(&pps->entry); =20 - zram_slot_lock(zram, pps->index); + zram_slot_write_lock(zram, pps->index); zram_clear_flag(zram, pps->index, ZRAM_PP_SLOT); - zram_slot_unlock(zram, pps->index); + zram_slot_write_unlock(zram, pps->index); =20 kfree(pps); } @@ -394,11 +418,11 @@ static void mark_idle(struct zram *zram, ktime_t cuto= ff) * * And ZRAM_WB slots simply cannot be ZRAM_IDLE. */ - zram_slot_lock(zram, index); + zram_slot_write_lock(zram, index); if (!zram_allocated(zram, index) || zram_test_flag(zram, index, ZRAM_WB) || zram_test_flag(zram, index, ZRAM_SAME)) { - zram_slot_unlock(zram, index); + zram_slot_write_unlock(zram, index); continue; } =20 @@ -410,7 +434,7 @@ static void mark_idle(struct zram *zram, ktime_t cutoff) zram_set_flag(zram, index, ZRAM_IDLE); else zram_clear_flag(zram, index, ZRAM_IDLE); - zram_slot_unlock(zram, index); + zram_slot_write_unlock(zram, index); } } =20 @@ -709,7 +733,7 @@ static int scan_slots_for_writeback(struct zram *zram, = u32 mode, =20 INIT_LIST_HEAD(&pps->entry); =20 - zram_slot_lock(zram, index); + zram_slot_write_lock(zram, index); if (!zram_allocated(zram, index)) goto next; =20 @@ -731,7 +755,7 @@ static int scan_slots_for_writeback(struct zram *zram, = u32 mode, place_pp_slot(zram, ctl, pps); pps =3D NULL; next: - zram_slot_unlock(zram, index); + zram_slot_write_unlock(zram, index); } =20 kfree(pps); @@ -822,7 +846,7 @@ static ssize_t writeback_store(struct device *dev, } =20 index =3D pps->index; - zram_slot_lock(zram, index); + zram_slot_read_lock(zram, index); /* * scan_slots() sets ZRAM_PP_SLOT and relases slot lock, so * slots can change in the meantime. If slots are accessed or @@ -833,7 +857,7 @@ static ssize_t writeback_store(struct device *dev, goto next; if (zram_read_from_zspool(zram, page, index)) goto next; - zram_slot_unlock(zram, index); + zram_slot_read_unlock(zram, index); =20 bio_init(&bio, zram->bdev, &bio_vec, 1, REQ_OP_WRITE | REQ_SYNC); @@ -860,7 +884,7 @@ static ssize_t writeback_store(struct device *dev, } =20 atomic64_inc(&zram->stats.bd_writes); - zram_slot_lock(zram, index); + zram_slot_write_lock(zram, index); /* * Same as above, we release slot lock during writeback so * slot can change under us: slot_free() or slot_free() and @@ -882,7 +906,7 @@ static ssize_t writeback_store(struct device *dev, zram->bd_wb_limit -=3D 1UL << (PAGE_SHIFT - 12); spin_unlock(&zram->wb_limit_lock); next: - zram_slot_unlock(zram, index); + zram_slot_write_unlock(zram, index); release_pp_slot(zram, pps); =20 cond_resched(); @@ -1001,7 +1025,7 @@ static ssize_t read_block_state(struct file *file, ch= ar __user *buf, for (index =3D *ppos; index < nr_pages; index++) { int copied; =20 - zram_slot_lock(zram, index); + zram_slot_read_lock(zram, index); if (!zram_allocated(zram, index)) goto next; =20 @@ -1019,13 +1043,13 @@ static ssize_t read_block_state(struct file *file, = char __user *buf, ZRAM_INCOMPRESSIBLE) ? 'n' : '.'); =20 if (count <=3D copied) { - zram_slot_unlock(zram, index); + zram_slot_read_unlock(zram, index); break; } written +=3D copied; count -=3D copied; next: - zram_slot_unlock(zram, index); + zram_slot_read_unlock(zram, index); *ppos +=3D 1; } =20 @@ -1451,37 +1475,44 @@ static void zram_meta_free(struct zram *zram, u64 d= isksize) zs_destroy_pool(zram->mem_pool); vfree(zram->table); zram->table =3D NULL; + vfree(zram->locks); + zram->locks =3D NULL; } =20 static bool zram_meta_alloc(struct zram *zram, u64 disksize) { - size_t num_pages, index; + size_t num_ents, index; =20 - num_pages =3D disksize >> PAGE_SHIFT; - zram->table =3D vzalloc(array_size(num_pages, sizeof(*zram->table))); + num_ents =3D disksize >> PAGE_SHIFT; + zram->table =3D vzalloc(array_size(num_ents, sizeof(*zram->table))); if (!zram->table) - return false; + goto error; + + num_ents /=3D ZRAM_PAGES_PER_BUCKET_LOCK; + zram->locks =3D vzalloc(array_size(num_ents, sizeof(*zram->locks))); + if (!zram->locks) + goto error; =20 zram->mem_pool =3D zs_create_pool(zram->disk->disk_name); - if (!zram->mem_pool) { - vfree(zram->table); - zram->table =3D NULL; - return false; - } + if (!zram->mem_pool) + goto error; =20 if (!huge_class_size) huge_class_size =3D zs_huge_class_size(zram->mem_pool); =20 - for (index =3D 0; index < num_pages; index++) - spin_lock_init(&zram->table[index].lock); + for (index =3D 0; index < num_ents; index++) + init_rwsem(&zram->locks[index].lock); + return true; + +error: + vfree(zram->table); + zram->table =3D NULL; + vfree(zram->locks); + zram->locks =3D NULL; + return false; } =20 -/* - * To protect concurrent access to the same index entry, - * caller should hold this table index entry's bit_spinlock to - * indicate this index entry is accessing. - */ static void zram_free_page(struct zram *zram, size_t index) { unsigned long handle; @@ -1602,17 +1633,17 @@ static int zram_read_page(struct zram *zram, struct= page *page, u32 index, { int ret; =20 - zram_slot_lock(zram, index); + zram_slot_read_lock(zram, index); if (!zram_test_flag(zram, index, ZRAM_WB)) { /* Slot should be locked through out the function call */ ret =3D zram_read_from_zspool(zram, page, index); - zram_slot_unlock(zram, index); + zram_slot_read_unlock(zram, index); } else { /* * The slot should be unlocked before reading from the backing * device. */ - zram_slot_unlock(zram, index); + zram_slot_read_unlock(zram, index); =20 ret =3D read_from_bdev(zram, page, zram_get_handle(zram, index), parent); @@ -1655,10 +1686,10 @@ static int zram_bvec_read(struct zram *zram, struct= bio_vec *bvec, static int write_same_filled_page(struct zram *zram, unsigned long fill, u32 index) { - zram_slot_lock(zram, index); + zram_slot_write_lock(zram, index); zram_set_flag(zram, index, ZRAM_SAME); zram_set_handle(zram, index, fill); - zram_slot_unlock(zram, index); + zram_slot_write_unlock(zram, index); =20 atomic64_inc(&zram->stats.same_pages); atomic64_inc(&zram->stats.pages_stored); @@ -1693,11 +1724,11 @@ static int write_incompressible_page(struct zram *z= ram, struct page *page, kunmap_local(src); zs_unmap_object(zram->mem_pool, handle); =20 - zram_slot_lock(zram, index); + zram_slot_write_lock(zram, index); zram_set_flag(zram, index, ZRAM_HUGE); zram_set_handle(zram, index, handle); zram_set_obj_size(zram, index, PAGE_SIZE); - zram_slot_unlock(zram, index); + zram_slot_write_unlock(zram, index); =20 atomic64_add(PAGE_SIZE, &zram->stats.compr_data_size); atomic64_inc(&zram->stats.huge_pages); @@ -1718,9 +1749,9 @@ static int zram_write_page(struct zram *zram, struct = page *page, u32 index) bool same_filled; =20 /* First, free memory allocated to this slot (if any) */ - zram_slot_lock(zram, index); + zram_slot_write_lock(zram, index); zram_free_page(zram, index); - zram_slot_unlock(zram, index); + zram_slot_write_unlock(zram, index); =20 mem =3D kmap_local_page(page); same_filled =3D page_same_filled(mem, &element); @@ -1790,10 +1821,10 @@ static int zram_write_page(struct zram *zram, struc= t page *page, u32 index) zcomp_stream_put(zram->comps[ZRAM_PRIMARY_COMP]); zs_unmap_object(zram->mem_pool, handle); =20 - zram_slot_lock(zram, index); + zram_slot_write_lock(zram, index); zram_set_handle(zram, index, handle); zram_set_obj_size(zram, index, comp_len); - zram_slot_unlock(zram, index); + zram_slot_write_unlock(zram, index); =20 /* Update stats */ atomic64_inc(&zram->stats.pages_stored); @@ -1850,7 +1881,7 @@ static int scan_slots_for_recompress(struct zram *zra= m, u32 mode, =20 INIT_LIST_HEAD(&pps->entry); =20 - zram_slot_lock(zram, index); + zram_slot_write_lock(zram, index); if (!zram_allocated(zram, index)) goto next; =20 @@ -1871,7 +1902,7 @@ static int scan_slots_for_recompress(struct zram *zra= m, u32 mode, place_pp_slot(zram, ctl, pps); pps =3D NULL; next: - zram_slot_unlock(zram, index); + zram_slot_write_unlock(zram, index); } =20 kfree(pps); @@ -2162,7 +2193,7 @@ static ssize_t recompress_store(struct device *dev, if (!num_recomp_pages) break; =20 - zram_slot_lock(zram, pps->index); + zram_slot_write_lock(zram, pps->index); if (!zram_test_flag(zram, pps->index, ZRAM_PP_SLOT)) goto next; =20 @@ -2170,7 +2201,7 @@ static ssize_t recompress_store(struct device *dev, &num_recomp_pages, threshold, prio, prio_max); next: - zram_slot_unlock(zram, pps->index); + zram_slot_write_unlock(zram, pps->index); release_pp_slot(zram, pps); =20 if (err) { @@ -2217,9 +2248,9 @@ static void zram_bio_discard(struct zram *zram, struc= t bio *bio) } =20 while (n >=3D PAGE_SIZE) { - zram_slot_lock(zram, index); + zram_slot_write_lock(zram, index); zram_free_page(zram, index); - zram_slot_unlock(zram, index); + zram_slot_write_unlock(zram, index); atomic64_inc(&zram->stats.notify_free); index++; n -=3D PAGE_SIZE; @@ -2248,9 +2279,9 @@ static void zram_bio_read(struct zram *zram, struct b= io *bio) } flush_dcache_page(bv.bv_page); =20 - zram_slot_lock(zram, index); + zram_slot_write_lock(zram, index); zram_accessed(zram, index); - zram_slot_unlock(zram, index); + zram_slot_write_unlock(zram, index); =20 bio_advance_iter_single(bio, &iter, bv.bv_len); } while (iter.bi_size); @@ -2278,9 +2309,9 @@ static void zram_bio_write(struct zram *zram, struct = bio *bio) break; } =20 - zram_slot_lock(zram, index); + zram_slot_write_lock(zram, index); zram_accessed(zram, index); - zram_slot_unlock(zram, index); + zram_slot_write_unlock(zram, index); =20 bio_advance_iter_single(bio, &iter, bv.bv_len); } while (iter.bi_size); @@ -2321,13 +2352,13 @@ static void zram_slot_free_notify(struct block_devi= ce *bdev, zram =3D bdev->bd_disk->private_data; =20 atomic64_inc(&zram->stats.notify_free); - if (!zram_slot_trylock(zram, index)) { + if (!zram_slot_write_trylock(zram, index)) { atomic64_inc(&zram->stats.miss_free); return; } =20 zram_free_page(zram, index); - zram_slot_unlock(zram, index); + zram_slot_write_unlock(zram, index); } =20 static void zram_comp_params_reset(struct zram *zram) diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h index db78d7c01b9a..b272ede404b0 100644 --- a/drivers/block/zram/zram_drv.h +++ b/drivers/block/zram/zram_drv.h @@ -28,6 +28,7 @@ #define ZRAM_SECTOR_PER_LOGICAL_BLOCK \ (1 << (ZRAM_LOGICAL_BLOCK_SHIFT - SECTOR_SHIFT)) =20 +#define ZRAM_PAGES_PER_BUCKET_LOCK 12 =20 /* * ZRAM is mainly used for memory efficiency so we want to keep memory @@ -63,13 +64,17 @@ enum zram_pageflags { /* Allocated for each disk page */ struct zram_table_entry { unsigned long handle; + /* u32 suffice for flags + u32 padding */ unsigned int flags; - spinlock_t lock; #ifdef CONFIG_ZRAM_TRACK_ENTRY_ACTIME ktime_t ac_time; #endif }; =20 +struct zram_bucket_lock { + struct rw_semaphore lock; +}; + struct zram_stats { atomic64_t compr_data_size; /* compressed size of pages stored */ atomic64_t failed_reads; /* can happen when memory is too low */ @@ -101,6 +106,7 @@ struct zram_stats { =20 struct zram { struct zram_table_entry *table; + struct zram_bucket_lock *locks; struct zs_pool *mem_pool; struct zcomp *comps[ZRAM_MAX_COMPS]; struct zcomp_params params[ZRAM_MAX_COMPS]; --=20 2.48.0.rc2.279.g1de40edade-goog From nobody Fri Dec 19 20:19:34 2025 Received: from mail-pj1-f42.google.com (mail-pj1-f42.google.com [209.85.216.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1206C1BD9CB for ; Wed, 22 Jan 2025 05:59:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737525561; cv=none; b=Ds6+07Ry/x//U7jXru1ytCWjMnr1eGuL1RFoBlEJhDnIEL2grYagAkaIZ4FBO6sHZfK2w0tFGwgcN/P6NJFD/O8bCcgtCgmJtNSyFAFf4lbZSm71tp5ypllpkAC4HBQAKOtvPFxb7qLHZNdIyaZKd0iyBoyccnW2PIY9mlJ1EVk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737525561; c=relaxed/simple; bh=3/AfWZnFPiTI9zzAIAkCDU1eEzFmJAb7Ud3Y5m0wgT8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=If1YjKnDIP8aqryUUnGl4EDk+Ucm8X5pXOtT53AJVvEPp7huP2mhOPTZJ74fWSVK4iVkWrXAN9oSiGdx9k0NN9kyzJDLBll+Wr51JLINac4BtjIdrxnxY68mil/GHr0fHIraQB4J+m5F9yrve3fAyMjPNCW2/SjFSaXHEKbWBis= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org; spf=pass smtp.mailfrom=chromium.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b=ElxmXfcR; arc=none smtp.client-ip=209.85.216.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=chromium.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="ElxmXfcR" Received: by mail-pj1-f42.google.com with SMTP id 98e67ed59e1d1-2ee67e9287fso11182655a91.0 for ; Tue, 21 Jan 2025 21:59:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1737525555; x=1738130355; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=nNJBFEkJ6QI9GsDPc+DtKnpVKRZ1iP1KV3SxEj7y/i4=; b=ElxmXfcRuSsS3jvGiG9gY2fcTzymaPGsBxSvT5bGCkStQt26TSZ45faVWR/qS+idCC SBObeU6g39/LeALHHp2UPdgAT4J3LU3ZBlQlwY9JpJW7Oj4ZUP0DbPw/UW0v5OL88XdB 31DIXWcC7SeyF+MDLzJbI5LJd0BsylJgv5sA4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737525555; x=1738130355; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nNJBFEkJ6QI9GsDPc+DtKnpVKRZ1iP1KV3SxEj7y/i4=; b=bcwXKCdlsIZ9wpvqGyGsLi+bYlTIKT0pC4knHy6WkJDXQU8SKzpGOfZ6Rr5CzV+0z/ Iy69R+29xiWlVSCoJ8r8WDdBUwta5R3omGa8zXYfqpdE+WnkrKkddBKpkRET2kohBLqW nnk58eTWkGkdV6CWmRpNSFdKDOcQNyfdJP+g+AmhGkd+RS2D/1l1g1fb2blLccUn9YNe G+pIlew3CoR7ZG9dfQu2YXv+paW8HUqYibG4C5L2Knppm6V0/d5tgCIPV+ggIMSeyZLN 2uGFMdoFawmaHS70kOyFNqEz2cFqLJ01AWIEFUE8DQPRfWRpAs8RNjNj5gdpmAoseEif leTA== X-Forwarded-Encrypted: i=1; AJvYcCWsET/R40Whep/4BJSIJa0mhHmJ9SsWtdHuMf+wMe+7qjy6CmLARsoIZug9Ksu5dD2QQ+8oSHjI3/wV0WU=@vger.kernel.org X-Gm-Message-State: AOJu0YwNcg1Zv2Y9L/u6jKE3eVPY1w9k79j+ng46mailblgVncR849rS /Fsln9ycgjpqalcWzJTs9vKHcJZc4Pumzxrx+k4JtGIr3BeOnQHgf1VTCO/ybg== X-Gm-Gg: ASbGncvQx8q3OhbjKLerJBWtRdmnuh8Om6ygrW1eBXOO43yTbzrRkV5IyGiM6DhkQEY M890Z+cSZP1rCvvx9JJ3p3b5Xxj3tdzH7gqAKEe0jD+RoGzjlicKM7MHrRhvh0X3YpElmOXhf4k 9EwuDKlURbhDQRsRNzuCUgZVvvaMbieVQ3aT39aD6Q2o6HMuHC7p1QgMw8e0+ZpF4YdHet5NHe8 laODg/U4KThu8FIJEUWE8l2VF7Y1BnCUty8fyQhdAkSztWcVHzp0NvQUitTHhcnHHtmM+DR X-Google-Smtp-Source: AGHT+IFkiDY8ubThsiSVPvY+v+ZW4p7rcixmuj2cYd+76LLUa3zPITbS5RuEp5qCovKlriMjDRnvpw== X-Received: by 2002:a17:90b:540b:b0:2f2:8bdd:cd8b with SMTP id 98e67ed59e1d1-2f782d972a2mr28408515a91.29.1737525555228; Tue, 21 Jan 2025 21:59:15 -0800 (PST) Received: from localhost ([2401:fa00:8f:203:2902:8f0f:12b3:c251]) by smtp.gmail.com with UTF8SMTPSA id 98e67ed59e1d1-2f7e6a78ba1sm618280a91.14.2025.01.21.21.59.13 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 21 Jan 2025 21:59:14 -0800 (PST) From: Sergey Senozhatsky To: Andrew Morton , Minchan Kim Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky Subject: [PATCH 2/7] zram: do not use per-CPU compression streams Date: Wed, 22 Jan 2025 14:57:40 +0900 Message-ID: <20250122055831.3341175-3-senozhatsky@chromium.org> X-Mailer: git-send-email 2.48.0.rc2.279.g1de40edade-goog In-Reply-To: <20250122055831.3341175-1-senozhatsky@chromium.org> References: <20250122055831.3341175-1-senozhatsky@chromium.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Similarly to per-entry spin-lock per-CPU compression streams also have a number of shortcoming. First, per-CPU stream access has to be done from a non-preemptible (atomic) section, which imposes the same atomicity requirements on compression backends as entry spin-lock do and makes it impossible to use algorithms that can schedule/wait/sleep during compression and decompression. Second, per-CPU streams noticeably increase memory usage (actually more like wastage) of secondary compression streams. The problem is that secondary compression streams are allocated per-CPU, just like the primary streams are. Yet we never use more that one secondary stream at a time, because recompression is a single threaded action. Which means that remaining num_online_cpu() - 1 streams are allocated for nothing, and this is per-priority list (we can have several secondary compression algorithms). Depending on the algorithm this may lead to a significant memory wastage, in addition each stream also carries a workmem buffer (2 physical pages). Instead of per-CPU streams, maintain a list of idle compression streams and allocate new streams on-demand (something that we used to do many years ago). So that zram read() and write() become non-atomic and ease requirements on the compression algorithm implementation. This also means that we now should have only one secondary stream per-priority list. Signed-off-by: Sergey Senozhatsky --- drivers/block/zram/zcomp.c | 162 +++++++++++++++++++--------------- drivers/block/zram/zcomp.h | 17 ++-- drivers/block/zram/zram_drv.c | 29 +++--- include/linux/cpuhotplug.h | 1 - 4 files changed, 108 insertions(+), 101 deletions(-) diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c index bb514403e305..5d8298fc2616 100644 --- a/drivers/block/zram/zcomp.c +++ b/drivers/block/zram/zcomp.c @@ -43,31 +43,40 @@ static const struct zcomp_ops *backends[] =3D { NULL }; =20 -static void zcomp_strm_free(struct zcomp *comp, struct zcomp_strm *zstrm) +static void zcomp_strm_free(struct zcomp *comp, struct zcomp_strm *strm) { - comp->ops->destroy_ctx(&zstrm->ctx); - vfree(zstrm->buffer); - zstrm->buffer =3D NULL; + comp->ops->destroy_ctx(&strm->ctx); + vfree(strm->buffer); + kfree(strm); } =20 -static int zcomp_strm_init(struct zcomp *comp, struct zcomp_strm *zstrm) +static struct zcomp_strm *zcomp_strm_alloc(struct zcomp *comp) { + struct zcomp_strm *strm; int ret; =20 - ret =3D comp->ops->create_ctx(comp->params, &zstrm->ctx); - if (ret) - return ret; + strm =3D kzalloc(sizeof(*strm), GFP_KERNEL); + if (!strm) + return NULL; + + INIT_LIST_HEAD(&strm->entry); + + ret =3D comp->ops->create_ctx(comp->params, &strm->ctx); + if (ret) { + kfree(strm); + return NULL; + } =20 /* - * allocate 2 pages. 1 for compressed data, plus 1 extra for the - * case when compressed size is larger than the original one + * allocate 2 pages. 1 for compressed data, plus 1 extra in case if + * compressed data is larger than the original one. */ - zstrm->buffer =3D vzalloc(2 * PAGE_SIZE); - if (!zstrm->buffer) { - zcomp_strm_free(comp, zstrm); - return -ENOMEM; + strm->buffer =3D vzalloc(2 * PAGE_SIZE); + if (!strm->buffer) { + zcomp_strm_free(comp, strm); + return NULL; } - return 0; + return strm; } =20 static const struct zcomp_ops *lookup_backend_ops(const char *comp) @@ -109,13 +118,59 @@ ssize_t zcomp_available_show(const char *comp, char *= buf) =20 struct zcomp_strm *zcomp_stream_get(struct zcomp *comp) { - local_lock(&comp->stream->lock); - return this_cpu_ptr(comp->stream); + struct zcomp_strm *strm; + + might_sleep(); + + while (1) { + spin_lock(&comp->strm_lock); + if (!list_empty(&comp->idle_strm)) { + strm =3D list_first_entry(&comp->idle_strm, + struct zcomp_strm, + entry); + list_del(&strm->entry); + spin_unlock(&comp->strm_lock); + return strm; + } + + /* cannot allocate new stream, wait for an idle one */ + if (comp->avail_strm >=3D num_online_cpus()) { + spin_unlock(&comp->strm_lock); + wait_event(comp->strm_wait, + !list_empty(&comp->idle_strm)); + continue; + } + + /* allocate new stream */ + comp->avail_strm++; + spin_unlock(&comp->strm_lock); + + strm =3D zcomp_strm_alloc(comp); + if (strm) + break; + + spin_lock(&comp->strm_lock); + comp->avail_strm--; + spin_unlock(&comp->strm_lock); + wait_event(comp->strm_wait, !list_empty(&comp->idle_strm)); + } + + return strm; } =20 -void zcomp_stream_put(struct zcomp *comp) +void zcomp_stream_put(struct zcomp *comp, struct zcomp_strm *strm) { - local_unlock(&comp->stream->lock); + spin_lock(&comp->strm_lock); + if (comp->avail_strm <=3D num_online_cpus()) { + list_add(&strm->entry, &comp->idle_strm); + spin_unlock(&comp->strm_lock); + wake_up(&comp->strm_wait); + return; + } + + comp->avail_strm--; + spin_unlock(&comp->strm_lock); + zcomp_strm_free(comp, strm); } =20 int zcomp_compress(struct zcomp *comp, struct zcomp_strm *zstrm, @@ -148,61 +203,19 @@ int zcomp_decompress(struct zcomp *comp, struct zcomp= _strm *zstrm, return comp->ops->decompress(comp->params, &zstrm->ctx, &req); } =20 -int zcomp_cpu_up_prepare(unsigned int cpu, struct hlist_node *node) -{ - struct zcomp *comp =3D hlist_entry(node, struct zcomp, node); - struct zcomp_strm *zstrm; - int ret; - - zstrm =3D per_cpu_ptr(comp->stream, cpu); - local_lock_init(&zstrm->lock); - - ret =3D zcomp_strm_init(comp, zstrm); - if (ret) - pr_err("Can't allocate a compression stream\n"); - return ret; -} - -int zcomp_cpu_dead(unsigned int cpu, struct hlist_node *node) -{ - struct zcomp *comp =3D hlist_entry(node, struct zcomp, node); - struct zcomp_strm *zstrm; - - zstrm =3D per_cpu_ptr(comp->stream, cpu); - zcomp_strm_free(comp, zstrm); - return 0; -} - -static int zcomp_init(struct zcomp *comp, struct zcomp_params *params) -{ - int ret; - - comp->stream =3D alloc_percpu(struct zcomp_strm); - if (!comp->stream) - return -ENOMEM; - - comp->params =3D params; - ret =3D comp->ops->setup_params(comp->params); - if (ret) - goto cleanup; - - ret =3D cpuhp_state_add_instance(CPUHP_ZCOMP_PREPARE, &comp->node); - if (ret < 0) - goto cleanup; - - return 0; - -cleanup: - comp->ops->release_params(comp->params); - free_percpu(comp->stream); - return ret; -} - void zcomp_destroy(struct zcomp *comp) { - cpuhp_state_remove_instance(CPUHP_ZCOMP_PREPARE, &comp->node); + struct zcomp_strm *strm; + + while (!list_empty(&comp->idle_strm)) { + strm =3D list_first_entry(&comp->idle_strm, + struct zcomp_strm, + entry); + list_del(&strm->entry); + zcomp_strm_free(comp, strm); + } + comp->ops->release_params(comp->params); - free_percpu(comp->stream); kfree(comp); } =20 @@ -229,7 +242,12 @@ struct zcomp *zcomp_create(const char *alg, struct zco= mp_params *params) return ERR_PTR(-EINVAL); } =20 - error =3D zcomp_init(comp, params); + INIT_LIST_HEAD(&comp->idle_strm); + init_waitqueue_head(&comp->strm_wait); + spin_lock_init(&comp->strm_lock); + + comp->params =3D params; + error =3D comp->ops->setup_params(comp->params); if (error) { kfree(comp); return ERR_PTR(error); diff --git a/drivers/block/zram/zcomp.h b/drivers/block/zram/zcomp.h index ad5762813842..62330829db3f 100644 --- a/drivers/block/zram/zcomp.h +++ b/drivers/block/zram/zcomp.h @@ -3,10 +3,10 @@ #ifndef _ZCOMP_H_ #define _ZCOMP_H_ =20 -#include - #define ZCOMP_PARAM_NO_LEVEL INT_MIN =20 +#include + /* * Immutable driver (backend) parameters. The driver may attach private * data to it (e.g. driver representation of the dictionary, etc.). @@ -31,7 +31,7 @@ struct zcomp_ctx { }; =20 struct zcomp_strm { - local_lock_t lock; + struct list_head entry; /* compression buffer */ void *buffer; struct zcomp_ctx ctx; @@ -60,16 +60,15 @@ struct zcomp_ops { const char *name; }; =20 -/* dynamic per-device compression frontend */ struct zcomp { - struct zcomp_strm __percpu *stream; + struct list_head idle_strm; + spinlock_t strm_lock; + u32 avail_strm; + wait_queue_head_t strm_wait; const struct zcomp_ops *ops; struct zcomp_params *params; - struct hlist_node node; }; =20 -int zcomp_cpu_up_prepare(unsigned int cpu, struct hlist_node *node); -int zcomp_cpu_dead(unsigned int cpu, struct hlist_node *node); ssize_t zcomp_available_show(const char *comp, char *buf); bool zcomp_available_algorithm(const char *comp); =20 @@ -77,7 +76,7 @@ struct zcomp *zcomp_create(const char *alg, struct zcomp_= params *params); void zcomp_destroy(struct zcomp *comp); =20 struct zcomp_strm *zcomp_stream_get(struct zcomp *comp); -void zcomp_stream_put(struct zcomp *comp); +void zcomp_stream_put(struct zcomp *comp, struct zcomp_strm *strm); =20 int zcomp_compress(struct zcomp *comp, struct zcomp_strm *zstrm, const void *src, unsigned int *dst_len); diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index 7eb7feba3cac..b217c29448ce 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -31,7 +31,6 @@ #include #include #include -#include #include #include =20 @@ -1606,7 +1605,7 @@ static int read_compressed_page(struct zram *zram, st= ruct page *page, u32 index) ret =3D zcomp_decompress(zram->comps[prio], zstrm, src, size, dst); kunmap_local(dst); zs_unmap_object(zram->mem_pool, handle); - zcomp_stream_put(zram->comps[prio]); + zcomp_stream_put(zram->comps[prio], zstrm); =20 return ret; } @@ -1767,14 +1766,14 @@ static int zram_write_page(struct zram *zram, struc= t page *page, u32 index) kunmap_local(mem); =20 if (unlikely(ret)) { - zcomp_stream_put(zram->comps[ZRAM_PRIMARY_COMP]); + zcomp_stream_put(zram->comps[ZRAM_PRIMARY_COMP], zstrm); pr_err("Compression failed! err=3D%d\n", ret); zs_free(zram->mem_pool, handle); return ret; } =20 if (comp_len >=3D huge_class_size) { - zcomp_stream_put(zram->comps[ZRAM_PRIMARY_COMP]); + zcomp_stream_put(zram->comps[ZRAM_PRIMARY_COMP], zstrm); return write_incompressible_page(zram, page, index); } =20 @@ -1798,7 +1797,7 @@ static int zram_write_page(struct zram *zram, struct = page *page, u32 index) __GFP_HIGHMEM | __GFP_MOVABLE); if (IS_ERR_VALUE(handle)) { - zcomp_stream_put(zram->comps[ZRAM_PRIMARY_COMP]); + zcomp_stream_put(zram->comps[ZRAM_PRIMARY_COMP], zstrm); atomic64_inc(&zram->stats.writestall); handle =3D zs_malloc(zram->mem_pool, comp_len, GFP_NOIO | __GFP_HIGHMEM | @@ -1810,7 +1809,7 @@ static int zram_write_page(struct zram *zram, struct = page *page, u32 index) } =20 if (!zram_can_store_page(zram)) { - zcomp_stream_put(zram->comps[ZRAM_PRIMARY_COMP]); + zcomp_stream_put(zram->comps[ZRAM_PRIMARY_COMP], zstrm); zs_free(zram->mem_pool, handle); return -ENOMEM; } @@ -1818,7 +1817,7 @@ static int zram_write_page(struct zram *zram, struct = page *page, u32 index) dst =3D zs_map_object(zram->mem_pool, handle, ZS_MM_WO); =20 memcpy(dst, zstrm->buffer, comp_len); - zcomp_stream_put(zram->comps[ZRAM_PRIMARY_COMP]); + zcomp_stream_put(zram->comps[ZRAM_PRIMARY_COMP], zstrm); zs_unmap_object(zram->mem_pool, handle); =20 zram_slot_write_lock(zram, index); @@ -1977,7 +1976,7 @@ static int recompress_slot(struct zram *zram, u32 ind= ex, struct page *page, kunmap_local(src); =20 if (ret) { - zcomp_stream_put(zram->comps[prio]); + zcomp_stream_put(zram->comps[prio], zstrm); return ret; } =20 @@ -1987,7 +1986,7 @@ static int recompress_slot(struct zram *zram, u32 ind= ex, struct page *page, /* Continue until we make progress */ if (class_index_new >=3D class_index_old || (threshold && comp_len_new >=3D threshold)) { - zcomp_stream_put(zram->comps[prio]); + zcomp_stream_put(zram->comps[prio], zstrm); continue; } =20 @@ -2045,13 +2044,13 @@ static int recompress_slot(struct zram *zram, u32 i= ndex, struct page *page, __GFP_HIGHMEM | __GFP_MOVABLE); if (IS_ERR_VALUE(handle_new)) { - zcomp_stream_put(zram->comps[prio]); + zcomp_stream_put(zram->comps[prio], zstrm); return PTR_ERR((void *)handle_new); } =20 dst =3D zs_map_object(zram->mem_pool, handle_new, ZS_MM_WO); memcpy(dst, zstrm->buffer, comp_len_new); - zcomp_stream_put(zram->comps[prio]); + zcomp_stream_put(zram->comps[prio], zstrm); =20 zs_unmap_object(zram->mem_pool, handle_new); =20 @@ -2799,7 +2798,6 @@ static void destroy_devices(void) zram_debugfs_destroy(); idr_destroy(&zram_index_idr); unregister_blkdev(zram_major, "zram"); - cpuhp_remove_multi_state(CPUHP_ZCOMP_PREPARE); } =20 static int __init zram_init(void) @@ -2809,15 +2807,9 @@ static int __init zram_init(void) =20 BUILD_BUG_ON(__NR_ZRAM_PAGEFLAGS > sizeof(zram_te.flags) * 8); =20 - ret =3D cpuhp_setup_state_multi(CPUHP_ZCOMP_PREPARE, "block/zram:prepare", - zcomp_cpu_up_prepare, zcomp_cpu_dead); - if (ret < 0) - return ret; - ret =3D class_register(&zram_control_class); if (ret) { pr_err("Unable to register zram-control class\n"); - cpuhp_remove_multi_state(CPUHP_ZCOMP_PREPARE); return ret; } =20 @@ -2826,7 +2818,6 @@ static int __init zram_init(void) if (zram_major <=3D 0) { pr_err("Unable to get major number\n"); class_unregister(&zram_control_class); - cpuhp_remove_multi_state(CPUHP_ZCOMP_PREPARE); return -EBUSY; } =20 diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index 6cc5e484547c..092ace7db8ee 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -119,7 +119,6 @@ enum cpuhp_state { CPUHP_MM_ZS_PREPARE, CPUHP_MM_ZSWP_POOL_PREPARE, CPUHP_KVM_PPC_BOOK3S_PREPARE, - CPUHP_ZCOMP_PREPARE, CPUHP_TIMERS_PREPARE, CPUHP_TMIGR_PREPARE, CPUHP_MIPS_SOC_PREPARE, --=20 2.48.0.rc2.279.g1de40edade-goog From nobody Fri Dec 19 20:19:34 2025 Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D51581A8F94 for ; Wed, 22 Jan 2025 05:59:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737525562; cv=none; b=ux4l/TKshWv5I5s9sou28O6iuFY//WDYoJ5nH+cr1ezoUGAepukQJD/abWwSwqmGGv6lgiZXF6zGTXVE7y7uIX6H4J320LMc2o2ZxbsNWS/d8T9W/Jk2ccy9M6g6OcnIJzT9ksqmJGPTYbUBY3iChOqe1RZXk9ZPxRMLFgM3gFI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737525562; c=relaxed/simple; bh=rMlmjPFP36XUCLN8rl/OxWPYxn8c0sM8l7U36R66UPA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=M20mslf7iqnDfCxEZgJJMZucdyCdsRNjYkJUV5rW6fN9ZoLeVgs4kueYhLet0oYmUTqkqDqHCz4k5TcyczFl49L17r/UXLQmTlYGpvPl9ORKfqTIeFRXivuXAGJYLYDgG8XArzA1U6oKk5pMvL6r5JKEk+hCRIEJh1JdUdRQ1mw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org; spf=pass smtp.mailfrom=chromium.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b=COCGUhmZ; arc=none smtp.client-ip=209.85.214.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=chromium.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="COCGUhmZ" Received: by mail-pl1-f181.google.com with SMTP id d9443c01a7336-2163b0c09afso117402645ad.0 for ; Tue, 21 Jan 2025 21:59:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1737525560; x=1738130360; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=SJ+ga7y6+gKF81TWcu7LgFAJ2gVF/zL45vImj5BpZyE=; b=COCGUhmZmlalaiqEp1hu5aWctrdZEgye+XdHo3fvH/vBAKu4Cj3PNrAjxk2fGCzjkn NQ92cPBdd5dwX8qJZ0uW/bMXc9bIcdmP8zcvJgn4uf+r9VrbdxC4Z02dSvWALKBFeFQU xpRdx/ElGOv4e6b03bDJJzpnvlPMIUFhBcNLE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737525560; x=1738130360; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SJ+ga7y6+gKF81TWcu7LgFAJ2gVF/zL45vImj5BpZyE=; b=Ll7FC/vgpVOmVakfAxkChJ0nKP2Zxul1IjKNPIx9C7tjen8bISH7KhUhkF7EkDBCuH B5D6G8UPtxECfuOs1w1/bPHqR0GbP9g4627tZMVpwbS+IOXWwoUhE3qt1l62m7iMHmgo DIf0ZVlgS32rjZH1PF5fHjJgR3OmlQ06ml3G7v0PfuTfDNizuP4iHJ/5Y9Id2N7S2uNH dBMgB4YvCgYPs26rS44siBqrLGo8zZhnMv/MsZiXhfS6jBQnn63EECt8q0YpsYbvBQQv Y7GJt8BUTJob039aEJQIfAI1lONgPrjdkn5+ghAyq8UASXJXPbjewi7S7j3oq1pfC+H+ jIIw== X-Forwarded-Encrypted: i=1; AJvYcCVvzppTEpgH9tgz++YBop763g7iDAODM1lJEkv8eE94Cuxw5XpCWvrJiL92id5ctVTQoA46pyyaoH1fQvE=@vger.kernel.org X-Gm-Message-State: AOJu0YzIgtbHJvlMlUbCuKJDLDroILor2h6mYSHQe9vgQK6S/7Bh1mfj 9sj+RFgzuXrWejXobAYh05ZunQHUyPRniIXW5rL+Fddqbj1Syeq6SKB8Snrr9A== X-Gm-Gg: ASbGncts68XnxvyGvvzOyK+YvxeY8NeDvpTKqWpmaixlykEcoKy7wZyYXbUgJ4o2Zw0 aBWZYal7RIVL+yDPvUe6bsGtQ9CJPgAvsxx6vD+9D0zfcfUoC+7rSNLG4gLGicXWHbJy8nIXbbC u+N8TfNLrJv0dVqssU97XvjrFZQC93tFmQsCnbkXHv6eBd8FeFbqQPdw0cKiJLu4xHyXGEE2qIZ L4P1Y27yqPIyNx43cf45SPWgT1v71P2w9EVT9MPo/lx+59Wz0WDEvjCW/wrIPYUPSkkPg4ezhM4 feGWDR0= X-Google-Smtp-Source: AGHT+IFBMlFbJctVy/ypJUX6lRvd/RQFlHEGcsLvYlfwp9P4XLmEWnmR34KjHtXYACtXltbyrx4LEQ== X-Received: by 2002:a05:6a20:2455:b0:1d9:3957:8a14 with SMTP id adf61e73a8af0-1eb2146512cmr29760238637.1.1737525560105; Tue, 21 Jan 2025 21:59:20 -0800 (PST) Received: from localhost ([2401:fa00:8f:203:2902:8f0f:12b3:c251]) by smtp.gmail.com with UTF8SMTPSA id 41be03b00d2f7-abe1a03aa6csm302287a12.2.2025.01.21.21.59.18 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 21 Jan 2025 21:59:19 -0800 (PST) From: Sergey Senozhatsky To: Andrew Morton , Minchan Kim Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky Subject: [PATCH 3/7] zram: remove two-staged handle allocation Date: Wed, 22 Jan 2025 14:57:41 +0900 Message-ID: <20250122055831.3341175-4-senozhatsky@chromium.org> X-Mailer: git-send-email 2.48.0.rc2.279.g1de40edade-goog In-Reply-To: <20250122055831.3341175-1-senozhatsky@chromium.org> References: <20250122055831.3341175-1-senozhatsky@chromium.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Previously zram write() was atomic which required us to pass __GFP_KSWAPD_RECLAIM to zsmalloc handle allocation on a fast path and attempt a slow path allocation (with recompression) when the fast path failed. Since it's not atomic anymore we can permit direct reclaim during allocation, and remove fast allocation path and, also, drop the recompression path (which should reduce CPU/battery usage). Signed-off-by: Sergey Senozhatsky --- drivers/block/zram/zram_drv.c | 39 ++++++----------------------------- 1 file changed, 6 insertions(+), 33 deletions(-) diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index b217c29448ce..8029e0fe864a 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -1740,11 +1740,11 @@ static int write_incompressible_page(struct zram *z= ram, struct page *page, static int zram_write_page(struct zram *zram, struct page *page, u32 index) { int ret =3D 0; - unsigned long handle =3D -ENOMEM; - unsigned int comp_len =3D 0; + unsigned long handle; + unsigned int comp_len; void *dst, *mem; struct zcomp_strm *zstrm; - unsigned long element =3D 0; + unsigned long element; bool same_filled; =20 /* First, free memory allocated to this slot (if any) */ @@ -1758,7 +1758,6 @@ static int zram_write_page(struct zram *zram, struct = page *page, u32 index) if (same_filled) return write_same_filled_page(zram, element, index); =20 -compress_again: zstrm =3D zcomp_stream_get(zram->comps[ZRAM_PRIMARY_COMP]); mem =3D kmap_local_page(page); ret =3D zcomp_compress(zram->comps[ZRAM_PRIMARY_COMP], zstrm, @@ -1777,36 +1776,10 @@ static int zram_write_page(struct zram *zram, struc= t page *page, u32 index) return write_incompressible_page(zram, page, index); } =20 - /* - * handle allocation has 2 paths: - * a) fast path is executed with preemption disabled (for - * per-cpu streams) and has __GFP_DIRECT_RECLAIM bit clear, - * since we can't sleep; - * b) slow path enables preemption and attempts to allocate - * the page with __GFP_DIRECT_RECLAIM bit set. we have to - * put per-cpu compression stream and, thus, to re-do - * the compression once handle is allocated. - * - * if we have a 'non-null' handle here then we are coming - * from the slow path and handle has already been allocated. - */ + handle =3D zs_malloc(zram->mem_pool, comp_len, + GFP_NOIO | __GFP_HIGHMEM | __GFP_MOVABLE); if (IS_ERR_VALUE(handle)) - handle =3D zs_malloc(zram->mem_pool, comp_len, - __GFP_KSWAPD_RECLAIM | - __GFP_NOWARN | - __GFP_HIGHMEM | - __GFP_MOVABLE); - if (IS_ERR_VALUE(handle)) { - zcomp_stream_put(zram->comps[ZRAM_PRIMARY_COMP], zstrm); - atomic64_inc(&zram->stats.writestall); - handle =3D zs_malloc(zram->mem_pool, comp_len, - GFP_NOIO | __GFP_HIGHMEM | - __GFP_MOVABLE); - if (IS_ERR_VALUE(handle)) - return PTR_ERR((void *)handle); - - goto compress_again; - } + return PTR_ERR((void *)handle); =20 if (!zram_can_store_page(zram)) { zcomp_stream_put(zram->comps[ZRAM_PRIMARY_COMP], zstrm); --=20 2.48.0.rc2.279.g1de40edade-goog From nobody Fri Dec 19 20:19:34 2025 Received: from mail-pl1-f177.google.com (mail-pl1-f177.google.com [209.85.214.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C6B131C07CB for ; Wed, 22 Jan 2025 05:59:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737525567; cv=none; b=EKrnR2yT1YvTm7d+KzkeD41pXaMjJrOv9X9+OfzdCMZJf4hPeaLRjYvg2eW2H2bxmqDtq+lwSgTgBkj+fL5Om5vqJL1B3izLkiuXMMZZmomkFK5hJzCIIkpCA6vpEit1PfvyyNiUXeBGsMsyE0VfYz3/ztaQVjvY6bUDkvCTODA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737525567; c=relaxed/simple; bh=rCEbEkeSlyFPjPT0SyooPaVCaIPqGlAbHWVgKAX9DWE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=TKlQZWXhibAcHn2u9lvTQwZ3XVHM3570z/KbxObzXerxS2O1QGOK9MlMlKW1SY9AhxlkJKjLhcwsEHNphfViDM8/oNOR0z2wsRndOtn8VVFutcNQobBmOR7MDCXDOQnxkGUuomoJxnH0V8l0197bOvZRt5k5BSYAPc9OXuGlCxg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org; spf=pass smtp.mailfrom=chromium.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b=IpE4CXL0; arc=none smtp.client-ip=209.85.214.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=chromium.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="IpE4CXL0" Received: by mail-pl1-f177.google.com with SMTP id d9443c01a7336-2164b662090so119224875ad.1 for ; Tue, 21 Jan 2025 21:59:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1737525565; x=1738130365; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=MU55qMB62NlP1Hfg8Lf4TsS0i/+gjPtdvpdv0f7R4WY=; b=IpE4CXL0Q8mPlB8B7NP28oFosREKMUCNzsicJ6dhmMwSI6DimLaKIOd1W28c0WYgaM dLIF1z1P9bai5SBmS/oVcPVSHwNqsuHw+AnXoTYXg0DIrXX6Hi7wjrQl+q/rJ5pXXl02 I40gHlALBh02lsWtiZhSZdWWYV0mrBnGWsGxM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737525565; x=1738130365; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MU55qMB62NlP1Hfg8Lf4TsS0i/+gjPtdvpdv0f7R4WY=; b=NbKcXdCZACs/deQIAEpwd3rOxwi0QQKPs4BRMKx9AIeoRUGwIfK5nM07dqgAUCNV8O Xq+GyY8UzavISJjlrJg4QZmtCiW6/PZS+wEzscoQ4TWpykcgq+rmmwOn2OZS/fpts+5R d4ZRbTmnN4gbkIIf6Sz+bNHYlD8BLxrBG7OHyicQEihLeQHG4e5if44V6h5aotg/bUUD LDEQhGOslVAgAwY4ATk4BOHUkswMBHy9JqE+38HeDzb0r5qqODWSJVKgj7YMQyHljO7Z 9XbO0dknXAy5Zl/Bwyh7/3WE0ihoxUL72m/88EXiWVclAD3OPVYMWILbtp/pQBgVYFVN QkXA== X-Forwarded-Encrypted: i=1; AJvYcCX+rZZBUWtMWBdcjCXEuU73z1o6oBb7Eo8tpWP2y+3545VCtZgD1Q2tg/OXsW1eg+1zZ9nAeNqgFmtIC+8=@vger.kernel.org X-Gm-Message-State: AOJu0YyN8cmwFm8or2A+OHn+E4xWg3JsElOVCyW21VWLp7Nl4zNuYH3G 3x7eZ7W/FIIim8gni3xRxB6gmgU2Bfqu+n73CyfN9jDRFB+J5CzNXOet4YsYwA== X-Gm-Gg: ASbGncthYyiPJt29olP9gD7dC1dL7SEzvwDg4HDZaLa93NPAsv0eP6YMMcFItGTPvFv zV6+bt/Q3UHQVMO3SWqohdZdtwc8K9QIoBqPP+yWGQrpcV2HrUrQS6QZVOaXiTbSmzc8qNjG5e7 1aPWm62qDel4MOzX9lce6wJuOKf/yG8x3vVwWuDKSuDKduvd14wnDkj95/O7KVkFPkcP1+6GK8a vl9M93drB4GT7vSBwJIol4jL3D0ziEtZ1du2VSrnZfd9Cet8UAIwyPoxSRx1ttyKHHzSIWG X-Google-Smtp-Source: AGHT+IHVvjN7ukzTgGkrLyv03/2Ys7IGx6CgvGp7QzNoO8SNF+YE8BGXBhQxnVlgYQWffGVAiCf6Ng== X-Received: by 2002:a05:6a20:3d8d:b0:1e0:f495:1bd9 with SMTP id adf61e73a8af0-1eb2144d37bmr33294873637.8.1737525564931; Tue, 21 Jan 2025 21:59:24 -0800 (PST) Received: from localhost ([2401:fa00:8f:203:2902:8f0f:12b3:c251]) by smtp.gmail.com with UTF8SMTPSA id 41be03b00d2f7-a9bdf0b7579sm9763985a12.73.2025.01.21.21.59.23 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 21 Jan 2025 21:59:24 -0800 (PST) From: Sergey Senozhatsky To: Andrew Morton , Minchan Kim Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky Subject: [PATCH 4/7] zram: permit reclaim in zstd custom allocator Date: Wed, 22 Jan 2025 14:57:42 +0900 Message-ID: <20250122055831.3341175-5-senozhatsky@chromium.org> X-Mailer: git-send-email 2.48.0.rc2.279.g1de40edade-goog In-Reply-To: <20250122055831.3341175-1-senozhatsky@chromium.org> References: <20250122055831.3341175-1-senozhatsky@chromium.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When configured with pre-trained compression/decompression dictionary support, zstd requires custom memory allocator, which it calls internally from compression()/decompression() routines. This was a tad problematic, because that would mean allocation from atomic context (either under entry spin-lock, or per-CPU local-lock or both). Now, with non-atomic zram write(), those limitations are relaxed and we can allow direct and indirect reclaim during allocations. The tricky part is zram read() path, which is still atomic in one particular case (read_compressed_page()), due to zsmalloc handling of object mapping. However, in zram in order to read() something one has to write() it first, and write() is when zstd allocates required internal state memory, and write() path is non-atomic. Because of this write() allocation, in theory, zstd should not call its allocator from the atomic read() path. Keep the non-preemptible branch, just in case if zstd allocates memory from read(), but WARN_ON_ONCE() if it happens. Signed-off-by: Sergey Senozhatsky --- drivers/block/zram/backend_zstd.c | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/drivers/block/zram/backend_zstd.c b/drivers/block/zram/backend= _zstd.c index 1184c0036f44..53431251ea62 100644 --- a/drivers/block/zram/backend_zstd.c +++ b/drivers/block/zram/backend_zstd.c @@ -24,19 +24,14 @@ struct zstd_params { /* * For C/D dictionaries we need to provide zstd with zstd_custom_mem, * which zstd uses internally to allocate/free memory when needed. - * - * This means that allocator.customAlloc() can be called from zcomp_compre= ss() - * under local-lock (per-CPU compression stream), in which case we must use - * GFP_ATOMIC. - * - * Another complication here is that we can be configured as a swap device. */ static void *zstd_custom_alloc(void *opaque, size_t size) { - if (!preemptible()) + /* Technically this should not happen */ + if (WARN_ON_ONCE(!preemptible())) return kvzalloc(size, GFP_ATOMIC); =20 - return kvzalloc(size, __GFP_KSWAPD_RECLAIM | __GFP_NOWARN); + return kvzalloc(size, GFP_NOIO | __GFP_NOWARN); } =20 static void zstd_custom_free(void *opaque, void *address) --=20 2.48.0.rc2.279.g1de40edade-goog From nobody Fri Dec 19 20:19:34 2025 Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 89BF71C1F34 for ; Wed, 22 Jan 2025 05:59:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737525572; cv=none; b=QtLcWIkldj6saV+VGyqyeMnylI/FfYup5KRM9v7UHlrW6m4mi3vFWyIxd9LVw8+0P44I8rtNd7owv4KSAxnSDuS6cyZG3njv2ISGKi0a/dLHFegv7Xf65qw+aUr4KNK+ySDhZNsrc8xEeUntkhXDIjJmR8Sw/6a7Y8fPp6myETc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737525572; c=relaxed/simple; bh=ceWVhzDBOJXXfnRl/1ShOrscDweY68PDMYI5sXiAuUg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=AAWUgyFu52mwZXnPHuDCiKydWYC3wo74vjlTMt/kxMFeJSwWFkuluXfoR6aGckL1ejF4Bd4aV3oj3xghSxTMX+BuOjX6kUOGCsbdWz79rjx5EmZsxlCpA30wSdyoLrsr1Pk228X7P2X/UeVtQoxw06D9i/keRmrB96lD7v55qz0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org; spf=pass smtp.mailfrom=chromium.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b=ICtAsZoR; arc=none smtp.client-ip=209.85.214.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=chromium.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="ICtAsZoR" Received: by mail-pl1-f169.google.com with SMTP id d9443c01a7336-21a7ed0155cso104689015ad.3 for ; Tue, 21 Jan 2025 21:59:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1737525570; x=1738130370; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=RlO1kRVQLX8+VpLXbuNkwAyTGyYAX3nxybEwBuf8gq8=; b=ICtAsZoRSMI4ZNNIgX9HAfQjwjviqsL0+p60kev8dqwOIHkWYTVTyei2NIz/SVImMS vpkV7+ZY2oCSGTxeTyhxdDZkJ/QBrnmSlXFclaEu19YmiW+0gYm76/MvBK2deaHhMAMM NUZs2eFGf5D3+wBXUlwgd7meNvg3A36Kl6c3A= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737525570; x=1738130370; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RlO1kRVQLX8+VpLXbuNkwAyTGyYAX3nxybEwBuf8gq8=; b=Gv4kQzbF+dQvDCoitdgYwJr4ON60dwnhvm3i0CNqw18wqVsczEiNel8tfMXEBY2aUD JRAL8/GCMAP8COqCjVX68UdpV9exbkV8EpcLXuc3etLM8XhehzqUwEh7MJDmpIFIusa1 aSsDeQhFvhvLck8/blhuhQpKqdZj91XsHhNdLreGglTLL5+5uIkzPrVVGr7P+IaeDlEd ikHb9MaYNitGxEBGmEEi4pN/ThbRwLMIt3ibJtArp+UJWkpNQL1K9eQsIpZCcQIU0nll VQfuk8wY/IJY44PwTgsrMZYhtZcqwQkI22sOcQXwKDuUNMETR9d34IryddR61IlaWCmr xS9w== X-Forwarded-Encrypted: i=1; AJvYcCXyA0Mfrv/xa/FwKbUJHfyJ37b/B7isgxtG1B+Ojl63OslR9D5MOoxOA6TE/COFWqxtSs6qszAQG8U2ZLU=@vger.kernel.org X-Gm-Message-State: AOJu0YyCmaTn1zee/6vKQxFnjo0f/ASKv+AV7JGmWd1aGfVP03YH7D+H XeXMuJbszUaOkUI1YGr6d5b6r2nZRhHKAFFDFBaJDWSpU9NV9nLbxnveX2fcZA== X-Gm-Gg: ASbGnctRq3B5UqeC4+k4F9rlvOkinHEV/zbBW5rEO+z6azmvwqLO1ZjCcLV9yYbdSXc oayb03b4JGCfxxb3Z0JQGys/oW33mKgNeq5fyLLyC0MLTPzgIvA2WPph5J31lXn6BnWMJBCyjMJ s8nJPDBYiypifksKaTcE093qaWgUMfJcNUflpEVr5aZvcOro5GgqXhbVtmvJolJmJA7KIvk4NRf PrrHG1ZGjlw+kX7hBmXRfSeidGODw36wVJmWapIWnNnZxu9l+Bf2swWoZ7dcZl4EmozcQwH X-Google-Smtp-Source: AGHT+IFDdKEw/ZLrD/5ATxrQr6EltWEOuPwzfWGcOrPfuy+nWB9ln8JMINNEvYbVGHeZ24N0e5vrbA== X-Received: by 2002:a17:902:f644:b0:211:8404:a957 with SMTP id d9443c01a7336-21c355f6aa9mr356031145ad.41.1737525569847; Tue, 21 Jan 2025 21:59:29 -0800 (PST) Received: from localhost ([2401:fa00:8f:203:2902:8f0f:12b3:c251]) by smtp.gmail.com with UTF8SMTPSA id d9443c01a7336-21d6217cd6fsm20187155ad.189.2025.01.21.21.59.28 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 21 Jan 2025 21:59:29 -0800 (PST) From: Sergey Senozhatsky To: Andrew Morton , Minchan Kim Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky Subject: [PATCH 5/7] zram: permit reclaim in recompression handle allocation Date: Wed, 22 Jan 2025 14:57:43 +0900 Message-ID: <20250122055831.3341175-6-senozhatsky@chromium.org> X-Mailer: git-send-email 2.48.0.rc2.279.g1de40edade-goog In-Reply-To: <20250122055831.3341175-1-senozhatsky@chromium.org> References: <20250122055831.3341175-1-senozhatsky@chromium.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Recompression path can now permit direct reclaim during new zs_handle allocation, because it's not atomic anymore. Signed-off-by: Sergey Senozhatsky --- drivers/block/zram/zram_drv.c | 12 +++--------- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index 8029e0fe864a..faccf9923391 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -2005,17 +2005,11 @@ static int recompress_slot(struct zram *zram, u32 i= ndex, struct page *page, return 0; =20 /* - * No direct reclaim (slow path) for handle allocation and no - * re-compression attempt (unlike in zram_write_bvec()) since - * we already have stored that object in zsmalloc. If we cannot - * alloc memory for recompressed object then we bail out and - * simply keep the old (existing) object in zsmalloc. + * If we cannot alloc memory for recompressed object then we bail out + * and simply keep the old (existing) object in zsmalloc. */ handle_new =3D zs_malloc(zram->mem_pool, comp_len_new, - __GFP_KSWAPD_RECLAIM | - __GFP_NOWARN | - __GFP_HIGHMEM | - __GFP_MOVABLE); + GFP_NOIO | __GFP_HIGHMEM | __GFP_MOVABLE); if (IS_ERR_VALUE(handle_new)) { zcomp_stream_put(zram->comps[prio], zstrm); return PTR_ERR((void *)handle_new); --=20 2.48.0.rc2.279.g1de40edade-goog From nobody Fri Dec 19 20:19:35 2025 Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BE7141C245C for ; Wed, 22 Jan 2025 05:59:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737525576; cv=none; b=ALwmcjIwwnNLd14mf+eOdqaj/uYWLV9gVVSqirAhLZ7N3Jdo5v7S56pgmfTlZ3jyax0oR7LLIu6i3NoQrjHu/EiPqhS9EwRJCCsq8iQ2+ozPV3C2OfXVVnPrDoND2DGCnkUr2GqMx5+AxpDgKMQtjcizz9mEzUE4yT1nnbDQs3U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737525576; c=relaxed/simple; bh=tok1d+KJY2StErY9HkGjxI08FBgkFYfzv69eQcR1YNw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=V0uLdt0XfgImZ3fOrFU9hVN9KdgqSBHfmJvTOgsPbGTzUbVzMH41Jh3f/+6mfD2aKi+aIxVDA81RAW34c5K38tVrSatbnKSTH5OeWGl+eMMGGBkWFXifvHnSAqYA8DOIHr6gT/Bsz4WlYQrmWJJX624s3l415IfZfJjom5nZsYY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org; spf=pass smtp.mailfrom=chromium.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b=gljiHlU2; arc=none smtp.client-ip=209.85.214.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=chromium.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="gljiHlU2" Received: by mail-pl1-f172.google.com with SMTP id d9443c01a7336-216281bc30fso148861705ad.0 for ; Tue, 21 Jan 2025 21:59:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1737525574; x=1738130374; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=BvDi8LZx+/IWsVw6XTpR4rzD/goHjjTNwP1dgLi65OA=; b=gljiHlU2W8aGL3x/jhvmIAqCJJiLBVU3NCmRxL3ctENtvjwIlGUnTEVDzZroTDmfbX 36vTPCKxH7hg0v6sjLGwSl2bhG3xpJ6Xj8gkN4RkKsNnmItjbrT0fX4mIKzRrB6wAgMn PTbJ6m0dZ65KMkY2M23bO1m78ac6sNMYqoPiQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737525574; x=1738130374; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BvDi8LZx+/IWsVw6XTpR4rzD/goHjjTNwP1dgLi65OA=; b=n4FqM4ELjdOPXtXoWuAA7TDrTwrm+GBOOcgdMiQIJnkjqeCSHKn8K1pl9fspQKJ5s0 BYfacrdMVVRPS94uPcGUA8bMmSg1FPEYqsqEzuT79kSZXLf2ul5RYbj8FqcG12gNDuZM FItLTNa9doPopWPZJjogdiVCFOx1yzCg5h6lGVPa3QOcvQVRoWuNr5t3Q2zQJhwj787t 7rO+olsnpDGRIRk1jYWLIf9+WoJOe3weBt4GnLKShEHpK9qoKRGTEZSVpOroa4xbQHYR hCLbv7b6PKaBV/3gNYoLNP9aVQHKLvfWeHp7s8IvgNKoI4DMqsGEbTnEeU4kWg5IPUE6 oZZg== X-Forwarded-Encrypted: i=1; AJvYcCVDuFZrxZ7HHmx0cvDmUqCRAQXj16WX+ZuefjIx6VgsyYl3UsXxYBck+i4rTG3gYy6zrudufYRfzs3bLW4=@vger.kernel.org X-Gm-Message-State: AOJu0YxS/dHBhImmaMeWTckEdNcDOdZiqH8ghmRUhD4eX5A8ogO6Z0co uhxDEUA6WHVst7W064XcnSNLddW72/bMxAKXVTCTKHhSKGTpVxccOC+cxs6t8A== X-Gm-Gg: ASbGncskC8IBjflue2jK/GAJZnItvEXa3kmiKX2xPue7nRtej5y2dp4ygmk9UxfDb3G jAuUaW8Hvvq+68k+K5FD3V0qj9k3/xyJnUOkryOlrNlnItTdtpwlLudAoOWnL+TPqCAATK3ca2x w8EUpx1FEOlpJ0nqle2Mf3DsQoX7v0ISLPAiuL2hJeAKingvst0d05dvN2a2TqQqQVcB7CasL24 amAUHX/AnfgBu+ZEasvXekJBVwMFWIzu2ECrA48UAS713zQQXz3IoTtpZcBsbbEHs3jAoSK X-Google-Smtp-Source: AGHT+IF5DI4LMRdnglCGj0EuJR+qLfX7sRa9uFUVFiVN11N888cUyl0GHxvq1Dybvd7Sp+FVeBy9+g== X-Received: by 2002:a17:902:ea05:b0:216:3d72:1712 with SMTP id d9443c01a7336-21c3579359cmr340704225ad.48.1737525574148; Tue, 21 Jan 2025 21:59:34 -0800 (PST) Received: from localhost ([2401:fa00:8f:203:2902:8f0f:12b3:c251]) by smtp.gmail.com with UTF8SMTPSA id d9443c01a7336-21c2d4027fcsm87386355ad.214.2025.01.21.21.59.32 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 21 Jan 2025 21:59:33 -0800 (PST) From: Sergey Senozhatsky To: Andrew Morton , Minchan Kim Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky Subject: [PATCH 6/7] zram: remove writestall zram_stats member Date: Wed, 22 Jan 2025 14:57:44 +0900 Message-ID: <20250122055831.3341175-7-senozhatsky@chromium.org> X-Mailer: git-send-email 2.48.0.rc2.279.g1de40edade-goog In-Reply-To: <20250122055831.3341175-1-senozhatsky@chromium.org> References: <20250122055831.3341175-1-senozhatsky@chromium.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" There is no zsmalloc handle allocation slow path now and writestall is not possible any longer. Remove it from zram_stats. Signed-off-by: Sergey Senozhatsky --- drivers/block/zram/zram_drv.c | 3 +-- drivers/block/zram/zram_drv.h | 1 - 2 files changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index faccf9923391..d516f968321e 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -1443,9 +1443,8 @@ static ssize_t debug_stat_show(struct device *dev, =20 down_read(&zram->init_lock); ret =3D scnprintf(buf, PAGE_SIZE, - "version: %d\n%8llu %8llu\n", + "version: %d\n0 %8llu\n", version, - (u64)atomic64_read(&zram->stats.writestall), (u64)atomic64_read(&zram->stats.miss_free)); up_read(&zram->init_lock); =20 diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h index b272ede404b0..4f707dabed12 100644 --- a/drivers/block/zram/zram_drv.h +++ b/drivers/block/zram/zram_drv.h @@ -85,7 +85,6 @@ struct zram_stats { atomic64_t huge_pages_since; /* no. of huge pages since zram set up */ atomic64_t pages_stored; /* no. of pages currently stored */ atomic_long_t max_used_pages; /* no. of maximum pages stored */ - atomic64_t writestall; /* no. of write slow paths */ atomic64_t miss_free; /* no. of missed free */ #ifdef CONFIG_ZRAM_WRITEBACK atomic64_t bd_count; /* no. of pages in backing device */ --=20 2.48.0.rc2.279.g1de40edade-goog From nobody Fri Dec 19 20:19:35 2025 Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 383A41C2DA2 for ; Wed, 22 Jan 2025 05:59:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737525580; cv=none; b=o2w8aGtD3IiEco2f4Ep13T7CGUVPgQHr/v7D/FXUZI0vORPO4TXGEW7i5jfOveVnbwyWNGkrl38HPoYyz8ABwEDICxgcfGgQwCEm9UJQ2P5sa/mAm5aFwR+nl+WluC0x3ve9eFmoJ3Vo23H/Cx7fI+BU7MWo9J3HUcwBMSq2WAM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737525580; c=relaxed/simple; bh=+ib/PEbdXlxjG+X7FzKlPeDQ6Vb0QZjNB7Cca9ibMCk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=VfeIwxXgeyMA26JWrDEYPXTejXINeQItpzsk85lgc/aFEtYaazSmp9VfijPLBWOOGk9Uh7NZoAT+NGvGBzsew3/OOgX7a85v8sc9a5B4fd+gLy4RQnpeuzx5QCMiXrYiYJyIGga7u/LaMpDrmEoJ3RXG5IsVmiQwTaX8+8sEnTM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org; spf=pass smtp.mailfrom=chromium.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b=PRQB/RDn; arc=none smtp.client-ip=209.85.214.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=chromium.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="PRQB/RDn" Received: by mail-pl1-f171.google.com with SMTP id d9443c01a7336-21654fdd5daso111814005ad.1 for ; Tue, 21 Jan 2025 21:59:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1737525578; x=1738130378; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=URQGVN2RLRBAm/FHMFqf5kTEz8apzib8vA7ECF9oUaE=; b=PRQB/RDnnbHgaFCAwWAg/8afO9FYX4tBYi3DkE0mI+BUsQO7nOBpEI8rIXTpoLkMM+ 4SoHUWC27XVX6LAE+mVHAot2YXTmbtHUguucyNa72M0ugUywID9DLXa19vru/P5tA2SZ 5B3eq/uipfpDU4t6OsaGfJhXEbGrhG/ptDIpc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737525578; x=1738130378; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=URQGVN2RLRBAm/FHMFqf5kTEz8apzib8vA7ECF9oUaE=; b=cScwkc+FCuDo0sfOQdtbJVgGE72uTZxZ8MzgcTPyiNnWACs18GCuO9QhJFiJ5TAkG6 XWLhpTzE6wUyCqyYN8qLrh2JsDYVkveeXNsP2a/aBJxEdxhU8g1cHBrs9IXrefua6FB2 VxGjC/9Y/6ArwTDDjXA/+/rhSqqTR3OBzWQ506PEtokDX3HvjKVqNVY5D3AFx4uUNKZm dtk84HihnTTMkM+uapWJPbzZ1VtqNtrECWQgGcNtcou4Vyczwmw6QgGaIPGYjgPBxSyW Hbr6hHrlBtgTL5qyIvftTVaXbsZQvCihcO2K7OV1SzzY5dVJeIRTrY98/HAFkk3kMqK7 3OZw== X-Forwarded-Encrypted: i=1; AJvYcCXIPBVkaAWVb8wCris5nOj1R4vjK3P4eeb/0cnp6O7h/wfy0RwGBOip4JemtdxJENX7t2qkBpPZCyV+foQ=@vger.kernel.org X-Gm-Message-State: AOJu0Yyho545g/xY1otBKlUpHRPPx0SZBy8XtnKDGUj2FI0JqcIHxDLJ UW6R4wXOHF9bA4OahohhUpj1cIgF4y5AtGrr8NlZ1dTrZ1Vhal7bjcRvj0sNPA== X-Gm-Gg: ASbGncu4L1jsuQb6UrLwOJLojfzT7ZB8nJp2Qf0A8MQkNrkxVzX6U2q1jeWhmLWJZ55 aA4msRkXFHjqn5CdndP3nEdQIC8nD3+275aNNE/IHb+oIjTm4b7CjKTNspBI+1F5VDT48MvtvWw zyv9/QTaze2dmTG04g4utlv66HQ7orncNOgiGfeoH6mkWXFxkGu0M1J2rvsAQgl5I/4tRXlypkx GdGOvIkcY3F0wcY+oUaVoCBjdOoRNkY+NenzwRABCzBPEOQFkgsAC/05fhLqEVDO8YuLUpu X-Google-Smtp-Source: AGHT+IHfDORXl8892ZPFv7Yg9s3aa+pqcjz8wYjoVNgZaPfPzukdMK7LQbRYaeoP/wmNeG24p8MYWA== X-Received: by 2002:a17:902:d2c5:b0:216:779a:d5f3 with SMTP id d9443c01a7336-21c353edbd7mr334668355ad.14.1737525578444; Tue, 21 Jan 2025 21:59:38 -0800 (PST) Received: from localhost ([2401:fa00:8f:203:2902:8f0f:12b3:c251]) by smtp.gmail.com with UTF8SMTPSA id d9443c01a7336-21c2d3ac3e2sm86558565ad.139.2025.01.21.21.59.36 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 21 Jan 2025 21:59:38 -0800 (PST) From: Sergey Senozhatsky To: Andrew Morton , Minchan Kim Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky Subject: [PATCH 7/7] zram: unlock slot bucket during recompression Date: Wed, 22 Jan 2025 14:57:45 +0900 Message-ID: <20250122055831.3341175-8-senozhatsky@chromium.org> X-Mailer: git-send-email 2.48.0.rc2.279.g1de40edade-goog In-Reply-To: <20250122055831.3341175-1-senozhatsky@chromium.org> References: <20250122055831.3341175-1-senozhatsky@chromium.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" As of now recompress_slot() is called under slot bucket write-lock, which is suboptimal as it blocks access to a huge number of entries. The good news is that recompression, like writeback, makes a local copy of slot data (we need to decompress it anyway) before post-processing so we can unlock slot bucket once we have that local copy. Unlock the bucket write-lock before recompression loop (secondary algorithms can be tried out one by one, in order of priority) and re-acquire it right after the loop. There is one more potentially costly operation recompress_slot() does - new zs_handle allocation, which can schedule(). Release the bucket write-lock before zsmalloc allocation and grab it again after the allocation. In both cases, once the bucket lock is re-acquired we examine slot's ZRAM_PP_SLOT flag to make sure that the slot has not been modified by a concurrent operation. Signed-off-by: Sergey Senozhatsky --- drivers/block/zram/zram_drv.c | 53 +++++++++++++++++++++++++---------- 1 file changed, 38 insertions(+), 15 deletions(-) diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c index d516f968321e..0413438e4500 100644 --- a/drivers/block/zram/zram_drv.c +++ b/drivers/block/zram/zram_drv.c @@ -1925,6 +1925,14 @@ static int recompress_slot(struct zram *zram, u32 in= dex, struct page *page, zram_clear_flag(zram, index, ZRAM_IDLE); =20 class_index_old =3D zs_lookup_class_index(zram->mem_pool, comp_len_old); + + /* + * Set prio to one past current slot's compression prio, so that + * we automatically skip lower priority algorithms. + */ + prio =3D zram_get_priority(zram, index) + 1; + /* Slot data copied out - unlock its bucket */ + zram_slot_write_unlock(zram, index); /* * Iterate the secondary comp algorithms list (in order of priority) * and try to recompress the page. @@ -1933,13 +1941,6 @@ static int recompress_slot(struct zram *zram, u32 in= dex, struct page *page, if (!zram->comps[prio]) continue; =20 - /* - * Skip if the object is already re-compressed with a higher - * priority algorithm (or same algorithm). - */ - if (prio <=3D zram_get_priority(zram, index)) - continue; - num_recomps++; zstrm =3D zcomp_stream_get(zram->comps[prio]); src =3D kmap_local_page(page); @@ -1947,10 +1948,8 @@ static int recompress_slot(struct zram *zram, u32 in= dex, struct page *page, src, &comp_len_new); kunmap_local(src); =20 - if (ret) { - zcomp_stream_put(zram->comps[prio], zstrm); - return ret; - } + if (ret) + break; =20 class_index_new =3D zs_lookup_class_index(zram->mem_pool, comp_len_new); @@ -1966,6 +1965,19 @@ static int recompress_slot(struct zram *zram, u32 in= dex, struct page *page, break; } =20 + zram_slot_write_lock(zram, index); + /* Compression error */ + if (ret) { + zcomp_stream_put(zram->comps[prio], zstrm); + return ret; + } + + /* Slot has been modified concurrently */ + if (!zram_test_flag(zram, index, ZRAM_PP_SLOT)) { + zcomp_stream_put(zram->comps[prio], zstrm); + return 0; + } + /* * We did not try to recompress, e.g. when we have only one * secondary algorithm and the page is already recompressed @@ -2003,17 +2015,28 @@ static int recompress_slot(struct zram *zram, u32 i= ndex, struct page *page, if (threshold && comp_len_new >=3D threshold) return 0; =20 - /* - * If we cannot alloc memory for recompressed object then we bail out - * and simply keep the old (existing) object in zsmalloc. - */ + /* zsmalloc handle allocation can schedule, unlock slot's bucket */ + zram_slot_write_unlock(zram, index); handle_new =3D zs_malloc(zram->mem_pool, comp_len_new, GFP_NOIO | __GFP_HIGHMEM | __GFP_MOVABLE); + zram_slot_write_lock(zram, index); + + /* + * If we couldn't allocate memory for recompressed object then bail + * out and simply keep the old (existing) object in mempool. + */ if (IS_ERR_VALUE(handle_new)) { zcomp_stream_put(zram->comps[prio], zstrm); return PTR_ERR((void *)handle_new); } =20 + /* Slot has been modified concurrently */ + if (!zram_test_flag(zram, index, ZRAM_PP_SLOT)) { + zcomp_stream_put(zram->comps[prio], zstrm); + zs_free(zram->mem_pool, handle_new); + return 0; + } + dst =3D zs_map_object(zram->mem_pool, handle_new, ZS_MM_WO); memcpy(dst, zstrm->buffer, comp_len_new); zcomp_stream_put(zram->comps[prio], zstrm); --=20 2.48.0.rc2.279.g1de40edade-goog