From nobody Fri Oct 10 09:18:07 2025 Received: from mail-pf1-f182.google.com (mail-pf1-f182.google.com [209.85.210.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A1E9F2E7F32; Sat, 14 Jun 2025 20:24:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749932669; cv=none; b=MuWbo93Lo/5QWqXCm5fzShRQi1K8po0ZnOmfW7U1LcXCgaqwDhwA2SPXfNFk+/L5A1jyKFPiem8/AUTh6bkU0LnI6CJmYd+Km5VqU83ReM9IU74Ts80WNYsr4R/6342qu/eK4UiC/aJncR6Hh2ktGCa/k/LFlw9CuaJvLOXlGhI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749932669; c=relaxed/simple; bh=FF7UNhexBfgyailB1LL46EeXTPxhK/Q0SjOBKHX9JMo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=F+fl3iUt1oS23NbW/jjBSCiruPJlkrq4eZsbOrFw3ijR8DwvFMaFgRYPoxTogfW1JV4kka794bPQuQYSrQoggCbD+SSUTPjzZCDPYZBWvQeKp3sj66Xjy1T1N0p3b5/2POTxj66Cd4PNyFF4xcEAm9HFQYrhKGB0/H7x9Z8JeCw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=kNPvHjlD; arc=none smtp.client-ip=209.85.210.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="kNPvHjlD" Received: by mail-pf1-f182.google.com with SMTP id d2e1a72fcca58-7489dfb71a8so745335b3a.1; Sat, 14 Jun 2025 13:24:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1749932667; x=1750537467; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=DPUx8acHfFHm+lWLzhnb5FfTKvzuEm/rpMZ61FlXfd0=; b=kNPvHjlDIW3h+YD4mrq8DeqdSaSSYNjPJjTTe2fh2U1LfCA4GUAM8ykENs4P5q0aG+ 2fm6cEN1PWkSaxrLgF9tl7xIWdg6xfr6rh4RniRlVZ9oevzks/TQ5CvUWJBnl0/nIr+0 wr3FCQAushIx2C4q1bEBgOJur3jY50U6EOI+SwCN+41bMqSdoR3ahbuPmvU/kh/gDZoQ nZWgMpH2/RXOBYfaySLKQd/6Fz1k8989S/JWGrRV2ul8m4BiQFjW19NydqtVYJZC45OG 2kQXYERjvqNDi6AocOQscCz21ySnMWzU4x1KYIaWYl8Qgh9rMU9nffhequL0z2/ne1st ruxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749932667; x=1750537467; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DPUx8acHfFHm+lWLzhnb5FfTKvzuEm/rpMZ61FlXfd0=; b=bnmiu8TA+lRVjX46NQ+YEFV3RS15fGH8+z15R0VWmq4JV20mXA9A8wqABmY4SC8s+M pqUXU/X6gBUY0kkZ3kHSB+xjwAUBtoWFBjo+Dk0Yl474ofGDcXOBwzoJzAJp0D9Jd+kx zsV2m4QP7y8DvdMBK4xlYs2TdN3sIjrhvkkr5zRBsUX04UYQyiAio0wxpDs/lJt5MR06 SLypEuRJ0NDWQf0DeEt1VqKIFcQ14MdR1M6nKLAJbIHjkMvqaLRIb7eA+tMK9xqeaX17 45AVSRp23F4iWCG7BYZEUDl4NPW6xvtMDe6nxeyrUeGpZiV2UzHuZxEa9TdhnjgF02cP JPPg== X-Forwarded-Encrypted: i=1; AJvYcCX0o0fFVCTwYe4Oq9pwoLdi/YF01cwOi48oKrNix+AN2VTymWAj38vPjEL/l9dNBZUTbdERXNA1@vger.kernel.org, AJvYcCXM2gAtJLmngJ1l/VzfOkhPHXPeEd8n4yyo2ksFj10GnMvhqprRHyAxgF55J0XNLMWDhSCXn74YEMLCIeM=@vger.kernel.org X-Gm-Message-State: AOJu0Yz6ScqGcPmNqBkbauceh/kMOKNwkLcMOR0hauZWp2efP97Tv5RI Mmbt7EfMCmssnHB4lq872lkcM+GeXF0oCEAsVMEKk9Jdxdw8N9MXbNyN X-Gm-Gg: ASbGncuy6xplSFRsdVRB3xN7tGL/soTybFqrUCXNpYDSTUZJDtP2Eyt9M5weJCXvsaG uIpmc9MKVxNNN4dGn2jFHkfrrFEokvopakHBKXbemOEdiHV0UMTrFtHwLZVY5ZxfiMc5czUYpFc Uw4yWN6kaRl9naQvIZSZcrjwGuhszQzDwo4Hue1JKhNuvg3R1DvLyqnLM/E1NqvlrAWKyudn8Wv mTLOZhCaX3I4t/OEGqJCel8g3ipJhtb5zHr/F2gSR0S/eCldzjVTb4eprnwYXE4JLbZtJKaN/3B 2GjS+qTyfpEEFnuus/MFwslYnQnAcfZAb9tF8B0LPJHzwOaoKq1z/U+qr2nT6ThQBm3AGCRbqBz 8T74nz3WvSGN1mJxCLa98wpBk7H4= X-Google-Smtp-Source: AGHT+IEUHsbrfOzRtJglTfXRv1dpILfhMUwCxuSEBswBFdDJupTBkHF2+V+10vsD7S3I7QtPnCxt/g== X-Received: by 2002:a05:6a00:13a9:b0:740:9c57:3907 with SMTP id d2e1a72fcca58-7489d175260mr5215603b3a.19.1749932666778; Sat, 14 Jun 2025 13:24:26 -0700 (PDT) Received: from visitorckw-System-Product-Name.. ([140.113.216.168]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-748900d738esm3863351b3a.177.2025.06.14.13.24.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 14 Jun 2025 13:24:26 -0700 (PDT) From: Kuan-Wei Chiu To: akpm@linux-foundation.org, colyli@kernel.org, kent.overstreet@linux.dev, robertpang@google.com Cc: linux-kernel@vger.kernel.org, linux-bcache@vger.kernel.org, jserv@ccns.ncku.edu.tw, Kuan-Wei Chiu , stable@vger.kernel.org Subject: [PATCH v2 1/3] Revert "bcache: update min_heap_callbacks to use default builtin swap" Date: Sun, 15 Jun 2025 04:23:51 +0800 Message-Id: <20250614202353.1632957-2-visitorckw@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250614202353.1632957-1-visitorckw@gmail.com> References: <20250614202353.1632957-1-visitorckw@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This reverts commit 3d8a9a1c35227c3f1b0bd132c9f0a80dbda07b65. Although removing the custom swap function simplified the code, this change is part of a broader migration to the generic min_heap API that introduced significant performance regressions in bcache. As reported by Robert, bcache now suffers from latency spikes, with P100 (max) latency increasing from 600 ms to 2.4 seconds every 5 minutes. These regressions degrade bcache's effectiveness as a low-latency cache layer and lead to frequent timeouts and application stalls in production environments. This revert is part of a series of changes to restore previous performance by undoing the min_heap transition. Link: https://lore.kernel.org/lkml/CAJhEC05+0S69z+3+FB2Cd0hD+pCRyWTKLEOsc8B= OmH73p1m+KQ@mail.gmail.com Fixes: 866898efbb25 ("bcache: remove heap-related macros and switch to gene= ric min_heap") Fixes: 92a8b224b833 ("lib/min_heap: introduce non-inline versions of min he= ap API functions") Reported-by: Robert Pang Closes: https://lore.kernel.org/linux-bcache/CAJhEC06F_AtrPgw2-7CvCqZgeStgC= titbD-ryuPpXQA-JG5XXw@mail.gmail.com Cc: stable@vger.kernel.org Signed-off-by: Kuan-Wei Chiu Acked-by: Coly Li --- drivers/md/bcache/alloc.c | 11 +++++++++-- drivers/md/bcache/bset.c | 14 +++++++++++--- drivers/md/bcache/extents.c | 10 +++++++++- drivers/md/bcache/movinggc.c | 10 +++++++++- 4 files changed, 38 insertions(+), 7 deletions(-) diff --git a/drivers/md/bcache/alloc.c b/drivers/md/bcache/alloc.c index 8998e61efa40..da50f6661bae 100644 --- a/drivers/md/bcache/alloc.c +++ b/drivers/md/bcache/alloc.c @@ -189,16 +189,23 @@ static inline bool new_bucket_min_cmp(const void *l, = const void *r, void *args) return new_bucket_prio(ca, *lhs) < new_bucket_prio(ca, *rhs); } =20 +static inline void new_bucket_swap(void *l, void *r, void __always_unused = *args) +{ + struct bucket **lhs =3D l, **rhs =3D r; + + swap(*lhs, *rhs); +} + static void invalidate_buckets_lru(struct cache *ca) { struct bucket *b; const struct min_heap_callbacks bucket_max_cmp_callback =3D { .less =3D new_bucket_max_cmp, - .swp =3D NULL, + .swp =3D new_bucket_swap, }; const struct min_heap_callbacks bucket_min_cmp_callback =3D { .less =3D new_bucket_min_cmp, - .swp =3D NULL, + .swp =3D new_bucket_swap, }; =20 ca->heap.nr =3D 0; diff --git a/drivers/md/bcache/bset.c b/drivers/md/bcache/bset.c index 68258a16e125..bd97d8626887 100644 --- a/drivers/md/bcache/bset.c +++ b/drivers/md/bcache/bset.c @@ -1093,6 +1093,14 @@ static inline bool new_btree_iter_cmp(const void *l,= const void *r, void __alway return bkey_cmp(_l->k, _r->k) <=3D 0; } =20 +static inline void new_btree_iter_swap(void *iter1, void *iter2, void __al= ways_unused *args) +{ + struct btree_iter_set *_iter1 =3D iter1; + struct btree_iter_set *_iter2 =3D iter2; + + swap(*_iter1, *_iter2); +} + static inline bool btree_iter_end(struct btree_iter *iter) { return !iter->heap.nr; @@ -1103,7 +1111,7 @@ void bch_btree_iter_push(struct btree_iter *iter, str= uct bkey *k, { const struct min_heap_callbacks callbacks =3D { .less =3D new_btree_iter_cmp, - .swp =3D NULL, + .swp =3D new_btree_iter_swap, }; =20 if (k !=3D end) @@ -1149,7 +1157,7 @@ static inline struct bkey *__bch_btree_iter_next(stru= ct btree_iter *iter, struct bkey *ret =3D NULL; const struct min_heap_callbacks callbacks =3D { .less =3D cmp, - .swp =3D NULL, + .swp =3D new_btree_iter_swap, }; =20 if (!btree_iter_end(iter)) { @@ -1223,7 +1231,7 @@ static void btree_mergesort(struct btree_keys *b, str= uct bset *out, : bch_ptr_invalid; const struct min_heap_callbacks callbacks =3D { .less =3D b->ops->sort_cmp, - .swp =3D NULL, + .swp =3D new_btree_iter_swap, }; =20 /* Heapify the iterator, using our comparison function */ diff --git a/drivers/md/bcache/extents.c b/drivers/md/bcache/extents.c index 4b84fda1530a..a7221e5dbe81 100644 --- a/drivers/md/bcache/extents.c +++ b/drivers/md/bcache/extents.c @@ -266,12 +266,20 @@ static bool new_bch_extent_sort_cmp(const void *l, co= nst void *r, void __always_ return !(c ? c > 0 : _l->k < _r->k); } =20 +static inline void new_btree_iter_swap(void *iter1, void *iter2, void __al= ways_unused *args) +{ + struct btree_iter_set *_iter1 =3D iter1; + struct btree_iter_set *_iter2 =3D iter2; + + swap(*_iter1, *_iter2); +} + static struct bkey *bch_extent_sort_fixup(struct btree_iter *iter, struct bkey *tmp) { const struct min_heap_callbacks callbacks =3D { .less =3D new_bch_extent_sort_cmp, - .swp =3D NULL, + .swp =3D new_btree_iter_swap, }; while (iter->heap.nr > 1) { struct btree_iter_set *top =3D iter->heap.data, *i =3D top + 1; diff --git a/drivers/md/bcache/movinggc.c b/drivers/md/bcache/movinggc.c index 45ca134cbf02..d6c73dd8eb2b 100644 --- a/drivers/md/bcache/movinggc.c +++ b/drivers/md/bcache/movinggc.c @@ -190,6 +190,14 @@ static bool new_bucket_cmp(const void *l, const void *= r, void __always_unused *a return GC_SECTORS_USED(*_l) >=3D GC_SECTORS_USED(*_r); } =20 +static void new_bucket_swap(void *l, void *r, void __always_unused *args) +{ + struct bucket **_l =3D l; + struct bucket **_r =3D r; + + swap(*_l, *_r); +} + static unsigned int bucket_heap_top(struct cache *ca) { struct bucket *b; @@ -204,7 +212,7 @@ void bch_moving_gc(struct cache_set *c) unsigned long sectors_to_move, reserve_sectors; const struct min_heap_callbacks callbacks =3D { .less =3D new_bucket_cmp, - .swp =3D NULL, + .swp =3D new_bucket_swap, }; =20 if (!c->copy_gc_enabled) --=20 2.34.1 From nobody Fri Oct 10 09:18:07 2025 Received: from mail-pf1-f171.google.com (mail-pf1-f171.google.com [209.85.210.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5F6232E7F3E; Sat, 14 Jun 2025 20:24:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749932674; cv=none; b=J2RpRm/2uZ/HfaFRfi++1tUaxwOLXVX3VsawOrjaH0L3xTfYDV2t+DSC5KSmT429iWucsQt1cXNEW+vj5TS6KWgiDh6b48NPJDM6CWO1ECqNoSLuuZGDZ8dSw0c3tuB+Xe1jr7iXiT4dpkcoZ3/O97H7wRxJq2x+59c26lVPfQA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749932674; c=relaxed/simple; bh=YO7iwZ6JSE5X1I27UGmq9JOjTL/4gmKHFD83Bvnkzh8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=TqqRgJVCi0TAdzCLuoyGoExo8h6x27+vzN6JrlyLJBwiJVNPW8gpnvsjvTvphHyimj32N3ryMc/fPdusti9YoNTYKHFuMb0zcifOPkSxPTcm+QBgfBh0FhvAKS0C2L9Dwa6zryMinPUuaxOYYN4e9nxbXIsDT/AFwoIzRZ6qhd4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=eaIEpf2r; arc=none smtp.client-ip=209.85.210.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="eaIEpf2r" Received: by mail-pf1-f171.google.com with SMTP id d2e1a72fcca58-747fc77ba9eso2826596b3a.0; Sat, 14 Jun 2025 13:24:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1749932671; x=1750537471; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=szK4Sb/dhw7QuY6b5wHIc0+tT8ZrjZtzL9ttoXi+2+0=; b=eaIEpf2rZBgc43g59eReu+yCc2mIFyONUnHiwh2v+ZFkcYRCbIR042MSiocfM6NIFi 6Q4YZE0l1Yv6SbyXwc0X0u44GRt9RFT29YhvuN3UbkQmVY7k/ZUwy6qInKnJwtpERDm6 RQ4kGMPmU2tiGIplKlSWYTDRTSmwTIJkqlRMUIBD9Sml8SZ+VskhnC1uIW0q8VkNd1r4 M4tcYcY8gHztKE8+0zi4Mcj92w4s16OhIj/sjeinlDlgdFQgc9miEOHXRbBNnIAgf46A yX8PM7AigtrS2g+xQFYjVjb4Wix/i8pzBIKzKhxaxrYaSEBMycqb9pDLTP4DmuJrtl1A cqOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749932671; x=1750537471; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=szK4Sb/dhw7QuY6b5wHIc0+tT8ZrjZtzL9ttoXi+2+0=; b=szzDqWzs6LsbcHFYwp36LB9bvf1lb+XnbdduE0E7ppb/gNFI4xn3gxK84mIGAamzth PWxIny3nrwU5sqpppMnC3Uk1napAPSnF5iy2STf1uK1UzeRrLJdaHd88c+QnzAsUTbqW hnZNim/3+OvwNFPJl2FRpoScFU1CZmFIv6znOGCy3UoYFMeQL7hBq6zugIAXYlBnxhlg UxMQvGRf+flJfWSqYj5hfdGbBi82hmLNcT1Lfi2P/g6ZiVudtk6KuwVSCJO/VlgThjGQ tOFteYNJeCv3qAXCrGpchKnGfU2DonkuAv91NoM4YuynOsN/Ctr1r300uXcGztcYgdMq x24Q== X-Forwarded-Encrypted: i=1; AJvYcCUYT3Ppi4TEAyG9ODNlJDY8DxnqBV0tQQIxgtCu7qjMGcewHSA7sjS3jHevlMrvckLq0oRj+TGL@vger.kernel.org, AJvYcCUdRNTBCij9f5o9fnKbcseCLxPlI40cDRUiUpOpzQiS/hz4xs6ooniMDNKI3t6ZOnnXA5pckKisMvy1xxw=@vger.kernel.org X-Gm-Message-State: AOJu0YwGyZKihzmKIInJaEsLc+U3iK1InRNGmS4L1v/lYxivcBCYzepK TVonPYrmZHZseCWvVx9zzfycYeuEdeluR+uiumbVn3inOLaqQXnT4Z7o4mmqAQ== X-Gm-Gg: ASbGncsp8yT5H66cmp/JT+kx2Uohdh5B9XnnGIl1ARHcmzmbANiIrBy85RSKvuOWepi LZU+yd+1Vl6g4KDfqiV6ys3qQGhF2Gc3VDgbNOs22uXxXM0Wtf/ZU5Umph8mxn6+595v8W+ZRMg TnqwMZCeU+6NlZu+E+8I5olC0+5Ze/JWXJYyOVzqul1e78Qcp1XKa1UTDoTN+AJ6jZoNTOUJCCH 2esF4k/WL+bwwW/8n0sA16LF7lK40hpLdgCZdYdPDwpbUf7iylNha1RsBH6WoYBfBf9bU7g2yuB yJWCRXzae8/jt8F8nRrT53bEokyMBPlRoOkpF9P6kiwxjFKhhmMskEqO4qLgyOSEoK+urduIBsu u+cW34oom5JVSe2fM X-Google-Smtp-Source: AGHT+IG6t74T3VXIrOQZabztpqLm02TAnaxjtROpYqzDBLt14zeOH58DwsBEms6U5XC35Zl4x/094Q== X-Received: by 2002:a05:6a21:8cc9:b0:203:9660:9e4a with SMTP id adf61e73a8af0-21fbd5d848dmr6131712637.41.1749932670360; Sat, 14 Jun 2025 13:24:30 -0700 (PDT) Received: from visitorckw-System-Product-Name.. ([140.113.216.168]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-748900d738esm3863351b3a.177.2025.06.14.13.24.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 14 Jun 2025 13:24:29 -0700 (PDT) From: Kuan-Wei Chiu To: akpm@linux-foundation.org, colyli@kernel.org, kent.overstreet@linux.dev, robertpang@google.com Cc: linux-kernel@vger.kernel.org, linux-bcache@vger.kernel.org, jserv@ccns.ncku.edu.tw, Kuan-Wei Chiu , stable@vger.kernel.org Subject: [PATCH v2 2/3] Revert "bcache: remove heap-related macros and switch to generic min_heap" Date: Sun, 15 Jun 2025 04:23:52 +0800 Message-Id: <20250614202353.1632957-3-visitorckw@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250614202353.1632957-1-visitorckw@gmail.com> References: <20250614202353.1632957-1-visitorckw@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This reverts commit 866898efbb25bb44fd42848318e46db9e785973a. The generic bottom-up min_heap implementation causes performance regression in invalidate_buckets_lru(), a hot path in bcache. Before the cache is fully populated, new_bucket_prio() often returns zero, leading to many equal comparisons. In such cases, bottom-up sift_down performs up to 2 * log2(n) comparisons, while the original top-down approach completes with just O() comparisons, resulting in a measurable performance gap. The performance degradation is further worsened by the non-inlined min_heap API functions introduced in commit 92a8b224b833 ("lib/min_heap: introduce non-inline versions of min heap API functions"), adding function call overhead to this critical path. As reported by Robert, bcache now suffers from latency spikes, with P100 (max) latency increasing from 600 ms to 2.4 seconds every 5 minutes. These regressions degrade bcache's effectiveness as a low-latency cache layer and lead to frequent timeouts and application stalls in production environments. This revert aims to restore bcache's original low-latency behavior. Link: https://lore.kernel.org/lkml/CAJhEC05+0S69z+3+FB2Cd0hD+pCRyWTKLEOsc8B= OmH73p1m+KQ@mail.gmail.com Fixes: 866898efbb25 ("bcache: remove heap-related macros and switch to gene= ric min_heap") Fixes: 92a8b224b833 ("lib/min_heap: introduce non-inline versions of min he= ap API functions") Reported-by: Robert Pang Closes: https://lore.kernel.org/linux-bcache/CAJhEC06F_AtrPgw2-7CvCqZgeStgC= titbD-ryuPpXQA-JG5XXw@mail.gmail.com Cc: stable@vger.kernel.org Signed-off-by: Kuan-Wei Chiu Acked-by: Coly Li --- drivers/md/bcache/alloc.c | 64 +++++------------- drivers/md/bcache/bcache.h | 2 +- drivers/md/bcache/bset.c | 124 ++++++++++++---------------------- drivers/md/bcache/bset.h | 40 ++++++----- drivers/md/bcache/btree.c | 69 ++++++++----------- drivers/md/bcache/extents.c | 53 ++++++--------- drivers/md/bcache/movinggc.c | 41 +++-------- drivers/md/bcache/super.c | 3 +- drivers/md/bcache/sysfs.c | 4 +- drivers/md/bcache/util.h | 67 +++++++++++++++++- drivers/md/bcache/writeback.c | 13 ++-- 11 files changed, 217 insertions(+), 263 deletions(-) diff --git a/drivers/md/bcache/alloc.c b/drivers/md/bcache/alloc.c index da50f6661bae..48ce750bf70a 100644 --- a/drivers/md/bcache/alloc.c +++ b/drivers/md/bcache/alloc.c @@ -164,68 +164,40 @@ static void bch_invalidate_one_bucket(struct cache *c= a, struct bucket *b) * prio is worth 1/8th of what INITIAL_PRIO is worth. */ =20 -static inline unsigned int new_bucket_prio(struct cache *ca, struct bucket= *b) -{ - unsigned int min_prio =3D (INITIAL_PRIO - ca->set->min_prio) / 8; - - return (b->prio - ca->set->min_prio + min_prio) * GC_SECTORS_USED(b); -} - -static inline bool new_bucket_max_cmp(const void *l, const void *r, void *= args) -{ - struct bucket **lhs =3D (struct bucket **)l; - struct bucket **rhs =3D (struct bucket **)r; - struct cache *ca =3D args; - - return new_bucket_prio(ca, *lhs) > new_bucket_prio(ca, *rhs); -} - -static inline bool new_bucket_min_cmp(const void *l, const void *r, void *= args) -{ - struct bucket **lhs =3D (struct bucket **)l; - struct bucket **rhs =3D (struct bucket **)r; - struct cache *ca =3D args; - - return new_bucket_prio(ca, *lhs) < new_bucket_prio(ca, *rhs); -} - -static inline void new_bucket_swap(void *l, void *r, void __always_unused = *args) -{ - struct bucket **lhs =3D l, **rhs =3D r; +#define bucket_prio(b) \ +({ \ + unsigned int min_prio =3D (INITIAL_PRIO - ca->set->min_prio) / 8; \ + \ + (b->prio - ca->set->min_prio + min_prio) * GC_SECTORS_USED(b); \ +}) =20 - swap(*lhs, *rhs); -} +#define bucket_max_cmp(l, r) (bucket_prio(l) < bucket_prio(r)) +#define bucket_min_cmp(l, r) (bucket_prio(l) > bucket_prio(r)) =20 static void invalidate_buckets_lru(struct cache *ca) { struct bucket *b; - const struct min_heap_callbacks bucket_max_cmp_callback =3D { - .less =3D new_bucket_max_cmp, - .swp =3D new_bucket_swap, - }; - const struct min_heap_callbacks bucket_min_cmp_callback =3D { - .less =3D new_bucket_min_cmp, - .swp =3D new_bucket_swap, - }; + ssize_t i; =20 - ca->heap.nr =3D 0; + ca->heap.used =3D 0; =20 for_each_bucket(b, ca) { if (!bch_can_invalidate_bucket(ca, b)) continue; =20 - if (!min_heap_full(&ca->heap)) - min_heap_push(&ca->heap, &b, &bucket_max_cmp_callback, ca); - else if (!new_bucket_max_cmp(&b, min_heap_peek(&ca->heap), ca)) { + if (!heap_full(&ca->heap)) + heap_add(&ca->heap, b, bucket_max_cmp); + else if (bucket_max_cmp(b, heap_peek(&ca->heap))) { ca->heap.data[0] =3D b; - min_heap_sift_down(&ca->heap, 0, &bucket_max_cmp_callback, ca); + heap_sift(&ca->heap, 0, bucket_max_cmp); } } =20 - min_heapify_all(&ca->heap, &bucket_min_cmp_callback, ca); + for (i =3D ca->heap.used / 2 - 1; i >=3D 0; --i) + heap_sift(&ca->heap, i, bucket_min_cmp); =20 while (!fifo_full(&ca->free_inc)) { - if (!ca->heap.nr) { + if (!heap_pop(&ca->heap, b, bucket_min_cmp)) { /* * We don't want to be calling invalidate_buckets() * multiple times when it can't do anything @@ -234,8 +206,6 @@ static void invalidate_buckets_lru(struct cache *ca) wake_up_gc(ca->set); return; } - b =3D min_heap_peek(&ca->heap)[0]; - min_heap_pop(&ca->heap, &bucket_min_cmp_callback, ca); =20 bch_invalidate_one_bucket(ca, b); } diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h index 785b0d9008fa..1d33e40d26ea 100644 --- a/drivers/md/bcache/bcache.h +++ b/drivers/md/bcache/bcache.h @@ -458,7 +458,7 @@ struct cache { /* Allocation stuff: */ struct bucket *buckets; =20 - DEFINE_MIN_HEAP(struct bucket *, cache_heap) heap; + DECLARE_HEAP(struct bucket *, heap); =20 /* * If nonzero, we know we aren't going to find any buckets to invalidate diff --git a/drivers/md/bcache/bset.c b/drivers/md/bcache/bset.c index bd97d8626887..463eb13bd0b2 100644 --- a/drivers/md/bcache/bset.c +++ b/drivers/md/bcache/bset.c @@ -54,11 +54,9 @@ void bch_dump_bucket(struct btree_keys *b) int __bch_count_data(struct btree_keys *b) { unsigned int ret =3D 0; - struct btree_iter iter; + struct btree_iter_stack iter; struct bkey *k; =20 - min_heap_init(&iter.heap, NULL, MAX_BSETS); - if (b->ops->is_extents) for_each_key(b, k, &iter) ret +=3D KEY_SIZE(k); @@ -69,11 +67,9 @@ void __bch_check_keys(struct btree_keys *b, const char *= fmt, ...) { va_list args; struct bkey *k, *p =3D NULL; - struct btree_iter iter; + struct btree_iter_stack iter; const char *err; =20 - min_heap_init(&iter.heap, NULL, MAX_BSETS); - for_each_key(b, k, &iter) { if (b->ops->is_extents) { err =3D "Keys out of order"; @@ -114,9 +110,9 @@ void __bch_check_keys(struct btree_keys *b, const char = *fmt, ...) =20 static void bch_btree_iter_next_check(struct btree_iter *iter) { - struct bkey *k =3D iter->heap.data->k, *next =3D bkey_next(k); + struct bkey *k =3D iter->data->k, *next =3D bkey_next(k); =20 - if (next < iter->heap.data->end && + if (next < iter->data->end && bkey_cmp(k, iter->b->ops->is_extents ? &START_KEY(next) : next) > 0) { bch_dump_bucket(iter->b); @@ -883,14 +879,12 @@ unsigned int bch_btree_insert_key(struct btree_keys *= b, struct bkey *k, unsigned int status =3D BTREE_INSERT_STATUS_NO_INSERT; struct bset *i =3D bset_tree_last(b)->data; struct bkey *m, *prev =3D NULL; - struct btree_iter iter; + struct btree_iter_stack iter; struct bkey preceding_key_on_stack =3D ZERO_KEY; struct bkey *preceding_key_p =3D &preceding_key_on_stack; =20 BUG_ON(b->ops->is_extents && !KEY_SIZE(k)); =20 - min_heap_init(&iter.heap, NULL, MAX_BSETS); - /* * If k has preceding key, preceding_key_p will be set to address * of k's preceding key; otherwise preceding_key_p will be set @@ -901,9 +895,9 @@ unsigned int bch_btree_insert_key(struct btree_keys *b,= struct bkey *k, else preceding_key(k, &preceding_key_p); =20 - m =3D bch_btree_iter_init(b, &iter, preceding_key_p); + m =3D bch_btree_iter_stack_init(b, &iter, preceding_key_p); =20 - if (b->ops->insert_fixup(b, k, &iter, replace_key)) + if (b->ops->insert_fixup(b, k, &iter.iter, replace_key)) return status; =20 status =3D BTREE_INSERT_STATUS_INSERT; @@ -1083,102 +1077,79 @@ struct bkey *__bch_bset_search(struct btree_keys *= b, struct bset_tree *t, =20 /* Btree iterator */ =20 -typedef bool (new_btree_iter_cmp_fn)(const void *, const void *, void *); - -static inline bool new_btree_iter_cmp(const void *l, const void *r, void _= _always_unused *args) -{ - const struct btree_iter_set *_l =3D l; - const struct btree_iter_set *_r =3D r; - - return bkey_cmp(_l->k, _r->k) <=3D 0; -} +typedef bool (btree_iter_cmp_fn)(struct btree_iter_set, + struct btree_iter_set); =20 -static inline void new_btree_iter_swap(void *iter1, void *iter2, void __al= ways_unused *args) +static inline bool btree_iter_cmp(struct btree_iter_set l, + struct btree_iter_set r) { - struct btree_iter_set *_iter1 =3D iter1; - struct btree_iter_set *_iter2 =3D iter2; - - swap(*_iter1, *_iter2); + return bkey_cmp(l.k, r.k) > 0; } =20 static inline bool btree_iter_end(struct btree_iter *iter) { - return !iter->heap.nr; + return !iter->used; } =20 void bch_btree_iter_push(struct btree_iter *iter, struct bkey *k, struct bkey *end) { - const struct min_heap_callbacks callbacks =3D { - .less =3D new_btree_iter_cmp, - .swp =3D new_btree_iter_swap, - }; - if (k !=3D end) - BUG_ON(!min_heap_push(&iter->heap, - &((struct btree_iter_set) { k, end }), - &callbacks, - NULL)); + BUG_ON(!heap_add(iter, + ((struct btree_iter_set) { k, end }), + btree_iter_cmp)); } =20 -static struct bkey *__bch_btree_iter_init(struct btree_keys *b, - struct btree_iter *iter, - struct bkey *search, - struct bset_tree *start) +static struct bkey *__bch_btree_iter_stack_init(struct btree_keys *b, + struct btree_iter_stack *iter, + struct bkey *search, + struct bset_tree *start) { struct bkey *ret =3D NULL; =20 - iter->heap.size =3D ARRAY_SIZE(iter->heap.preallocated); - iter->heap.nr =3D 0; + iter->iter.size =3D ARRAY_SIZE(iter->stack_data); + iter->iter.used =3D 0; =20 #ifdef CONFIG_BCACHE_DEBUG - iter->b =3D b; + iter->iter.b =3D b; #endif =20 for (; start <=3D bset_tree_last(b); start++) { ret =3D bch_bset_search(b, start, search); - bch_btree_iter_push(iter, ret, bset_bkey_last(start->data)); + bch_btree_iter_push(&iter->iter, ret, bset_bkey_last(start->data)); } =20 return ret; } =20 -struct bkey *bch_btree_iter_init(struct btree_keys *b, - struct btree_iter *iter, +struct bkey *bch_btree_iter_stack_init(struct btree_keys *b, + struct btree_iter_stack *iter, struct bkey *search) { - return __bch_btree_iter_init(b, iter, search, b->set); + return __bch_btree_iter_stack_init(b, iter, search, b->set); } =20 static inline struct bkey *__bch_btree_iter_next(struct btree_iter *iter, - new_btree_iter_cmp_fn *cmp) + btree_iter_cmp_fn *cmp) { struct btree_iter_set b __maybe_unused; struct bkey *ret =3D NULL; - const struct min_heap_callbacks callbacks =3D { - .less =3D cmp, - .swp =3D new_btree_iter_swap, - }; =20 if (!btree_iter_end(iter)) { bch_btree_iter_next_check(iter); =20 - ret =3D iter->heap.data->k; - iter->heap.data->k =3D bkey_next(iter->heap.data->k); + ret =3D iter->data->k; + iter->data->k =3D bkey_next(iter->data->k); =20 - if (iter->heap.data->k > iter->heap.data->end) { + if (iter->data->k > iter->data->end) { WARN_ONCE(1, "bset was corrupt!\n"); - iter->heap.data->k =3D iter->heap.data->end; + iter->data->k =3D iter->data->end; } =20 - if (iter->heap.data->k =3D=3D iter->heap.data->end) { - if (iter->heap.nr) { - b =3D min_heap_peek(&iter->heap)[0]; - min_heap_pop(&iter->heap, &callbacks, NULL); - } - } + if (iter->data->k =3D=3D iter->data->end) + heap_pop(iter, b, cmp); else - min_heap_sift_down(&iter->heap, 0, &callbacks, NULL); + heap_sift(iter, 0, cmp); } =20 return ret; @@ -1186,7 +1157,7 @@ static inline struct bkey *__bch_btree_iter_next(stru= ct btree_iter *iter, =20 struct bkey *bch_btree_iter_next(struct btree_iter *iter) { - return __bch_btree_iter_next(iter, new_btree_iter_cmp); + return __bch_btree_iter_next(iter, btree_iter_cmp); =20 } =20 @@ -1224,18 +1195,16 @@ static void btree_mergesort(struct btree_keys *b, s= truct bset *out, struct btree_iter *iter, bool fixup, bool remove_stale) { + int i; struct bkey *k, *last =3D NULL; BKEY_PADDED(k) tmp; bool (*bad)(struct btree_keys *, const struct bkey *) =3D remove_stale ? bch_ptr_bad : bch_ptr_invalid; - const struct min_heap_callbacks callbacks =3D { - .less =3D b->ops->sort_cmp, - .swp =3D new_btree_iter_swap, - }; =20 /* Heapify the iterator, using our comparison function */ - min_heapify_all(&iter->heap, &callbacks, NULL); + for (i =3D iter->used / 2 - 1; i >=3D 0; --i) + heap_sift(iter, i, b->ops->sort_cmp); =20 while (!btree_iter_end(iter)) { if (b->ops->sort_fixup && fixup) @@ -1324,11 +1293,10 @@ void bch_btree_sort_partial(struct btree_keys *b, u= nsigned int start, struct bset_sort_state *state) { size_t order =3D b->page_order, keys =3D 0; - struct btree_iter iter; + struct btree_iter_stack iter; int oldsize =3D bch_count_data(b); =20 - min_heap_init(&iter.heap, NULL, MAX_BSETS); - __bch_btree_iter_init(b, &iter, NULL, &b->set[start]); + __bch_btree_iter_stack_init(b, &iter, NULL, &b->set[start]); =20 if (start) { unsigned int i; @@ -1339,7 +1307,7 @@ void bch_btree_sort_partial(struct btree_keys *b, uns= igned int start, order =3D get_order(__set_bytes(b->set->data, keys)); } =20 - __btree_sort(b, &iter, start, order, false, state); + __btree_sort(b, &iter.iter, start, order, false, state); =20 EBUG_ON(oldsize >=3D 0 && bch_count_data(b) !=3D oldsize); } @@ -1355,13 +1323,11 @@ void bch_btree_sort_into(struct btree_keys *b, stru= ct btree_keys *new, struct bset_sort_state *state) { uint64_t start_time =3D local_clock(); - struct btree_iter iter; - - min_heap_init(&iter.heap, NULL, MAX_BSETS); + struct btree_iter_stack iter; =20 - bch_btree_iter_init(b, &iter, NULL); + bch_btree_iter_stack_init(b, &iter, NULL); =20 - btree_mergesort(b, new->set->data, &iter, false, true); + btree_mergesort(b, new->set->data, &iter.iter, false, true); =20 bch_time_stats_update(&state->time, start_time); =20 diff --git a/drivers/md/bcache/bset.h b/drivers/md/bcache/bset.h index f79441acd4c1..011f6062c4c0 100644 --- a/drivers/md/bcache/bset.h +++ b/drivers/md/bcache/bset.h @@ -187,9 +187,8 @@ struct bset_tree { }; =20 struct btree_keys_ops { - bool (*sort_cmp)(const void *l, - const void *r, - void *args); + bool (*sort_cmp)(struct btree_iter_set l, + struct btree_iter_set r); struct bkey *(*sort_fixup)(struct btree_iter *iter, struct bkey *tmp); bool (*insert_fixup)(struct btree_keys *b, @@ -313,17 +312,23 @@ enum { BTREE_INSERT_STATUS_FRONT_MERGE, }; =20 -struct btree_iter_set { - struct bkey *k, *end; -}; - /* Btree key iteration */ =20 struct btree_iter { + size_t size, used; #ifdef CONFIG_BCACHE_DEBUG struct btree_keys *b; #endif - MIN_HEAP_PREALLOCATED(struct btree_iter_set, btree_iter_heap, MAX_BSETS) = heap; + struct btree_iter_set { + struct bkey *k, *end; + } data[]; +}; + +/* Fixed-size btree_iter that can be allocated on the stack */ + +struct btree_iter_stack { + struct btree_iter iter; + struct btree_iter_set stack_data[MAX_BSETS]; }; =20 typedef bool (*ptr_filter_fn)(struct btree_keys *b, const struct bkey *k); @@ -335,9 +340,9 @@ struct bkey *bch_btree_iter_next_filter(struct btree_it= er *iter, =20 void bch_btree_iter_push(struct btree_iter *iter, struct bkey *k, struct bkey *end); -struct bkey *bch_btree_iter_init(struct btree_keys *b, - struct btree_iter *iter, - struct bkey *search); +struct bkey *bch_btree_iter_stack_init(struct btree_keys *b, + struct btree_iter_stack *iter, + struct bkey *search); =20 struct bkey *__bch_bset_search(struct btree_keys *b, struct bset_tree *t, const struct bkey *search); @@ -352,13 +357,14 @@ static inline struct bkey *bch_bset_search(struct btr= ee_keys *b, return search ? __bch_bset_search(b, t, search) : t->data->start; } =20 -#define for_each_key_filter(b, k, iter, filter) \ - for (bch_btree_iter_init((b), (iter), NULL); \ - ((k) =3D bch_btree_iter_next_filter((iter), (b), filter));) +#define for_each_key_filter(b, k, stack_iter, filter) = \ + for (bch_btree_iter_stack_init((b), (stack_iter), NULL); \ + ((k) =3D bch_btree_iter_next_filter(&((stack_iter)->iter), (b), \ + filter));) =20 -#define for_each_key(b, k, iter) \ - for (bch_btree_iter_init((b), (iter), NULL); \ - ((k) =3D bch_btree_iter_next(iter));) +#define for_each_key(b, k, stack_iter) \ + for (bch_btree_iter_stack_init((b), (stack_iter), NULL); \ + ((k) =3D bch_btree_iter_next(&((stack_iter)->iter)));) =20 /* Sorting */ =20 diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c index 1d0100677357..210b59007d98 100644 --- a/drivers/md/bcache/btree.c +++ b/drivers/md/bcache/btree.c @@ -148,19 +148,19 @@ void bch_btree_node_read_done(struct btree *b) { const char *err =3D "bad btree header"; struct bset *i =3D btree_bset_first(b); - struct btree_iter iter; + struct btree_iter *iter; =20 /* * c->fill_iter can allocate an iterator with more memory space * than static MAX_BSETS. * See the comment arount cache_set->fill_iter. */ - iter.heap.data =3D mempool_alloc(&b->c->fill_iter, GFP_NOIO); - iter.heap.size =3D b->c->cache->sb.bucket_size / b->c->cache->sb.block_si= ze; - iter.heap.nr =3D 0; + iter =3D mempool_alloc(&b->c->fill_iter, GFP_NOIO); + iter->size =3D b->c->cache->sb.bucket_size / b->c->cache->sb.block_size; + iter->used =3D 0; =20 #ifdef CONFIG_BCACHE_DEBUG - iter.b =3D &b->keys; + iter->b =3D &b->keys; #endif =20 if (!i->seq) @@ -198,7 +198,7 @@ void bch_btree_node_read_done(struct btree *b) if (i !=3D b->keys.set[0].data && !i->keys) goto err; =20 - bch_btree_iter_push(&iter, i->start, bset_bkey_last(i)); + bch_btree_iter_push(iter, i->start, bset_bkey_last(i)); =20 b->written +=3D set_blocks(i, block_bytes(b->c->cache)); } @@ -210,7 +210,7 @@ void bch_btree_node_read_done(struct btree *b) if (i->seq =3D=3D b->keys.set[0].data->seq) goto err; =20 - bch_btree_sort_and_fix_extents(&b->keys, &iter, &b->c->sort); + bch_btree_sort_and_fix_extents(&b->keys, iter, &b->c->sort); =20 i =3D b->keys.set[0].data; err =3D "short btree key"; @@ -222,7 +222,7 @@ void bch_btree_node_read_done(struct btree *b) bch_bset_init_next(&b->keys, write_block(b), bset_magic(&b->c->cache->sb)); out: - mempool_free(iter.heap.data, &b->c->fill_iter); + mempool_free(iter, &b->c->fill_iter); return; err: set_btree_node_io_error(b); @@ -1306,11 +1306,9 @@ static bool btree_gc_mark_node(struct btree *b, stru= ct gc_stat *gc) uint8_t stale =3D 0; unsigned int keys =3D 0, good_keys =3D 0; struct bkey *k; - struct btree_iter iter; + struct btree_iter_stack iter; struct bset_tree *t; =20 - min_heap_init(&iter.heap, NULL, MAX_BSETS); - gc->nodes++; =20 for_each_key_filter(&b->keys, k, &iter, bch_ptr_invalid) { @@ -1569,11 +1567,9 @@ static int btree_gc_rewrite_node(struct btree *b, st= ruct btree_op *op, static unsigned int btree_gc_count_keys(struct btree *b) { struct bkey *k; - struct btree_iter iter; + struct btree_iter_stack iter; unsigned int ret =3D 0; =20 - min_heap_init(&iter.heap, NULL, MAX_BSETS); - for_each_key_filter(&b->keys, k, &iter, bch_ptr_bad) ret +=3D bkey_u64s(k); =20 @@ -1612,18 +1608,18 @@ static int btree_gc_recurse(struct btree *b, struct= btree_op *op, int ret =3D 0; bool should_rewrite; struct bkey *k; - struct btree_iter iter; + struct btree_iter_stack iter; struct gc_merge_info r[GC_MERGE_NODES]; struct gc_merge_info *i, *last =3D r + ARRAY_SIZE(r) - 1; =20 - min_heap_init(&iter.heap, NULL, MAX_BSETS); - bch_btree_iter_init(&b->keys, &iter, &b->c->gc_done); + bch_btree_iter_stack_init(&b->keys, &iter, &b->c->gc_done); =20 for (i =3D r; i < r + ARRAY_SIZE(r); i++) i->b =3D ERR_PTR(-EINTR); =20 while (1) { - k =3D bch_btree_iter_next_filter(&iter, &b->keys, bch_ptr_bad); + k =3D bch_btree_iter_next_filter(&iter.iter, &b->keys, + bch_ptr_bad); if (k) { r->b =3D bch_btree_node_get(b->c, op, k, b->level - 1, true, b); @@ -1918,9 +1914,7 @@ static int bch_btree_check_recurse(struct btree *b, s= truct btree_op *op) { int ret =3D 0; struct bkey *k, *p =3D NULL; - struct btree_iter iter; - - min_heap_init(&iter.heap, NULL, MAX_BSETS); + struct btree_iter_stack iter; =20 for_each_key_filter(&b->keys, k, &iter, bch_ptr_invalid) bch_initial_mark_key(b->c, b->level, k); @@ -1928,10 +1922,10 @@ static int bch_btree_check_recurse(struct btree *b,= struct btree_op *op) bch_initial_mark_key(b->c, b->level + 1, &b->key); =20 if (b->level) { - bch_btree_iter_init(&b->keys, &iter, NULL); + bch_btree_iter_stack_init(&b->keys, &iter, NULL); =20 do { - k =3D bch_btree_iter_next_filter(&iter, &b->keys, + k =3D bch_btree_iter_next_filter(&iter.iter, &b->keys, bch_ptr_bad); if (k) { btree_node_prefetch(b, k); @@ -1959,7 +1953,7 @@ static int bch_btree_check_thread(void *arg) struct btree_check_info *info =3D arg; struct btree_check_state *check_state =3D info->state; struct cache_set *c =3D check_state->c; - struct btree_iter iter; + struct btree_iter_stack iter; struct bkey *k, *p; int cur_idx, prev_idx, skip_nr; =20 @@ -1967,11 +1961,9 @@ static int bch_btree_check_thread(void *arg) cur_idx =3D prev_idx =3D 0; ret =3D 0; =20 - min_heap_init(&iter.heap, NULL, MAX_BSETS); - /* root node keys are checked before thread created */ - bch_btree_iter_init(&c->root->keys, &iter, NULL); - k =3D bch_btree_iter_next_filter(&iter, &c->root->keys, bch_ptr_bad); + bch_btree_iter_stack_init(&c->root->keys, &iter, NULL); + k =3D bch_btree_iter_next_filter(&iter.iter, &c->root->keys, bch_ptr_bad); BUG_ON(!k); =20 p =3D k; @@ -1989,7 +1981,7 @@ static int bch_btree_check_thread(void *arg) skip_nr =3D cur_idx - prev_idx; =20 while (skip_nr) { - k =3D bch_btree_iter_next_filter(&iter, + k =3D bch_btree_iter_next_filter(&iter.iter, &c->root->keys, bch_ptr_bad); if (k) @@ -2062,11 +2054,9 @@ int bch_btree_check(struct cache_set *c) int ret =3D 0; int i; struct bkey *k =3D NULL; - struct btree_iter iter; + struct btree_iter_stack iter; struct btree_check_state check_state; =20 - min_heap_init(&iter.heap, NULL, MAX_BSETS); - /* check and mark root node keys */ for_each_key_filter(&c->root->keys, k, &iter, bch_ptr_invalid) bch_initial_mark_key(c, c->root->level, k); @@ -2560,12 +2550,11 @@ static int bch_btree_map_nodes_recurse(struct btree= *b, struct btree_op *op, =20 if (b->level) { struct bkey *k; - struct btree_iter iter; + struct btree_iter_stack iter; =20 - min_heap_init(&iter.heap, NULL, MAX_BSETS); - bch_btree_iter_init(&b->keys, &iter, from); + bch_btree_iter_stack_init(&b->keys, &iter, from); =20 - while ((k =3D bch_btree_iter_next_filter(&iter, &b->keys, + while ((k =3D bch_btree_iter_next_filter(&iter.iter, &b->keys, bch_ptr_bad))) { ret =3D bcache_btree(map_nodes_recurse, k, b, op, from, fn, flags); @@ -2594,12 +2583,12 @@ int bch_btree_map_keys_recurse(struct btree *b, str= uct btree_op *op, { int ret =3D MAP_CONTINUE; struct bkey *k; - struct btree_iter iter; + struct btree_iter_stack iter; =20 - min_heap_init(&iter.heap, NULL, MAX_BSETS); - bch_btree_iter_init(&b->keys, &iter, from); + bch_btree_iter_stack_init(&b->keys, &iter, from); =20 - while ((k =3D bch_btree_iter_next_filter(&iter, &b->keys, bch_ptr_bad))) { + while ((k =3D bch_btree_iter_next_filter(&iter.iter, &b->keys, + bch_ptr_bad))) { ret =3D !b->level ? fn(op, b, k) : bcache_btree(map_keys_recurse, k, diff --git a/drivers/md/bcache/extents.c b/drivers/md/bcache/extents.c index a7221e5dbe81..d626ffcbecb9 100644 --- a/drivers/md/bcache/extents.c +++ b/drivers/md/bcache/extents.c @@ -33,16 +33,15 @@ static void sort_key_next(struct btree_iter *iter, i->k =3D bkey_next(i->k); =20 if (i->k =3D=3D i->end) - *i =3D iter->heap.data[--iter->heap.nr]; + *i =3D iter->data[--iter->used]; } =20 -static bool new_bch_key_sort_cmp(const void *l, const void *r, void *args) +static bool bch_key_sort_cmp(struct btree_iter_set l, + struct btree_iter_set r) { - struct btree_iter_set *_l =3D (struct btree_iter_set *)l; - struct btree_iter_set *_r =3D (struct btree_iter_set *)r; - int64_t c =3D bkey_cmp(_l->k, _r->k); + int64_t c =3D bkey_cmp(l.k, r.k); =20 - return !(c ? c > 0 : _l->k < _r->k); + return c ? c > 0 : l.k < r.k; } =20 static bool __ptr_invalid(struct cache_set *c, const struct bkey *k) @@ -239,7 +238,7 @@ static bool bch_btree_ptr_insert_fixup(struct btree_key= s *bk, } =20 const struct btree_keys_ops bch_btree_keys_ops =3D { - .sort_cmp =3D new_bch_key_sort_cmp, + .sort_cmp =3D bch_key_sort_cmp, .insert_fixup =3D bch_btree_ptr_insert_fixup, .key_invalid =3D bch_btree_ptr_invalid, .key_bad =3D bch_btree_ptr_bad, @@ -256,36 +255,22 @@ const struct btree_keys_ops bch_btree_keys_ops =3D { * Necessary for btree_sort_fixup() - if there are multiple keys that comp= are * equal in different sets, we have to process them newest to oldest. */ - -static bool new_bch_extent_sort_cmp(const void *l, const void *r, void __a= lways_unused *args) -{ - struct btree_iter_set *_l =3D (struct btree_iter_set *)l; - struct btree_iter_set *_r =3D (struct btree_iter_set *)r; - int64_t c =3D bkey_cmp(&START_KEY(_l->k), &START_KEY(_r->k)); - - return !(c ? c > 0 : _l->k < _r->k); -} - -static inline void new_btree_iter_swap(void *iter1, void *iter2, void __al= ways_unused *args) +static bool bch_extent_sort_cmp(struct btree_iter_set l, + struct btree_iter_set r) { - struct btree_iter_set *_iter1 =3D iter1; - struct btree_iter_set *_iter2 =3D iter2; + int64_t c =3D bkey_cmp(&START_KEY(l.k), &START_KEY(r.k)); =20 - swap(*_iter1, *_iter2); + return c ? c > 0 : l.k < r.k; } =20 static struct bkey *bch_extent_sort_fixup(struct btree_iter *iter, struct bkey *tmp) { - const struct min_heap_callbacks callbacks =3D { - .less =3D new_bch_extent_sort_cmp, - .swp =3D new_btree_iter_swap, - }; - while (iter->heap.nr > 1) { - struct btree_iter_set *top =3D iter->heap.data, *i =3D top + 1; - - if (iter->heap.nr > 2 && - !new_bch_extent_sort_cmp(&i[0], &i[1], NULL)) + while (iter->used > 1) { + struct btree_iter_set *top =3D iter->data, *i =3D top + 1; + + if (iter->used > 2 && + bch_extent_sort_cmp(i[0], i[1])) i++; =20 if (bkey_cmp(top->k, &START_KEY(i->k)) <=3D 0) @@ -293,7 +278,7 @@ static struct bkey *bch_extent_sort_fixup(struct btree_= iter *iter, =20 if (!KEY_SIZE(i->k)) { sort_key_next(iter, i); - min_heap_sift_down(&iter->heap, i - top, &callbacks, NULL); + heap_sift(iter, i - top, bch_extent_sort_cmp); continue; } =20 @@ -303,7 +288,7 @@ static struct bkey *bch_extent_sort_fixup(struct btree_= iter *iter, else bch_cut_front(top->k, i->k); =20 - min_heap_sift_down(&iter->heap, i - top, &callbacks, NULL); + heap_sift(iter, i - top, bch_extent_sort_cmp); } else { /* can't happen because of comparison func */ BUG_ON(!bkey_cmp(&START_KEY(top->k), &START_KEY(i->k))); @@ -313,7 +298,7 @@ static struct bkey *bch_extent_sort_fixup(struct btree_= iter *iter, =20 bch_cut_back(&START_KEY(i->k), tmp); bch_cut_front(i->k, top->k); - min_heap_sift_down(&iter->heap, 0, &callbacks, NULL); + heap_sift(iter, 0, bch_extent_sort_cmp); =20 return tmp; } else { @@ -633,7 +618,7 @@ static bool bch_extent_merge(struct btree_keys *bk, } =20 const struct btree_keys_ops bch_extent_keys_ops =3D { - .sort_cmp =3D new_bch_extent_sort_cmp, + .sort_cmp =3D bch_extent_sort_cmp, .sort_fixup =3D bch_extent_sort_fixup, .insert_fixup =3D bch_extent_insert_fixup, .key_invalid =3D bch_extent_invalid, diff --git a/drivers/md/bcache/movinggc.c b/drivers/md/bcache/movinggc.c index d6c73dd8eb2b..26a6a535ec32 100644 --- a/drivers/md/bcache/movinggc.c +++ b/drivers/md/bcache/movinggc.c @@ -182,27 +182,16 @@ err: if (!IS_ERR_OR_NULL(w->private)) closure_sync(&cl); } =20 -static bool new_bucket_cmp(const void *l, const void *r, void __always_unu= sed *args) +static bool bucket_cmp(struct bucket *l, struct bucket *r) { - struct bucket **_l =3D (struct bucket **)l; - struct bucket **_r =3D (struct bucket **)r; - - return GC_SECTORS_USED(*_l) >=3D GC_SECTORS_USED(*_r); -} - -static void new_bucket_swap(void *l, void *r, void __always_unused *args) -{ - struct bucket **_l =3D l; - struct bucket **_r =3D r; - - swap(*_l, *_r); + return GC_SECTORS_USED(l) < GC_SECTORS_USED(r); } =20 static unsigned int bucket_heap_top(struct cache *ca) { struct bucket *b; =20 - return (b =3D min_heap_peek(&ca->heap)[0]) ? GC_SECTORS_USED(b) : 0; + return (b =3D heap_peek(&ca->heap)) ? GC_SECTORS_USED(b) : 0; } =20 void bch_moving_gc(struct cache_set *c) @@ -210,10 +199,6 @@ void bch_moving_gc(struct cache_set *c) struct cache *ca =3D c->cache; struct bucket *b; unsigned long sectors_to_move, reserve_sectors; - const struct min_heap_callbacks callbacks =3D { - .less =3D new_bucket_cmp, - .swp =3D new_bucket_swap, - }; =20 if (!c->copy_gc_enabled) return; @@ -224,7 +209,7 @@ void bch_moving_gc(struct cache_set *c) reserve_sectors =3D ca->sb.bucket_size * fifo_used(&ca->free[RESERVE_MOVINGGC]); =20 - ca->heap.nr =3D 0; + ca->heap.used =3D 0; =20 for_each_bucket(b, ca) { if (GC_MARK(b) =3D=3D GC_MARK_METADATA || @@ -233,31 +218,25 @@ void bch_moving_gc(struct cache_set *c) atomic_read(&b->pin)) continue; =20 - if (!min_heap_full(&ca->heap)) { + if (!heap_full(&ca->heap)) { sectors_to_move +=3D GC_SECTORS_USED(b); - min_heap_push(&ca->heap, &b, &callbacks, NULL); - } else if (!new_bucket_cmp(&b, min_heap_peek(&ca->heap), ca)) { + heap_add(&ca->heap, b, bucket_cmp); + } else if (bucket_cmp(b, heap_peek(&ca->heap))) { sectors_to_move -=3D bucket_heap_top(ca); sectors_to_move +=3D GC_SECTORS_USED(b); =20 ca->heap.data[0] =3D b; - min_heap_sift_down(&ca->heap, 0, &callbacks, NULL); + heap_sift(&ca->heap, 0, bucket_cmp); } } =20 while (sectors_to_move > reserve_sectors) { - if (ca->heap.nr) { - b =3D min_heap_peek(&ca->heap)[0]; - min_heap_pop(&ca->heap, &callbacks, NULL); - } + heap_pop(&ca->heap, b, bucket_cmp); sectors_to_move -=3D GC_SECTORS_USED(b); } =20 - while (ca->heap.nr) { - b =3D min_heap_peek(&ca->heap)[0]; - min_heap_pop(&ca->heap, &callbacks, NULL); + while (heap_pop(&ca->heap, b, bucket_cmp)) SET_GC_MOVE(b, 1); - } =20 mutex_unlock(&c->bucket_lock); =20 diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index 1efb768b2890..2ea490b9d370 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -1912,7 +1912,8 @@ struct cache_set *bch_cache_set_alloc(struct cache_sb= *sb) INIT_LIST_HEAD(&c->btree_cache_freed); INIT_LIST_HEAD(&c->data_buckets); =20 - iter_size =3D ((meta_bucket_pages(sb) * PAGE_SECTORS) / sb->block_size) * + iter_size =3D sizeof(struct btree_iter) + + ((meta_bucket_pages(sb) * PAGE_SECTORS) / sb->block_size) * sizeof(struct btree_iter_set); =20 c->devices =3D kcalloc(c->nr_uuids, sizeof(void *), GFP_KERNEL); diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c index e8f696cb58c0..826b14cae4e5 100644 --- a/drivers/md/bcache/sysfs.c +++ b/drivers/md/bcache/sysfs.c @@ -660,9 +660,7 @@ static unsigned int bch_root_usage(struct cache_set *c) unsigned int bytes =3D 0; struct bkey *k; struct btree *b; - struct btree_iter iter; - - min_heap_init(&iter.heap, NULL, MAX_BSETS); + struct btree_iter_stack iter; =20 goto lock_root; =20 diff --git a/drivers/md/bcache/util.h b/drivers/md/bcache/util.h index 539454d8e2d0..f61ab1bada6c 100644 --- a/drivers/md/bcache/util.h +++ b/drivers/md/bcache/util.h @@ -9,7 +9,6 @@ #include #include #include -#include #include #include #include @@ -31,10 +30,16 @@ struct closure; =20 #endif =20 +#define DECLARE_HEAP(type, name) \ + struct { \ + size_t size, used; \ + type *data; \ + } name + #define init_heap(heap, _size, gfp) \ ({ \ size_t _bytes; \ - (heap)->nr =3D 0; \ + (heap)->used =3D 0; \ (heap)->size =3D (_size); \ _bytes =3D (heap)->size * sizeof(*(heap)->data); \ (heap)->data =3D kvmalloc(_bytes, (gfp) & GFP_KERNEL); \ @@ -47,6 +52,64 @@ do { \ (heap)->data =3D NULL; \ } while (0) =20 +#define heap_swap(h, i, j) swap((h)->data[i], (h)->data[j]) + +#define heap_sift(h, i, cmp) \ +do { \ + size_t _r, _j =3D i; \ + \ + for (; _j * 2 + 1 < (h)->used; _j =3D _r) { \ + _r =3D _j * 2 + 1; \ + if (_r + 1 < (h)->used && \ + cmp((h)->data[_r], (h)->data[_r + 1])) \ + _r++; \ + \ + if (cmp((h)->data[_r], (h)->data[_j])) \ + break; \ + heap_swap(h, _r, _j); \ + } \ +} while (0) + +#define heap_sift_down(h, i, cmp) \ +do { \ + while (i) { \ + size_t p =3D (i - 1) / 2; \ + if (cmp((h)->data[i], (h)->data[p])) \ + break; \ + heap_swap(h, i, p); \ + i =3D p; \ + } \ +} while (0) + +#define heap_add(h, d, cmp) \ +({ \ + bool _r =3D !heap_full(h); \ + if (_r) { \ + size_t _i =3D (h)->used++; \ + (h)->data[_i] =3D d; \ + \ + heap_sift_down(h, _i, cmp); \ + heap_sift(h, _i, cmp); \ + } \ + _r; \ +}) + +#define heap_pop(h, d, cmp) \ +({ \ + bool _r =3D (h)->used; \ + if (_r) { \ + (d) =3D (h)->data[0]; \ + (h)->used--; \ + heap_swap(h, 0, (h)->used); \ + heap_sift(h, 0, cmp); \ + } \ + _r; \ +}) + +#define heap_peek(h) ((h)->used ? (h)->data[0] : NULL) + +#define heap_full(h) ((h)->used =3D=3D (h)->size) + #define DECLARE_FIFO(type, name) \ struct { \ size_t front, back, size, mask; \ diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c index 453efbbdc8ee..302e75f1fc4b 100644 --- a/drivers/md/bcache/writeback.c +++ b/drivers/md/bcache/writeback.c @@ -908,16 +908,15 @@ static int bch_dirty_init_thread(void *arg) struct dirty_init_thrd_info *info =3D arg; struct bch_dirty_init_state *state =3D info->state; struct cache_set *c =3D state->c; - struct btree_iter iter; + struct btree_iter_stack iter; struct bkey *k, *p; int cur_idx, prev_idx, skip_nr; =20 k =3D p =3D NULL; prev_idx =3D 0; =20 - min_heap_init(&iter.heap, NULL, MAX_BSETS); - bch_btree_iter_init(&c->root->keys, &iter, NULL); - k =3D bch_btree_iter_next_filter(&iter, &c->root->keys, bch_ptr_bad); + bch_btree_iter_stack_init(&c->root->keys, &iter, NULL); + k =3D bch_btree_iter_next_filter(&iter.iter, &c->root->keys, bch_ptr_bad); BUG_ON(!k); =20 p =3D k; @@ -931,7 +930,7 @@ static int bch_dirty_init_thread(void *arg) skip_nr =3D cur_idx - prev_idx; =20 while (skip_nr) { - k =3D bch_btree_iter_next_filter(&iter, + k =3D bch_btree_iter_next_filter(&iter.iter, &c->root->keys, bch_ptr_bad); if (k) @@ -980,13 +979,11 @@ void bch_sectors_dirty_init(struct bcache_device *d) int i; struct btree *b =3D NULL; struct bkey *k =3D NULL; - struct btree_iter iter; + struct btree_iter_stack iter; struct sectors_dirty_init op; struct cache_set *c =3D d->c; struct bch_dirty_init_state state; =20 - min_heap_init(&iter.heap, NULL, MAX_BSETS); - retry_lock: b =3D c->root; rw_lock(0, b, b->level); --=20 2.34.1 From nobody Fri Oct 10 09:18:07 2025 Received: from mail-pf1-f181.google.com (mail-pf1-f181.google.com [209.85.210.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E8B1F2E7F11; Sat, 14 Jun 2025 20:24:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749932675; cv=none; b=mkXriSITcwm3HglnPrJb6TwaEfWBAmBOffhI2zCgA+Q2kuCwOpymnM6nqYjlhboHaAz+RsRAwa9RgMZEg40v4fnVJPOwxRpClO0x7W3Pec8+eX0Rs72bw50WT2AbywfPGeTBaAShJUXKslMYlNyGHBLoRynihz3RV8vLqMIu8zo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1749932675; c=relaxed/simple; bh=/nPUNYT0M9vpqDN2YDJhl+LtGIFZwutUWruPsgjSrZk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=WSi8NhxI/GeJh/mxRm9hgI5/Xb54sUJuFS8yMY6saNDDqTH5eSbS0VhOs9qWACdqtAKiEDlNxJcR/hOdSy9we3zTLhDHudvaRsTKn/HeKmTPIv6+jucOF01xk26CyibUSvP5oayIPJGfLAucK6aSjPO5sAMEe7l/a15s1TJky1o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=MQShtxQp; arc=none smtp.client-ip=209.85.210.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="MQShtxQp" Received: by mail-pf1-f181.google.com with SMTP id d2e1a72fcca58-7489dfb71a8so745357b3a.1; Sat, 14 Jun 2025 13:24:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1749932673; x=1750537473; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=BeLfvqj/4d+vVfQI829QhhRWfMlVlhJ0oopKK5Rq8XM=; b=MQShtxQpH1bX/qrZKnIPRpsBg5/Q4YBHR1EVC0qmx1xYO7tfMKEQ9QW9fiWMuXnpsv LVLLnnX3w+L4SHIztjrwJG3eNeFrT9dfUDupn/e1IQOJf2sFi6BUHRaRPmR/PjznW616 rXiMS+8xYs9MntkAwXqrbShHlRt+oOjZAAK9W6VC8ahlyPR58TqQkEPwsK7I/bLdGaND SmdY+BlZ6Ha+QXn2z8juGapoYE2T8rHAjhFgjJZnG5Je4PgIcW8reb54/fMsoluFQGQx oxsAAVPo01qR5DknBKeOR5npSl8klIwErijsvkYM/fNe8Knhyd0Ba3hiR9K/phlA9Ff9 FaJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749932673; x=1750537473; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BeLfvqj/4d+vVfQI829QhhRWfMlVlhJ0oopKK5Rq8XM=; b=xOBtdUtwXLs0AI/yabb8oC6d0nI5D97VCthrzNYncgmyzHJZUmBBSMTmejHJJXEfIJ ecei9bXlJwpQMJshEGKBWLnHYsPLDdyiRoKD0e3mtMlfuRPJ0m0vVLB+i6I71LcQnkhB IygzYc0eA2Gr5m6zA8tHmGzfQJ9Y1Ov8xnKV9EIESIrIa/xdCbC7o5p3nZkK4LEAMZZ6 3okcC9F8fAEdMm3Xsw2rF+SUHgAM78OdrSygezVskb2hfLVHejucdLiZ63R+22L9FGVH QGnxjHv8AWSNepppZhsas3aPEgCudMJtowsHcvWQuGlowfV0dOxnB/bi/KxgKG7oQNpT LBWQ== X-Forwarded-Encrypted: i=1; AJvYcCU1duL68Qzbq8vwqFuX5MCKHuuJhhUVYmotgCUoFLD4qzPfuvdQE9GvAOOPAyITeBcpyV8nIp/AEOIZUjk=@vger.kernel.org, AJvYcCWFeuR2OuNjaSYrxe5ONgYjyAuB5RSxIGtKqgML00RtXnZdLOUAoAMLTTQ9+R+d0yhle60If9pZ@vger.kernel.org X-Gm-Message-State: AOJu0YwjFTrAJXuC+rEfEMTt6v3eoXtM4QO8Pih7MRCB7oxckmPzEFDO srhBp5mEEXR1WwlWpiQX6o3MwtUc14ZDkehF8YMAScHtYj14e9WTUzvr X-Gm-Gg: ASbGncszHPbcSHmvMWnEGRSOZSfm7WyylZyoEGRi40esyrbeAUwC17vaxAdb7Sz79fx wQmiv34AGzTBp1tCuQZ03Un5HRBZVK7kuTKTzSYu32Lk+az4Z8VHV4pUtm1Zg/sV3iZ9EDhCRwR h1Jn5tW3q7IiuSUgTkzap4TUrl30ZFNuRvt1BaE1nHzgMZEZHOuNdyRd9dvjD3kZbS+pree152I kQwI61ziZWTmfe4futE9B6K43r3AbPHFDdoEj5i+UQFUelnPjiznB8K0J3/SoC6K8Hu6YEKWNPu xnrXVqudg8sn7MssDR7fj3koQVnkHFZeG+RDRtZtIF2HxE1vPRAXArWxGklxiG+HlEl4ipb0dbH s6Lnn4Q2Wz04j/y0Eq1W8ZHLyuHg= X-Google-Smtp-Source: AGHT+IEX1VdCB0tSnpuHFmmxai4lc5c2Hfil5CzoES0bMZ28X1GSLrHf67e2/uSKtVP7/y6myFqe/w== X-Received: by 2002:a05:6a00:2186:b0:748:33f3:8da8 with SMTP id d2e1a72fcca58-7489cfc2976mr4323401b3a.5.1749932673105; Sat, 14 Jun 2025 13:24:33 -0700 (PDT) Received: from visitorckw-System-Product-Name.. ([140.113.216.168]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-748900d738esm3863351b3a.177.2025.06.14.13.24.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 14 Jun 2025 13:24:32 -0700 (PDT) From: Kuan-Wei Chiu To: akpm@linux-foundation.org, colyli@kernel.org, kent.overstreet@linux.dev, robertpang@google.com Cc: linux-kernel@vger.kernel.org, linux-bcache@vger.kernel.org, jserv@ccns.ncku.edu.tw, Kuan-Wei Chiu , stable@vger.kernel.org Subject: [PATCH v2 3/3] bcache: Remove unnecessary select MIN_HEAP Date: Sun, 15 Jun 2025 04:23:53 +0800 Message-Id: <20250614202353.1632957-4-visitorckw@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250614202353.1632957-1-visitorckw@gmail.com> References: <20250614202353.1632957-1-visitorckw@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" After reverting the transition to the generic min heap library, bcache no longer depends on MIN_HEAP. The select entry can be removed to reduce code size and shrink the kernel's attack surface. This change effectively reverts the bcache-related part of commit 92a8b224b833 ("lib/min_heap: introduce non-inline versions of min heap API functions"). This is part of a series of changes to address a performance regression caused by the use of the generic min_heap implementation. As reported by Robert, bcache now suffers from latency spikes, with P100 (max) latency increasing from 600 ms to 2.4 seconds every 5 minutes. These regressions degrade bcache's effectiveness as a low-latency cache layer and lead to frequent timeouts and application stalls in production environments. Link: https://lore.kernel.org/lkml/CAJhEC05+0S69z+3+FB2Cd0hD+pCRyWTKLEOsc8B= OmH73p1m+KQ@mail.gmail.com Fixes: 866898efbb25 ("bcache: remove heap-related macros and switch to gene= ric min_heap") Fixes: 92a8b224b833 ("lib/min_heap: introduce non-inline versions of min he= ap API functions") Reported-by: Robert Pang Closes: https://lore.kernel.org/linux-bcache/CAJhEC06F_AtrPgw2-7CvCqZgeStgC= titbD-ryuPpXQA-JG5XXw@mail.gmail.com Cc: stable@vger.kernel.org Signed-off-by: Kuan-Wei Chiu Acked-by: Coly Li --- drivers/md/bcache/Kconfig | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/md/bcache/Kconfig b/drivers/md/bcache/Kconfig index d4697e79d5a3..b2d10063d35f 100644 --- a/drivers/md/bcache/Kconfig +++ b/drivers/md/bcache/Kconfig @@ -5,7 +5,6 @@ config BCACHE select BLOCK_HOLDER_DEPRECATED if SYSFS select CRC64 select CLOSURES - select MIN_HEAP help Allows a block device to be used as cache for other devices; uses a btree for indexing and the layout is optimized for SSDs. --=20 2.34.1