Documentation/core-api/min_heap.rst | 20 +++++ drivers/md/bcache/alloc.c | 15 ++-- include/linux/min_heap.h | 131 +++++++++++++++++++++++----- lib/min_heap.c | 23 +++-- 4 files changed, 154 insertions(+), 35 deletions(-)
This patch series introduces equality-aware variants of the min heap API that use a top-down heapify strategy to improve performance when many elements are equal under the comparison function. It also updates the documentation accordingly and modifies bcache to use the new APIs to fix a performance regression caused by the switch to the generic min heap library. In particular, invalidate_buckets_lru() in bcache suffered from increased comparison overhead due to the bottom-up strategy introduced in commit 866898efbb25 ("bcache: remove heap-related macros and switch to generic min_heap"). The regression is addressed by switching to the equality-aware variants and using the inline versions to avoid function call overhead in this hot path. Cc: stable@vger.kernel.org --- To avoid duplicated effort and expedite resolution, Robert kindly agreed that I should submit my already-completed series instead. Many thanks to him for his cooperation and support. Kuan-Wei Chiu (8): lib min_heap: Add equal-elements-aware sift_down variant lib min_heap: Add typedef for sift_down function pointer lib min_heap: add eqaware variant of min_heapify_all() lib min_heap: add eqaware variant of min_heap_pop() lib min_heap: add eqaware variant of min_heap_pop_push() lib min_heap: add eqaware variant of min_heap_del() Documentation/core-api: min_heap: Document _eqaware variants of min-heap APIs bcache: Fix the tail IO latency regression by using equality-aware min heap API Documentation/core-api/min_heap.rst | 20 +++++ drivers/md/bcache/alloc.c | 15 ++-- include/linux/min_heap.h | 131 +++++++++++++++++++++++----- lib/min_heap.c | 23 +++-- 4 files changed, 154 insertions(+), 35 deletions(-) -- 2.34.1
On Wed, 11 Jun 2025 05:55:08 +0800 Kuan-Wei Chiu <visitorckw@gmail.com> wrote: > This patch series introduces equality-aware variants of the min heap > API that use a top-down heapify strategy to improve performance when > many elements are equal under the comparison function. It also updates > the documentation accordingly and modifies bcache to use the new APIs > to fix a performance regression caused by the switch to the generic min > heap library. > > In particular, invalidate_buckets_lru() in bcache suffered from > increased comparison overhead due to the bottom-up strategy introduced > in commit 866898efbb25 ("bcache: remove heap-related macros and switch > to generic min_heap"). The regression is addressed by switching to the > equality-aware variants and using the inline versions to avoid function > call overhead in this hot path. > > Cc: stable@vger.kernel.org To justify a -stable backport this performance regression would need to have a pretty significant impact upon real-world userspace. Especially as the patchset is large. Unfortunately the changelog provides no indication of the magnitude of the userspace impact. Please tell us this, in detail. Also, if we are to address this regression in -stable kernels then reverting 866898efbb25 is an obvious way - it is far far safer. So please also tell us why the proposed patchset is a better way for us to go. (Also, each patch should have a fixes:866898efbb25 to help direct the backporting efforts) I'll add the patches to mm.git to get you some testing but from what I'm presently seeing the -stable backporting would be unwise.
Hi Andrew, On Wed, Jun 11, 2025 at 06:48:17PM -0700, Andrew Morton wrote: > On Wed, 11 Jun 2025 05:55:08 +0800 Kuan-Wei Chiu <visitorckw@gmail.com> wrote: > > > This patch series introduces equality-aware variants of the min heap > > API that use a top-down heapify strategy to improve performance when > > many elements are equal under the comparison function. It also updates > > the documentation accordingly and modifies bcache to use the new APIs > > to fix a performance regression caused by the switch to the generic min > > heap library. > > > > In particular, invalidate_buckets_lru() in bcache suffered from > > increased comparison overhead due to the bottom-up strategy introduced > > in commit 866898efbb25 ("bcache: remove heap-related macros and switch > > to generic min_heap"). The regression is addressed by switching to the > > equality-aware variants and using the inline versions to avoid function > > call overhead in this hot path. > > > > Cc: stable@vger.kernel.org > > To justify a -stable backport this performance regression would need to > have a pretty significant impact upon real-world userspace. Especially > as the patchset is large. > > Unfortunately the changelog provides no indication of the magnitude of > the userspace impact. Please tell us this, in detail. > I'll work with Robert to provide a more detailed explanation of the real-world impact on userspace. > Also, if we are to address this regression in -stable kernels then > reverting 866898efbb25 is an obvious way - it is far far safer. So > please also tell us why the proposed patchset is a better way for us to > go. > I agree that reverting 866898efbb25 is a much safer and smaller change for backporting. In fact, I previously raised the discussion of whether we should revert the commit or instead introduce an equality-aware API and use it. The bcache maintainer preferred the latter, and I also believe that it is a more forward-looking approach. Given that bcache has run into this issue, it's likely that other users with similar use cases may encounter it as well. We wouldn't want those users to continue relying on the current default heapify behavior. So, although reverting may be more suitable for stable in isolation, adding an equality-aware API could better serve a broader set of use cases going forward. > (Also, each patch should have a fixes:866898efbb25 to help direct the > backporting efforts) > Ack. Will do. > > I'll add the patches to mm.git to get you some testing but from what > I'm presently seeing the -stable backporting would be unwise. Thanks! Regards, Kuan-Wei
Hi Andrew Bcache is designed to boost the I/O performance of slower storage (HDDs, network-attached storage) by leveraging fast SSDs as a block cache. This functionality is critical in significantly reducing I/O latency. Therefore, any notable increase in bcache's latency severely diminishes its value. For instance, our tests show a P100 (max) latency spike from 600 ms to 2.4 seconds every 5 minutes due to this regression. In real-world environments, this increase will cause frequent timeouts and stalls in end-user applications that rely on bcache's latency improvements, highlighting the urgent need to address this issue. Best regards Robert Pang On Fri, Jun 13, 2025 at 3:16 PM Kuan-Wei Chiu <visitorckw@gmail.com> wrote: > > Hi Andrew, > > On Wed, Jun 11, 2025 at 06:48:17PM -0700, Andrew Morton wrote: > > On Wed, 11 Jun 2025 05:55:08 +0800 Kuan-Wei Chiu <visitorckw@gmail.com> wrote: > > > > > This patch series introduces equality-aware variants of the min heap > > > API that use a top-down heapify strategy to improve performance when > > > many elements are equal under the comparison function. It also updates > > > the documentation accordingly and modifies bcache to use the new APIs > > > to fix a performance regression caused by the switch to the generic min > > > heap library. > > > > > > In particular, invalidate_buckets_lru() in bcache suffered from > > > increased comparison overhead due to the bottom-up strategy introduced > > > in commit 866898efbb25 ("bcache: remove heap-related macros and switch > > > to generic min_heap"). The regression is addressed by switching to the > > > equality-aware variants and using the inline versions to avoid function > > > call overhead in this hot path. > > > > > > Cc: stable@vger.kernel.org > > > > To justify a -stable backport this performance regression would need to > > have a pretty significant impact upon real-world userspace. Especially > > as the patchset is large. > > > > Unfortunately the changelog provides no indication of the magnitude of > > the userspace impact. Please tell us this, in detail. > > > I'll work with Robert to provide a more detailed explanation of the > real-world impact on userspace. > > > Also, if we are to address this regression in -stable kernels then > > reverting 866898efbb25 is an obvious way - it is far far safer. So > > please also tell us why the proposed patchset is a better way for us to > > go. > > > I agree that reverting 866898efbb25 is a much safer and smaller change > for backporting. In fact, I previously raised the discussion of whether > we should revert the commit or instead introduce an equality-aware API > and use it. The bcache maintainer preferred the latter, and I also > believe that it is a more forward-looking approach. Given that bcache > has run into this issue, it's likely that other users with similar use > cases may encounter it as well. We wouldn't want those users to > continue relying on the current default heapify behavior. So, although > reverting may be more suitable for stable in isolation, adding an > equality-aware API could better serve a broader set of use cases going > forward. > > > (Also, each patch should have a fixes:866898efbb25 to help direct the > > backporting efforts) > > > Ack. Will do. > > > > > I'll add the patches to mm.git to get you some testing but from what > > I'm presently seeing the -stable backporting would be unwise. > > Thanks! > > Regards, > Kuan-Wei
On Fri, 13 Jun 2025 23:26:33 +0900 Robert Pang <robertpang@google.com> wrote: > Hi Andrew > > Bcache is designed to boost the I/O performance of slower storage > (HDDs, network-attached storage) by leveraging fast SSDs as a block > cache. This functionality is critical in significantly reducing I/O > latency. Therefore, any notable increase in bcache's latency severely > diminishes its value. For instance, our tests show a P100 (max) > latency spike from 600 ms to 2.4 seconds every 5 minutes due to this > regression. In real-world environments, this increase will cause > frequent timeouts and stalls in end-user applications that rely on > bcache's latency improvements, highlighting the urgent need to address > this issue. Great, thanks. Let's please incorporate this into the v2 changelogging. > > > Also, if we are to address this regression in -stable kernels then > > > reverting 866898efbb25 is an obvious way - it is far far safer. So > > > please also tell us why the proposed patchset is a better way for us to > > > go. > > > > > I agree that reverting 866898efbb25 is a much safer and smaller change > > for backporting. In fact, I previously raised the discussion of whether > > we should revert the commit or instead introduce an equality-aware API > > and use it. The bcache maintainer preferred the latter, and I also > > believe that it is a more forward-looking approach. Given that bcache > > has run into this issue, it's likely that other users with similar use > > cases may encounter it as well. We wouldn't want those users to > > continue relying on the current default heapify behavior. So, although > > reverting may be more suitable for stable in isolation, adding an > > equality-aware API could better serve a broader set of use cases going > > forward. "much safer and smaller" is very desirable for backporting, please. After all, 866898efbb25 didn't really fix anything and reverting that takes us back to a known-to-work implementation. I of course have no problem making the changes in this patchset for "going forward"! So if agreeable, please prepare a patch which reverts 866898efbb25. Robert's words above are a great basis for that patch's description.
Hi Andrew, On Fri, Jun 13, 2025 at 11:04:15AM -0700, Andrew Morton wrote: > On Fri, 13 Jun 2025 23:26:33 +0900 Robert Pang <robertpang@google.com> wrote: > > > Hi Andrew > > > > Bcache is designed to boost the I/O performance of slower storage > > (HDDs, network-attached storage) by leveraging fast SSDs as a block > > cache. This functionality is critical in significantly reducing I/O > > latency. Therefore, any notable increase in bcache's latency severely > > diminishes its value. For instance, our tests show a P100 (max) > > latency spike from 600 ms to 2.4 seconds every 5 minutes due to this > > regression. In real-world environments, this increase will cause > > frequent timeouts and stalls in end-user applications that rely on > > bcache's latency improvements, highlighting the urgent need to address > > this issue. > > Great, thanks. Let's please incorporate this into the v2 changelogging. > > > > > Also, if we are to address this regression in -stable kernels then > > > > reverting 866898efbb25 is an obvious way - it is far far safer. So > > > > please also tell us why the proposed patchset is a better way for us to > > > > go. > > > > > > > I agree that reverting 866898efbb25 is a much safer and smaller change > > > for backporting. In fact, I previously raised the discussion of whether > > > we should revert the commit or instead introduce an equality-aware API > > > and use it. The bcache maintainer preferred the latter, and I also > > > believe that it is a more forward-looking approach. Given that bcache > > > has run into this issue, it's likely that other users with similar use > > > cases may encounter it as well. We wouldn't want those users to > > > continue relying on the current default heapify behavior. So, although > > > reverting may be more suitable for stable in isolation, adding an > > > equality-aware API could better serve a broader set of use cases going > > > forward. > > "much safer and smaller" is very desirable for backporting, please. > After all, 866898efbb25 didn't really fix anything and reverting that > takes us back to a known-to-work implementation. > > I of course have no problem making the changes in this patchset for > "going forward"! > > So if agreeable, please prepare a patch which reverts 866898efbb25. > Robert's words above are a great basis for that patch's description. > Sure, I'll prepare a revert patch to address the issue and plan to submit it for backporting to -stable. However, I'd like to confirm whether the following patch series structure would be appropriate: - Patch 1: Revert 866898efbb25 and CC it to stable - Patch 2–8: Introduce the new equality-aware heap API - Patch 9: Revert Patch 1 and switch bcache to use the new API In this case, we would only backport Patch 1 to stable. Alternatively, would you prefer we simply revert 866898efbb25 without introducing and using the new API in the same series? Regards, Kuan-Wei
On Sat, 14 Jun 2025 07:19:51 +0800 Kuan-Wei Chiu <visitorckw@gmail.com> wrote: > Sure, I'll prepare a revert patch to address the issue and plan to > submit it for backporting to -stable. > > However, I'd like to confirm whether the following patch series > structure would be appropriate: > > - Patch 1: Revert 866898efbb25 and CC it to stable > - Patch 2–8: Introduce the new equality-aware heap API > - Patch 9: Revert Patch 1 and switch bcache to use the new API > > In this case, we would only backport Patch 1 to stable. > > Alternatively, would you prefer we simply revert 866898efbb25 without > introducing and using the new API in the same series? Yes, just the revert for now, please. Let's not make that dependent on new development.
On Wed, 11 Jun 2025 18:48:17 -0700 Andrew Morton <akpm@linux-foundation.org> wrote: > Unfortunately the changelog provides no indication of the magnitude of > the userspace impact. Please tell us this, in detail. OK I found details in the 8th patch's Closes: link. That was tricky ;) It's impressively detailed but I'm still struggling to understand how much pain this regression causes users in real life. So some sort of reader-friendly executive summary would be great, please.
© 2016 - 2025 Red Hat, Inc.