[PATCH v3 0/2] Improve the performance of bitmap_find_next_zero_area_off()

Yi Sun posted 2 patches 1 week, 4 days ago
include/linux/find.h | 33 +++++++++++++++++++++++++++++++++
lib/bitmap.c         |  2 +-
lib/find_bit.c       | 22 ++++++++++++++++++++++
3 files changed, 56 insertions(+), 1 deletion(-)
[PATCH v3 0/2] Improve the performance of bitmap_find_next_zero_area_off()
Posted by Yi Sun 1 week, 4 days ago
Based on Michał Nazarewicz's suggestion,
code optimization was performed on PATCH v1.

Replacing find_next_bit() with find_last_bit_from()
can improve performance by an average of 50%.
The test results can be viewed in PATCH v1.


This section compares the performance of PATCH v2 and PATCH v3.
Test results show that PATCH v3 performs slightly better
than PATCH v2 in most cases.
When the number of 'goto again' loops is large,
PATCH v3's advantage becomes more apparent.


Test result:
	cnt	again_cnt	v2_time(ns)	v3_time(ns)	time_ratio
test1	8	9		230		242		-5.2%
test2	8	1		75		76		around 0%

test1	8	329		4452		4242		4.7%
test2	8	1		46		47		around 0%

test1	32	10414		139015		132700		4.5%
test2	32	1		47		47		around 0%

test1	128	2570		34163		32711		4.3%
test2	128	1		46		46		around 0%

test1	1024	321		4293		4098		4.5%
test2	1024	6		126		122		3.2%

test1	4096	81		1087		1046		3.8%
test2	4096	92		1656		1570		5.2%

Test result explanation:
@test1: The bitmap is filled with random numbers,
so the bitmap is very messy.
@test2: Sparse bitmap.

@cnt: The expected number of consecutive clear bits.

@again_cnt: The number of 'goto again'.

@v2_time(ns): The total time consumed by
bitmap_find_next_zero_area_off() when
using PATCH v2.
@v3_time(ns): The total time consumed by
bitmap_find_next_zero_area_off() when
using PATCH v3.
@time_ratio = (v2_time - v3_time) / v2_time.

---
v2: https://lore.kernel.org/all/20260514035644.4118050-1-yi.sun@unisoc.com
- Do not introduce find_last_bit_from().

v1: https://lore.kernel.org/all/20260512040659.2992142-1-yi.sun@unisoc.com


Yi Sun (2):
  lib: bitmap: add find_last_bit_from() and _find_last_bit_from()
  lib: bitmap: reduce the number of goto again in
    bitmap_find_next_zero_area_off()

 include/linux/find.h | 33 +++++++++++++++++++++++++++++++++
 lib/bitmap.c         |  2 +-
 lib/find_bit.c       | 22 ++++++++++++++++++++++
 3 files changed, 56 insertions(+), 1 deletion(-)

-- 
2.34.1

Re: [PATCH v3 0/2] Improve the performance of bitmap_find_next_zero_area_off()
Posted by Yury Norov 1 week, 4 days ago
You submitted v2 and v3 within the same day. This is not how things
are working. So, should I ignore v2 now? Please allow your reviewers
at least a few days before submitting any follow ups.

On Thu, May 14, 2026 at 05:06:05PM +0800, Yi Sun wrote:
> Based on Michał Nazarewicz's suggestion,
> code optimization was performed on PATCH v1.
> 
> Replacing find_next_bit() with find_last_bit_from()
> can improve performance by an average of 50%.
> The test results can be viewed in PATCH v1.
> 
> 
> This section compares the performance of PATCH v2 and PATCH v3.
> Test results show that PATCH v3 performs slightly better
> than PATCH v2 in most cases.
> When the number of 'goto again' loops is large,
> PATCH v3's advantage becomes more apparent.
> 
> 
> Test result:
> 	cnt	again_cnt	v2_time(ns)	v3_time(ns)	time_ratio
> test1	8	9		230		242		-5.2%
> test2	8	1		75		76		around 0%
> 
> test1	8	329		4452		4242		4.7%
> test2	8	1		46		47		around 0%
> 
> test1	32	10414		139015		132700		4.5%
> test2	32	1		47		47		around 0%
> 
> test1	128	2570		34163		32711		4.3%
> test2	128	1		46		46		around 0%
> 
> test1	1024	321		4293		4098		4.5%
> test2	1024	6		126		122		3.2%
> 
> test1	4096	81		1087		1046		3.8%
> test2	4096	92		1656		1570		5.2%
> 
> Test result explanation:
> @test1: The bitmap is filled with random numbers,
> so the bitmap is very messy.
> @test2: Sparse bitmap.
> 
> @cnt: The expected number of consecutive clear bits.
> 
> @again_cnt: The number of 'goto again'.
> 
> @v2_time(ns): The total time consumed by
> bitmap_find_next_zero_area_off() when
> using PATCH v2.
> @v3_time(ns): The total time consumed by
> bitmap_find_next_zero_area_off() when
> using PATCH v3.
> @time_ratio = (v2_time - v3_time) / v2_time.

In v1 discussion, I asked you to turn your testing code into the
addition to lib/find_bit_benchmark. Any reason to ignore that?

Whatever you end up, it should not look how it looks now. I want
to be able to compile the find_bit_benchmark, then run it before
applying your series, and after that, and just compare 2 numbers
for dense and 2 numbers for sparse bitmaps.

Refer to e3783c805db29c8 as an example of how the tests look.

> ---
> v2: https://lore.kernel.org/all/20260514035644.4118050-1-yi.sun@unisoc.com
> - Do not introduce find_last_bit_from().
> 
> v1: https://lore.kernel.org/all/20260512040659.2992142-1-yi.sun@unisoc.com
> 
> 
> Yi Sun (2):
>   lib: bitmap: add find_last_bit_from() and _find_last_bit_from()
>   lib: bitmap: reduce the number of goto again in
>     bitmap_find_next_zero_area_off()
> 
>  include/linux/find.h | 33 +++++++++++++++++++++++++++++++++
>  lib/bitmap.c         |  2 +-
>  lib/find_bit.c       | 22 ++++++++++++++++++++++
>  3 files changed, 56 insertions(+), 1 deletion(-)
> 
> -- 
> 2.34.1