[PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation

Kuan-Wei Chiu posted 2 patches 1 year, 11 months ago
[PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation
Posted by Kuan-Wei Chiu 1 year, 11 months ago
Instead of popping only the maximum element from the heap during each
iteration, we now pop the two largest elements at once. Although this
introduces an additional comparison to determine the second largest
element, it enables a reduction in the height of the tree by one during
the heapify operations starting from root's left/right child. This
reduction in tree height by one leads to a decrease of one comparison
and one swap.

This optimization results in saving approximately 0.5 * n swaps without
increasing the number of comparisons. Additionally, the heap size
during heapify is now one less than the original size, offering a
chance for further reduction in comparisons and swaps.

Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
---

The following experimental data is based on the array generated using
get_random_u32().

| N     | swaps (old) | swaps (new) | comparisons (old) | comparisons (new) |
|-------|-------------|-------------|-------------------|-------------------|
| 1000  | 9054        | 8569        | 10328             | 10320             |
| 2000  | 20137       | 19182       | 22634             | 22587             |
| 3000  | 32062       | 30623       | 35833             | 35752             |
| 4000  | 44274       | 42282       | 49332             | 49306             |
| 5000  | 57195       | 54676       | 63300             | 63294             |
| 6000  | 70205       | 67202       | 77599             | 77557             |
| 7000  | 83276       | 79831       | 92113             | 92032             |
| 8000  | 96630       | 92678       | 106635            | 106617            |
| 9000  | 110349      | 105883      | 121505            | 121404            |
| 10000 | 124165      | 119202      | 136628            | 136617            |

 lib/sort.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/lib/sort.c b/lib/sort.c
index fe4efd4a1410..a0509088f82a 100644
--- a/lib/sort.c
+++ b/lib/sort.c
@@ -215,6 +215,7 @@ void sort_r(void *base, size_t num, size_t size,
 	/* pre-scale counters for performance */
 	size_t n = num * size, a = (num/2) * size;
 	const unsigned int lsbit = size & -size;  /* Used to find parent */
+	size_t shift = 0;
 
 	if (!a)		/* num < 2 || size == 0 */
 		return;
@@ -242,12 +243,21 @@ void sort_r(void *base, size_t num, size_t size,
 	for (;;) {
 		size_t b, c, d;
 
-		if (a)			/* Building heap: sift down --a */
-			a -= size;
-		else if (n -= size)	/* Sorting: Extract root to --n */
+		if (a)			/* Building heap: sift down a */
+			a -= size << shift;
+		else if (n > 3 * size) { /* Sorting: Extract two largest elements */
+			n -= size;
 			do_swap(base, base + n, size, swap_func, priv);
-		else			/* Sort complete */
+			shift = do_cmp(base + size, base + 2 * size, cmp_func, priv) <= 0;
+			a = size << shift;
+			n -= size;
+			do_swap(base + a, base + n, size, swap_func, priv);
+		} else if (n > size) {	/* Sorting: Extract root */
+			n -= size;
+			do_swap(base, base + n, size, swap_func, priv);
+		} else	{		/* Sort complete */
 			break;
+		}
 
 		/*
 		 * Sift element at "a" down into heap.  This is the
-- 
2.25.1
Re: [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation
Posted by Julian Sikorski 1 year, 6 months ago
Hello,

it appears that this patch has caused suspend-to-idle regression:

https://gitlab.freedesktop.org/drm/amd/-/issues/3436

In brief, my laptop fails to suspend completely with the following error 
in the log:

Jun 18 12:42:20 kernel: amd_pmc AMDI0005:00: Last suspend didn't reach 
deepest state

Power consumption remains high enough that my battery has already 
unexpectedly drained twice before I noticed something was off.
I am not entirely sure how changes to sorting function can influence 
suspend, but it is what it is. 6.9.5 as shipped by Fedora 40 exhibits 
the issue, 6.9.5 as shipped by Fedora with the patch reverted does not.

Best regards,
Juliam
Re: [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation
Posted by Luke Jones 11 months, 2 weeks ago
On Thu, 2024-06-20 at 17:36 +0200, Julian Sikorski wrote:
> Hello,
> 
> it appears that this patch has caused suspend-to-idle regression:
> 
> https://gitlab.freedesktop.org/drm/amd/-/issues/3436
> 

Another regression from this has been reported here
https://bugzilla.kernel.org/show_bug.cgi?id=219158

Regards,
Luke.
Re: [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation
Posted by Kuan-Wei Chiu 11 months, 2 weeks ago
(+Cc sound developers)

Hi Luke,

On Wed, Jan 15, 2025 at 04:27:52PM +1300, Luke Jones wrote:
> On Thu, 2024-06-20 at 17:36 +0200, Julian Sikorski wrote:
> > Hello,
> > 
> > it appears that this patch has caused suspend-to-idle regression:
> > 
> > https://gitlab.freedesktop.org/drm/amd/-/issues/3436
> > 
> 
> Another regression from this has been reported here
> https://bugzilla.kernel.org/show_bug.cgi?id=219158
>
Thank you for reporting this regression!

From a quick look, this seems to be caused by yet another broken
compare function. In sound/pci/hda/hda_auto_parser.c, the
compare_input_type() function can lead to sorting issues when both
is_headset_mic and is_headphone_mic are true for a and b. This can
result in a situation where both a < b and b < a hold true, violating
the antisymmetry and transitivity required by sort().

Additionally, the comments about "swap" and "don't swap" seem to make
incorrect assumptions about how sort() works. Regardless of this
optimization patch, sort() may swap a and b without comparing them.

Could you help test the following code? If it works, I'll submit it as
an official patch.

Regards,
Kuan-Wei

diff --git a/sound/pci/hda/hda_auto_parser.c b/sound/pci/hda/hda_auto_parser.c
index 84393f4f429d..5502ec09b584 100644
--- a/sound/pci/hda/hda_auto_parser.c
+++ b/sound/pci/hda/hda_auto_parser.c
@@ -73,6 +73,8 @@ static int compare_input_type(const void *ap, const void *bp)
 		return (int)(a->type - b->type);

 	/* If has both hs_mic and hp_mic, pick the hs_mic ahead of hp_mic. */
+	if (a->is_headset_mic && b->is_headset_mic && a->is_headphone_mic && b->is_headphone_mic)
+		return (int)(b->has_boost_on_pin - a->has_boost_on_pin);
 	if (a->is_headset_mic && b->is_headphone_mic)
 		return -1; /* don't swap */
 	else if (a->is_headphone_mic && b->is_headset_mic)
Re: [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation
Posted by Linux regression tracking (Thorsten Leemhuis) 1 year, 6 months ago
On 20.06.24 17:36, Julian Sikorski wrote:
> 
> it appears that this patch has caused suspend-to-idle regression:
> 
> https://gitlab.freedesktop.org/drm/amd/-/issues/3436
> 
> In brief, my laptop fails to suspend completely with the following error
> in the log:
> 
> Jun 18 12:42:20 kernel: amd_pmc AMDI0005:00: Last suspend didn't reach
> deepest state
> 
> Power consumption remains high enough that my battery has already
> unexpectedly drained twice before I noticed something was off.
> I am not entirely sure how changes to sorting function can influence
> suspend, but it is what it is. 6.9.5 as shipped by Fedora 40 exhibits
> the issue, 6.9.5 as shipped by Fedora with the patch reverted does not.

Andrew, could you maybe help out here or point us in the direction of
somewhat that might be able to help? I'm asking, as from a quick search
on lore it seems Kuan-Wei Chiu has not posted anything since nearly four
weeks and thus also did not reply to Julian's regression report.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke
Re: [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation
Posted by Kuan-Wei Chiu 1 year, 6 months ago
On Fri, Jun 28, 2024 at 05:15:15PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 20.06.24 17:36, Julian Sikorski wrote:
> > 
> > it appears that this patch has caused suspend-to-idle regression:
> > 
> > https://gitlab.freedesktop.org/drm/amd/-/issues/3436
> > 
> > In brief, my laptop fails to suspend completely with the following error
> > in the log:
> > 
> > Jun 18 12:42:20 kernel: amd_pmc AMDI0005:00: Last suspend didn't reach
> > deepest state
> > 
> > Power consumption remains high enough that my battery has already
> > unexpectedly drained twice before I noticed something was off.
> > I am not entirely sure how changes to sorting function can influence
> > suspend, but it is what it is. 6.9.5 as shipped by Fedora 40 exhibits
> > the issue, 6.9.5 as shipped by Fedora with the patch reverted does not.
> 
> Andrew, could you maybe help out here or point us in the direction of
> somewhat that might be able to help? I'm asking, as from a quick search
> on lore it seems Kuan-Wei Chiu has not posted anything since nearly four
> weeks and thus also did not reply to Julian's regression report.
> 
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> --
> Everything you wanna know about Linux kernel regression tracking:
> https://linux-regtracking.leemhuis.info/about/#tldr
> If I did something stupid, please tell me, as explained on that page.
> 
> #regzbot poke

Sorry for the late reply.

I apologize for the regression caused by the patch. I am not familiar
with the power management domain. I would guess there might be some
side effects in the compare or swap functions that I was not aware of.

While reviewing the sort calls that could potentially cause the error,
I noticed that the compare function used in the sort call within
drivers/acpi/processor_idle.c might not satisfy the transitive
relation. Although I'm not sure if this is the root cause of the
problem, specifically, if there are two valid acpi_processor_cx
elements A and B, and an invalid element C, there could be a scenario
where A < B but simultaneously A = C and B = C. However, if I
understand correctly, this issue should have existed before this patch
but might not have been triggered previously. My patch might have
exposed this issue by changing the order of comparisons and swaps.

Regards,
Kuan-Wei
Re: [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation
Posted by Linux regression tracking (Thorsten Leemhuis) 1 year, 5 months ago
On 28.06.24 19:10, Kuan-Wei Chiu wrote:
> On Fri, Jun 28, 2024 at 05:15:15PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 20.06.24 17:36, Julian Sikorski wrote:
>>>
>>> it appears that this patch has caused suspend-to-idle regression:
>>>
>>> https://gitlab.freedesktop.org/drm/amd/-/issues/3436
>>>
>>> In brief, my laptop fails to suspend completely with the following error
>>> in the log:
>>>
>>> Jun 18 12:42:20 kernel: amd_pmc AMDI0005:00: Last suspend didn't reach
>>> deepest state
>>>
>>> Power consumption remains high enough that my battery has already
>>> unexpectedly drained twice before I noticed something was off.
>>> I am not entirely sure how changes to sorting function can influence
>>> suspend, but it is what it is. 6.9.5 as shipped by Fedora 40 exhibits
>>> the issue, 6.9.5 as shipped by Fedora with the patch reverted does not.
> [...]
> I apologize for the regression caused by the patch. I am not familiar
> with the power management domain. I would guess there might be some
> side effects in the compare or swap functions that I was not aware of.
> 
> While reviewing the sort calls that could potentially cause the error,
> I noticed that the compare function used in the sort call within
> drivers/acpi/processor_idle.c might not satisfy the transitive
> relation. Although I'm not sure if this is the root cause of the
> problem, specifically, if there are two valid acpi_processor_cx
> elements A and B, and an invalid element C, there could be a scenario
> where A < B but simultaneously A = C and B = C.

Let's bring in Rafael, he might be able to help us out here.

> However, if I
> understand correctly, this issue should have existed before this patch
> but might not have been triggered previously. My patch might have
> exposed this issue by changing the order of comparisons and swaps.

Yeah, these things happen. But when it comes to the Linux kernel's "no
regressions" rule that does not matter much; what matters more is which
change exposed the problem.

Ciao, Thorsten
Re: [PATCH 2/2] lib/sort: Optimize heapsort with double-pop variation
Posted by Mario Limonciello 1 year, 6 months ago
On 6/20/2024 10:36, Julian Sikorski wrote:
> Hello,
> 
> it appears that this patch has caused suspend-to-idle regression:
> 
> https://gitlab.freedesktop.org/drm/amd/-/issues/3436
> 
> In brief, my laptop fails to suspend completely with the following error 
> in the log:
> 
> Jun 18 12:42:20 kernel: amd_pmc AMDI0005:00: Last suspend didn't reach 
> deepest state
> 
> Power consumption remains high enough that my battery has already 
> unexpectedly drained twice before I noticed something was off.
> I am not entirely sure how changes to sorting function can influence 
> suspend, but it is what it is. 6.9.5 as shipped by Fedora 40 exhibits 
> the issue, 6.9.5 as shipped by Fedora with the patch reverted does not.
> 
> Best regards,
> Juliam
> 

+regressions M/L

#regzbot ^introduced: 0e02ca29a563
#regzbot from: Kuan-Wei Chiu <visitorckw@gmail.com>
#regzbot monitor: https://gitlab.freedesktop.org/drm/amd/-/issues/3436