[PATCH 1/2] ACPI: nsrepair2: Replace O(n²) bubble sort with O(n log n) sort_r()

Nick Huang posted 2 patches 6 days, 4 hours ago
[PATCH 1/2] ACPI: nsrepair2: Replace O(n²) bubble sort with O(n log n) sort_r()
Posted by Nick Huang 6 days, 4 hours ago
   Replace the O(n²) bubble sort implementation in acpi_ns_sort_list()
   with the kernel's sort_r() function which uses heapsort, providing
   O(n log n) time complexity.

   This improves performance for large ACPI package lists while also
   reducing code complexity by leveraging the existing kernel sort API.

Signed-off-by: Nick Huang <sef1548@gmail.com>
---
 drivers/acpi/acpica/nsrepair2.c | 87 +++++++++++++++++++++++----------
 1 file changed, 62 insertions(+), 25 deletions(-)

diff --git a/drivers/acpi/acpica/nsrepair2.c b/drivers/acpi/acpica/nsrepair2.c
index 8dbb870f4..a39ef59fe 100644
--- a/drivers/acpi/acpica/nsrepair2.c
+++ b/drivers/acpi/acpica/nsrepair2.c
@@ -9,6 +9,7 @@
  *****************************************************************************/
 
 #include <acpi/acpi.h>
+#include <linux/sort.h>
 #include "accommon.h"
 #include "acnamesp.h"
 
@@ -84,6 +85,14 @@ acpi_ns_check_sorted_list(struct acpi_evaluate_info *info,
 static void
 acpi_ns_remove_element(union acpi_operand_object *obj_desc, u32 index);
 
+/* Context structure for sort comparison function */
+struct acpi_sort_context {
+	u32 sort_index;
+	u8 sort_direction;
+};
+
+static int acpi_ns_sort_cmp(const void *a, const void *b, const void *priv);
+
 static void
 acpi_ns_sort_list(union acpi_operand_object **elements,
 		  u32 count, u32 index, u8 sort_direction);
@@ -851,6 +860,52 @@ acpi_ns_check_sorted_list(struct acpi_evaluate_info *info,
 	return (AE_OK);
 }
 
+/******************************************************************************
+ *
+ * FUNCTION:    acpi_ns_sort_cmp
+ *
+ * PARAMETERS:  a               - First element to compare
+ *              b               - Second element to compare
+ *              priv            - Pointer to sort context (acpi_sort_context)
+ *
+ * RETURN:      -1, 0, or 1 depending on sort order
+ *
+ * DESCRIPTION: Comparison function for sort_r() API. Compares the integer
+ *              values at the specified index within package elements.
+ *
+ *****************************************************************************/
+
+static int acpi_ns_sort_cmp(const void *a, const void *b, const void *priv)
+{
+	union acpi_operand_object *obj_a = *(union acpi_operand_object **)a;
+	union acpi_operand_object *obj_b = *(union acpi_operand_object **)b;
+	const struct acpi_sort_context *ctx = priv;
+	union acpi_operand_object *value_a;
+	union acpi_operand_object *value_b;
+	u64 a_val;
+	u64 b_val;
+
+	value_a = obj_a->package.elements[ctx->sort_index];
+	value_b = obj_b->package.elements[ctx->sort_index];
+
+	a_val = value_a->integer.value;
+	b_val = value_b->integer.value;
+
+	if (ctx->sort_direction == ACPI_SORT_ASCENDING) {
+		if (a_val < b_val)
+			return -1;
+		if (a_val > b_val)
+			return 1;
+	} else {
+		if (a_val > b_val)
+			return -1;
+		if (a_val < b_val)
+			return 1;
+	}
+
+	return 0;
+}
+
 /******************************************************************************
  *
  * FUNCTION:    acpi_ns_sort_list
@@ -873,31 +928,13 @@ static void
 acpi_ns_sort_list(union acpi_operand_object **elements,
 		  u32 count, u32 index, u8 sort_direction)
 {
-	union acpi_operand_object *obj_desc1;
-	union acpi_operand_object *obj_desc2;
-	union acpi_operand_object *temp_obj;
-	u32 i;
-	u32 j;
-
-	/* Simple bubble sort */
-
-	for (i = 1; i < count; i++) {
-		for (j = (count - 1); j >= i; j--) {
-			obj_desc1 = elements[j - 1]->package.elements[index];
-			obj_desc2 = elements[j]->package.elements[index];
-
-			if (((sort_direction == ACPI_SORT_ASCENDING) &&
-			     (obj_desc1->integer.value >
-			      obj_desc2->integer.value))
-			    || ((sort_direction == ACPI_SORT_DESCENDING)
-				&& (obj_desc1->integer.value <
-				    obj_desc2->integer.value))) {
-				temp_obj = elements[j - 1];
-				elements[j - 1] = elements[j];
-				elements[j] = temp_obj;
-			}
-		}
-	}
+	struct acpi_sort_context ctx;
+
+	ctx.sort_index = index;
+	ctx.sort_direction = sort_direction;
+
+	sort_r(elements, count, sizeof(union acpi_operand_object *),
+	       acpi_ns_sort_cmp, NULL, &ctx);
 }
 
 /******************************************************************************
-- 
2.43.0

Re: [PATCH 1/2] ACPI: nsrepair2: Replace O(n²)bubble sort with O(n log n) sort_r()
Posted by David Laight 5 days, 18 hours ago
On Sun,  1 Feb 2026 13:03:33 +0000
Nick Huang <sef1548@gmail.com> wrote:

>    Replace the O(n²) bubble sort implementation in acpi_ns_sort_list()
>    with the kernel's sort_r() function which uses heapsort, providing
>    O(n log n) time complexity.
> 
>    This improves performance for large ACPI package lists while also
>    reducing code complexity by leveraging the existing kernel sort API.

What is the break even size?
While the heapsort is O(n long n) it is also more complicated.
There is also the cost of the function call - especially with all the
mitigations that distro kernels are likely to enable.

For large datasets the d-cache locality of both sorts is particularly horrid.
It is almost certainly better to allocate an array of index:value pairs
and sort that.
For very big datasets you want to sort small sections (that fit in the
d-cache) and then use merge sorts (also O(n log n)) to combine them.
(Yes - this is how you sort data with 3 mag-tape drives....)

Oh, in any case, write separate functions for ascending/descending.

	David

> 
> Signed-off-by: Nick Huang <sef1548@gmail.com>
> ---
>  drivers/acpi/acpica/nsrepair2.c | 87 +++++++++++++++++++++++----------
>  1 file changed, 62 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/acpi/acpica/nsrepair2.c b/drivers/acpi/acpica/nsrepair2.c
> index 8dbb870f4..a39ef59fe 100644
> --- a/drivers/acpi/acpica/nsrepair2.c
> +++ b/drivers/acpi/acpica/nsrepair2.c
> @@ -9,6 +9,7 @@
>   *****************************************************************************/
>  
>  #include <acpi/acpi.h>
> +#include <linux/sort.h>
>  #include "accommon.h"
>  #include "acnamesp.h"
>  
> @@ -84,6 +85,14 @@ acpi_ns_check_sorted_list(struct acpi_evaluate_info *info,
>  static void
>  acpi_ns_remove_element(union acpi_operand_object *obj_desc, u32 index);
>  
> +/* Context structure for sort comparison function */
> +struct acpi_sort_context {
> +	u32 sort_index;
> +	u8 sort_direction;
> +};
> +
> +static int acpi_ns_sort_cmp(const void *a, const void *b, const void *priv);
> +
>  static void
>  acpi_ns_sort_list(union acpi_operand_object **elements,
>  		  u32 count, u32 index, u8 sort_direction);
> @@ -851,6 +860,52 @@ acpi_ns_check_sorted_list(struct acpi_evaluate_info *info,
>  	return (AE_OK);
>  }
>  
> +/******************************************************************************
> + *
> + * FUNCTION:    acpi_ns_sort_cmp
> + *
> + * PARAMETERS:  a               - First element to compare
> + *              b               - Second element to compare
> + *              priv            - Pointer to sort context (acpi_sort_context)
> + *
> + * RETURN:      -1, 0, or 1 depending on sort order
> + *
> + * DESCRIPTION: Comparison function for sort_r() API. Compares the integer
> + *              values at the specified index within package elements.
> + *
> + *****************************************************************************/
> +
> +static int acpi_ns_sort_cmp(const void *a, const void *b, const void *priv)
> +{
> +	union acpi_operand_object *obj_a = *(union acpi_operand_object **)a;
> +	union acpi_operand_object *obj_b = *(union acpi_operand_object **)b;
> +	const struct acpi_sort_context *ctx = priv;
> +	union acpi_operand_object *value_a;
> +	union acpi_operand_object *value_b;
> +	u64 a_val;
> +	u64 b_val;
> +
> +	value_a = obj_a->package.elements[ctx->sort_index];
> +	value_b = obj_b->package.elements[ctx->sort_index];
> +
> +	a_val = value_a->integer.value;
> +	b_val = value_b->integer.value;
> +
> +	if (ctx->sort_direction == ACPI_SORT_ASCENDING) {
> +		if (a_val < b_val)
> +			return -1;
> +		if (a_val > b_val)
> +			return 1;
> +	} else {
> +		if (a_val > b_val)
> +			return -1;
> +		if (a_val < b_val)
> +			return 1;
> +	}
> +
> +	return 0;
> +}
> +
>  /******************************************************************************
>   *
>   * FUNCTION:    acpi_ns_sort_list
> @@ -873,31 +928,13 @@ static void
>  acpi_ns_sort_list(union acpi_operand_object **elements,
>  		  u32 count, u32 index, u8 sort_direction)
>  {
> -	union acpi_operand_object *obj_desc1;
> -	union acpi_operand_object *obj_desc2;
> -	union acpi_operand_object *temp_obj;
> -	u32 i;
> -	u32 j;
> -
> -	/* Simple bubble sort */
> -
> -	for (i = 1; i < count; i++) {
> -		for (j = (count - 1); j >= i; j--) {
> -			obj_desc1 = elements[j - 1]->package.elements[index];
> -			obj_desc2 = elements[j]->package.elements[index];
> -
> -			if (((sort_direction == ACPI_SORT_ASCENDING) &&
> -			     (obj_desc1->integer.value >
> -			      obj_desc2->integer.value))
> -			    || ((sort_direction == ACPI_SORT_DESCENDING)
> -				&& (obj_desc1->integer.value <
> -				    obj_desc2->integer.value))) {
> -				temp_obj = elements[j - 1];
> -				elements[j - 1] = elements[j];
> -				elements[j] = temp_obj;
> -			}
> -		}
> -	}
> +	struct acpi_sort_context ctx;
> +
> +	ctx.sort_index = index;
> +	ctx.sort_direction = sort_direction;
> +
> +	sort_r(elements, count, sizeof(union acpi_operand_object *),
> +	       acpi_ns_sort_cmp, NULL, &ctx);
>  }
>  
>  /******************************************************************************
Re: [PATCH 1/2] ACPI: nsrepair2: Replace O(n²) bubble sort with O(n log n) sort_r()
Posted by Nick Huang 4 days, 6 hours ago
David Laight <david.laight.linux@gmail.com> 於 2026年2月2日週一 上午6:48寫道:
>
> On Sun,  1 Feb 2026 13:03:33 +0000
> Nick Huang <sef1548@gmail.com> wrote:
>
> >    Replace the O(n²) bubble sort implementation in acpi_ns_sort_list()
> >    with the kernel's sort_r() function which uses heapsort, providing
> >    O(n log n) time complexity.
> >
> >    This improves performance for large ACPI package lists while also
> >    reducing code complexity by leveraging the existing kernel sort API.
>
> What is the break even size?
> While the heapsort is O(n long n) it is also more complicated.
> There is also the cost of the function call - especially with all the
> mitigations that distro kernels are likely to enable.
>
> For large datasets the d-cache locality of both sorts is particularly horrid.
> It is almost certainly better to allocate an array of index:value pairs
> and sort that.
> For very big datasets you want to sort small sections (that fit in the
> d-cache) and then use merge sorts (also O(n log n)) to combine them.
> (Yes - this is how you sort data with 3 mag-tape drives....)
>
> Oh, in any case, write separate functions for ascending/descending.
>
>         David
>
Hi David

  Thank you for your reply and the detailed feedback.


   I ran KUnit benchmarks to investigate your questions.


   === Break-Even Point ===


   I compared bubble sort vs heapsort across various array sizes:


     N  | Bubble(ns) | Heap(ns) | Faster

    ----|------------|----------|--------

      2 |         61 |       99 | bubble

      3 |         63 |      115 | bubble

      4 |         78 |      163 | bubble

      5 |         96 |      215 | bubble

      6 |        119 |      260 | bubble

      8 |        177 |      388 | bubble

     10 |        257 |      539 | bubble

     12 |        415 |      721 | bubble

     16 |        726 |     1044 | bubble

     20 |       1106 |     1484 | bubble

     32 |       2854 |     3091 | bubble


   Bubble sort is faster for all tested sizes. The break-even point was not

   reached within N=32, which covers all realistic ACPI use cases:


     - _ALR: 2-10 entries (ambient light response)

     - _CST: 2-8 entries (C-states)

     - _PSS: 5-20 entries (P-states)

     - _TSS: 2-16 entries (T-states)


   As you noted, heapsort has additional overhead from function calls and

   mitigations that outweigh its O(n log n) advantage at small N.


   === Separate Ascending/Descending Functions ===


   I also tested combined vs separate sort functions as you suggested:


     N  | Combined(ns) | Separate(ns) | Saved

    ----|--------------|--------------|------

      4 |           84 |           75 |  11%

      8 |          179 |          175 |   3%

     16 |          737 |          806 |  -9%


   The results are mixed. Separate functions show marginal improvement at

   small N, but the combined function performs better at N=16, possibly

   due to instruction cache effects.


   === Conclusion ===


   Given these results, replacing bubble sort with heapsort would likely

   degrade performance for typical ACPI workloads. The existing bubble

   sort implementation appears to be the right choice for this use case.



-- 
Regards,
Nick Huang