[v3] selinux: speed up avc_search_node() with large number of avc nodes

[PATCH v3 2/2] selinux: improve bucket distribution uniformity of avc_hash()

Posted by Hongru Zhang 5 days, 17 hours ago

From: Hongru Zhang <zhanghongru@xiaomi.com>

Under heavy stress testing (on an 8-core system sustaining over 50,000
authentication events per second), sample once per second and take the
mean of 1800 samples:

1. Bucket utilization rate and length of longest chain
+--------------------------+-----------------------------------------+
|                          | bucket utilization rate / longest chain |
|                          +--------------------+--------------------+
|                          |      no-patch      |     with-patch     |
+--------------------------+--------------------+--------------------+
|  512 nodes,  512 buckets |      52.5%/7.5     |     58.2%/6.2      |
+--------------------------+--------------------+--------------------+
| 1024 nodes,  512 buckets |      68.9%/12.1    |     82.4%/8.9      |
+--------------------------+--------------------+--------------------+
| 2048 nodes,  512 buckets |      83.7%/19.4    |     94.8%/15.2     |
+--------------------------+--------------------+--------------------+
| 8192 nodes, 8192 buckets |      49.5%/11.4    |     61.9%/6.6      |
+--------------------------+--------------------+--------------------+

2. avc_search_node latency (total latency of hash operation and table
lookup)
+--------------------------+-----------------------------------------+
|                          |   latency of function avc_search_node   |
|                          +--------------------+--------------------+
|                          |      no-patch      |     with-patch     |
+--------------------------+--------------------+--------------------+
|  512 nodes,  512 buckets |        87ns        |        79ns        |
+--------------------------+--------------------+--------------------+
| 1024 nodes,  512 buckets |        97ns        |        91ns        |
+--------------------------+--------------------+--------------------+
| 2048 nodes,  512 buckets |       118ns        |       110ns        |
+--------------------------+--------------------+--------------------+
| 8192 nodes, 8192 buckets |       106ns        |        94ns        |
+--------------------------+--------------------+--------------------+

Although the multiplication in the new hash algorithm has higher overhead
than the bitwise operations in the original algorithm, the data shows
that the new algorithm achieves better distribution, reducing average
lookup time. Consequently, the total latency of hashing and table lookup
is lower than before.

Signed-off-by: Hongru Zhang <zhanghongru@xiaomi.com>
---
 security/selinux/avc.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/security/selinux/avc.c b/security/selinux/avc.c
index 7a7f88012865..fc631d1097bc 100644
--- a/security/selinux/avc.c
+++ b/security/selinux/avc.c
@@ -146,9 +146,24 @@ static struct kmem_cache *avc_xperms_data_cachep __ro_after_init;
 static struct kmem_cache *avc_xperms_decision_cachep __ro_after_init;
 static struct kmem_cache *avc_xperms_cachep __ro_after_init;
 
+/*
+ * Advantages of this hash design:
+ *     - Minimized collisions: Different inputs won't produce similar
+ *       contributions
+ *     - Bit diffusion: Each constant effectively scrambles input bits
+ *     - Mathematical guarantee: These constants are theoretically analyzed
+ *       and empirically validated
+ *     - Complementarity: Three constants complement each other at the
+ *       binary level
+ */
+#define C1 0x9E3779B9	/* 2^32 multiplied by Golden Ratio, classic constant
+			 * for Knuth's multiplicative hashing
+			 */
+#define C2 0x85EBCA77	/* Large prime-like properties */
+#define C3 0xC2B2AE35	/* Large prime-like properties, MurmurHash3 constant */
 static inline u32 avc_hash(u32 ssid, u32 tsid, u16 tclass)
 {
-	return (ssid ^ (tsid<<2) ^ (tclass<<4)) & (avc_cache_slots - 1);
+	return (ssid * C1 + tsid * C2 + tclass * C3) & (avc_cache_slots - 1);
 }
 
 /**
-- 
2.43.0

Re: [PATCH v3 2/2] selinux: improve bucket distribution uniformity of avc_hash()

Posted by Stephen Smalley 5 days, 11 hours ago

On Fri, Sep 26, 2025 at 2:23 AM Hongru Zhang <zhanghongru06@gmail.com> wrote:
>
> From: Hongru Zhang <zhanghongru@xiaomi.com>
>
> Under heavy stress testing (on an 8-core system sustaining over 50,000
> authentication events per second), sample once per second and take the
> mean of 1800 samples:
>
> 1. Bucket utilization rate and length of longest chain
> +--------------------------+-----------------------------------------+
> |                          | bucket utilization rate / longest chain |
> |                          +--------------------+--------------------+
> |                          |      no-patch      |     with-patch     |
> +--------------------------+--------------------+--------------------+
> |  512 nodes,  512 buckets |      52.5%/7.5     |     58.2%/6.2      |
> +--------------------------+--------------------+--------------------+
> | 1024 nodes,  512 buckets |      68.9%/12.1    |     82.4%/8.9      |
> +--------------------------+--------------------+--------------------+
> | 2048 nodes,  512 buckets |      83.7%/19.4    |     94.8%/15.2     |
> +--------------------------+--------------------+--------------------+
> | 8192 nodes, 8192 buckets |      49.5%/11.4    |     61.9%/6.6      |
> +--------------------------+--------------------+--------------------+
>
> 2. avc_search_node latency (total latency of hash operation and table
> lookup)
> +--------------------------+-----------------------------------------+
> |                          |   latency of function avc_search_node   |
> |                          +--------------------+--------------------+
> |                          |      no-patch      |     with-patch     |
> +--------------------------+--------------------+--------------------+
> |  512 nodes,  512 buckets |        87ns        |        79ns        |
> +--------------------------+--------------------+--------------------+
> | 1024 nodes,  512 buckets |        97ns        |        91ns        |
> +--------------------------+--------------------+--------------------+
> | 2048 nodes,  512 buckets |       118ns        |       110ns        |
> +--------------------------+--------------------+--------------------+
> | 8192 nodes, 8192 buckets |       106ns        |        94ns        |
> +--------------------------+--------------------+--------------------+
>
> Although the multiplication in the new hash algorithm has higher overhead
> than the bitwise operations in the original algorithm, the data shows
> that the new algorithm achieves better distribution, reducing average
> lookup time. Consequently, the total latency of hashing and table lookup
> is lower than before.
>
> Signed-off-by: Hongru Zhang <zhanghongru@xiaomi.com>

One minor nit below but you can wait and see what Paul says.
Otherwise,
Reviewed-by: Stephen Smalley <stephen.smalley.work@gmail.com>

> ---
>  security/selinux/avc.c | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/security/selinux/avc.c b/security/selinux/avc.c
> index 7a7f88012865..fc631d1097bc 100644
> --- a/security/selinux/avc.c
> +++ b/security/selinux/avc.c
> @@ -146,9 +146,24 @@ static struct kmem_cache *avc_xperms_data_cachep __ro_after_init;
>  static struct kmem_cache *avc_xperms_decision_cachep __ro_after_init;
>  static struct kmem_cache *avc_xperms_cachep __ro_after_init;
>
> +/*
> + * Advantages of this hash design:
> + *     - Minimized collisions: Different inputs won't produce similar
> + *       contributions
> + *     - Bit diffusion: Each constant effectively scrambles input bits
> + *     - Mathematical guarantee: These constants are theoretically analyzed
> + *       and empirically validated
> + *     - Complementarity: Three constants complement each other at the
> + *       binary level
> + */
> +#define C1 0x9E3779B9  /* 2^32 multiplied by Golden Ratio, classic constant
> +                        * for Knuth's multiplicative hashing
> +                        */

Personally, I would put this comment on lines above the #define rather
than alongside it since it is a multi-line comment. But you can wait
and see what Paul prefers.

> +#define C2 0x85EBCA77  /* Large prime-like properties */
> +#define C3 0xC2B2AE35  /* Large prime-like properties, MurmurHash3 constant */
>  static inline u32 avc_hash(u32 ssid, u32 tsid, u16 tclass)
>  {
> -       return (ssid ^ (tsid<<2) ^ (tclass<<4)) & (avc_cache_slots - 1);
> +       return (ssid * C1 + tsid * C2 + tclass * C3) & (avc_cache_slots - 1);
>  }
>
>  /**
> --
> 2.43.0
>

[PATCH v3 1/2] selinux: Make avc cache slot size configurable during boot
[PATCH v3 2/2] selinux: improve bucket distribution uniformity of avc_hash()