From nobody Wed Dec 17 18:01:32 2025 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AAE0713E88C for ; Sun, 15 Dec 2024 23:06:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734304015; cv=none; b=MR0kasJl8kJbWYDrJmtPPaRQarBN+JPlXXvfRX262Yf4oMSLr0vEHCkBFHJy0g156HYDwTK6ww5oPmIvBAUEfqJhRUm+xVq4qS8hGSL6NK78l+87J4MMurg9nJ1d/loDyxVYdEUHXFrR3j/FOW32eWG39pswc3LNXtsOBvIHYgU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734304015; c=relaxed/simple; bh=InvQp8csFucSAYHvhD3g6l+nVFgce1XtjBnaPhEq20I=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=HMZLG53ayR2VxQsHEz4YJRX+WYELjTy5Haq+UFeZxlO6txnIpUMfuDTZbyO78UreDXZ8Y2f9sCvYsqpGyjzjLb6LXLKeT1k29PIljLbd2WDY39hQHti4aMcaO+I2t7AR0KftxuuadIYm4pCx5l3/us+YI/bh6thmg117i5DjCW0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=oUlENTn1; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=PSxVCp5D; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="oUlENTn1"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="PSxVCp5D" From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1734304006; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gQw1ajNgbVfr/Cy7eKebvTHzkVUlLunUhj0hFbpNeGU=; b=oUlENTn11fS+Ps+HAxDLNvmDnSMpjmWsEOcSjenuBt1BP2lYbWKEFFRi2btesLeYGbOzii iM/xVZjI1JTyw/QIm8wt4QZTryaYfdDT7qz0t3lT7bffqAABh6WF4rOydSTul46Wvds3TR Ur+hqfZA6KklIRjx+wTEN6VwUSjlugeC63uwUNVxWHlbmQA2Ypy5Ww1MQbrxwhzX3P83ri 5HagoamJpit1fVj/1I5L5AdLYj73LC/AyYQVNZOYbz1MjOvCrxCC2Juv+3aIP1sCLERVI/ 6LZluqtMed4+kvLQ2ftRSCN0wb9603A2/tV8faL+NSFYBaNt3F8rJFgguAftTg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1734304006; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gQw1ajNgbVfr/Cy7eKebvTHzkVUlLunUhj0hFbpNeGU=; b=PSxVCp5Dqex9WilRqkXsfn+9tBzZK+wSjJvvwwhM6sCxqhKm9Y7vmuYXwtUM/Jk8H69VCK KTDDTGISl0SdU1AA== To: linux-kernel@vger.kernel.org Cc: =?UTF-8?q?Andr=C3=A9=20Almeida?= , Darren Hart , Davidlohr Bueso , Ingo Molnar , Juri Lelli , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Waiman Long , Sebastian Andrzej Siewior Subject: [PATCH v5 01/14] futex: Create helper function to initialize a hash slot. Date: Mon, 16 Dec 2024 00:00:05 +0100 Message-ID: <20241215230642.104118-2-bigeasy@linutronix.de> In-Reply-To: <20241215230642.104118-1-bigeasy@linutronix.de> References: <20241215230642.104118-1-bigeasy@linutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Factor out the futex_hash_bucket initialisation into a helpr function. The helper function will be used in a follow up patch implementing process private hash buckets. Signed-off-by: Sebastian Andrzej Siewior --- kernel/futex/core.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/kernel/futex/core.c b/kernel/futex/core.c index ebdd76b4ecbba..d1d3c7b358b23 100644 --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -1124,6 +1124,13 @@ void futex_exit_release(struct task_struct *tsk) futex_cleanup_end(tsk, FUTEX_STATE_DEAD); } =20 +static void futex_hash_bucket_init(struct futex_hash_bucket *fhb) +{ + atomic_set(&fhb->waiters, 0); + plist_head_init(&fhb->chain); + spin_lock_init(&fhb->lock); +} + static int __init futex_init(void) { unsigned int futex_shift; @@ -1141,11 +1148,8 @@ static int __init futex_init(void) futex_hashsize, futex_hashsize); futex_hashsize =3D 1UL << futex_shift; =20 - for (i =3D 0; i < futex_hashsize; i++) { - atomic_set(&futex_queues[i].waiters, 0); - plist_head_init(&futex_queues[i].chain); - spin_lock_init(&futex_queues[i].lock); - } + for (i =3D 0; i < futex_hashsize; i++) + futex_hash_bucket_init(&futex_queues[i]); =20 return 0; } --=20 2.45.2 From nobody Wed Dec 17 18:01:32 2025 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8F26E1BDA89 for ; Sun, 15 Dec 2024 23:06:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734304016; cv=none; b=R4G9+Upc///tRhnqD26lRSkNjfiXFWEDOg0IeG+c4urzfJV78CuFIESlc6tErQsJX3ziux8NjNbdhnxAXbP3f0Z8ish2Em7gvn8kgaQ+g5lQ/iuzWjeD3mYs2OtF1NlAZxZ6pHhoT0YMQ8/qki/XPmCbF1Evzo42DXiM0al+804= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734304016; c=relaxed/simple; bh=VFUo8fLP7fYv3J8uw9e2MvlXdsUe9QNKIk91xG+cLjw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=VvkOLo/GzdWJKyizsFKoVArbFsDKUXaXmNn6JYQhzBdcX14X4t4W9yhxh5tekzArtRkBN7iDA3LZE7UmvEpUzWsG4Rpt6aa5jPi4ldf4k0nibVzlpc6+RTpW9QbFd+o7UrYKHznnMyy/jeM17zuUFB9cuT7yuEnE34UWup0o9oY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=lHB8c7uH; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=q4bx57LW; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="lHB8c7uH"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="q4bx57LW" From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1734304006; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+PNdyDDnxj0+A1kbyMNndI39hLrigmWKLKP/EyyYtKE=; b=lHB8c7uHUQ39Pqg9/gB0McTC7jq3MrMISzOnhtvAsBOhIbjia7jnWSWtvJ4iduNRg3I0l9 caRmhBkj5St//ekbPKn+mENKFkB8nFlnODvZhpYeVtsmafQtLHeZqY8WoMRqTCxqMpnqZA kJFq5axyhdIEE1OAMWYwo4bUYqVPYh3bC1CbJ5/lJqC5CqCsZ9ASUcnOeHY/clxPl5X71D sRdQvuqDyEJUzwJHixKrWMTT/3FPOYdPIghfFSC5Iu+LKYg7tewkl2GVec7Pp7pgGcXgC1 EfLhe/pbtsPmAgojHT/J8mwD+1jqbNOlpqVp2yCqTSUTzL62NvPve7RzeDNkRw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1734304006; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+PNdyDDnxj0+A1kbyMNndI39hLrigmWKLKP/EyyYtKE=; b=q4bx57LWMcnQieDZmzF8AKXZl6PoHqB3FkGj5zSbcilqRxFGmZO//F8oB4D9P6sZLwb4wY wbmEl3dYQrIL2zAg== To: linux-kernel@vger.kernel.org Cc: =?UTF-8?q?Andr=C3=A9=20Almeida?= , Darren Hart , Davidlohr Bueso , Ingo Molnar , Juri Lelli , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Waiman Long , Sebastian Andrzej Siewior Subject: [PATCH v5 02/14] futex: Add basic infrastructure for local task local hash. Date: Mon, 16 Dec 2024 00:00:06 +0100 Message-ID: <20241215230642.104118-3-bigeasy@linutronix.de> In-Reply-To: <20241215230642.104118-1-bigeasy@linutronix.de> References: <20241215230642.104118-1-bigeasy@linutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The futex hashmap is system wide and shared by random tasks. Each slot is hashed based on its address and VMA. Due to randomized VMAs (and memory allocations) the same logical lock (pointer) can end up in a different hash bucket on each invocation of the application. This in turn means that different applications may share a hash bucket on the first invocation but not on the second an it is not always clear which applications will be involved. This can result in high latency's to acquire the futex_hash_bucket::lock especially if the lock owner is limited to a CPU and not be effectively PI boosted. Introduce a task local hash map. The hashmap can be allocated via prctl(PR_FUTEX_HASH, PR_FUTEX_HASH_SET_SLOTS, 0) The `0' argument allocates a default number of 16 slots, a higher number can be specified if desired. The current upper limit is 131072. The allocated hashmap is used by all threads within a process. A thread can check if the private map has been allocated via prctl(PR_FUTEX_HASH, PR_FUTEX_HASH_GET_SLOTS); Which return the current number of slots. Signed-off-by: Sebastian Andrzej Siewior --- include/linux/futex.h | 20 ++++++++ include/linux/mm_types.h | 5 ++ include/uapi/linux/prctl.h | 5 ++ kernel/fork.c | 2 + kernel/futex/core.c | 99 ++++++++++++++++++++++++++++++++++++-- kernel/sys.c | 4 ++ 6 files changed, 132 insertions(+), 3 deletions(-) diff --git a/include/linux/futex.h b/include/linux/futex.h index b70df27d7e85c..943828db52234 100644 --- a/include/linux/futex.h +++ b/include/linux/futex.h @@ -77,6 +77,15 @@ void futex_exec_release(struct task_struct *tsk); =20 long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout, u32 __user *uaddr2, u32 val2, u32 val3); +int futex_hash_prctl(unsigned long arg2, unsigned long arg3); +int futex_hash_allocate_default(void); +void futex_hash_free(struct mm_struct *mm); + +static inline void futex_mm_init(struct mm_struct *mm) +{ + mm->futex_hash_bucket =3D NULL; +} + #else static inline void futex_init_task(struct task_struct *tsk) { } static inline void futex_exit_recursive(struct task_struct *tsk) { } @@ -88,6 +97,17 @@ static inline long do_futex(u32 __user *uaddr, int op, u= 32 val, { return -EINVAL; } +static inline int futex_hash_prctl(unsigned long arg2, unsigned long arg3) +{ + return -EINVAL; +} +static inline int futex_hash_allocate_default(void) +{ + return 0; +} +static inline void futex_hash_free(struct mm_struct *mm) { } +static inline void futex_mm_init(struct mm_struct *mm) { } + #endif =20 #endif diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 7361a8f3ab68e..2337a2e481fd0 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -30,6 +30,7 @@ #define INIT_PASID 0 =20 struct address_space; +struct futex_hash_bucket; struct mem_cgroup; =20 /* @@ -902,6 +903,10 @@ struct mm_struct { int mm_lock_seq; #endif =20 +#ifdef CONFIG_FUTEX + unsigned int futex_hash_mask; + struct futex_hash_bucket *futex_hash_bucket; +#endif =20 unsigned long hiwater_rss; /* High-watermark of RSS usage */ unsigned long hiwater_vm; /* High-water virtual memory usage */ diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index 5c6080680cb27..55b843644c51a 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -353,4 +353,9 @@ struct prctl_mm_map { */ #define PR_LOCK_SHADOW_STACK_STATUS 76 =20 +/* FUTEX hash management */ +#define PR_FUTEX_HASH 77 +# define PR_FUTEX_HASH_SET_SLOTS 1 +# define PR_FUTEX_HASH_GET_SLOTS 2 + #endif /* _LINUX_PRCTL_H */ diff --git a/kernel/fork.c b/kernel/fork.c index 1450b461d196a..cda8886f3a1d7 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1284,6 +1284,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm= , struct task_struct *p, RCU_INIT_POINTER(mm->exe_file, NULL); mmu_notifier_subscriptions_init(mm); init_tlb_flush_pending(mm); + futex_mm_init(mm); #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !defined(CONFIG_SPLIT_PMD_PTLO= CKS) mm->pmd_huge_pte =3D NULL; #endif @@ -1361,6 +1362,7 @@ static inline void __mmput(struct mm_struct *mm) if (mm->binfmt) module_put(mm->binfmt->module); lru_gen_del_mm(mm); + futex_hash_free(mm); mmdrop(mm); } =20 diff --git a/kernel/futex/core.c b/kernel/futex/core.c index d1d3c7b358b23..b87bd27b73707 100644 --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -39,6 +39,7 @@ #include #include #include +#include =20 #include "futex.h" #include "../locking/rtmutex_common.h" @@ -107,18 +108,40 @@ late_initcall(fail_futex_debugfs); =20 #endif /* CONFIG_FAIL_FUTEX */ =20 +static inline bool futex_key_is_private(union futex_key *key) +{ + /* + * Relies on get_futex_key() to set either bit for shared + * futexes -- see comment with union futex_key. + */ + return !(key->both.offset & (FUT_OFF_INODE | FUT_OFF_MMSHARED)); +} + /** * futex_hash - Return the hash bucket in the global hash * @key: Pointer to the futex key for which the hash is calculated * * We hash on the keys returned from get_futex_key (see below) and return = the - * corresponding hash bucket in the global hash. + * corresponding hash bucket in the global hash. If the FUTEX is private a= nd + * a local hash table is privated then this one is used. */ struct futex_hash_bucket *futex_hash(union futex_key *key) { - u32 hash =3D jhash2((u32 *)key, offsetof(typeof(*key), both.offset) / 4, - key->both.offset); + struct futex_hash_bucket *fhb; + u32 hash; =20 + fhb =3D current->mm->futex_hash_bucket; + if (fhb && futex_key_is_private(key)) { + u32 hash_mask =3D current->mm->futex_hash_mask; + + hash =3D jhash2((u32 *)key, + offsetof(typeof(*key), both.offset) / 4, + key->both.offset); + return &fhb[hash & hash_mask]; + } + hash =3D jhash2((u32 *)key, + offsetof(typeof(*key), both.offset) / 4, + key->both.offset); return &futex_queues[hash & (futex_hashsize - 1)]; } =20 @@ -1131,6 +1154,76 @@ static void futex_hash_bucket_init(struct futex_hash= _bucket *fhb) spin_lock_init(&fhb->lock); } =20 +void futex_hash_free(struct mm_struct *mm) +{ + kvfree(mm->futex_hash_bucket); +} + +static int futex_hash_allocate(unsigned int hash_slots) +{ + struct futex_hash_bucket *fhb; + int i; + + if (current->mm->futex_hash_bucket) + return -EALREADY; + + if (!thread_group_leader(current)) + return -EINVAL; + + if (hash_slots =3D=3D 0) + hash_slots =3D 16; + if (hash_slots < 2) + hash_slots =3D 2; + if (hash_slots > 131072) + hash_slots =3D 131072; + if (!is_power_of_2(hash_slots)) + hash_slots =3D rounddown_pow_of_two(hash_slots); + + fhb =3D kvmalloc_array(hash_slots, sizeof(struct futex_hash_bucket), GFP_= KERNEL_ACCOUNT); + if (!fhb) + return -ENOMEM; + + current->mm->futex_hash_mask =3D hash_slots - 1; + + for (i =3D 0; i < hash_slots; i++) + futex_hash_bucket_init(&fhb[i]); + + current->mm->futex_hash_bucket =3D fhb; + return 0; +} + +int futex_hash_allocate_default(void) +{ + return futex_hash_allocate(0); +} + +static int futex_hash_get_slots(void) +{ + if (current->mm->futex_hash_bucket) + return current->mm->futex_hash_mask + 1; + return 0; +} + +int futex_hash_prctl(unsigned long arg2, unsigned long arg3) +{ + int ret; + + switch (arg2) { + case PR_FUTEX_HASH_SET_SLOTS: + ret =3D futex_hash_allocate(arg3); + break; + + case PR_FUTEX_HASH_GET_SLOTS: + ret =3D futex_hash_get_slots(); + break; + + default: + ret =3D -EINVAL; + break; + } + return ret; +} + static int __init futex_init(void) { unsigned int futex_shift; diff --git a/kernel/sys.c b/kernel/sys.c index c4c701c6f0b4d..d8081f1d07d11 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -52,6 +52,7 @@ #include #include #include +#include =20 #include #include @@ -2809,6 +2810,9 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, ar= g2, unsigned long, arg3, return -EINVAL; error =3D arch_lock_shadow_stack_status(me, arg2); break; + case PR_FUTEX_HASH: + error =3D futex_hash_prctl(arg2, arg3); + break; default: error =3D -EINVAL; break; --=20 2.45.2 From nobody Wed Dec 17 18:01:32 2025 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8F2C01BF804 for ; Sun, 15 Dec 2024 23:06:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734304016; cv=none; b=qAGIMa0y3fIQ1OTWpaB6+6VwzovqsT9thNP4Qf41S1VKTJ9tNA58qRo1hlolAX91x5gmNTlO76r4geOUPqCSbyLOjtsXMsFMFVN/CKQ5JMR/rdGYisXXBmZp3SctrXJ5omFkY90la95Zvr/bUXNL3ilt/oA66ab8JgoVk8ohkf0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734304016; c=relaxed/simple; bh=iuTiLj2z21pkOsFUc4CUzne/vJKYqz8+5pclDz3yn3E=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Cuj3EzI0r5vQltjPd3aAsWZKeB5zUMDK28P7V8kXHERjrqRjA+aQ7SlA+U4oO5lRwDDnb+nHd89wEioM78Bq+C2UlcRzr6UkbuM3/+a32GjL0fDyJjYAWQGwE6ce6+SSA7pA/oGkiH4zSzB4mzdr1EwmZwGUKWhJwGOD0KM4iqo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=Tjx25s4p; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=rpcPPMDp; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="Tjx25s4p"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="rpcPPMDp" From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1734304007; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=t3raVT5zrCpWLkmYLc1TieRa21nw1jX0UrDlRecHYYg=; b=Tjx25s4pSvw7czbpFUfhH4zrJyy3bGMikuvVPCwsWWjLdc5eWcx7xntTy/IJdbYc4LNGoi GzxbrOvEXzSdG8fK0t+F7b1V5+mZFYsghINS+cQkNiXsrqa2hwIE9iMVLnUMl89f1h6KEA 9TtCg9vxJQnrTn29TF9HFG9XlSFu55tPirNcKt6fizBuC7gP0bS6wO68I3DIdZ9P55tK6A +wAGTirRd9xapmK1eCE0BnvWUZmauHHHJL8Ibz/miMUcMI89WaG7TepRnijycFSX+XVulV N9WEfWOAu2OVggUDS4vUVa38YcN9VpNpE/9kYL+Zh3DQ433hWRH9iTrl/nQZlg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1734304007; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=t3raVT5zrCpWLkmYLc1TieRa21nw1jX0UrDlRecHYYg=; b=rpcPPMDp/IrAueoLPdBlJm1yybzjzAFea7UH4jn/ETmCU978NSyAPYkS/wpk1KnBdla8XR xbt3pjOQb5EJjvAA== To: linux-kernel@vger.kernel.org Cc: =?UTF-8?q?Andr=C3=A9=20Almeida?= , Darren Hart , Davidlohr Bueso , Ingo Molnar , Juri Lelli , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Waiman Long , Sebastian Andrzej Siewior Subject: [PATCH v5 03/14] futex: Allow automatic allocation of process wide futex hash. Date: Mon, 16 Dec 2024 00:00:07 +0100 Message-ID: <20241215230642.104118-4-bigeasy@linutronix.de> In-Reply-To: <20241215230642.104118-1-bigeasy@linutronix.de> References: <20241215230642.104118-1-bigeasy@linutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Allocate a default futex hash if a task forks its first thread. Signed-off-by: Sebastian Andrzej Siewior --- include/linux/futex.h | 12 ++++++++++++ kernel/fork.c | 24 ++++++++++++++++++++++++ 2 files changed, 36 insertions(+) diff --git a/include/linux/futex.h b/include/linux/futex.h index 943828db52234..bad377c30de5e 100644 --- a/include/linux/futex.h +++ b/include/linux/futex.h @@ -86,6 +86,13 @@ static inline void futex_mm_init(struct mm_struct *mm) mm->futex_hash_bucket =3D NULL; } =20 +static inline bool futex_hash_requires_allocation(void) +{ + if (current->mm->futex_hash_bucket) + return false; + return true; +} + #else static inline void futex_init_task(struct task_struct *tsk) { } static inline void futex_exit_recursive(struct task_struct *tsk) { } @@ -108,6 +115,11 @@ static inline int futex_hash_allocate_default(void) static inline void futex_hash_free(struct mm_struct *mm) { } static inline void futex_mm_init(struct mm_struct *mm) { } =20 +static inline bool futex_hash_requires_allocation(void) +{ + return false; +} + #endif =20 #endif diff --git a/kernel/fork.c b/kernel/fork.c index cda8886f3a1d7..e34bb2a107a9d 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2130,6 +2130,15 @@ static void rv_task_fork(struct task_struct *p) #define rv_task_fork(p) do {} while (0) #endif =20 +static bool need_futex_hash_allocate_default(u64 clone_flags) +{ + if ((clone_flags & (CLONE_THREAD | CLONE_VM)) !=3D (CLONE_THREAD | CLONE_= VM)) + return false; + if (!thread_group_empty(current)) + return false; + return futex_hash_requires_allocation(); +} + /* * This creates a new process as a copy of the old one, * but does not actually start it yet. @@ -2507,6 +2516,21 @@ __latent_entropy struct task_struct *copy_process( if (retval) goto bad_fork_cancel_cgroup; =20 + /* + * Allocate a default futex hash for the user process once the first + * thread spawns. + */ + if (need_futex_hash_allocate_default(clone_flags)) { + retval =3D futex_hash_allocate_default(); + if (retval) + goto bad_fork_core_free; + /* + * If we fail beyond this point we don't free the allocated + * futex hash map. We assume that another thread will created + * and makes use of it The hash map will be freed once the main + * thread terminates. + */ + } /* * From this point on we must avoid any synchronous user-space * communication until we take the tasklist-lock. In particular, we do --=20 2.45.2 From nobody Wed Dec 17 18:01:32 2025 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8F20A1885A5 for ; Sun, 15 Dec 2024 23:06:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734304016; cv=none; b=YQMhXxO9oznLjYtEl/mNlwAFUsxl7N3V3T3y5NLMS4qAj4XQZtI/1sYKwIKJB+JIahnpehtRauEL7Ls1bIuEWgpNrcMtvNwz9R5c5idpzvLOR6jKGXHjGWY23UUSFaJ+vjQSDb+CfQ1RY6UN2e59wcAwdRqLWdHsKfupBph51bU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734304016; c=relaxed/simple; bh=KHeSjPd0zRCAGUtxGFlcWOPrmx9y+ukDcIFpzj3jkJU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=YdphbcoZP6xBF0NDQndhO/l9CGCu/Jh0woBjqwiQZhsxM7u337sJrvV1ECN1CQh23BZ4JHyIt0NAJ8Znc0FhpmhDw5oLLNqkWOulOt+YVQ6iHjyF+82tZDVV7JGxvO6XwrS6L4Ya1DFJRJ9JyHZxGiIIdydc50j2QHtbpc9A07I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=Wj1RtPGw; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=ROwnjMX9; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="Wj1RtPGw"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="ROwnjMX9" From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1734304007; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XOfEVIxyuzb94jSGu1mkbITIF88QTRaDLmkNO646qMs=; b=Wj1RtPGwGqK4iS882X2utHk7Xonbp1mBivUjy0D1YuCupWOe/9vnnbBqZJCOap/LzbNHUk TyDSggtUpL5IVLMO6RjPOvn3/g/kqQHETd+yvdMGvQK6dEDbdh8L0I9UyWb8SytEhXbUlp LqrTGnFqYe1R9TuI5YVgEUduV3+Sc1HJWLyJl+nHxQf/KOUgu+hEa92D2oeCIfNQ+05WFa Qyiqt8sW2PbOlK7jmSUcKJqro7XRU6LH4kKOX6kRw0QVnNjAwX242Y4kwfs/HWwCCYxc8D fBT1tLwiZGJtpfhj/SnUF/81fbibzp1GrdzbTKLd13rxYdTq0v38+wT527AXRg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1734304007; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XOfEVIxyuzb94jSGu1mkbITIF88QTRaDLmkNO646qMs=; b=ROwnjMX9Pjj3paSZUe1UCai39B5ThLmwtZ6/SrGAB75Vpz+807rKiO18QToxYefKGfymJi n6YQfDa9AKSAgnAg== To: linux-kernel@vger.kernel.org Cc: =?UTF-8?q?Andr=C3=A9=20Almeida?= , Darren Hart , Davidlohr Bueso , Ingo Molnar , Juri Lelli , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Waiman Long , Sebastian Andrzej Siewior Subject: [PATCH v5 04/14] futex: Hash only the address for private futexes. Date: Mon, 16 Dec 2024 00:00:08 +0100 Message-ID: <20241215230642.104118-5-bigeasy@linutronix.de> In-Reply-To: <20241215230642.104118-1-bigeasy@linutronix.de> References: <20241215230642.104118-1-bigeasy@linutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" futex_hash() passes the whole futex_key to jhash2. The first two member are passed as the first argument and the offset as the "initial value". For private futexes, the mm-part is always the same and it is used only within the process. By excluding the mm part from the hash, we reduce the length passed to jhash2 from 4 (16 / 4) to 2 (8 / 2). This avoids the __jhash_mix() part of jhash. The resulting code is smaller and based on testing this variant performs as good as the original or slightly better. Signed-off-by: Sebastian Andrzej Siewior --- kernel/futex/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/futex/core.c b/kernel/futex/core.c index b87bd27b73707..583a0149d62c9 100644 --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -134,8 +134,8 @@ struct futex_hash_bucket *futex_hash(union futex_key *k= ey) if (fhb && futex_key_is_private(key)) { u32 hash_mask =3D current->mm->futex_hash_mask; =20 - hash =3D jhash2((u32 *)key, - offsetof(typeof(*key), both.offset) / 4, + hash =3D jhash2((void *)&key->private.address, + sizeof(key->private.address) / 4, key->both.offset); return &fhb[hash & hash_mask]; } --=20 2.45.2 From nobody Wed Dec 17 18:01:32 2025 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CA7671C54BF for ; Sun, 15 Dec 2024 23:06:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734304018; cv=none; b=hOm//SmmmJiuPKKTq2O/9IIPQm8EgLiaswT5wu8L16rEPXEmhsqrr37/ToTQccs2WakTM+LPfTnAUfjy/TdglgjlJEiHphaKkzGF9DQK+y7Z28YTpAwX1h2ZeDlZ7MrPTm/hueP655d5GeGmxb6PuQpaNFAYUOixDh1G6qnXxwk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734304018; c=relaxed/simple; bh=XCZnUcVRTg/OdBNdYd0llHPmAfBHZiYKdMr5bvCqgZs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=c7rHyD6C2XustAh+IeavmnEpaKjOQrPMCC+v5UDPb/xWYrRl6nqpVsnOSxyLDcgw6aZNKEl0C/E1FQLpHLS83ot++6vW+5Vi0AwL7C6H+KF8CZQtR17wb+qTOQlHi9m1TAfWvAZZFT43HRKjaTibkmxWWkZEN89FAc1/MYqx78E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=zyYGNDBS; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=fiRxPRgi; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="zyYGNDBS"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="fiRxPRgi" From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1734304007; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FjMoQW4IUAh/X7O25zMu1DmUj0sv7Y/29SSuFX54hl0=; b=zyYGNDBS6MzYtimsNxXaj3pKa/3MHj/OtFxs8T2zMbQeGC3XS0zMziBarTx61dLQZ7mzyA 499z23StgdVeDvXidER2SDApQ0mrdd8dHhcvemX6UwlO6W4mrc4N+bMi5hUNI0gMBpfLDW y5J1Tyc0PCHpXqITVN6Yj69VhB6u6ZDMmGcHPPig6b0BdW7tGLEQ39blCW1GT/tmihHy/x jM3B2BS8MhpjKnHNk32KRuqUJGCY12yxFzx5O358OGNI3k1azYHXlYOq3Qy6g7u1hwojFY qFzaLrODOWJCaNYVXUDMQbDWsWETSbqGzrGnvp3rkHV+QsFRNeDqHuFOKeixkg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1734304007; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FjMoQW4IUAh/X7O25zMu1DmUj0sv7Y/29SSuFX54hl0=; b=fiRxPRgi2xx9o6KwA997m5S8tC3VBNjFoGe/rjzjs2yiwpaUPhfv9W1uL3nf+g9fchTIC+ T2zm6Ix3re5aIxAw== To: linux-kernel@vger.kernel.org Cc: =?UTF-8?q?Andr=C3=A9=20Almeida?= , Darren Hart , Davidlohr Bueso , Ingo Molnar , Juri Lelli , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Waiman Long , Sebastian Andrzej Siewior Subject: [PATCH v5 05/14] futex: Move private hashing into its own function. Date: Mon, 16 Dec 2024 00:00:09 +0100 Message-ID: <20241215230642.104118-6-bigeasy@linutronix.de> In-Reply-To: <20241215230642.104118-1-bigeasy@linutronix.de> References: <20241215230642.104118-1-bigeasy@linutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The hashing of the private is slightly different and will be needed again while moving a futex_q entry to a different hash bucket after the resize. Move the private hashing into its own function. Signed-off-by: Sebastian Andrzej Siewior --- kernel/futex/core.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/kernel/futex/core.c b/kernel/futex/core.c index 583a0149d62c9..907b76590df16 100644 --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -117,6 +117,18 @@ static inline bool futex_key_is_private(union futex_ke= y *key) return !(key->both.offset & (FUT_OFF_INODE | FUT_OFF_MMSHARED)); } =20 +static struct futex_hash_bucket *futex_hash_private(union futex_key *key, + struct futex_hash_bucket *fhb, + u32 hash_mask) +{ + u32 hash; + + hash =3D jhash2((void *)&key->private.address, + sizeof(key->private.address) / 4, + key->both.offset); + return &fhb[hash & hash_mask]; +} + /** * futex_hash - Return the hash bucket in the global hash * @key: Pointer to the futex key for which the hash is calculated @@ -131,14 +143,9 @@ struct futex_hash_bucket *futex_hash(union futex_key *= key) u32 hash; =20 fhb =3D current->mm->futex_hash_bucket; - if (fhb && futex_key_is_private(key)) { - u32 hash_mask =3D current->mm->futex_hash_mask; + if (fhb && futex_key_is_private(key)) + return futex_hash_private(key, fhb, current->mm->futex_hash_mask); =20 - hash =3D jhash2((void *)&key->private.address, - sizeof(key->private.address) / 4, - key->both.offset); - return &fhb[hash & hash_mask]; - } hash =3D jhash2((u32 *)key, offsetof(typeof(*key), both.offset) / 4, key->both.offset); --=20 2.45.2 From nobody Wed Dec 17 18:01:32 2025 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CA8211C7B62 for ; Sun, 15 Dec 2024 23:06:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734304019; cv=none; b=Xp6WMmZIAXDyNgNZwPesok90890dLOtxIM1tzJIbW6rk2Wh7WrCJlktYrGbDA/IHOvKG2e7wRcsU5acQ5gL1+0LOlddXf7sOLxAZoLKEu0p2ni44Ns0YBcXR7ou8hJUEztTLUvxXW4Vlum8TC5n/CLyxLzKpGtXytl3Hz5O7eKg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734304019; c=relaxed/simple; bh=4OCLSK0IAckDHRlP+nH4B2ka9CDlgPnOHY28BujTCRI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=REgnhAQQWk62c0Lri4NtH1HRDAC7FBNxpkazweYQaTbDkCcJ9u45qXJi7MCjckEysPsMTFaRaPBw31vgaJXhO22A/r0P2kHT4Fd7gd1FPfKC9lJWWvhIB5CJOBY5lkQkqe0g4eOu1NjMcrY2kSFCY8BTUeCrokIJ3CqgqoPX0Co= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=AbE7dw2b; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=MEvf4JhB; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="AbE7dw2b"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="MEvf4JhB" From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1734304007; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=etNnTYYSTjAmrwSwk7AJnd4JiQiKqSMx0MbJH27pRgc=; b=AbE7dw2b5IjDgNp8WZSo2n8vPHV9AgBFCfpowmQRrmPZzD2/qfY4O54567UGB8jVQ+Vhop D5FCDgjwG4FhGnIFhXosWKdzF4edy2VCY3PVfCRRdBQa2oWZWJYN02JQHkNeOiJngGSUi4 5893Ne5GquqCabYWAjMohXcBkCgIpDlkuMf4rPVZAcUdjiTzbDRL4TX+0IH8XQNYha54U9 FNlO+F73xNNcl+kjlLcVB61PF8k6omzmc8RiDud45Cr2Cu7+8lTgW/jSyK5eLqfZK73jHr hIcGoniZzH3aII/89mgWpjrsr83+XfibuB1n+R8M9zce/7xahvm5Hy+BIvaxKg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1734304008; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=etNnTYYSTjAmrwSwk7AJnd4JiQiKqSMx0MbJH27pRgc=; b=MEvf4JhB3pbkICNP769Fv6VrqrdbGMMxGbQ73/6OvHowYs+PVVgG0u2x4sUUNlrOzyk2Pz DPxfVYZSrqvmgtDg== To: linux-kernel@vger.kernel.org Cc: =?UTF-8?q?Andr=C3=A9=20Almeida?= , Darren Hart , Davidlohr Bueso , Ingo Molnar , Juri Lelli , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Waiman Long , Sebastian Andrzej Siewior Subject: [PATCH v5 06/14] futex: Add helper which include the put of a hb after end of operation. Date: Mon, 16 Dec 2024 00:00:10 +0100 Message-ID: <20241215230642.104118-7-bigeasy@linutronix.de> In-Reply-To: <20241215230642.104118-1-bigeasy@linutronix.de> References: <20241215230642.104118-1-bigeasy@linutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" With the planned schema of resize of hb, a reference count will be obtained during futex_hash() and will be dropped after the hb is unlocked. Once the reference is dropped, the hb must not be used because it will disappear after a resize. To prepare the integration, rename - futex_hb_unlock() to futex_hb_unlock_put() - futex_queue() to futex_queue_put() - futex_q_unlock() to futex_q_unlock_put() - double_unlock_hb() to double_unlock_hb_put() which is additionally includes futex_hb_unlock_put(), an empty stub. Introduce futex_hb_unlock_put() which is the unlock plus the reference drop. Move futex_hb_waiters_dec() before the reference drop, if needed before the unlock. Update comments referring to the functions accordingly. Signed-off-by: Sebastian Andrzej Siewior --- io_uring/futex.c | 2 +- kernel/futex/core.c | 12 ++++++++---- kernel/futex/futex.h | 31 ++++++++++++++++++++----------- kernel/futex/pi.c | 19 ++++++++++--------- kernel/futex/requeue.c | 15 ++++++++------- kernel/futex/waitwake.c | 23 ++++++++++++----------- 6 files changed, 59 insertions(+), 43 deletions(-) diff --git a/io_uring/futex.c b/io_uring/futex.c index e29662f039e1a..67246438da228 100644 --- a/io_uring/futex.c +++ b/io_uring/futex.c @@ -349,7 +349,7 @@ int io_futex_wait(struct io_kiocb *req, unsigned int is= sue_flags) hlist_add_head(&req->hash_node, &ctx->futex_list); io_ring_submit_unlock(ctx, issue_flags); =20 - futex_queue(&ifd->q, hb); + futex_queue_put(&ifd->q, hb); return IOU_ISSUE_SKIP_COMPLETE; } =20 diff --git a/kernel/futex/core.c b/kernel/futex/core.c index 907b76590df16..3cfdd4c02f261 100644 --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -152,6 +152,9 @@ struct futex_hash_bucket *futex_hash(union futex_key *k= ey) return &futex_queues[hash & (futex_hashsize - 1)]; } =20 +void futex_hash_put(struct futex_hash_bucket *hb) +{ +} =20 /** * futex_setup_timer - set up the sleeping hrtimer. @@ -543,8 +546,8 @@ struct futex_hash_bucket *futex_q_lock(struct futex_q *= q) * Increment the counter before taking the lock so that * a potential waker won't miss a to-be-slept task that is * waiting for the spinlock. This is safe as all futex_q_lock() - * users end up calling futex_queue(). Similarly, for housekeeping, - * decrement the counter at futex_q_unlock() when some error has + * users end up calling futex_queue_put(). Similarly, for housekeeping, + * decrement the counter at futex_q_unlock_put() when some error has * occurred and we don't end up adding the task to the list. */ futex_hb_waiters_inc(hb); /* implies smp_mb(); (A) */ @@ -555,11 +558,12 @@ struct futex_hash_bucket *futex_q_lock(struct futex_q= *q) return hb; } =20 -void futex_q_unlock(struct futex_hash_bucket *hb) +void futex_q_unlock_put(struct futex_hash_bucket *hb) __releases(&hb->lock) { spin_unlock(&hb->lock); futex_hb_waiters_dec(hb); + futex_hash_put(hb); } =20 void __futex_queue(struct futex_q *q, struct futex_hash_bucket *hb) @@ -586,7 +590,7 @@ void __futex_queue(struct futex_q *q, struct futex_hash= _bucket *hb) * @q: The futex_q to unqueue * * The q->lock_ptr must not be held by the caller. A call to futex_unqueue= () must - * be paired with exactly one earlier call to futex_queue(). + * be paired with exactly one earlier call to futex_queue_put(). * * Return: * - 1 - if the futex_q was still queued (and we removed unqueued it); diff --git a/kernel/futex/futex.h b/kernel/futex/futex.h index 618ce1fe870e9..5793546a48ebf 100644 --- a/kernel/futex/futex.h +++ b/kernel/futex/futex.h @@ -202,6 +202,7 @@ futex_setup_timer(ktime_t *time, struct hrtimer_sleeper= *timeout, int flags, u64 range_ns); =20 extern struct futex_hash_bucket *futex_hash(union futex_key *key); +extern void futex_hash_put(struct futex_hash_bucket *hb); =20 /** * futex_match - Check whether two futex keys are equal @@ -288,23 +289,29 @@ extern void __futex_unqueue(struct futex_q *q); extern void __futex_queue(struct futex_q *q, struct futex_hash_bucket *hb); extern int futex_unqueue(struct futex_q *q); =20 +static inline void futex_hb_unlock_put(struct futex_hash_bucket *hb) +{ + spin_unlock(&hb->lock); + futex_hash_put(hb); +} + /** - * futex_queue() - Enqueue the futex_q on the futex_hash_bucket + * futex_queue_put() - Enqueue the futex_q on the futex_hash_bucket * @q: The futex_q to enqueue * @hb: The destination hash bucket * - * The hb->lock must be held by the caller, and is released here. A call to - * futex_queue() is typically paired with exactly one call to futex_unqueu= e(). The - * exceptions involve the PI related operations, which may use futex_unque= ue_pi() - * or nothing if the unqueue is done as part of the wake process and the u= nqueue - * state is implicit in the state of woken task (see futex_wait_requeue_pi= () for - * an example). + * The hb->lock must be held by the caller, and is released here and the r= eference + * on the hb is droppedV. A call to futex_queue_put() is typically paired = with + * exactly one call to futex_unqueue(). The exceptions involve the PI rela= ted + * operations, which may use futex_unqueue_pi() or nothing if the unqueue = is + * done as part of the wake process and the unqueue state is implicit in t= he + * state of woken task (see futex_wait_requeue_pi() for an example). */ -static inline void futex_queue(struct futex_q *q, struct futex_hash_bucket= *hb) +static inline void futex_queue_put(struct futex_q *q, struct futex_hash_bu= cket *hb) __releases(&hb->lock) { __futex_queue(q, hb); - spin_unlock(&hb->lock); + futex_hb_unlock_put(hb); } =20 extern void futex_unqueue_pi(struct futex_q *q); @@ -350,7 +357,7 @@ static inline int futex_hb_waiters_pending(struct futex= _hash_bucket *hb) } =20 extern struct futex_hash_bucket *futex_q_lock(struct futex_q *q); -extern void futex_q_unlock(struct futex_hash_bucket *hb); +extern void futex_q_unlock_put(struct futex_hash_bucket *hb); =20 =20 extern int futex_lock_pi_atomic(u32 __user *uaddr, struct futex_hash_bucke= t *hb, @@ -380,11 +387,13 @@ double_lock_hb(struct futex_hash_bucket *hb1, struct = futex_hash_bucket *hb2) } =20 static inline void -double_unlock_hb(struct futex_hash_bucket *hb1, struct futex_hash_bucket *= hb2) +double_unlock_hb_put(struct futex_hash_bucket *hb1, struct futex_hash_buck= et *hb2) { spin_unlock(&hb1->lock); if (hb1 !=3D hb2) spin_unlock(&hb2->lock); + futex_hash_put(hb1); + futex_hash_put(hb2); } =20 /* syscalls */ diff --git a/kernel/futex/pi.c b/kernel/futex/pi.c index d62cca5ed8f4c..8561f94f21ed9 100644 --- a/kernel/futex/pi.c +++ b/kernel/futex/pi.c @@ -217,9 +217,9 @@ static int attach_to_pi_state(u32 __user *uaddr, u32 uv= al, /* * We get here with hb->lock held, and having found a * futex_top_waiter(). This means that futex_lock_pi() of said futex_q - * has dropped the hb->lock in between futex_queue() and futex_unqueue_pi= (), - * which in turn means that futex_lock_pi() still has a reference on - * our pi_state. + * has dropped the hb->lock in between futex_queue_put() and + * futex_unqueue_pi(), which in turn means that futex_lock_pi() still + * has a reference on our pi_state. * * The waiter holding a reference on @pi_state also protects against * the unlocked put_pi_state() in futex_unlock_pi(), futex_lock_pi() @@ -963,7 +963,7 @@ int futex_lock_pi(u32 __user *uaddr, unsigned int flags= , ktime_t *time, int tryl * exit to complete. * - EAGAIN: The user space value changed. */ - futex_q_unlock(hb); + futex_q_unlock_put(hb); /* * Handle the case where the owner is in the middle of * exiting. Wait for the exit to complete otherwise @@ -1086,7 +1086,7 @@ int futex_lock_pi(u32 __user *uaddr, unsigned int fla= gs, ktime_t *time, int tryl goto out; =20 out_unlock_put_key: - futex_q_unlock(hb); + futex_q_unlock_put(hb); =20 out: if (to) { @@ -1096,7 +1096,7 @@ int futex_lock_pi(u32 __user *uaddr, unsigned int fla= gs, ktime_t *time, int tryl return ret !=3D -EINTR ? ret : -ERESTARTNOINTR; =20 uaddr_faulted: - futex_q_unlock(hb); + futex_q_unlock_put(hb); =20 ret =3D fault_in_user_writeable(uaddr); if (ret) @@ -1196,7 +1196,7 @@ int futex_unlock_pi(u32 __user *uaddr, unsigned int f= lags) } =20 get_pi_state(pi_state); - spin_unlock(&hb->lock); + futex_hb_unlock_put(hb); =20 /* drops pi_state->pi_mutex.wait_lock */ ret =3D wake_futex_pi(uaddr, uval, pi_state, rt_waiter); @@ -1235,7 +1235,8 @@ int futex_unlock_pi(u32 __user *uaddr, unsigned int f= lags) * owner. */ if ((ret =3D futex_cmpxchg_value_locked(&curval, uaddr, uval, 0))) { - spin_unlock(&hb->lock); + futex_hb_unlock_put(hb); + switch (ret) { case -EFAULT: goto pi_faulted; @@ -1255,7 +1256,7 @@ int futex_unlock_pi(u32 __user *uaddr, unsigned int f= lags) ret =3D (curval =3D=3D uval) ? 0 : -EAGAIN; =20 out_unlock: - spin_unlock(&hb->lock); + futex_hb_unlock_put(hb); return ret; =20 pi_retry: diff --git a/kernel/futex/requeue.c b/kernel/futex/requeue.c index b47bb764b3520..80e99a498de28 100644 --- a/kernel/futex/requeue.c +++ b/kernel/futex/requeue.c @@ -58,7 +58,7 @@ enum { }; =20 const struct futex_q futex_q_init =3D { - /* list gets initialized in futex_queue()*/ + /* list gets initialized in futex_queue_put()*/ .wake =3D futex_wake_mark, .key =3D FUTEX_KEY_INIT, .bitset =3D FUTEX_BITSET_MATCH_ANY, @@ -456,8 +456,8 @@ int futex_requeue(u32 __user *uaddr1, unsigned int flag= s1, ret =3D futex_get_value_locked(&curval, uaddr1); =20 if (unlikely(ret)) { - double_unlock_hb(hb1, hb2); futex_hb_waiters_dec(hb2); + double_unlock_hb_put(hb1, hb2); =20 ret =3D get_user(curval, uaddr1); if (ret) @@ -542,8 +542,9 @@ int futex_requeue(u32 __user *uaddr1, unsigned int flag= s1, * waiter::requeue_state is correct. */ case -EFAULT: - double_unlock_hb(hb1, hb2); futex_hb_waiters_dec(hb2); + double_unlock_hb_put(hb1, hb2); + ret =3D fault_in_user_writeable(uaddr2); if (!ret) goto retry; @@ -556,8 +557,8 @@ int futex_requeue(u32 __user *uaddr1, unsigned int flag= s1, * exit to complete. * - EAGAIN: The user space value changed. */ - double_unlock_hb(hb1, hb2); futex_hb_waiters_dec(hb2); + double_unlock_hb_put(hb1, hb2); /* * Handle the case where the owner is in the middle of * exiting. Wait for the exit to complete otherwise @@ -674,9 +675,9 @@ int futex_requeue(u32 __user *uaddr1, unsigned int flag= s1, put_pi_state(pi_state); =20 out_unlock: - double_unlock_hb(hb1, hb2); - wake_up_q(&wake_q); futex_hb_waiters_dec(hb2); + double_unlock_hb_put(hb1, hb2); + wake_up_q(&wake_q); return ret ? ret : task_count; } =20 @@ -814,7 +815,7 @@ int futex_wait_requeue_pi(u32 __user *uaddr, unsigned i= nt flags, * shared futexes. We need to compare the keys: */ if (futex_match(&q.key, &key2)) { - futex_q_unlock(hb); + futex_q_unlock_put(hb); ret =3D -EINVAL; goto out; } diff --git a/kernel/futex/waitwake.c b/kernel/futex/waitwake.c index 3a10375d95218..fdb9fcaaf9fba 100644 --- a/kernel/futex/waitwake.c +++ b/kernel/futex/waitwake.c @@ -195,7 +195,7 @@ int futex_wake(u32 __user *uaddr, unsigned int flags, i= nt nr_wake, u32 bitset) } } =20 - spin_unlock(&hb->lock); + futex_hb_unlock_put(hb); wake_up_q(&wake_q); return ret; } @@ -274,7 +274,7 @@ int futex_wake_op(u32 __user *uaddr1, unsigned int flag= s, u32 __user *uaddr2, double_lock_hb(hb1, hb2); op_ret =3D futex_atomic_op_inuser(op, uaddr2); if (unlikely(op_ret < 0)) { - double_unlock_hb(hb1, hb2); + double_unlock_hb_put(hb1, hb2); =20 if (!IS_ENABLED(CONFIG_MMU) || unlikely(op_ret !=3D -EFAULT && op_ret !=3D -EAGAIN)) { @@ -327,7 +327,7 @@ int futex_wake_op(u32 __user *uaddr1, unsigned int flag= s, u32 __user *uaddr2, } =20 out_unlock: - double_unlock_hb(hb1, hb2); + double_unlock_hb_put(hb1, hb2); wake_up_q(&wake_q); return ret; } @@ -335,7 +335,7 @@ int futex_wake_op(u32 __user *uaddr1, unsigned int flag= s, u32 __user *uaddr2, static long futex_wait_restart(struct restart_block *restart); =20 /** - * futex_wait_queue() - futex_queue() and wait for wakeup, timeout, or sig= nal + * futex_wait_queue() - futex_queue_put() and wait for wakeup, timeout, or= signal * @hb: the futex hash bucket, must be locked by the caller * @q: the futex_q to queue up on * @timeout: the prepared hrtimer_sleeper, or null for no timeout @@ -346,11 +346,11 @@ void futex_wait_queue(struct futex_hash_bucket *hb, s= truct futex_q *q, /* * The task state is guaranteed to be set before another task can * wake it. set_current_state() is implemented using smp_store_mb() and - * futex_queue() calls spin_unlock() upon completion, both serializing + * futex_queue_put() calls spin_unlock() upon completion, both serializing * access to the hash list and forcing another memory barrier. */ set_current_state(TASK_INTERRUPTIBLE|TASK_FREEZABLE); - futex_queue(q, hb); + futex_queue_put(q, hb); =20 /* Arm the timer */ if (timeout) @@ -461,11 +461,12 @@ int futex_wait_multiple_setup(struct futex_vector *vs= , int count, int *woken) * next futex. Queue each futex at this moment so hb can * be unlocked. */ - futex_queue(q, hb); + futex_queue_put(q, hb); continue; } =20 - futex_q_unlock(hb); + futex_q_unlock_put(hb); + __set_current_state(TASK_RUNNING); =20 /* @@ -624,7 +625,7 @@ int futex_wait_setup(u32 __user *uaddr, u32 val, unsign= ed int flags, ret =3D futex_get_value_locked(&uval, uaddr); =20 if (ret) { - futex_q_unlock(*hb); + futex_q_unlock_put(*hb); =20 ret =3D get_user(uval, uaddr); if (ret) @@ -637,7 +638,7 @@ int futex_wait_setup(u32 __user *uaddr, u32 val, unsign= ed int flags, } =20 if (uval !=3D val) { - futex_q_unlock(*hb); + futex_q_unlock_put(*hb); ret =3D -EWOULDBLOCK; } =20 @@ -665,7 +666,7 @@ int __futex_wait(u32 __user *uaddr, unsigned int flags,= u32 val, if (ret) return ret; =20 - /* futex_queue and wait for wakeup, timeout, or a signal. */ + /* futex_queue_put and wait for wakeup, timeout, or a signal. */ futex_wait_queue(hb, &q, to); =20 /* If we were woken (and unqueued), we succeeded, whatever. */ --=20 2.45.2 From nobody Wed Dec 17 18:01:32 2025 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CA7BF1C5799 for ; Sun, 15 Dec 2024 23:06:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734304018; cv=none; b=d0V1e+MHDAg7oySHl5E6Goj+/9rZx/yqNLAozUNLaTvoQBcf9OtgkFl4QiYDC6bFF4cW9bvxbAXtSXhVMssDad7wBFILx0vuJW1zHNQW+FgyOIU9y91KhG0B6iBsdEa/Ir+d93+VI0RMh7vgCdyJ+SXZEZuBOrY8FXl4h+WhKQs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734304018; c=relaxed/simple; bh=JzPYQE+P5+fF624i5y2tKmR1xSJxMzO6QO9QQ2LID5g=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=sX+3Xd2ybTPvhez5d/Olxh/g1qE4vXT/xthO71TvBi/BCvhUysU2qAY8mNzjVh89hYuDpy+78WEcgppP9XH0CixHGGuPzt9yg5KPe85figu2oVDVw1VDTUbH986jhmsW1gIGzCnJE863Zi/UF8jvxr/vrU2VRs7PS5nErWriFTU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=SxP7NIlj; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=8RfJP0Oj; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="SxP7NIlj"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="8RfJP0Oj" From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1734304008; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kK8t4exBPPa9fF54xJFbju/ZW1xJhzX3gtrTdYSijx0=; b=SxP7NIlj5sDEYxlUWp3HgN4P8cfFLfdgRxuYoEuzE2LpDhYNfbhocampRVQBcwDzE5WaMN 1pCegYt1M+CB74GMe68dgI4GSO2XTo23v4ayUd8Y5frLg4etMlx8NlRkFfW1Whn9hyAH1x Pu6y3q1OHtGtOVKxcuhqRKjCYVIcqhtHIqWmdxWO52jhXmdkuEJ/1ytOE7IMCHaJ+DZAYK wyLv6DKMRnnfhW3I+MkXfQefhR+equBntx5+vSRb+e+gbtlNEpX/y1/w1gg6SYExWG1V83 v2GDmwbmz+AIXvP/tZRg2y6I55qNSR+j3p3ybTFOlCTjw0PhGICxrY+XZU9VFg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1734304008; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kK8t4exBPPa9fF54xJFbju/ZW1xJhzX3gtrTdYSijx0=; b=8RfJP0OjUYYDhsK6VoGMjifRMrOD5YNA7easdUv3xpFklp0xM21ynZYpYcHkSy8hnrAJlD Pmn9trwu5xIitwBg== To: linux-kernel@vger.kernel.org Cc: =?UTF-8?q?Andr=C3=A9=20Almeida?= , Darren Hart , Davidlohr Bueso , Ingo Molnar , Juri Lelli , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Waiman Long , Sebastian Andrzej Siewior Subject: [PATCH v5 07/14] futex: Move the retry_private label. Date: Mon, 16 Dec 2024 00:00:11 +0100 Message-ID: <20241215230642.104118-8-bigeasy@linutronix.de> In-Reply-To: <20241215230642.104118-1-bigeasy@linutronix.de> References: <20241215230642.104118-1-bigeasy@linutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The label futex_requeue in futex_requeue() and futex_wake_op() is jumped after the lock is dropped in a retry operation. This assumes that the hb does not need to be hashed again. If hb is resized then the hb can change if the reference is dropped. Move the retry_private label before the hashing operation. Signed-off-by: Sebastian Andrzej Siewior --- kernel/futex/requeue.c | 2 +- kernel/futex/waitwake.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/futex/requeue.c b/kernel/futex/requeue.c index 80e99a498de28..0395740ce5e71 100644 --- a/kernel/futex/requeue.c +++ b/kernel/futex/requeue.c @@ -443,10 +443,10 @@ int futex_requeue(u32 __user *uaddr1, unsigned int fl= ags1, if (requeue_pi && futex_match(&key1, &key2)) return -EINVAL; =20 +retry_private: hb1 =3D futex_hash(&key1); hb2 =3D futex_hash(&key2); =20 -retry_private: futex_hb_waiters_inc(hb2); double_lock_hb(hb1, hb2); =20 diff --git a/kernel/futex/waitwake.c b/kernel/futex/waitwake.c index fdb9fcaaf9fba..ec73a6ea7462a 100644 --- a/kernel/futex/waitwake.c +++ b/kernel/futex/waitwake.c @@ -267,10 +267,10 @@ int futex_wake_op(u32 __user *uaddr1, unsigned int fl= ags, u32 __user *uaddr2, if (unlikely(ret !=3D 0)) return ret; =20 +retry_private: hb1 =3D futex_hash(&key1); hb2 =3D futex_hash(&key2); =20 -retry_private: double_lock_hb(hb1, hb2); op_ret =3D futex_atomic_op_inuser(op, uaddr2); if (unlikely(op_ret < 0)) { --=20 2.45.2 From nobody Wed Dec 17 18:01:32 2025 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CA8741C7B8D for ; Sun, 15 Dec 2024 23:06:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734304018; cv=none; b=LEngKFQI6OSJAuAoySiHaOTEblpGQZRjESClgrC/UEwGEy6nLpf7t9t6bzrins7UwxnBSvfWcFUMNeqV2UxHG9g4kxo++Xz5UcSC643mlMSDUkmEMDaNu7dl49OkaBEVKqtOACUT+XKjV99NjMwLIwoB6RfLpaUVBqgW3Ib8OV0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734304018; c=relaxed/simple; bh=i9MijDvVOWfckqUfUJqCPM813s6GxsIU5dJt1BnSsbk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=aI23QNv1JfqTqcH6Qhi5pb9ZggbRVrEgLL1LQqKrUhVJNYdKQbi4o8VcfsvO5VG1IGuonBxVSM5isgbid5r27HWE9vMj3F8p8pIFPxnjM7CVCIAszqxhLjIMXzBefOVeVbgEugfDIJaX4OmkthxvdMZCxcs8wNP75EF68gxk8cE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=CjCv2qE6; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=7Ab+4GIV; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="CjCv2qE6"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="7Ab+4GIV" From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1734304008; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=521YN3lEQj8wnb4LQ1VCL1pUFnvlLLrWT5SguDVioRo=; b=CjCv2qE6jzYuNwZ2KGp4sNWJTkxQSNF3l4Y1la5rGLKlGwMj0q8P7qNuTE5WT3HuuiJ4U/ vq6FJDtzC4eiRAA4z4NWLtqE4mrfFi07hYYb9md5Txo1E9k/SBVIglwK0ON0JO5EZ+Kea+ YV0R7+iXweytboatFLG4dcsvd6Uogq1xqIAPD6LkFPL5cXNsEdsOOQoiEgslnolftxGZRI 9Y6qKDQS1rUG5RFD6gfk0tI592dkr+hbuU/wqZ/fHx2rN48zQ8IS+zqsCI6TqBs6AoXrMo LfkvY7tsADjEcvs96NltgF6q0dv+vpzdhYdoZBJ5poT3vhxIRcg9MTrku5lO3Q== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1734304008; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=521YN3lEQj8wnb4LQ1VCL1pUFnvlLLrWT5SguDVioRo=; b=7Ab+4GIVOvafesQaNqjZwJMBAVtUmpqvskyB4G9L6IjEdcdrqG2NoBxsq/J0tkzKJEUUSv vVdxzRpVmSAP2oCw== To: linux-kernel@vger.kernel.org Cc: =?UTF-8?q?Andr=C3=A9=20Almeida?= , Darren Hart , Davidlohr Bueso , Ingo Molnar , Juri Lelli , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Waiman Long , Sebastian Andrzej Siewior Subject: [PATCH v5 08/14] futex: Introduce futex_get_locked_hb(). Date: Mon, 16 Dec 2024 00:00:12 +0100 Message-ID: <20241215230642.104118-9-bigeasy@linutronix.de> In-Reply-To: <20241215230642.104118-1-bigeasy@linutronix.de> References: <20241215230642.104118-1-bigeasy@linutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" futex_lock_pi() and __fixup_pi_state_owner() acquire the futex_q::lock_ptr without holding a reference assuming the previously obtained hb and the assigned lock_ptr are still valid. This isn't the case once the hb can be resized and becomes invalid after the reference drop. Introduce futex_get_locked_hb() to lock the hb recorded in futex_q::lock_ptr. The lock pointer is read in a RCU section to ensure that it does not go away if the hb has been replaced and the old pointer has been observed. After locking the pointer needs to be compared to check if it changed. If so then the hb has been replaced and the user has been moved to the new one and lock_ptr has been updated. The lock operation needs to be redone in this case. Once the lock_ptr is the same, we can return the futex_hash_bucket it belongs to as the hb for the caller locked. This is important because we don't own a reference so the hb is valid as long as we hold the lock. This means if the hb is resized then this (old) hb remains valid as long as we hold the lock because it all user need to be moved to the new lock. So the task performing the resize will block. Signed-off-by: Sebastian Andrzej Siewior --- kernel/futex/core.c | 27 +++++++++++++++++++++++++++ kernel/futex/futex.h | 2 +- kernel/futex/pi.c | 9 +++++++-- kernel/futex/requeue.c | 8 +++++--- 4 files changed, 40 insertions(+), 6 deletions(-) diff --git a/kernel/futex/core.c b/kernel/futex/core.c index 3cfdd4c02f261..6bccf48cdb049 100644 --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -639,6 +639,33 @@ int futex_unqueue(struct futex_q *q) return ret; } =20 +struct futex_hash_bucket *futex_get_locked_hb(struct futex_q *q) +{ + struct futex_hash_bucket *hb; + spinlock_t *lock_ptr; + + /* + * See futex_unqueue() why lock_ptr can change. + */ + guard(rcu)(); +retry: + lock_ptr =3D READ_ONCE(q->lock_ptr); + spin_lock(lock_ptr); + + if (unlikely(lock_ptr !=3D q->lock_ptr)) { + spin_unlock(lock_ptr); + goto retry; + } + + hb =3D container_of(lock_ptr, struct futex_hash_bucket, lock); + /* + * We don't acquire a reference on the hb because we don't get it + * if a resize is in progress and we got the old hb->lock before the + * other task got it which meant to move us to the new hb. + */ + return hb; +} + /* * PI futexes can not be requeued and must remove themselves from the hash * bucket. The hash bucket lock (i.e. lock_ptr) is held. diff --git a/kernel/futex/futex.h b/kernel/futex/futex.h index 5793546a48ebf..143bf1523fa4a 100644 --- a/kernel/futex/futex.h +++ b/kernel/futex/futex.h @@ -196,7 +196,7 @@ enum futex_access { =20 extern int get_futex_key(u32 __user *uaddr, unsigned int flags, union fute= x_key *key, enum futex_access rw); - +extern struct futex_hash_bucket *futex_get_locked_hb(struct futex_q *q); extern struct hrtimer_sleeper * futex_setup_timer(ktime_t *time, struct hrtimer_sleeper *timeout, int flags, u64 range_ns); diff --git a/kernel/futex/pi.c b/kernel/futex/pi.c index 8561f94f21ed9..506ba1ad8ff23 100644 --- a/kernel/futex/pi.c +++ b/kernel/futex/pi.c @@ -806,7 +806,7 @@ static int __fixup_pi_state_owner(u32 __user *uaddr, st= ruct futex_q *q, break; } =20 - spin_lock(q->lock_ptr); + futex_get_locked_hb(q); raw_spin_lock_irq(&pi_state->pi_mutex.wait_lock); =20 /* @@ -922,6 +922,7 @@ int futex_lock_pi(u32 __user *uaddr, unsigned int flags= , ktime_t *time, int tryl struct rt_mutex_waiter rt_waiter; struct futex_hash_bucket *hb; struct futex_q q =3D futex_q_init; + bool no_block_fp =3D false; DEFINE_WAKE_Q(wake_q); int res, ret; =20 @@ -988,6 +989,7 @@ int futex_lock_pi(u32 __user *uaddr, unsigned int flags= , ktime_t *time, int tryl ret =3D rt_mutex_futex_trylock(&q.pi_state->pi_mutex); /* Fixup the trylock return value: */ ret =3D ret ? 0 : -EWOULDBLOCK; + no_block_fp =3D true; goto no_block; } =20 @@ -1024,6 +1026,7 @@ int futex_lock_pi(u32 __user *uaddr, unsigned int fla= gs, ktime_t *time, int tryl raw_spin_unlock_irq(&q.pi_state->pi_mutex.wait_lock); wake_up_q(&wake_q); preempt_enable(); + futex_hash_put(hb); =20 if (ret) { if (ret =3D=3D 1) @@ -1063,7 +1066,7 @@ int futex_lock_pi(u32 __user *uaddr, unsigned int fla= gs, ktime_t *time, int tryl * spinlock/rtlock (which might enqueue its own rt_waiter) and fix up * the */ - spin_lock(q.lock_ptr); + hb =3D futex_get_locked_hb(&q); /* * Waiter is unqueued. */ @@ -1083,6 +1086,8 @@ int futex_lock_pi(u32 __user *uaddr, unsigned int fla= gs, ktime_t *time, int tryl =20 futex_unqueue_pi(&q); spin_unlock(q.lock_ptr); + if (no_block_fp) + futex_hash_put(hb); goto out; =20 out_unlock_put_key: diff --git a/kernel/futex/requeue.c b/kernel/futex/requeue.c index 0395740ce5e71..1f3ac76ce1229 100644 --- a/kernel/futex/requeue.c +++ b/kernel/futex/requeue.c @@ -826,15 +826,17 @@ int futex_wait_requeue_pi(u32 __user *uaddr, unsigned= int flags, switch (futex_requeue_pi_wakeup_sync(&q)) { case Q_REQUEUE_PI_IGNORE: /* The waiter is still on uaddr1 */ - spin_lock(&hb->lock); + hb =3D futex_get_locked_hb(&q); + ret =3D handle_early_requeue_pi_wakeup(hb, &q, to); spin_unlock(&hb->lock); + break; =20 case Q_REQUEUE_PI_LOCKED: /* The requeue acquired the lock */ if (q.pi_state && (q.pi_state->owner !=3D current)) { - spin_lock(q.lock_ptr); + futex_get_locked_hb(&q); ret =3D fixup_pi_owner(uaddr2, &q, true); /* * Drop the reference to the pi state which the @@ -861,7 +863,7 @@ int futex_wait_requeue_pi(u32 __user *uaddr, unsigned i= nt flags, if (ret && !rt_mutex_cleanup_proxy_lock(pi_mutex, &rt_waiter)) ret =3D 0; =20 - spin_lock(q.lock_ptr); + futex_get_locked_hb(&q); debug_rt_mutex_free_waiter(&rt_waiter); /* * Fixup the pi_state owner and possibly acquire the lock if we --=20 2.45.2 From nobody Wed Dec 17 18:01:32 2025 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AFACF1CCEF8 for ; Sun, 15 Dec 2024 23:06:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734304020; cv=none; b=t/xgU6vl1O5dkaP1f3ko9mTcwYnpV0SWJECqRVOydS14LVTUraxZqV+3q2fhSA7uK04xWYjq0mE4LSpN3UpCH/1VL/JiMswQHnLLCHaRcbcAqgnswO7Xz3p216lXqMLaX3+h/8X3IneJ4Toz4G5ZCxFp6nflxSXzNUgFI+vwyhw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734304020; c=relaxed/simple; bh=kwdf7FnpnKp/razSaMtYYvLUagOr6onBiGOeyXzLbl0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=J2OkIop7YeUvobf2rAgYtaTlpca60Hnu9RQnPErR5BwUVBC5ciYjON6XT1b5I745iz0bY2t2+qFxvHqnx/CvHxZqZHvPBbwEzOWmPe9iDQKpxLeCgefHZMqf26uleb8/m0yjYXzTvKt/oOzkpIG0EFx4aTpJ9/Q1NfiKJZdNkhQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=eW9OsYBg; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=Ly1z6IbL; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="eW9OsYBg"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="Ly1z6IbL" From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1734304009; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xHrceZ449XC5Ya6yQJk3mw9ylGHggZCqe4BUBnkBmN8=; b=eW9OsYBgogh7XXnVKiPp7Fr/IL06bNel+KTypJkmMX6IJ10iteymw5sgL+Qop3eKhFXnV9 GSM/n71xs3HnJkcH7pV4h3zz3mOGhRxVoO+BApKVlCMyAtSaLmuSFXiP9Ju3AAJkxmz4NC HzEhE2EC6P1GYfsS9OZTQfHt2VBQA+FkWcUomqORNvLJrlCrGYeMz4bdk+asBxil2qc5ll 250vPHALD68qkHFQ7BogS4QqBsCjHn04OgOPuGAqpromYDFm6/oG11L1pwwbB+FFVpTtbp oqhvW+s6c837GqsZBVPosMvWmyGURovoaet7MRqO6scO1wWa6SHG4aE3lT8LEA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1734304009; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xHrceZ449XC5Ya6yQJk3mw9ylGHggZCqe4BUBnkBmN8=; b=Ly1z6IbLoo42+z98vqUV83M/M52QBFSsQE7HsQ/J/N6ItTJESeEiw2u9NYXipeyGfUVpVQ SIfzaaRn3nbhgwAg== To: linux-kernel@vger.kernel.org Cc: =?UTF-8?q?Andr=C3=A9=20Almeida?= , Darren Hart , Davidlohr Bueso , Ingo Molnar , Juri Lelli , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Waiman Long , Sebastian Andrzej Siewior Subject: [PATCH v5 09/14] futex: Allow to re-allocate the private hash bucket. Date: Mon, 16 Dec 2024 00:00:13 +0100 Message-ID: <20241215230642.104118-10-bigeasy@linutronix.de> In-Reply-To: <20241215230642.104118-1-bigeasy@linutronix.de> References: <20241215230642.104118-1-bigeasy@linutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The mm_struct::futex_hash_lock guards the futex_hash_bucket assignment/ replacement. The futex_hash_allocate()/ PR_FUTEX_HASH_SET_SLOTS operation can now be invoked at runtime and resize an already existing internal private futex_hash_bucket to another size. The reallocation is based on an idea by Thomas Gleixner: The initial allocation of struct futex_hash_bucket_private set the reference count to one. Every user acquires a reference on the hb before using it and drops it after it enqueued itself on the hash bucket. This means that there is no reference held while the task is scheduled out while waiting for the wake up. The resize allocates a new futex_hash_bucket_private and drops the initial reference under the mm_struct::futex_hash_lock. If the reference drop results in destruction of the object then users currently queued on the hash bucket(s) will be requeued on the new hash bucket. At the end mm_struct::futex_hash_bucket is updated, the old pointer is RCU freed and the mutex is dropped. If the reference drop does not result in destruction of the object then the new pointer is saved as mm_struct::futex_hash_new. In this case replacement is delayed to the user that drops the last reference. All new user, that fail to acquire a reference, block on mm_struct::futex_hash_lock and attempt to perform the replacement. This scheme keeps the requirement that during a lock/ unlock operation all waiter block on the same futex_hash_bucket::lock. Signed-off-by: Sebastian Andrzej Siewior --- include/linux/futex.h | 3 +- include/linux/mm_types.h | 7 +- kernel/futex/core.c | 243 +++++++++++++++++++++++++++++++++++---- kernel/futex/futex.h | 1 + kernel/futex/requeue.c | 5 + kernel/futex/waitwake.c | 4 +- 6 files changed, 237 insertions(+), 26 deletions(-) diff --git a/include/linux/futex.h b/include/linux/futex.h index bad377c30de5e..3ced01a9c5218 100644 --- a/include/linux/futex.h +++ b/include/linux/futex.h @@ -83,7 +83,8 @@ void futex_hash_free(struct mm_struct *mm); =20 static inline void futex_mm_init(struct mm_struct *mm) { - mm->futex_hash_bucket =3D NULL; + rcu_assign_pointer(mm->futex_hash_bucket, NULL); + mutex_init(&mm->futex_hash_lock); } =20 static inline bool futex_hash_requires_allocation(void) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 2337a2e481fd0..62fe872b381f8 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -30,7 +30,7 @@ #define INIT_PASID 0 =20 struct address_space; -struct futex_hash_bucket; +struct futex_hash_bucket_private; struct mem_cgroup; =20 /* @@ -904,8 +904,9 @@ struct mm_struct { #endif =20 #ifdef CONFIG_FUTEX - unsigned int futex_hash_mask; - struct futex_hash_bucket *futex_hash_bucket; + struct mutex futex_hash_lock; + struct futex_hash_bucket_private __rcu *futex_hash_bucket; + struct futex_hash_bucket_private *futex_hash_new; #endif =20 unsigned long hiwater_rss; /* High-watermark of RSS usage */ diff --git a/kernel/futex/core.c b/kernel/futex/core.c index 6bccf48cdb049..f80ae39f2a83a 100644 --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -40,6 +40,7 @@ #include #include #include +#include =20 #include "futex.h" #include "../locking/rtmutex_common.h" @@ -56,6 +57,12 @@ static struct { #define futex_queues (__futex_data.queues) #define futex_hashsize (__futex_data.hashsize) =20 +struct futex_hash_bucket_private { + rcuref_t users; + unsigned int hash_mask; + struct rcu_head rcu; + struct futex_hash_bucket queues[]; +}; =20 /* * Fault injections for futexes. @@ -129,6 +136,148 @@ static struct futex_hash_bucket *futex_hash_private(u= nion futex_key *key, return &fhb[hash & hash_mask]; } =20 +static void futex_rehash_current_users(struct futex_hash_bucket_private *o= ld, + struct futex_hash_bucket_private *new) +{ + struct futex_hash_bucket *hb_old, *hb_new; + unsigned int slots =3D old->hash_mask + 1; + u32 hash_mask =3D new->hash_mask; + unsigned int i; + + for (i =3D 0; i < slots; i++) { + struct futex_q *this, *tmp; + + hb_old =3D &old->queues[i]; + + spin_lock(&hb_old->lock); + plist_for_each_entry_safe(this, tmp, &hb_old->chain, list) { + + plist_del(&this->list, &hb_old->chain); + futex_hb_waiters_dec(hb_old); + + WARN_ON_ONCE(this->lock_ptr !=3D &hb_old->lock); + + hb_new =3D futex_hash_private(&this->key, new->queues, hash_mask); + futex_hb_waiters_inc(hb_new); + /* + * The new pointer isn't published yet but an already + * moved user can unqueue itself so locking is needed. + */ + spin_lock_nested(&hb_new->lock, SINGLE_DEPTH_NESTING); + plist_add(&this->list, &hb_new->chain); + this->lock_ptr =3D &hb_new->lock; + spin_unlock(&hb_new->lock); + } + spin_unlock(&hb_old->lock); + } +} + +struct futex_assign_cleanup { + struct futex_hash_bucket_private *rcu; + struct futex_hash_bucket_private *free; + struct futex_hash_bucket_private *free2; +}; + +static void __futex_assign_new_hb(struct futex_hash_bucket_private *hb_p_n= ew, + struct mm_struct *mm, + struct futex_assign_cleanup *fu_cleanup) +{ + struct futex_hash_bucket_private *hb_p; + bool drop_last_ref =3D hb_p_new !=3D NULL; + + /* + * If the supplied hb_p is NULL then we must have one in mm. We might + * have both. The pointer with the larger amount of slots is + * considered. If we are late, we have none and someone else did the + * work while we blocked on the lock. + */ + if (mm->futex_hash_new) { + if (hb_p_new) { + if (mm->futex_hash_new->hash_mask <=3D hb_p_new->hash_mask) { + fu_cleanup->free =3D mm->futex_hash_new; + } else { + fu_cleanup->free =3D hb_p_new; + hb_p_new =3D mm->futex_hash_new; + } + } else { + hb_p_new =3D mm->futex_hash_new; + } + mm->futex_hash_new =3D NULL; + } + + /* Someone was quicker, the current mask is valid */ + if (!hb_p_new) + return; + + hb_p =3D rcu_dereference_check(mm->futex_hash_bucket, + lockdep_is_held(&mm->futex_hash_lock)); + if (hb_p) { + if (hb_p->hash_mask >=3D hb_p_new->hash_mask) { + /* It was increased again while we were waiting */ + fu_cleanup->free2 =3D hb_p_new; + return; + } + + if (drop_last_ref && !rcuref_put(&hb_p->users)) { + /* We are not the last user, let the last one continue */ + mm->futex_hash_new =3D hb_p_new; + return; + } + + futex_rehash_current_users(hb_p, hb_p_new); + fu_cleanup->rcu =3D hb_p; + } + rcu_assign_pointer(mm->futex_hash_bucket, hb_p_new); +} + +static void futex_assign_cleanup(struct futex_assign_cleanup *fu_cleanup) +{ + kvfree(fu_cleanup->free); + kvfree(fu_cleanup->free2); + kvfree_rcu(fu_cleanup->rcu, rcu); +} + +static void futex_assign_new_hb(struct futex_hash_bucket_private *hb_p_new) +{ + struct futex_assign_cleanup fu_cleanup =3D {}; + struct mm_struct *mm =3D current->mm; + + scoped_guard(mutex, &mm->futex_hash_lock) + __futex_assign_new_hb(hb_p_new, mm, &fu_cleanup); + futex_assign_cleanup(&fu_cleanup); +} + +static struct futex_hash_bucket_private *futex_get_private_hb(union futex_= key *key) +{ + struct mm_struct *mm =3D current->mm; + + if (!futex_key_is_private(key)) + return NULL; + /* + * Ideally we don't loop. If there is a replacement in progress + * then a new HB is already prepared. We fail to obtain a + * reference only after the last user returned its referefence. + * In that case futex_assign_new_hb() blocks on futex_hash_bucket + * and we either have to performon the replacement or wait + * while someone else is doing the job. Eitherway, after we + * return we can acquire a reference on the new hash bucket + * (unless it is replaced again). + */ +again: + scoped_guard(rcu) { + struct futex_hash_bucket_private *hb_p; + + hb_p =3D rcu_dereference(mm->futex_hash_bucket); + if (!hb_p) + return NULL; + + if (rcuref_get(&hb_p->users)) + return hb_p; + } + futex_assign_new_hb(NULL); + goto again; +} + /** * futex_hash - Return the hash bucket in the global hash * @key: Pointer to the futex key for which the hash is calculated @@ -139,12 +288,12 @@ static struct futex_hash_bucket *futex_hash_private(u= nion futex_key *key, */ struct futex_hash_bucket *futex_hash(union futex_key *key) { - struct futex_hash_bucket *fhb; + struct futex_hash_bucket_private *hb_p =3D NULL; u32 hash; =20 - fhb =3D current->mm->futex_hash_bucket; - if (fhb && futex_key_is_private(key)) - return futex_hash_private(key, fhb, current->mm->futex_hash_mask); + hb_p =3D futex_get_private_hb(key); + if (hb_p) + return futex_hash_private(key, hb_p->queues, hb_p->hash_mask); =20 hash =3D jhash2((u32 *)key, offsetof(typeof(*key), both.offset) / 4, @@ -154,6 +303,17 @@ struct futex_hash_bucket *futex_hash(union futex_key *= key) =20 void futex_hash_put(struct futex_hash_bucket *hb) { + struct futex_hash_bucket_private *hb_p; + + if (hb->hb_slot =3D=3D 0) + return; + hb_p =3D container_of(hb, struct futex_hash_bucket_private, + queues[hb->hb_slot - 1]); + + if (!rcuref_put(&hb_p->users)) + return; + + futex_assign_new_hb(NULL); } =20 /** @@ -601,6 +761,8 @@ int futex_unqueue(struct futex_q *q) spinlock_t *lock_ptr; int ret =3D 0; =20 + /* RCU so lock_ptr is not going away during locking. */ + guard(rcu)(); /* In the common case we don't take the spinlock, which is nice. */ retry: /* @@ -1008,10 +1170,27 @@ static void compat_exit_robust_list(struct task_str= uct *curr) static void exit_pi_state_list(struct task_struct *curr) { struct list_head *next, *head =3D &curr->pi_state_list; + struct futex_hash_bucket_private *hb_p; struct futex_pi_state *pi_state; struct futex_hash_bucket *hb; union futex_key key =3D FUTEX_KEY_INIT; =20 + /* + * Lock the futex_hash_bucket to ensure that the hb remains unchanged. + * This is important so we can invoke futex_hash() under the pi_lock. + */ + guard(mutex)(&curr->mm->futex_hash_lock); + hb_p =3D rcu_dereference_check(curr->mm->futex_hash_bucket, + lockdep_is_held(&curr->mm->futex_hash_lock)); + if (hb_p) { + if (rcuref_read(&hb_p->users) =3D=3D 0) { + struct futex_assign_cleanup fu_cleanup =3D {}; + + __futex_assign_new_hb(NULL, curr->mm, &fu_cleanup); + futex_assign_cleanup(&fu_cleanup); + } + } + /* * We are a ZOMBIE and nobody can enqueue itself on * pi_state_list anymore, but we have to be careful @@ -1037,6 +1216,7 @@ static void exit_pi_state_list(struct task_struct *cu= rr) if (!refcount_inc_not_zero(&pi_state->refcount)) { raw_spin_unlock_irq(&curr->pi_lock); cpu_relax(); + futex_hash_put(hb); raw_spin_lock_irq(&curr->pi_lock); continue; } @@ -1053,6 +1233,7 @@ static void exit_pi_state_list(struct task_struct *cu= rr) /* retain curr->pi_lock for the loop invariant */ raw_spin_unlock(&pi_state->pi_mutex.wait_lock); spin_unlock(&hb->lock); + futex_hash_put(hb); put_pi_state(pi_state); continue; } @@ -1065,6 +1246,7 @@ static void exit_pi_state_list(struct task_struct *cu= rr) raw_spin_unlock(&curr->pi_lock); raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock); spin_unlock(&hb->lock); + futex_hash_put(hb); =20 rt_mutex_futex_unlock(&pi_state->pi_mutex); put_pi_state(pi_state); @@ -1185,8 +1367,9 @@ void futex_exit_release(struct task_struct *tsk) futex_cleanup_end(tsk, FUTEX_STATE_DEAD); } =20 -static void futex_hash_bucket_init(struct futex_hash_bucket *fhb) +static void futex_hash_bucket_init(struct futex_hash_bucket *fhb, unsigned= int slot) { + fhb->hb_slot =3D slot; atomic_set(&fhb->waiters, 0); plist_head_init(&fhb->chain); spin_lock_init(&fhb->lock); @@ -1194,20 +1377,25 @@ static void futex_hash_bucket_init(struct futex_has= h_bucket *fhb) =20 void futex_hash_free(struct mm_struct *mm) { - kvfree(mm->futex_hash_bucket); + struct futex_hash_bucket_private *hb_p; + + /* We are the last one and we hold the initial reference */ + hb_p =3D rcu_dereference_check(mm->futex_hash_bucket, true); + if (!hb_p) + return; + + if (WARN_ON(!rcuref_put(&hb_p->users))) + return; + + kvfree(hb_p); } =20 static int futex_hash_allocate(unsigned int hash_slots) { - struct futex_hash_bucket *fhb; + struct futex_hash_bucket_private *hb_p; + size_t alloc_size; int i; =20 - if (current->mm->futex_hash_bucket) - return -EALREADY; - - if (!thread_group_leader(current)) - return -EINVAL; - if (hash_slots =3D=3D 0) hash_slots =3D 16; if (hash_slots < 2) @@ -1217,16 +1405,25 @@ static int futex_hash_allocate(unsigned int hash_sl= ots) if (!is_power_of_2(hash_slots)) hash_slots =3D rounddown_pow_of_two(hash_slots); =20 - fhb =3D kvmalloc_array(hash_slots, sizeof(struct futex_hash_bucket), GFP_= KERNEL_ACCOUNT); - if (!fhb) + if (unlikely(check_mul_overflow(hash_slots, sizeof(struct futex_hash_buck= et), + &alloc_size))) return -ENOMEM; =20 - current->mm->futex_hash_mask =3D hash_slots - 1; + if (unlikely(check_add_overflow(alloc_size, sizeof(struct futex_hash_buck= et_private), + &alloc_size))) + return -ENOMEM; + + hb_p =3D kvmalloc(alloc_size, GFP_KERNEL_ACCOUNT); + if (!hb_p) + return -ENOMEM; + + rcuref_init(&hb_p->users, 1); + hb_p->hash_mask =3D hash_slots - 1; =20 for (i =3D 0; i < hash_slots; i++) - futex_hash_bucket_init(&fhb[i]); + futex_hash_bucket_init(&hb_p->queues[i], i + 1); =20 - current->mm->futex_hash_bucket =3D fhb; + futex_assign_new_hb(hb_p); return 0; } =20 @@ -1237,8 +1434,12 @@ int futex_hash_allocate_default(void) =20 static int futex_hash_get_slots(void) { - if (current->mm->futex_hash_bucket) - return current->mm->futex_hash_mask + 1; + struct futex_hash_bucket_private *hb_p; + + guard(rcu)(); + hb_p =3D rcu_dereference(current->mm->futex_hash_bucket); + if (hb_p) + return hb_p->hash_mask + 1; return 0; } =20 @@ -1280,7 +1481,7 @@ static int __init futex_init(void) futex_hashsize =3D 1UL << futex_shift; =20 for (i =3D 0; i < futex_hashsize; i++) - futex_hash_bucket_init(&futex_queues[i]); + futex_hash_bucket_init(&futex_queues[i], 0); =20 return 0; } diff --git a/kernel/futex/futex.h b/kernel/futex/futex.h index 143bf1523fa4a..7de1117c2eab0 100644 --- a/kernel/futex/futex.h +++ b/kernel/futex/futex.h @@ -115,6 +115,7 @@ static inline bool should_fail_futex(bool fshared) */ struct futex_hash_bucket { atomic_t waiters; + unsigned int hb_slot; spinlock_t lock; struct plist_head chain; } ____cacheline_aligned_in_smp; diff --git a/kernel/futex/requeue.c b/kernel/futex/requeue.c index 1f3ac76ce1229..684f4eff20854 100644 --- a/kernel/futex/requeue.c +++ b/kernel/futex/requeue.c @@ -87,6 +87,11 @@ void requeue_futex(struct futex_q *q, struct futex_hash_= bucket *hb1, futex_hb_waiters_inc(hb2); plist_add(&q->list, &hb2->chain); q->lock_ptr =3D &hb2->lock; + /* + * hb1 and hb2 belong to the same futex_hash_bucket_private + * because if we managed get a reference on hb1 then it can't be + * replaced. Therefore we avoid put(hb1)+get(hb2) here. + */ } q->key =3D *key2; } diff --git a/kernel/futex/waitwake.c b/kernel/futex/waitwake.c index ec73a6ea7462a..4dc71ff8911fd 100644 --- a/kernel/futex/waitwake.c +++ b/kernel/futex/waitwake.c @@ -173,8 +173,10 @@ int futex_wake(u32 __user *uaddr, unsigned int flags, = int nr_wake, u32 bitset) hb =3D futex_hash(&key); =20 /* Make sure we really have tasks to wakeup */ - if (!futex_hb_waiters_pending(hb)) + if (!futex_hb_waiters_pending(hb)) { + futex_hash_put(hb); return ret; + } =20 spin_lock(&hb->lock); =20 --=20 2.45.2 From nobody Wed Dec 17 18:01:32 2025 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6E2E61CBE9D for ; Sun, 15 Dec 2024 23:06:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734304019; cv=none; b=lYYUZUhImfj+dDzyx6aj5rng0HwvJywyIFoTlWvbjCOGFykGOppc9XQ5OXRzquPDOailGuSQQG2npoYyop6hHxoVSZQ6iQmQNU+e5DTaFernzDITM3UQ/L6TWAT3jeSOR/H9k6a+H2tf2Y064KMfQh4a8CweXB13IBMISNwxE8Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734304019; c=relaxed/simple; bh=ApSE6kj/TaY7TtI1krjzNXsHoRPLQpQTs1ZcO8nxEt0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pLhr8BL8p86hetCSWrJUdLihZG7kHpvEUpRenblIlf7BMajBzooALuPMe7K9NK34JGBaBmwscQvmj+2789nKjEwlqNOTvSrR7tL5pqcfBrto71rTu4uSbTNas9ILaij0O76HFca5LJCH8TVgBuU07+/4+91jFFOm0EjWKhXt28c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=ZzVI4xUJ; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=fncCmHO0; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="ZzVI4xUJ"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="fncCmHO0" From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1734304009; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vNK33miYFEQI02d8SciwtswjvFmdUE/Kvrm/kWYmMWM=; b=ZzVI4xUJQ08bQ83t4/lFpCm5Y+lSZWVY5jgqgT/WBxltaHIru4Ztz5CiPhwFUkXPbx+BFP 0A9rLQO3l7JoDFMsE83PjxspgmUpOJy3WczkKp5GSnigLZjHxaVBV2lELCpTBF18O5m/wm Gde6oNdd0veOLxjM5q2XwbrmskL3kmNB7jwHTgcIg2Mf083Tx4KFbzuKl46jmj/HJ4K40Q MSxq4F/X9Jp9pDLGi3bxWYrhFN3E4e0XV4VxpQ3C+K4GxpsSgq6QfR3aqyuL/Xno0DL20v 3U8ITkfVtwZh/AIctNqvFW7+BNNicQJ29fwFGxlRA55WK8L8afdLCT6kTjPF1g== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1734304009; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vNK33miYFEQI02d8SciwtswjvFmdUE/Kvrm/kWYmMWM=; b=fncCmHO0LKncLW+nKp0jzCgF7IeRw0PhIF4NsuwWG2BI+AVgrpD8RAVOZYo1MzmV8HZX4+ 3V3b9JPqK27HlUCw== To: linux-kernel@vger.kernel.org Cc: =?UTF-8?q?Andr=C3=A9=20Almeida?= , Darren Hart , Davidlohr Bueso , Ingo Molnar , Juri Lelli , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Waiman Long , Sebastian Andrzej Siewior Subject: [PATCH v5 10/14] futex: Resize futex hash table based on number of threads. Date: Mon, 16 Dec 2024 00:00:14 +0100 Message-ID: <20241215230642.104118-11-bigeasy@linutronix.de> In-Reply-To: <20241215230642.104118-1-bigeasy@linutronix.de> References: <20241215230642.104118-1-bigeasy@linutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Automatically size hash bucket based on the number of threads. The logic tries to allocate between 16 and futex_hashsize (the default for the system wide hash bucket) and uses 4 * number-of-threads. Signed-off-by: Sebastian Andrzej Siewior --- include/linux/futex.h | 12 ------------ kernel/fork.c | 4 +--- kernel/futex/core.c | 28 +++++++++++++++++++++++++--- 3 files changed, 26 insertions(+), 18 deletions(-) diff --git a/include/linux/futex.h b/include/linux/futex.h index 3ced01a9c5218..403b54526a081 100644 --- a/include/linux/futex.h +++ b/include/linux/futex.h @@ -87,13 +87,6 @@ static inline void futex_mm_init(struct mm_struct *mm) mutex_init(&mm->futex_hash_lock); } =20 -static inline bool futex_hash_requires_allocation(void) -{ - if (current->mm->futex_hash_bucket) - return false; - return true; -} - #else static inline void futex_init_task(struct task_struct *tsk) { } static inline void futex_exit_recursive(struct task_struct *tsk) { } @@ -116,11 +109,6 @@ static inline int futex_hash_allocate_default(void) static inline void futex_hash_free(struct mm_struct *mm) { } static inline void futex_mm_init(struct mm_struct *mm) { } =20 -static inline bool futex_hash_requires_allocation(void) -{ - return false; -} - #endif =20 #endif diff --git a/kernel/fork.c b/kernel/fork.c index e34bb2a107a9d..35ec9958707c5 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2134,9 +2134,7 @@ static bool need_futex_hash_allocate_default(u64 clon= e_flags) { if ((clone_flags & (CLONE_THREAD | CLONE_VM)) !=3D (CLONE_THREAD | CLONE_= VM)) return false; - if (!thread_group_empty(current)) - return false; - return futex_hash_requires_allocation(); + return true; } =20 /* diff --git a/kernel/futex/core.c b/kernel/futex/core.c index f80ae39f2a83a..15e319239c282 100644 --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -64,6 +64,8 @@ struct futex_hash_bucket_private { struct futex_hash_bucket queues[]; }; =20 +static unsigned int futex_default_max_buckets; + /* * Fault injections for futexes. */ @@ -1400,8 +1402,8 @@ static int futex_hash_allocate(unsigned int hash_slot= s) hash_slots =3D 16; if (hash_slots < 2) hash_slots =3D 2; - if (hash_slots > 131072) - hash_slots =3D 131072; + if (hash_slots > futex_default_max_buckets) + hash_slots =3D futex_default_max_buckets; if (!is_power_of_2(hash_slots)) hash_slots =3D rounddown_pow_of_two(hash_slots); =20 @@ -1429,7 +1431,26 @@ static int futex_hash_allocate(unsigned int hash_slo= ts) =20 int futex_hash_allocate_default(void) { - return futex_hash_allocate(0); + unsigned int threads, buckets, current_buckets =3D 0; + struct futex_hash_bucket_private *hb_p; + + if (!current->mm) + return 0; + + scoped_guard(rcu) { + threads =3D get_nr_threads(current); + hb_p =3D rcu_dereference(current->mm->futex_hash_bucket); + if (hb_p) + current_buckets =3D hb_p->hash_mask + 1; + } + + buckets =3D roundup_pow_of_two(4 * threads); + buckets =3D max(buckets, 16); + buckets =3D min(buckets, futex_default_max_buckets); + if (current_buckets >=3D buckets) + return 0; + + return futex_hash_allocate(buckets); } =20 static int futex_hash_get_slots(void) @@ -1473,6 +1494,7 @@ static int __init futex_init(void) #else futex_hashsize =3D roundup_pow_of_two(256 * num_possible_cpus()); #endif + futex_default_max_buckets =3D futex_hashsize; =20 futex_queues =3D alloc_large_system_hash("futex", sizeof(*futex_queues), futex_hashsize, 0, 0, --=20 2.45.2 From nobody Wed Dec 17 18:01:32 2025 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B2F261CD1E0 for ; Sun, 15 Dec 2024 23:06:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734304019; cv=none; b=HyhIha0qB7jIgXFNpnRNG/7KT0dMSAIhj1im8kM7DfZhyvEArI+M2dYePEb4SqgyWbFGv7u+pR4F0tgls1xUiT4KrWgLSXRhhn2GYaPQB+/RU4NwjO2l6StinxuwYGlV4JALA7ffJQjDiwV/9QC+jCG5hv0EMcJH4JwXQO5zt6c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734304019; c=relaxed/simple; bh=tGQ90tQFNBs0LioAZHXxhJi+dnJqsSoi4itaba/oBcU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=l8Pp4WjcTHZ6AfdkOcvYhZW0GIGPNeYKUilJxdksP0p6Wmp7aqmmr2gDrEbJoR6ifAOKo8IheQl8ENMj07Tkj5plMOvDnkeU6qyTXsjw5W6Pwu0ScSpUJe1RFxuRl/+MLKm1UmLaNK8+p6PWCf5gNv6mYiU0Yif+HT+f9FJZfNc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=BDH1N5Nn; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=D7qox8Em; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="BDH1N5Nn"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="D7qox8Em" From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1734304009; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RXQJmqhHPrSasMZSAKq9wWCexASoz+ENk/0L1qCztPU=; b=BDH1N5Nns2jn+rHBKPB5smoQ6OBUA2MgYwPnWmNkZ3Vn/Bzi1xO1tTVHsGz6/m0FKjqYHc lL25xsOzScS3A8GMsZyF8exvJs/O+ElW66sgkOktGcyzc/7fbNzw6FRJZnwViS01zIpudY ksu59SW/WHx2LOTTtUe/AhQfBjNeNKyUr44cf2ibjwlDkBm9+4dgOYFMP1OJD7cWELDDR9 tdAbD/3X3pUpdZdEUt9AkAQN+61ZyvVB+B1+Hg61j/Y5gEL8WVB27oYmtEAtnp4e+yLTh7 usyvnRau7yD67fP6YQGea1T7AdO59gmRxX1MOAwiZVrzDrHW9dx6pK/lH9/89w== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1734304009; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RXQJmqhHPrSasMZSAKq9wWCexASoz+ENk/0L1qCztPU=; b=D7qox8EmHs5aNj1+QLs6sWdwAHF2xnivF3dNcl7YSH9sBv4mcTv30OSDXrZpNHK3xVs/zE 6WVZyYWgilGfdzBw== To: linux-kernel@vger.kernel.org Cc: =?UTF-8?q?Andr=C3=A9=20Almeida?= , Darren Hart , Davidlohr Bueso , Ingo Molnar , Juri Lelli , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Waiman Long , Sebastian Andrzej Siewior Subject: [PATCH v5 11/14] futex: Use a hashmask instead of hashsize. Date: Mon, 16 Dec 2024 00:00:15 +0100 Message-ID: <20241215230642.104118-12-bigeasy@linutronix.de> In-Reply-To: <20241215230642.104118-1-bigeasy@linutronix.de> References: <20241215230642.104118-1-bigeasy@linutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The global hash uses futex_hashsize to save the amount of the hash buckets that have been allocated during system boot. On each futex_hash() invocation this number is substracted by one to get the mask. This can be optimized by saving directly the mask avoiding the substraction on each futex_hash() invocation. Rename futex_hashsize to futex_hashmask and save the mask of the allocated hash map. Signed-off-by: Sebastian Andrzej Siewior --- kernel/futex/core.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/kernel/futex/core.c b/kernel/futex/core.c index 15e319239c282..b237154d67df0 100644 --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -52,10 +52,10 @@ */ static struct { struct futex_hash_bucket *queues; - unsigned long hashsize; + unsigned long hashmask; } __futex_data __read_mostly __aligned(2*sizeof(long)); #define futex_queues (__futex_data.queues) -#define futex_hashsize (__futex_data.hashsize) +#define futex_hashmask (__futex_data.hashmask) =20 struct futex_hash_bucket_private { rcuref_t users; @@ -300,7 +300,7 @@ struct futex_hash_bucket *futex_hash(union futex_key *k= ey) hash =3D jhash2((u32 *)key, offsetof(typeof(*key), both.offset) / 4, key->both.offset); - return &futex_queues[hash & (futex_hashsize - 1)]; + return &futex_queues[hash & futex_hashmask]; } =20 void futex_hash_put(struct futex_hash_bucket *hb) @@ -1486,25 +1486,25 @@ int futex_hash_prctl(unsigned long arg2, unsigned l= ong arg3) =20 static int __init futex_init(void) { + unsigned long i, hashsize; unsigned int futex_shift; - unsigned long i; =20 #ifdef CONFIG_BASE_SMALL - futex_hashsize =3D 16; + hashsize =3D 16; #else - futex_hashsize =3D roundup_pow_of_two(256 * num_possible_cpus()); + hashsize =3D roundup_pow_of_two(256 * num_possible_cpus()); #endif - futex_default_max_buckets =3D futex_hashsize; + futex_default_max_buckets =3D hashsize; =20 futex_queues =3D alloc_large_system_hash("futex", sizeof(*futex_queues), - futex_hashsize, 0, 0, + hashsize, 0, 0, &futex_shift, NULL, - futex_hashsize, futex_hashsize); - futex_hashsize =3D 1UL << futex_shift; + hashsize, hashsize); + hashsize =3D 1UL << futex_shift; =20 - for (i =3D 0; i < futex_hashsize; i++) + for (i =3D 0; i < hashsize; i++) futex_hash_bucket_init(&futex_queues[i], 0); - + futex_hashmask =3D hashsize - 1; return 0; } core_initcall(futex_init); --=20 2.45.2 From nobody Wed Dec 17 18:01:32 2025 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B4B831CD1E4 for ; Sun, 15 Dec 2024 23:06:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734304019; cv=none; b=uF6nqN/c8h3QF8W8HfNdNn0lfv4r3zR/kduKyA0eaSg7P7cAL7MSwhcvWWntxJPpvYQdtMZ/QgNfbe5CCSzKwqQrGfkCm2FCkT1vFxeXUehtyA9vZC7W8X5FuXvVaWIktVb6fbdCLlIqRlz8Jkdh3qW59tOAn6wJPdsvyDmO6jM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734304019; c=relaxed/simple; bh=oBCl2xg1hBgwXDdOFVnS6Y5pFTfEYL/Bz1BD8yFVl24=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=YXrNmMmGQCV1eE4wVfImJJAAg2wuArBEnRawQ6lLBxJUDNC+Yr7tnTMwgj2mSh7Ah45JL6UdnGVxfdwl+VkonjiTvCeXbzmBD3gweKsHDXWnYMc5cm2/LAnoAtOJRGYSPEWB//uYs6ILKMG+E1E7VMfcHIv6IejZrxLbBhYIEhQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=cL9Mv/I9; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=gMP6g89E; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="cL9Mv/I9"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="gMP6g89E" From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1734304010; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3L65Sur2ZtMYl6cHzYDBnz6z7qp7UvxDlE/18U+/UkU=; b=cL9Mv/I9ElqtgngdVm7gAIuvQf9ER6jkgzRVBURJl+nHbRnalv1tKMpjx09NMlhIeoF4ca tTVI1rQRXwQhdXm4I686kKHNzd+lJMFPl15pjtFyFDDaBWiow1FqZi2TWlR38yDw7jTlcL rLcXgZQ1utRFZq8Zr3BB/QdiWFIegyR/xuxbV0M/2TW6vEwzh7U1F0QclZC87tVd8/8ets r73Y/fIp8CeN+NS4EGUXSIKy72d4XQGBhxMzAj3Aw+N99DhekImJ570XC5K6hcHeO1qlNK wrRSS5vUjthTpkkMdHFs6nSxUQHkkYmOKsqxdgyZnJFho9vYPWbFLEsF6JdHXQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1734304010; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3L65Sur2ZtMYl6cHzYDBnz6z7qp7UvxDlE/18U+/UkU=; b=gMP6g89ElOUbaApJX+yGHyRhjdroMN3V6XzVpfHaLIgzkvx48zlT95RQvQw+/7/P/Fdrkq JkCczzZDQPDMrdDw== To: linux-kernel@vger.kernel.org Cc: =?UTF-8?q?Andr=C3=A9=20Almeida?= , Darren Hart , Davidlohr Bueso , Ingo Molnar , Juri Lelli , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Waiman Long , Sebastian Andrzej Siewior Subject: [PATCH v5 12/14] =?UTF-8?q?tools/perf:=20Add=20the=20prctl(PR=5FF?= =?UTF-8?q?UTEX=5FHASH,=E2=80=A6)=20to=20futex-hash.?= Date: Mon, 16 Dec 2024 00:00:16 +0100 Message-ID: <20241215230642.104118-13-bigeasy@linutronix.de> In-Reply-To: <20241215230642.104118-1-bigeasy@linutronix.de> References: <20241215230642.104118-1-bigeasy@linutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Wire up PR_FUTEX_HASH to futex-hash. Use the `-b' argument to specify the number of buckets. Read it back and show during invocation. Signed-off-by: Sebastian Andrzej Siewior --- tools/perf/bench/futex-hash.c | 19 +++++++++++++++++-- tools/perf/bench/futex.h | 1 + 2 files changed, 18 insertions(+), 2 deletions(-) diff --git a/tools/perf/bench/futex-hash.c b/tools/perf/bench/futex-hash.c index b472eded521b1..e24e987ae213e 100644 --- a/tools/perf/bench/futex-hash.c +++ b/tools/perf/bench/futex-hash.c @@ -22,6 +22,7 @@ #include #include #include +#include =20 #include "../util/mutex.h" #include "../util/stat.h" @@ -53,6 +54,7 @@ static struct bench_futex_parameters params =3D { }; =20 static const struct option options[] =3D { + OPT_UINTEGER('b', "buckets", ¶ms.nbuckets, "Task local futex buckets = to allocate"), OPT_UINTEGER('t', "threads", ¶ms.nthreads, "Specify amount of threads= "), OPT_UINTEGER('r', "runtime", ¶ms.runtime, "Specify runtime (in second= s)"), OPT_UINTEGER('f', "futexes", ¶ms.nfutexes, "Specify amount of futexes= per threads"), @@ -120,6 +122,10 @@ static void print_summary(void) (int)bench__runtime.tv_sec); } =20 +#define PR_FUTEX_HASH 77 +# define PR_FUTEX_HASH_SET_SLOTS 1 +# define PR_FUTEX_HASH_GET_SLOTS 2 + int bench_futex_hash(int argc, const char **argv) { int ret =3D 0; @@ -131,6 +137,7 @@ int bench_futex_hash(int argc, const char **argv) struct perf_cpu_map *cpu; int nrcpus; size_t size; + int num_buckets; =20 argc =3D parse_options(argc, argv, options, bench_futex_hash_usage, 0); if (argc) { @@ -147,6 +154,14 @@ int bench_futex_hash(int argc, const char **argv) act.sa_sigaction =3D toggle_done; sigaction(SIGINT, &act, NULL); =20 + ret =3D prctl(PR_FUTEX_HASH, PR_FUTEX_HASH_SET_SLOTS, params.nbuckets); + if (ret) { + printf("Allocation of %u hash buckets failed: %d/%m\n", + params.nbuckets, ret); + goto errmem; + } + num_buckets =3D prctl(PR_FUTEX_HASH, PR_FUTEX_HASH_GET_SLOTS); + if (params.mlockall) { if (mlockall(MCL_CURRENT | MCL_FUTURE)) err(EXIT_FAILURE, "mlockall"); @@ -162,8 +177,8 @@ int bench_futex_hash(int argc, const char **argv) if (!params.fshared) futex_flag =3D FUTEX_PRIVATE_FLAG; =20 - printf("Run summary [PID %d]: %d threads, each operating on %d [%s] futex= es for %d secs.\n\n", - getpid(), params.nthreads, params.nfutexes, params.fshared ? "shar= ed":"private", params.runtime); + printf("Run summary [PID %d]: %d threads, hash slots: %d each operating o= n %d [%s] futexes for %d secs.\n\n", + getpid(), params.nthreads, num_buckets, params.nfutexes, params.fs= hared ? "shared":"private", params.runtime); =20 init_stats(&throughput_stats); mutex_init(&thread_lock); diff --git a/tools/perf/bench/futex.h b/tools/perf/bench/futex.h index ebdc2b032afc1..abc353c63a9a4 100644 --- a/tools/perf/bench/futex.h +++ b/tools/perf/bench/futex.h @@ -20,6 +20,7 @@ struct bench_futex_parameters { bool multi; /* lock-pi */ bool pi; /* requeue-pi */ bool broadcast; /* requeue */ + unsigned int nbuckets; unsigned int runtime; /* seconds*/ unsigned int nthreads; unsigned int nfutexes; --=20 2.45.2 From nobody Wed Dec 17 18:01:32 2025 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0E95A1CD213 for ; Sun, 15 Dec 2024 23:06:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734304019; cv=none; b=oONqyPzdAj5TTwQ9QMqY69Qh4EXzBWvPSNn2fcXgJjomOcj1Sp/FEMqDSLpeAGiC/LkYVP3qNFsMlGs0GqmtcMcxB4N5WktP268qyxhDmxh5BtBHM/w26ycqwHfsrLz+k4Xv0JP00XoQaWFxOGZchAkTwaKDtWFk+f2mTzaZbbQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734304019; c=relaxed/simple; bh=Wza0NVogmEzWlHk3L0MKCrq8xu+lZPkmjetxS0L4yJY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=IwsGwQuDLhpZ1zglXbHIDHBNrc33Yemqn3f3jQ5WXD4zJ73vbm/ucX5iSnIR4WSvP6FpkdhIpfUOty9PP0DKz07peA2KeyrK08Oo6OG0eIsLAh7EB8UgmxUoRYhCoKvVtsMeHXdDjEloDzlySciHoUfXoU5MD0k9GVsc39Qbq/M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=QMbSF9Wj; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=JPuphqC/; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="QMbSF9Wj"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="JPuphqC/" From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1734304010; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=P0sAWYJTAdfyW4L0KBHcmylrIRN9HFXqGitHstMvgdc=; b=QMbSF9WjSp/0pQ46sR4e+aRSKX/Lo+mCL/z9NLz7sCYYeSo4zMFbOgIJ9g17RGaamIwKAn BSBnEjL+8GktjoqGmXyBE+qA8xwNO4R5FrEefZhUn2gwDSCi+RSAn87wlrbS4AgFS6hEpv SpCnuz/5N/FXuWKywH/1AWem7pNCB+tFe7hRMUR8j2cJebmq4agGAspV6iuBIdrTyi39+E AO3UgQTm4OYMNdqjftxpmqD4tb2wlfxisjjbRSQ/82UBGwSq+31PTKt2mwwvKCglOdVuX1 sHCgRLxj1sXYyj/nRdvMyIhn7b/E8c29eot4ArqLhD8uapNkzSo0zk/XDdHSeA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1734304010; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=P0sAWYJTAdfyW4L0KBHcmylrIRN9HFXqGitHstMvgdc=; b=JPuphqC/2rGj45e1A0xYzvgaZ2BetSh6jY2X0mHrlZHGBhKew6+HF+HB5s1MYwTW3lcGPz KO00MzWRX2wab4Aw== To: linux-kernel@vger.kernel.org Cc: =?UTF-8?q?Andr=C3=A9=20Almeida?= , Darren Hart , Davidlohr Bueso , Ingo Molnar , Juri Lelli , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Waiman Long , Sebastian Andrzej Siewior Subject: [PATCH v5 13/14] tools/perf: The the current affinity for CPU pinning in futex-hash. Date: Mon, 16 Dec 2024 00:00:17 +0100 Message-ID: <20241215230642.104118-14-bigeasy@linutronix.de> In-Reply-To: <20241215230642.104118-1-bigeasy@linutronix.de> References: <20241215230642.104118-1-bigeasy@linutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In order to simplify NUMA local testing, let futex-hash use the current affinity mask and pin the individual threads based on that mask. Signed-off-by: Sebastian Andrzej Siewior --- tools/perf/bench/futex-hash.c | 30 ++++++++++++++++++++++++------ 1 file changed, 24 insertions(+), 6 deletions(-) diff --git a/tools/perf/bench/futex-hash.c b/tools/perf/bench/futex-hash.c index e24e987ae213e..216b0d1301ffc 100644 --- a/tools/perf/bench/futex-hash.c +++ b/tools/perf/bench/futex-hash.c @@ -126,10 +126,24 @@ static void print_summary(void) # define PR_FUTEX_HASH_SET_SLOTS 1 # define PR_FUTEX_HASH_GET_SLOTS 2 =20 +static unsigned int get_cpu_bit(cpu_set_t *set, size_t set_size, unsigned = int r_cpu) +{ + unsigned int cpu =3D 0; + + do { + if (CPU_ISSET_S(cpu, set_size, set)) { + if (!r_cpu) + return cpu; + r_cpu--; + } + cpu++; + } while (1); +} + int bench_futex_hash(int argc, const char **argv) { int ret =3D 0; - cpu_set_t *cpuset; + cpu_set_t *cpuset, cpuset_; struct sigaction act; unsigned int i; pthread_attr_t thread_attr; @@ -167,8 +181,12 @@ int bench_futex_hash(int argc, const char **argv) err(EXIT_FAILURE, "mlockall"); } =20 + ret =3D pthread_getaffinity_np(pthread_self(), sizeof(cpuset_), &cpuset_); + BUG_ON(ret); + nrcpus =3D CPU_COUNT(&cpuset_); + if (!params.nthreads) /* default to the number of CPUs */ - params.nthreads =3D perf_cpu_map__nr(cpu); + params.nthreads =3D nrcpus; =20 worker =3D calloc(params.nthreads, sizeof(*worker)); if (!worker) @@ -189,10 +207,9 @@ int bench_futex_hash(int argc, const char **argv) pthread_attr_init(&thread_attr); gettimeofday(&bench__start, NULL); =20 - nrcpus =3D cpu__max_cpu().cpu; - cpuset =3D CPU_ALLOC(nrcpus); + cpuset =3D CPU_ALLOC(4096); BUG_ON(!cpuset); - size =3D CPU_ALLOC_SIZE(nrcpus); + size =3D CPU_ALLOC_SIZE(4096); =20 for (i =3D 0; i < params.nthreads; i++) { worker[i].tid =3D i; @@ -202,7 +219,8 @@ int bench_futex_hash(int argc, const char **argv) =20 CPU_ZERO_S(size, cpuset); =20 - CPU_SET_S(perf_cpu_map__cpu(cpu, i % perf_cpu_map__nr(cpu)).cpu, size, c= puset); + CPU_SET_S(get_cpu_bit(&cpuset_, sizeof(cpuset_), i % nrcpus), size, cpus= et); + ret =3D pthread_attr_setaffinity_np(&thread_attr, size, cpuset); if (ret) { CPU_FREE(cpuset); --=20 2.45.2 From nobody Wed Dec 17 18:01:32 2025 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D8EDD1CD210 for ; Sun, 15 Dec 2024 23:06:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734304019; cv=none; b=qEzQI3MyvGvi4tZbQnSdIVCTxxpiB9zUQ4WvKESweyCbFnDRSMzfCLYhVTUoTn2GoWAmSEO0sAw9ny9mOv28hgtW7YkueZDL32ZuBdeeFWJZXXCWVRJ5UdZWF54tSiwNAYnQQGyZ1e8s88GMZGZqkBXtKdryxJAA3o5MH0e6Ww0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734304019; c=relaxed/simple; bh=lg8axIUYclb2T6+boQQvvI4x/adQdWN/2MeRAxM1I+E=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=fPyj8da/MPI0f+BQzImEYrhULAb7QmHBBp7Y/57fgFXwvkmoxCge4+MR+5p0vz8t9MFRDUuYVUzNXZeAu5YqAvS7VVSM7zaYled/oFW3rSHDS4d1uZ26KuVNmsjUz/WZlz3gMbl0u9tAcrXJDX9bI2b1WQyknU9TITkk19z38Yg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=FX5qm10x; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=kUYKyebN; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="FX5qm10x"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="kUYKyebN" From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1734304010; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=E6O0H8PocWBBcPGtHkLgMVhBIfVergybbi7OL8FwAuw=; b=FX5qm10xvvJ0YL1mdxdCaowGcj/a1h95ddaSUKlSFvHjx92wsu+zcSCVVrx7g69tv4RLC/ X1cEVWwR/JIOx7huR/VlF1IsnJ7pxmhKnsxMNKUSFNz+G/y1Nmczl0QneKGTFMY5Z1N/jv SNMzkCtTG8CltVzipwkQc/c3h2hV9IusrnfiZANS4Y1rUe5WUGJzrHCfF5edbSujwWyytP geCRMjRNp0mU1MXJbcoDB5YwHJ4LNsmN7mVvSmW5xZ7p1Vk43uOLglM2E+NHWAD8SIBExU JENegw3uD87tR9kr1Y+z6WwjBj3N4AahBZV4QhyA7tV2Qz5S/ujyRPMCtOyYvg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1734304010; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=E6O0H8PocWBBcPGtHkLgMVhBIfVergybbi7OL8FwAuw=; b=kUYKyebNmTw2maS7BNcOM4Kybnyjb1oYoQmo7i3E6g+3qWoV1TeiOHVQ9vQNTmcKFIluDP +obnWx5fvygNxoBw== To: linux-kernel@vger.kernel.org Cc: =?UTF-8?q?Andr=C3=A9=20Almeida?= , Darren Hart , Davidlohr Bueso , Ingo Molnar , Juri Lelli , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Waiman Long , Sebastian Andrzej Siewior Subject: [PATCH v5 14/14] tools/perf: Allocate futex locks on the local CPU-node. Date: Mon, 16 Dec 2024 00:00:18 +0100 Message-ID: <20241215230642.104118-15-bigeasy@linutronix.de> In-Reply-To: <20241215230642.104118-1-bigeasy@linutronix.de> References: <20241215230642.104118-1-bigeasy@linutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Signed-off-by: Sebastian Andrzej Siewior --- tools/perf/bench/futex-hash.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/tools/perf/bench/futex-hash.c b/tools/perf/bench/futex-hash.c index 216b0d1301ffc..4c7c6677463f8 100644 --- a/tools/perf/bench/futex-hash.c +++ b/tools/perf/bench/futex-hash.c @@ -122,6 +122,8 @@ static void print_summary(void) (int)bench__runtime.tv_sec); } =20 +#include + #define PR_FUTEX_HASH 77 # define PR_FUTEX_HASH_SET_SLOTS 1 # define PR_FUTEX_HASH_GET_SLOTS 2 @@ -212,14 +214,19 @@ int bench_futex_hash(int argc, const char **argv) size =3D CPU_ALLOC_SIZE(4096); =20 for (i =3D 0; i < params.nthreads; i++) { + unsigned int cpu_num; worker[i].tid =3D i; - worker[i].futex =3D calloc(params.nfutexes, sizeof(*worker[i].futex)); - if (!worker[i].futex) - goto errmem; =20 CPU_ZERO_S(size, cpuset); + cpu_num =3D get_cpu_bit(&cpuset_, sizeof(cpuset_), i % nrcpus); + //worker[i].futex =3D calloc(params.nfutexes, sizeof(*worker[i].futex)); =20 - CPU_SET_S(get_cpu_bit(&cpuset_, sizeof(cpuset_), i % nrcpus), size, cpus= et); + worker[i].futex =3D numa_alloc_onnode(params.nfutexes * sizeof(*worker[i= ].futex), + numa_node_of_cpu(cpu_num)); + if (worker[i].futex =3D=3D MAP_FAILED || worker[i].futex =3D=3D NULL) + goto errmem; + + CPU_SET_S(cpu_num, size, cpuset); =20 ret =3D pthread_attr_setaffinity_np(&thread_attr, size, cpuset); if (ret) { @@ -271,7 +278,7 @@ int bench_futex_hash(int argc, const char **argv) &worker[i].futex[params.nfutexes-1], t); } =20 - zfree(&worker[i].futex); + numa_free(worker[i].futex, params.nfutexes * sizeof(*worker[i].futex)); } =20 print_summary(); --=20 2.45.2