From nobody Fri Nov 22 17:05:56 2024 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 812541D5AAD for ; Fri, 15 Nov 2024 17:20:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731691246; cv=none; b=ttFasF6M0JLjtz1MbQAPTPpvxgV7fzAgbkH4wvoGU3DrDmN2xZVywIriifmc8KYQAzP1OIYWHyBZgJPYlOFEs8TG/pxcDRJPFbGB2HXeRIePjwHJaoIHLiIPcmL2ew+Sv1uQbO2PQwXPPtnJZcARcVWJMPEjP8qPGBaTUtF2dy0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731691246; c=relaxed/simple; bh=wxTiCPgczYeasflXTB3Gnf/RVwmWOrJrFbwDdZDzhX4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=sZfOxG+t7N+8X/ckELs+n7E2GKrPVnJcTX9xXGaKli0GDQ8e5wzA9WV6wcpxSoFHNFVj/adpgWMea4tm6vq4nwO8Gr2SxUgI7XB5BgUmYjBPdPkFKKNQ+hzif2GbVzofA03/gIU3xloxZGN5O89Gh7SlBfr1l16uGb5WZvcASnQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=2VnrhiYT; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=DKEUhz62; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="2VnrhiYT"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="DKEUhz62" From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1731691242; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4mZlKRrG+DPnUxrk/55cv+455WXnVQzhH1V8fNj3nts=; b=2VnrhiYTiAmLGQrjJ16qjkrjR/d1b3jOx3TEBwhm2WAekj2rIyptD6CIsETB6z3c4cY39P uCM0uwkZYbL/6902R9ZXbhOfU2ZsHPmdyxXwVSw63JMUOAixfuy4eJpuTu50l8mC3g8ffu OBXonUfGzyFLj59SxBBTfFbC9o7hQ2tmburqTI/m0J2Mkww5mLTNKc66IdY4EHN9BV0Qcx 6r7KFfcgdo0uPPA6oTUj0pB6q7K9SGyOD59R6n2iLFobhxmCppeNeULDXB8WLKGaq4umqk SyEuUrMDgMsSbOw9YZQgj4MDce6a0ry16kIkDuYlnIB52Li7HHq4CQG7k7Az4A== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1731691242; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4mZlKRrG+DPnUxrk/55cv+455WXnVQzhH1V8fNj3nts=; b=DKEUhz62Z18n85ppjSjG9tT6rNiezg/7ILHb/r5qQUuwzWeRvo4G19xmzDx9+sDkKsniBI sLeILguCtdOAYtDA== To: linux-kernel@vger.kernel.org Cc: =?UTF-8?q?Andr=C3=A9=20Almeida?= , Darren Hart , Davidlohr Bueso , Ingo Molnar , Juri Lelli , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Waiman Long , Sebastian Andrzej Siewior Subject: [RFC PATCH v3 1/9] futex: Create helper function to initialize a hash slot. Date: Fri, 15 Nov 2024 17:58:42 +0100 Message-ID: <20241115172035.795842-2-bigeasy@linutronix.de> In-Reply-To: <20241115172035.795842-1-bigeasy@linutronix.de> References: <20241115172035.795842-1-bigeasy@linutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Factor out the futex_hash_bucket initialisation into a helpr function. The helper function will be used in a follow up patch. Signed-off-by: Sebastian Andrzej Siewior --- kernel/futex/core.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/kernel/futex/core.c b/kernel/futex/core.c index 136768ae26375..de6d7f71961eb 100644 --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -1146,6 +1146,13 @@ void futex_exit_release(struct task_struct *tsk) futex_cleanup_end(tsk, FUTEX_STATE_DEAD); } =20 +static void futex_hash_bucket_init(struct futex_hash_bucket *fhb) +{ + atomic_set(&fhb->waiters, 0); + plist_head_init(&fhb->chain); + spin_lock_init(&fhb->lock); +} + static int __init futex_init(void) { unsigned int futex_shift; @@ -1163,11 +1170,8 @@ static int __init futex_init(void) futex_hashsize, futex_hashsize); futex_hashsize =3D 1UL << futex_shift; =20 - for (i =3D 0; i < futex_hashsize; i++) { - atomic_set(&futex_queues[i].waiters, 0); - plist_head_init(&futex_queues[i].chain); - spin_lock_init(&futex_queues[i].lock); - } + for (i =3D 0; i < futex_hashsize; i++) + futex_hash_bucket_init(&futex_queues[i]); =20 return 0; } --=20 2.45.2 From nobody Fri Nov 22 17:05:56 2024 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4094B14A088 for ; Fri, 15 Nov 2024 17:20:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731691247; cv=none; b=cLD0/AKYCFoAgU1W2IaMj3lsL4zRGUhzlTfcUn+6eZPodGeTi1M9keR4yl7r4tjdyiY3Sr074nrXntymJfFCKQ4i8QH7hOyA/KiyNp9r0I9Vjdw0suGTK7XpBB3OiV2YCdCToG5UZ9Q5wloFK8n1QFLtNLg5/Shj211sh/U2R8M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731691247; c=relaxed/simple; bh=nQQQxnCsPJwoktUeagLS9DBMcP0e0q44VUVtsdV9pdg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=LJf21xk/A3JzoYsaR+OOK/sRb02pr+98WpCRU440E3Avqw0zR6CrHOSyIsqjVylbcdzZnKnbhd5HhhJQmG+DZ/E1wBygrSt6TcrPlXaD6vYlegoQAbC+btbMsRPt3ooYXbxvat1GE/R/r0aBiijjHz52cFF8i4+Sxo504uX/Q+A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=LSvhrEYH; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=TEkA5z9b; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="LSvhrEYH"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="TEkA5z9b" From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1731691243; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MV7ViqGRPlSOuEQVPkYCiJqU4ia+4PXg5K8Jz4/iAvM=; b=LSvhrEYHs2hcaIhQzmfectuW+KWYh9ViDSpwK8L8Huc5A2etStWF3TC6TJhybhFfSFk+/e NjNy/Gj3xCccXPaEDG4X+hS5mn/AAn5SFFGBWGoTDTf16Zly4LgfI837oe0INxav4xdhx9 wxh+LBSXOhV8w2ZZ+G6OVFfZLR/jdtZBHG8ak882/x9a/O9IK3tAu1ION7BZd2cOTUgrFY +J8F7R9ZBmy0C5NqErVTNv5GLCcQSDppabikaLauzqLyboLbfYobdKMn+6z+vfi/uO8il6 TNzep5eptmpL5ecoeZNN1H0v6Y7Rkk3+fkzRZOgTm4ByopA+TuhE8mvIL0xo3w== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1731691243; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MV7ViqGRPlSOuEQVPkYCiJqU4ia+4PXg5K8Jz4/iAvM=; b=TEkA5z9bRfjcVBUWC/3ApOOyhV51kH7Dbiye0rBm25axi2vo3fc4XCqICUHXtof3Drii23 G8q2igTOotTtYYBg== To: linux-kernel@vger.kernel.org Cc: =?UTF-8?q?Andr=C3=A9=20Almeida?= , Darren Hart , Davidlohr Bueso , Ingo Molnar , Juri Lelli , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Waiman Long , Sebastian Andrzej Siewior Subject: [RFC PATCH v3 2/9] futex: Add basic infrastructure for local task local hash. Date: Fri, 15 Nov 2024 17:58:43 +0100 Message-ID: <20241115172035.795842-3-bigeasy@linutronix.de> In-Reply-To: <20241115172035.795842-1-bigeasy@linutronix.de> References: <20241115172035.795842-1-bigeasy@linutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The futex hashmap is system wide and shared by random tasks. Each slot is hashed based on its address and VMA. Due to randomized VMAs the same logical lock (pointer) can end up in a different hash bucket on each invocation of the application. This in turn means that different applications may share a hash bucket on each invocation and it is not always clear which applications will be involved. This can result in high latency's to acquire the futex_hash_bucket::lock especially if the lock owner is limited to a CPU and not be effectively PI boosted. Introduce a task local hash map. The hashmap can be allocated via prctl(PR_FUTEX_HASH, PR_FUTEX_HASH_SET_SLOTS, 0) The `0' argument allocates a default number of 4 slots, a higher number can be specified if desired. The current uppoer limit is 128. The allocated hashmap is used by all threads within a process. A thread can check if the private map has been allocated via prctl(PR_FUTEX_HASH, PR_FUTEX_HASH_GET_SLOTS); Which return the current number of slots. Signed-off-by: Sebastian Andrzej Siewior --- include/linux/futex.h | 22 +++++++++ include/linux/mm_types.h | 3 ++ include/uapi/linux/prctl.h | 5 ++ kernel/fork.c | 2 + kernel/futex/core.c | 98 ++++++++++++++++++++++++++++++++++++-- kernel/sys.c | 4 ++ 6 files changed, 131 insertions(+), 3 deletions(-) diff --git a/include/linux/futex.h b/include/linux/futex.h index b70df27d7e85c..61e81b866d34e 100644 --- a/include/linux/futex.h +++ b/include/linux/futex.h @@ -77,6 +77,16 @@ void futex_exec_release(struct task_struct *tsk); =20 long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout, u32 __user *uaddr2, u32 val2, u32 val3); +int futex_hash_prctl(unsigned long arg2, unsigned long arg3, + unsigned long arg4, unsigned long arg5); +int futex_hash_allocate_default(void); +void futex_hash_free(struct mm_struct *mm); + +static inline void futex_mm_init(struct mm_struct *mm) +{ + mm->futex_hash_bucket =3D NULL; +} + #else static inline void futex_init_task(struct task_struct *tsk) { } static inline void futex_exit_recursive(struct task_struct *tsk) { } @@ -88,6 +98,18 @@ static inline long do_futex(u32 __user *uaddr, int op, u= 32 val, { return -EINVAL; } +static inline int futex_hash_prctl(unsigned long arg2, unsigned long arg3, + unsigned long arg4, unsigned long arg5) +{ + return -EINVAL; +} +static inline int futex_hash_allocate_default(void) +{ + return 0; +} +static inline void futex_hash_free(struct mm_struct *mm) { } +static inline void futex_mm_init(struct mm_struct *mm) { } + #endif =20 #endif diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 6e3bdf8e38bca..2d25be28fa35f 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -30,6 +30,7 @@ #define INIT_PASID 0 =20 struct address_space; +struct futex_hash_bucket; struct mem_cgroup; =20 /* @@ -898,6 +899,8 @@ struct mm_struct { int mm_lock_seq; #endif =20 + unsigned int futex_hash_mask; + struct futex_hash_bucket *futex_hash_bucket; =20 unsigned long hiwater_rss; /* High-watermark of RSS usage */ unsigned long hiwater_vm; /* High-water virtual memory usage */ diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index 35791791a879b..2f45e2d291fe4 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -328,4 +328,9 @@ struct prctl_mm_map { # define PR_PPC_DEXCR_CTRL_CLEAR_ONEXEC 0x10 /* Clear the aspect on exec */ # define PR_PPC_DEXCR_CTRL_MASK 0x1f =20 +/* FUTEX hash management */ +#define PR_FUTEX_HASH 74 +# define PR_FUTEX_HASH_SET_SLOTS 1 +# define PR_FUTEX_HASH_GET_SLOTS 2 + #endif /* _LINUX_PRCTL_H */ diff --git a/kernel/fork.c b/kernel/fork.c index 22f43721d031d..a83cf4d87ae57 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1279,6 +1279,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm= , struct task_struct *p, RCU_INIT_POINTER(mm->exe_file, NULL); mmu_notifier_subscriptions_init(mm); init_tlb_flush_pending(mm); + futex_mm_init(mm); #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !defined(CONFIG_SPLIT_PMD_PTLO= CKS) mm->pmd_huge_pte =3D NULL; #endif @@ -1356,6 +1357,7 @@ static inline void __mmput(struct mm_struct *mm) if (mm->binfmt) module_put(mm->binfmt->module); lru_gen_del_mm(mm); + futex_hash_free(mm); mmdrop(mm); } =20 diff --git a/kernel/futex/core.c b/kernel/futex/core.c index de6d7f71961eb..2f5087fde57ef 100644 --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -39,6 +39,7 @@ #include #include #include +#include =20 #include "futex.h" #include "../locking/rtmutex_common.h" @@ -107,18 +108,40 @@ late_initcall(fail_futex_debugfs); =20 #endif /* CONFIG_FAIL_FUTEX */ =20 +static inline bool futex_key_is_private(union futex_key *key) +{ + /* + * Relies on get_futex_key() to set either bit for shared + * futexes -- see comment with union futex_key. + */ + return !(key->both.offset & (FUT_OFF_INODE | FUT_OFF_MMSHARED)); +} + /** * futex_hash - Return the hash bucket in the global hash * @key: Pointer to the futex key for which the hash is calculated * * We hash on the keys returned from get_futex_key (see below) and return = the - * corresponding hash bucket in the global hash. + * corresponding hash bucket in the global hash. If the FUTEX is private a= nd + * a local hash table is privated then this one is used. */ struct futex_hash_bucket *futex_hash(union futex_key *key) { - u32 hash =3D jhash2((u32 *)key, offsetof(typeof(*key), both.offset) / 4, - key->both.offset); + struct futex_hash_bucket *fhb; + u32 hash; =20 + fhb =3D current->mm->futex_hash_bucket; + if (fhb && futex_key_is_private(key)) { + u32 hash_mask =3D current->mm->futex_hash_mask; + + hash =3D jhash2((u32 *)key, + offsetof(typeof(*key), both.offset) / 4, + key->both.offset); + return &fhb[hash & hash_mask]; + } + hash =3D jhash2((u32 *)key, + offsetof(typeof(*key), both.offset) / 4, + key->both.offset); return &futex_queues[hash & (futex_hashsize - 1)]; } =20 @@ -1153,6 +1176,75 @@ static void futex_hash_bucket_init(struct futex_hash= _bucket *fhb) spin_lock_init(&fhb->lock); } =20 +void futex_hash_free(struct mm_struct *mm) +{ + kvfree(mm->futex_hash_bucket); +} + +static int futex_hash_allocate(unsigned int hash_slots) +{ + struct futex_hash_bucket *fhb; + int i; + + if (current->mm->futex_hash_bucket) + return -EALREADY; + + if (!thread_group_leader(current)) + return -EINVAL; + + if (hash_slots < 2) + hash_slots =3D 2; + if (hash_slots > 131072) + hash_slots =3D 131072; + if (!is_power_of_2(hash_slots)) + hash_slots =3D rounddown_pow_of_two(hash_slots); + + fhb =3D kvmalloc_array(hash_slots, sizeof(struct futex_hash_bucket), GFP_= KERNEL_ACCOUNT); + if (!fhb) + return -ENOMEM; + + current->mm->futex_hash_mask =3D hash_slots - 1; + + for (i =3D 0; i < hash_slots; i++) + futex_hash_bucket_init(&fhb[i]); + + current->mm->futex_hash_bucket =3D fhb; + return 0; +} + +int futex_hash_allocate_default(void) +{ + return futex_hash_allocate(16); +} + +static int futex_hash_get_slots(void) +{ + if (current->mm->futex_hash_bucket) + return current->mm->futex_hash_mask + 1; + return 0; +} + +int futex_hash_prctl(unsigned long arg2, unsigned long arg3, + unsigned long arg4, unsigned long arg5) +{ + int ret; + + switch (arg2) { + case PR_FUTEX_HASH_SET_SLOTS: + ret =3D futex_hash_allocate(arg3); + break; + + case PR_FUTEX_HASH_GET_SLOTS: + ret =3D futex_hash_get_slots(); + break; + + default: + ret =3D -EINVAL; + break; + } + return ret; +} + static int __init futex_init(void) { unsigned int futex_shift; diff --git a/kernel/sys.c b/kernel/sys.c index 4da31f28fda81..0dcbb8ce9f19d 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -52,6 +52,7 @@ #include #include #include +#include =20 #include #include @@ -2784,6 +2785,9 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, ar= g2, unsigned long, arg3, case PR_RISCV_SET_ICACHE_FLUSH_CTX: error =3D RISCV_SET_ICACHE_FLUSH_CTX(arg2, arg3); break; + case PR_FUTEX_HASH: + error =3D futex_hash_prctl(arg2, arg3, arg4, arg5); + break; default: error =3D -EINVAL; break; --=20 2.45.2 From nobody Fri Nov 22 17:05:56 2024 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 64AB01D618E for ; Fri, 15 Nov 2024 17:20:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731691246; cv=none; b=h860hB/3UeHXcJmmXavJwufAEy8Zc+evN1m0te309XoUyPQW2HUVH17dTAKpjmHId52crKCWgJWhEbZrihUf4fFZNyfzIyfOA7WWbAYT+Hat1o1xOw2PnleoJ3XLinA7FiBUqwOQnGTjkrf3tuf9YsZoScKbLlaS5rJlCyivGs4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731691246; c=relaxed/simple; bh=QjaveWrqakVWPDt1kCPBzhPvKFA/6V1j7gTXc4PlN0A=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pQArQEmD6J9hz4a/f1HLLKIbRdNKnLN8ieVTr/crd2EDbjURRAzZDpGvbCLBiLgTySIUMYbJcAvY4eHLwPC3bD//+6LN+OMYHtvkPaQn3kj2gBDQI7oSOVMnV+oxlGF+KmOWqfPMzmFQSr4gYzylKBdqGaelJhPTDbH2nY/AFlI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=BQcSAVxl; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=rQ7MGKcU; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="BQcSAVxl"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="rQ7MGKcU" From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1731691243; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=N+Zz9C96+wY0yelOjC14Yin3lUriNa4uFLkffOwJDHg=; b=BQcSAVxlLPVkp148JIbgjpXBFCvDoOOL+JG4ujHIq1HlLzfO9cVj3rXptijeq+/NdlYks3 84YfSkY+o0X6dze5sX1oFCJuK4ZuDO5y28XleWok9WJYbxzcMRvl8dlK0nrb+Pete97YdF Ec77646gpOUYmU/eMc/GNxM0BE/+vDa+Vv7L5BAYKHxSxrwtC1Nqurz0JDy4nSr68vQixB RTmgTqvO/k+NzL4GQZM1F8KgghJ7KoStLBeRnSCjLMSeP43fk0ldg0wOXJjsPvn/W/2HKE zM4Umys45Lc44J3GU+/T9XEANk0fFgMr+DB4iPWGNt+H3Gplurpn5tN2ASpVZg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1731691243; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=N+Zz9C96+wY0yelOjC14Yin3lUriNa4uFLkffOwJDHg=; b=rQ7MGKcUkJ2KIvaOumavAY0j4jWR3pKykhrgfpwLcFZqldt223nw6lBbhLeeVDOJANENVh 9NFFnx7fI1es2nCQ== To: linux-kernel@vger.kernel.org Cc: =?UTF-8?q?Andr=C3=A9=20Almeida?= , Darren Hart , Davidlohr Bueso , Ingo Molnar , Juri Lelli , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Waiman Long , Sebastian Andrzej Siewior Subject: [RFC PATCH v3 3/9] futex: Allow automatic allocation of process wide futex hash. Date: Fri, 15 Nov 2024 17:58:44 +0100 Message-ID: <20241115172035.795842-4-bigeasy@linutronix.de> In-Reply-To: <20241115172035.795842-1-bigeasy@linutronix.de> References: <20241115172035.795842-1-bigeasy@linutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Allocate a default futex hash if a task forks its first thread. Signed-off-by: Sebastian Andrzej Siewior --- kernel/fork.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/kernel/fork.c b/kernel/fork.c index a83cf4d87ae57..2929e236a3801 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2109,6 +2109,17 @@ static void rv_task_fork(struct task_struct *p) #define rv_task_fork(p) do {} while (0) #endif =20 +static bool need_futex_hash_allocate_default(u64 clone_flags) +{ + if ((clone_flags & (CLONE_THREAD | CLONE_VM)) !=3D (CLONE_THREAD | CLONE_= VM)) + return false; + if (!thread_group_empty(current)) + return false; + if (current->mm->futex_hash_bucket) + return false; + return true; +} + /* * This creates a new process as a copy of the old one, * but does not actually start it yet. @@ -2486,6 +2497,21 @@ __latent_entropy struct task_struct *copy_process( if (retval) goto bad_fork_cancel_cgroup; =20 + /* + * Allocate a default futex hash for the user process once the first + * thread spawns. + */ + if (need_futex_hash_allocate_default(clone_flags)) { + retval =3D futex_hash_allocate_default(); + if (retval) + goto bad_fork_core_free; + /* + * If we fail beyond this point we don't free the allocated + * futex hash map. We assume that another thread will created + * and makes use of it The hash map will be freed once the main + * thread terminates. + */ + } /* * From this point on we must avoid any synchronous user-space * communication until we take the tasklist-lock. In particular, we do --=20 2.45.2 From nobody Fri Nov 22 17:05:56 2024 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 64A561D618A for ; Fri, 15 Nov 2024 17:20:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731691248; cv=none; b=aKIknrx/bp7TB+yD5ZLjQbarsOerxVCIoBjd5SJ+RJUc0UuRTpG61eiXEAc5YvYCWuvtV2JrCIStEv8cvu11rT8luENyW6t1BUVDBN1KHI81tXGnKj3wZu+lB8E4XovIY9kV4nEz1ulPKsMpAE5YhdnCnu56S0EqleEnpMNhrzg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731691248; c=relaxed/simple; bh=DH58qurgKZ3NE3npPwHdlb4AJ1ljYjCmbxGxiwkUg2w=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=iIRK4JT32z1uqG/jRG+NXNsyVJFaLDQ6MyEeCpPI3tYIYue6SiafbFDolcVGvdIYyO/lqDZ8x0GAbRWOah+XhZzMqak+ttcXbhE/eCjZd5LizQc1p4yhopU9KJ4oj3hqgO4BaukBWBXerQuhmqg1qDz/lV8XZlL2+X/2cflKmIg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=DUu2icoD; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=albHp5bJ; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="DUu2icoD"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="albHp5bJ" From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1731691243; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UuNDl7xyS9sedFEMEU6wMQC/X2A0lAOMTqby/qP6gQU=; b=DUu2icoDMQXb1lm7Uf83Udw4o+I1YgUvz2s+EZ0Ztq5zyzrRgpzRxMoxTQ0hNMYw+EBfuH gyz7HCmwjIOyBsOUPNg28x856AxW+BsjlcXeqnQrmy7iWHKVxGceqGQ/OXZvHmTuXBPGat Iaa5E5dAD+OzeVSrPOxKwCBja/YPhf2UkrplHrWyu89UQhsojB1ISCgCVtRhEG/xgvk3tn StHVexJHG5EQvSmosAxH9Phzz0DOplCfFgsi7QR9U5swWiEc8FsbqHOliaHQAZT0TE5IOn PO/p3bCMyq6sbCnbFn0tl/dJRwAU3SYHGZ73lzoVB6ktPLXrDUGRnW3NF2XC9Q== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1731691243; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UuNDl7xyS9sedFEMEU6wMQC/X2A0lAOMTqby/qP6gQU=; b=albHp5bJu0xft0J9jVLygTXTFxxh/thCXImsz4FJILcjgsZSUZSlbT0UYUEJzPr+Kqx+sP pVCILychx9bHzPCA== To: linux-kernel@vger.kernel.org Cc: =?UTF-8?q?Andr=C3=A9=20Almeida?= , Darren Hart , Davidlohr Bueso , Ingo Molnar , Juri Lelli , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Waiman Long , Sebastian Andrzej Siewior Subject: [RFC PATCH v3 4/9] futex: Hash only the address for private futexes. Date: Fri, 15 Nov 2024 17:58:45 +0100 Message-ID: <20241115172035.795842-5-bigeasy@linutronix.de> In-Reply-To: <20241115172035.795842-1-bigeasy@linutronix.de> References: <20241115172035.795842-1-bigeasy@linutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" futex_hash() passes the whole futex_key to jhash2. The first two member are passed as the first argument and the offset as the "initial value". For private futexes, the mm-part is always the same and it is used only within the process. By excluding the mm part from the hash, we reduce the length passed to jhash2 from 4 (16 / 4) to 2 (8 / 2). This avoids the __jhash_mix() part of jhash. The resulting code is smaller and based on testing this variant performs as good as the original or slightly better. Signed-off-by: Sebastian Andrzej Siewior --- kernel/futex/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/futex/core.c b/kernel/futex/core.c index 2f5087fde57ef..5b66b6e52aeb5 100644 --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -134,8 +134,8 @@ struct futex_hash_bucket *futex_hash(union futex_key *k= ey) if (fhb && futex_key_is_private(key)) { u32 hash_mask =3D current->mm->futex_hash_mask; =20 - hash =3D jhash2((u32 *)key, - offsetof(typeof(*key), both.offset) / 4, + hash =3D jhash2((void *)&key->private.address, + sizeof(key->private.address) / 4, key->both.offset); return &fhb[hash & hash_mask]; } --=20 2.45.2 From nobody Fri Nov 22 17:05:56 2024 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9ABCD1D8DE4 for ; Fri, 15 Nov 2024 17:20:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731691250; cv=none; b=Zi3w8tSyB/i9sZ3V7WbU0mxJzdls1zOqV1Vz/wyVN9uztk09qYO6aVXl4iKlkPO5v1KCJQFDbQkCdLIXNcU/6qDEAOOfaRqC6rnIEvJ28/a5oZu97UyrGASMdSe2qt6GEGf3wFnrdbX6voUz6qI1Dqco8xbrc7VHCDTUDNiKGYU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731691250; c=relaxed/simple; bh=xW8VUvvqvYAiNZMAFl2nyJsPveltI05j6MQq1vFOmJw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=dTIYtySKQJlbfMsxZkI94knzmP9YqhkSig+slStWkNtEbsNADQrhYGTZBsBl2smBtKt+c9N2CT+3Q7cNoVbDNi6hpsONFvhHZEB1/9ChPjacYsn+tStXGHabFHHn5xZs6+dA3x131J+LGzsdf5KGjj1bZ2JDjcrxuCYb/51Vt6k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=bOQIUKTT; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=VxsMp8S5; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="bOQIUKTT"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="VxsMp8S5" From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1731691244; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=S2KrYBPNHZBBTUJKzzH/LOiqNiVM9M5T0OwYa2xCIRE=; b=bOQIUKTTJ/s9jYiJeZVUTkNCLbgtlpZi4CIy6UlCcLQZ2ObjhexWB/8F2yd/MkIyCelrKA DE3AcJbkU4GPXvlFBaY+Dnv09ZU+NCghvGPiyQ6TUO8EtdClzFQ8SIzk0jgxmkdYryl4sp uvH6gdFDDHEbpODfMLcIL48DSIikcUd7rrS5qY0cfaoIgsv2De63Wp2TXVOAESGfk5peJO aKVM041KpXCX2qn7mTvRJEOPrysLb085UMK3yQJt57y6M7iYqqnk+Tf0UOce2ykp3LefqF r+AZsytki0m/UZMLiNa9Nbgks5B/NKJEZIz13njI+6cEN1K8QbkTliZ8UmeY8A== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1731691244; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=S2KrYBPNHZBBTUJKzzH/LOiqNiVM9M5T0OwYa2xCIRE=; b=VxsMp8S5+XvmxEayMmAC3kCOW1aUdMUTi4jjqodFP8k2kOgoOiXxwrwyGkqeffTXfYSCjY 7UJTsSut6tK9z/Ag== To: linux-kernel@vger.kernel.org Cc: =?UTF-8?q?Andr=C3=A9=20Almeida?= , Darren Hart , Davidlohr Bueso , Ingo Molnar , Juri Lelli , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Waiman Long , Sebastian Andrzej Siewior Subject: [RFC PATCH v3 5/9] futex: Track the futex hash bucket. Date: Fri, 15 Nov 2024 17:58:46 +0100 Message-ID: <20241115172035.795842-6-bigeasy@linutronix.de> In-Reply-To: <20241115172035.795842-1-bigeasy@linutronix.de> References: <20241115172035.795842-1-bigeasy@linutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add futex_hash_get/put() to keep the assigned hash_bucket around while a futex operation is performed. Have RCU lifetime guarantee for futex_hash_bucket_private. This is should have the right amount of gets/ puts so that the private hash bucket is released on exit. Signed-off-by: Sebastian Andrzej Siewior --- include/linux/futex.h | 2 +- include/linux/mm_types.h | 5 +- kernel/futex/core.c | 103 +++++++++++++++++++++++++++++++++------ kernel/futex/futex.h | 8 +++ kernel/futex/pi.c | 7 +++ kernel/futex/requeue.c | 17 +++++++ kernel/futex/waitwake.c | 17 ++++++- 7 files changed, 138 insertions(+), 21 deletions(-) diff --git a/include/linux/futex.h b/include/linux/futex.h index 61e81b866d34e..359fc24eb37ff 100644 --- a/include/linux/futex.h +++ b/include/linux/futex.h @@ -84,7 +84,7 @@ void futex_hash_free(struct mm_struct *mm); =20 static inline void futex_mm_init(struct mm_struct *mm) { - mm->futex_hash_bucket =3D NULL; + rcu_assign_pointer(mm->futex_hash_bucket, NULL); } =20 #else diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 2d25be28fa35f..057ad1de59ca0 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -30,7 +30,7 @@ #define INIT_PASID 0 =20 struct address_space; -struct futex_hash_bucket; +struct futex_hash_bucket_private; struct mem_cgroup; =20 /* @@ -899,8 +899,7 @@ struct mm_struct { int mm_lock_seq; #endif =20 - unsigned int futex_hash_mask; - struct futex_hash_bucket *futex_hash_bucket; + struct futex_hash_bucket_private __rcu *futex_hash_bucket; =20 unsigned long hiwater_rss; /* High-watermark of RSS usage */ unsigned long hiwater_vm; /* High-water virtual memory usage */ diff --git a/kernel/futex/core.c b/kernel/futex/core.c index 5b66b6e52aeb5..cff5652a29917 100644 --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -40,6 +40,7 @@ #include #include #include +#include =20 #include "futex.h" #include "../locking/rtmutex_common.h" @@ -56,6 +57,12 @@ static struct { #define futex_queues (__futex_data.queues) #define futex_hashsize (__futex_data.hashsize) =20 +struct futex_hash_bucket_private { + rcuref_t users; + unsigned int hash_mask; + struct rcu_head rcu; + struct futex_hash_bucket queues[]; +}; =20 /* * Fault injections for futexes. @@ -127,17 +134,24 @@ static inline bool futex_key_is_private(union futex_k= ey *key) */ struct futex_hash_bucket *futex_hash(union futex_key *key) { - struct futex_hash_bucket *fhb; + struct futex_hash_bucket_private *hb_p =3D NULL; u32 hash; =20 - fhb =3D current->mm->futex_hash_bucket; - if (fhb && futex_key_is_private(key)) { - u32 hash_mask =3D current->mm->futex_hash_mask; + if (futex_key_is_private(key)) { + guard(rcu)(); + + do { + hb_p =3D rcu_dereference(current->mm->futex_hash_bucket); + } while (hb_p && !rcuref_get(&hb_p->users)); + } + + if (hb_p) { + u32 hash_mask =3D hb_p->hash_mask; =20 hash =3D jhash2((void *)&key->private.address, sizeof(key->private.address) / 4, key->both.offset); - return &fhb[hash & hash_mask]; + return &hb_p->queues[hash & hash_mask]; } hash =3D jhash2((u32 *)key, offsetof(typeof(*key), both.offset) / 4, @@ -145,6 +159,35 @@ struct futex_hash_bucket *futex_hash(union futex_key *= key) return &futex_queues[hash & (futex_hashsize - 1)]; } =20 +static void futex_hash_priv_put(struct futex_hash_bucket_private *hb_p) +{ + if (rcuref_put(&hb_p->users)) + kvfree_rcu(hb_p, rcu); +} + +void futex_hash_put(struct futex_hash_bucket *hb) +{ + struct futex_hash_bucket_private *hb_p; + + if (hb->hb_slot =3D=3D 0) + return; + hb_p =3D container_of(hb, struct futex_hash_bucket_private, + queues[hb->hb_slot - 1]); + futex_hash_priv_put(hb_p); +} + +void futex_hash_get(struct futex_hash_bucket *hb) +{ + struct futex_hash_bucket_private *hb_p; + + if (hb->hb_slot =3D=3D 0) + return; + + hb_p =3D container_of(hb, struct futex_hash_bucket_private, + queues[hb->hb_slot - 1]); + /* The ref needs to be owned by the caller so this can't fail */ + WARN_ON_ONCE(!rcuref_get(&hb_p->users)); +} =20 /** * futex_setup_timer - set up the sleeping hrtimer. @@ -621,7 +664,10 @@ int futex_unqueue(struct futex_q *q) */ lock_ptr =3D READ_ONCE(q->lock_ptr); if (lock_ptr !=3D NULL) { + struct futex_hash_bucket *hb; + spin_lock(lock_ptr); + hb =3D futex_hb_from_futex_q(q); /* * q->lock_ptr can change between reading it and * spin_lock(), causing us to take the wrong lock. This @@ -644,6 +690,7 @@ int futex_unqueue(struct futex_q *q) BUG_ON(q->pi_state); =20 spin_unlock(lock_ptr); + futex_hash_put(hb); ret =3D 1; } =20 @@ -1021,6 +1068,7 @@ static void exit_pi_state_list(struct task_struct *cu= rr) if (!refcount_inc_not_zero(&pi_state->refcount)) { raw_spin_unlock_irq(&curr->pi_lock); cpu_relax(); + futex_hash_put(hb); raw_spin_lock_irq(&curr->pi_lock); continue; } @@ -1037,6 +1085,7 @@ static void exit_pi_state_list(struct task_struct *cu= rr) /* retain curr->pi_lock for the loop invariant */ raw_spin_unlock(&pi_state->pi_mutex.wait_lock); spin_unlock(&hb->lock); + futex_hash_put(hb); put_pi_state(pi_state); continue; } @@ -1049,6 +1098,7 @@ static void exit_pi_state_list(struct task_struct *cu= rr) raw_spin_unlock(&curr->pi_lock); raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock); spin_unlock(&hb->lock); + futex_hash_put(hb); =20 rt_mutex_futex_unlock(&pi_state->pi_mutex); put_pi_state(pi_state); @@ -1178,12 +1228,20 @@ static void futex_hash_bucket_init(struct futex_has= h_bucket *fhb) =20 void futex_hash_free(struct mm_struct *mm) { - kvfree(mm->futex_hash_bucket); + struct futex_hash_bucket_private *hb_p; + + /* own a reference */ + hb_p =3D rcu_dereference_check(mm->futex_hash_bucket, true); + if (!hb_p) + return; + WARN_ON(rcuref_read(&hb_p->users) !=3D 1); + futex_hash_priv_put(hb_p); } =20 static int futex_hash_allocate(unsigned int hash_slots) { - struct futex_hash_bucket *fhb; + struct futex_hash_bucket_private *hb_p; + size_t alloc_size; int i; =20 if (current->mm->futex_hash_bucket) @@ -1199,16 +1257,27 @@ static int futex_hash_allocate(unsigned int hash_sl= ots) if (!is_power_of_2(hash_slots)) hash_slots =3D rounddown_pow_of_two(hash_slots); =20 - fhb =3D kvmalloc_array(hash_slots, sizeof(struct futex_hash_bucket), GFP_= KERNEL_ACCOUNT); - if (!fhb) + if (unlikely(check_mul_overflow(hash_slots, sizeof(struct futex_hash_buck= et), + &alloc_size))) return -ENOMEM; =20 - current->mm->futex_hash_mask =3D hash_slots - 1; + if (unlikely(check_add_overflow(alloc_size, sizeof(struct futex_hash_buck= et_private), + &alloc_size))) + return -ENOMEM; =20 - for (i =3D 0; i < hash_slots; i++) - futex_hash_bucket_init(&fhb[i]); + hb_p =3D kvmalloc(alloc_size, GFP_KERNEL_ACCOUNT); + if (!hb_p) + return -ENOMEM; =20 - current->mm->futex_hash_bucket =3D fhb; + rcuref_init(&hb_p->users, 1); + hb_p->hash_mask =3D hash_slots - 1; + + for (i =3D 0; i < hash_slots; i++) { + futex_hash_bucket_init(&hb_p->queues[i]); + hb_p->queues[i].hb_slot =3D i + 1; + } + + rcu_assign_pointer(current->mm->futex_hash_bucket, hb_p); return 0; } =20 @@ -1219,8 +1288,12 @@ int futex_hash_allocate_default(void) =20 static int futex_hash_get_slots(void) { - if (current->mm->futex_hash_bucket) - return current->mm->futex_hash_mask + 1; + struct futex_hash_bucket_private *hb_p; + + guard(rcu)(); + hb_p =3D rcu_dereference(current->mm->futex_hash_bucket); + if (hb_p) + return hb_p->hash_mask + 1; return 0; } =20 diff --git a/kernel/futex/futex.h b/kernel/futex/futex.h index 8b195d06f4e8e..c6d59949766d2 100644 --- a/kernel/futex/futex.h +++ b/kernel/futex/futex.h @@ -114,6 +114,7 @@ static inline bool should_fail_futex(bool fshared) */ struct futex_hash_bucket { atomic_t waiters; + unsigned int hb_slot; spinlock_t lock; struct plist_head chain; } ____cacheline_aligned_in_smp; @@ -201,6 +202,13 @@ futex_setup_timer(ktime_t *time, struct hrtimer_sleepe= r *timeout, int flags, u64 range_ns); =20 extern struct futex_hash_bucket *futex_hash(union futex_key *key); +extern void futex_hash_put(struct futex_hash_bucket *hb); +extern void futex_hash_get(struct futex_hash_bucket *hb); + +static inline struct futex_hash_bucket *futex_hb_from_futex_q(struct futex= _q *q) +{ + return container_of(q->lock_ptr, struct futex_hash_bucket, lock); +} =20 /** * futex_match - Check whether two futex keys are equal diff --git a/kernel/futex/pi.c b/kernel/futex/pi.c index 5722467f27379..399ac712f1fd6 100644 --- a/kernel/futex/pi.c +++ b/kernel/futex/pi.c @@ -963,6 +963,7 @@ int futex_lock_pi(u32 __user *uaddr, unsigned int flags= , ktime_t *time, int tryl * - EAGAIN: The user space value changed. */ futex_q_unlock(hb); + futex_hash_put(hb); /* * Handle the case where the owner is in the middle of * exiting. Wait for the exit to complete otherwise @@ -1079,10 +1080,12 @@ int futex_lock_pi(u32 __user *uaddr, unsigned int f= lags, ktime_t *time, int tryl =20 futex_unqueue_pi(&q); spin_unlock(q.lock_ptr); + futex_hash_put(hb); goto out; =20 out_unlock_put_key: futex_q_unlock(hb); + futex_hash_put(hb); =20 out: if (to) { @@ -1093,6 +1096,7 @@ int futex_lock_pi(u32 __user *uaddr, unsigned int fla= gs, ktime_t *time, int tryl =20 uaddr_faulted: futex_q_unlock(hb); + futex_hash_put(hb); =20 ret =3D fault_in_user_writeable(uaddr); if (ret) @@ -1193,6 +1197,7 @@ int futex_unlock_pi(u32 __user *uaddr, unsigned int f= lags) =20 get_pi_state(pi_state); spin_unlock(&hb->lock); + futex_hash_put(hb); =20 /* drops pi_state->pi_mutex.wait_lock */ ret =3D wake_futex_pi(uaddr, uval, pi_state, rt_waiter); @@ -1232,6 +1237,7 @@ int futex_unlock_pi(u32 __user *uaddr, unsigned int f= lags) */ if ((ret =3D futex_cmpxchg_value_locked(&curval, uaddr, uval, 0))) { spin_unlock(&hb->lock); + futex_hash_put(hb); switch (ret) { case -EFAULT: goto pi_faulted; @@ -1252,6 +1258,7 @@ int futex_unlock_pi(u32 __user *uaddr, unsigned int f= lags) =20 out_unlock: spin_unlock(&hb->lock); + futex_hash_put(hb); return ret; =20 pi_retry: diff --git a/kernel/futex/requeue.c b/kernel/futex/requeue.c index b47bb764b3520..d271e0fd146a5 100644 --- a/kernel/futex/requeue.c +++ b/kernel/futex/requeue.c @@ -87,6 +87,8 @@ void requeue_futex(struct futex_q *q, struct futex_hash_b= ucket *hb1, futex_hb_waiters_inc(hb2); plist_add(&q->list, &hb2->chain); q->lock_ptr =3D &hb2->lock; + futex_hash_put(hb1); + futex_hash_get(hb2); } q->key =3D *key2; } @@ -233,6 +235,7 @@ void requeue_pi_wake_futex(struct futex_q *q, union fut= ex_key *key, q->rt_waiter =3D NULL; =20 q->lock_ptr =3D &hb->lock; + futex_hash_get(hb); =20 /* Signal locked state to the waiter */ futex_requeue_pi_complete(q, 1); @@ -327,6 +330,7 @@ futex_proxy_trylock_atomic(u32 __user *pifutex, struct = futex_hash_bucket *hb1, * consistent and the waiter can return to user space * immediately after the wakeup. */ + futex_hash_put(hb1); requeue_pi_wake_futex(top_waiter, key2, hb2); } else if (ret < 0) { /* Rewind top_waiter::requeue_state */ @@ -458,6 +462,8 @@ int futex_requeue(u32 __user *uaddr1, unsigned int flag= s1, if (unlikely(ret)) { double_unlock_hb(hb1, hb2); futex_hb_waiters_dec(hb2); + futex_hash_put(hb1); + futex_hash_put(hb2); =20 ret =3D get_user(curval, uaddr1); if (ret) @@ -544,6 +550,8 @@ int futex_requeue(u32 __user *uaddr1, unsigned int flag= s1, case -EFAULT: double_unlock_hb(hb1, hb2); futex_hb_waiters_dec(hb2); + futex_hash_put(hb1); + futex_hash_put(hb2); ret =3D fault_in_user_writeable(uaddr2); if (!ret) goto retry; @@ -558,6 +566,8 @@ int futex_requeue(u32 __user *uaddr1, unsigned int flag= s1, */ double_unlock_hb(hb1, hb2); futex_hb_waiters_dec(hb2); + futex_hash_put(hb1); + futex_hash_put(hb2); /* * Handle the case where the owner is in the middle of * exiting. Wait for the exit to complete otherwise @@ -677,6 +687,8 @@ int futex_requeue(u32 __user *uaddr1, unsigned int flag= s1, double_unlock_hb(hb1, hb2); wake_up_q(&wake_q); futex_hb_waiters_dec(hb2); + futex_hash_put(hb1); + futex_hash_put(hb2); return ret ? ret : task_count; } =20 @@ -815,6 +827,7 @@ int futex_wait_requeue_pi(u32 __user *uaddr, unsigned i= nt flags, */ if (futex_match(&q.key, &key2)) { futex_q_unlock(hb); + futex_hash_put(hb); ret =3D -EINVAL; goto out; } @@ -828,6 +841,8 @@ int futex_wait_requeue_pi(u32 __user *uaddr, unsigned i= nt flags, spin_lock(&hb->lock); ret =3D handle_early_requeue_pi_wakeup(hb, &q, to); spin_unlock(&hb->lock); + /* XXX */ + futex_hash_put(hb); break; =20 case Q_REQUEUE_PI_LOCKED: @@ -847,6 +862,7 @@ int futex_wait_requeue_pi(u32 __user *uaddr, unsigned i= nt flags, */ ret =3D ret < 0 ? ret : 0; } + futex_hash_put(futex_hb_from_futex_q(&q)); break; =20 case Q_REQUEUE_PI_DONE: @@ -876,6 +892,7 @@ int futex_wait_requeue_pi(u32 __user *uaddr, unsigned i= nt flags, =20 futex_unqueue_pi(&q); spin_unlock(q.lock_ptr); + futex_hash_put(futex_hb_from_futex_q(&q)); =20 if (ret =3D=3D -EINTR) { /* diff --git a/kernel/futex/waitwake.c b/kernel/futex/waitwake.c index 3a10375d95218..628340920b7aa 100644 --- a/kernel/futex/waitwake.c +++ b/kernel/futex/waitwake.c @@ -113,6 +113,8 @@ bool __futex_wake_mark(struct futex_q *q) return false; =20 __futex_unqueue(q); + /* Waiters reference */ + futex_hash_put(futex_hb_from_futex_q(q)); /* * The waiting task can free the futex_q as soon as q->lock_ptr =3D NULL * is written, without taking any locks. This is possible in the event @@ -173,8 +175,10 @@ int futex_wake(u32 __user *uaddr, unsigned int flags, = int nr_wake, u32 bitset) hb =3D futex_hash(&key); =20 /* Make sure we really have tasks to wakeup */ - if (!futex_hb_waiters_pending(hb)) + if (!futex_hb_waiters_pending(hb)) { + futex_hash_put(hb); return ret; + } =20 spin_lock(&hb->lock); =20 @@ -196,6 +200,7 @@ int futex_wake(u32 __user *uaddr, unsigned int flags, i= nt nr_wake, u32 bitset) } =20 spin_unlock(&hb->lock); + futex_hash_put(hb); wake_up_q(&wake_q); return ret; } @@ -275,6 +280,8 @@ int futex_wake_op(u32 __user *uaddr1, unsigned int flag= s, u32 __user *uaddr2, op_ret =3D futex_atomic_op_inuser(op, uaddr2); if (unlikely(op_ret < 0)) { double_unlock_hb(hb1, hb2); + futex_hash_put(hb1); + futex_hash_put(hb2); =20 if (!IS_ENABLED(CONFIG_MMU) || unlikely(op_ret !=3D -EFAULT && op_ret !=3D -EAGAIN)) { @@ -329,6 +336,8 @@ int futex_wake_op(u32 __user *uaddr1, unsigned int flag= s, u32 __user *uaddr2, out_unlock: double_unlock_hb(hb1, hb2); wake_up_q(&wake_q); + futex_hash_put(hb1); + futex_hash_put(hb2); return ret; } =20 @@ -387,7 +396,7 @@ int futex_unqueue_multiple(struct futex_vector *v, int = count) { int ret =3D -1, i; =20 - for (i =3D 0; i < count; i++) { + for (i =3D 0; i < count; i++) { // if (!futex_unqueue(&v[i].q)) ret =3D i; } @@ -466,6 +475,8 @@ int futex_wait_multiple_setup(struct futex_vector *vs, = int count, int *woken) } =20 futex_q_unlock(hb); + futex_hash_put(hb); + __set_current_state(TASK_RUNNING); =20 /* @@ -625,6 +636,7 @@ int futex_wait_setup(u32 __user *uaddr, u32 val, unsign= ed int flags, =20 if (ret) { futex_q_unlock(*hb); + futex_hash_put(*hb); =20 ret =3D get_user(uval, uaddr); if (ret) @@ -638,6 +650,7 @@ int futex_wait_setup(u32 __user *uaddr, u32 val, unsign= ed int flags, =20 if (uval !=3D val) { futex_q_unlock(*hb); + futex_hash_put(*hb); ret =3D -EWOULDBLOCK; } =20 --=20 2.45.2 From nobody Fri Nov 22 17:05:56 2024 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 97B431D8A12 for ; Fri, 15 Nov 2024 17:20:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731691249; cv=none; b=AAUQhDwi0bJRhNxtSP5M9mBoatGzhF6uQoSEGWoJ082/eHIykTjzUdXjQ2ZYjuFfe1rOJxC9mOtaMT4H/kvIu7x3TdnjDdRCWa2JMLtsaSThNhbW/TQrKWM8wtoctPWQxewl8RIhDYZDh0iqHVcCDETEL7QlRZ4jFr9LcCg9oL0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731691249; c=relaxed/simple; bh=KHZ338Al8IJJw1ZC1QdqBLGaCPHTdlMrAcWX2HKh9jo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=XwkfRKvyl7U+5BaCHEnQxERG7o6bg9ki8OjO8oGh/S3U2cdgg1IKfUVofgtYP9e7W1nD5JGFTVFuDpjxn2mr23Hml/BJyOTX9TNPVcOc2145AVnycd7WDlvY3OtB/YCOPdHsFOKfapHfVjEmvkc0rbSaeSHRHnvqDouNyuMuQJQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=kaSpaicU; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=j+gApzTn; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="kaSpaicU"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="j+gApzTn" From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1731691244; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=L7xbsHSY0yw52MGuaTTTzZKF0el2JDMLD+w9bMwrA7Y=; b=kaSpaicUhLLTqaF+gl0ZWHhlHUy0bzr+1MECxzk9DPlyNhBew1lMTpu2nKx8YLFJvoHc2s 12DhIk113xVbldLwbFz3FhAFdqt9NdqR2+V3O3Nn01lR4Vm1Wi/PhMPunhBSEDo1z5A64U MpIpLtVKwyxxDPFzaawXoHUsuO3gleD4tf32yiDsp8fZDU+l2sSTYFxJrfGE627UlvDCzk sPN5dTJuCMgWfvUpxafwvaWbS75iFZ4OFvCsN5M+UqEBcWeZFyknS9wcQHJJ/0luLMAXnQ NtvmIZ55vIJ96jI9Ns0KnmPEgPCtUsTM2sPWMX1wBVaehcVY1bfcqcd5J5C0hA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1731691244; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=L7xbsHSY0yw52MGuaTTTzZKF0el2JDMLD+w9bMwrA7Y=; b=j+gApzTnfB25ML3ndORcAmz/N1PLTz1F6rct2lvEcqxdUgcu6VCMe04ZPJutT6ZCT2eEm2 lmvbcn4hCJTRYcCA== To: linux-kernel@vger.kernel.org Cc: =?UTF-8?q?Andr=C3=A9=20Almeida?= , Darren Hart , Davidlohr Bueso , Ingo Molnar , Juri Lelli , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Waiman Long , Sebastian Andrzej Siewior Subject: [RFC PATCH v3 6/9] futex: Allow to re-allocate the private hash bucket. Date: Fri, 15 Nov 2024 17:58:47 +0100 Message-ID: <20241115172035.795842-7-bigeasy@linutronix.de> In-Reply-To: <20241115172035.795842-1-bigeasy@linutronix.de> References: <20241115172035.795842-1-bigeasy@linutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The mm_struct::futex_hash_lock guards the futex_hash_bucket assignment/ replacement. The futex_hash_allocate()/ PR_FUTEX_HASH_SET_SLOTS operation can now be invoked at runtime and resize the internal private futex_hash_bucket to another size. The idea is to use the recently introduced ref counting to keep a valid HB around. On resize/ replacement the new HB is assigned and all users currently queued on hb will get poked so they can requeue themself. This has been only tested with FUTEX_LOCK_PI. Signed-off-by: Sebastian Andrzej Siewior --- include/linux/futex.h | 1 + include/linux/mm_types.h | 1 + kernel/futex/core.c | 64 ++++++++++++++++++++++++++++----- kernel/futex/futex.h | 1 + kernel/futex/pi.c | 25 +++++++++++++ kernel/locking/rtmutex.c | 26 ++++++++++++++ kernel/locking/rtmutex_common.h | 2 ++ 7 files changed, 111 insertions(+), 9 deletions(-) diff --git a/include/linux/futex.h b/include/linux/futex.h index 359fc24eb37ff..838a5a6be0444 100644 --- a/include/linux/futex.h +++ b/include/linux/futex.h @@ -85,6 +85,7 @@ void futex_hash_free(struct mm_struct *mm); static inline void futex_mm_init(struct mm_struct *mm) { rcu_assign_pointer(mm->futex_hash_bucket, NULL); + mutex_init(&mm->futex_hash_lock); } =20 #else diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 057ad1de59ca0..5bf86ea363780 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -899,6 +899,7 @@ struct mm_struct { int mm_lock_seq; #endif =20 + struct mutex futex_hash_lock; struct futex_hash_bucket_private __rcu *futex_hash_bucket; =20 unsigned long hiwater_rss; /* High-watermark of RSS usage */ diff --git a/kernel/futex/core.c b/kernel/futex/core.c index cff5652a29917..70d4b1d93bbb8 100644 --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -595,6 +595,7 @@ struct futex_hash_bucket *futex_q_lock(struct futex_q *= q) { struct futex_hash_bucket *hb; =20 +try_again: hb =3D futex_hash(&q->key); =20 /* @@ -610,7 +611,13 @@ struct futex_hash_bucket *futex_q_lock(struct futex_q = *q) q->lock_ptr =3D &hb->lock; =20 spin_lock(&hb->lock); - return hb; + if (futex_check_hb_valid(hb)) + return hb; + + futex_hb_waiters_dec(hb); + spin_unlock(&hb->lock); + futex_hash_put(hb); + goto try_again; } =20 void futex_q_unlock(struct futex_hash_bucket *hb) @@ -1238,18 +1245,50 @@ void futex_hash_free(struct mm_struct *mm) futex_hash_priv_put(hb_p); } =20 +static void futex_put_old_hb_p(struct futex_hash_bucket_private *hb_p) +{ + unsigned int slots =3D hb_p->hash_mask + 1; + struct futex_hash_bucket *hb; + DEFINE_WAKE_Q(wake_q); + unsigned int i; + + for (i =3D 0; i < slots; i++) { + struct futex_q *this; + + hb =3D &hb_p->queues[i]; + + spin_lock(&hb->lock); + plist_for_each_entry(this, &hb->chain, list) + wake_q_add(&wake_q, this->task); + spin_unlock(&hb->lock); + } + futex_hash_priv_put(hb_p); + + wake_up_q(&wake_q); +} + +bool futex_check_hb_valid(struct futex_hash_bucket *hb) +{ + struct futex_hash_bucket_private *hb_p_now; + struct futex_hash_bucket_private *hb_p; + + if (hb->hb_slot =3D=3D 0) + return true; + guard(rcu)(); + hb_p_now =3D rcu_dereference(current->mm->futex_hash_bucket); + hb_p =3D container_of(hb, struct futex_hash_bucket_private, + queues[hb->hb_slot - 1]); + + return hb_p_now =3D=3D hb_p; +} + static int futex_hash_allocate(unsigned int hash_slots) { - struct futex_hash_bucket_private *hb_p; + struct futex_hash_bucket_private *hb_p, *hb_p_old =3D NULL; + struct mm_struct *mm; size_t alloc_size; int i; =20 - if (current->mm->futex_hash_bucket) - return -EALREADY; - - if (!thread_group_leader(current)) - return -EINVAL; - if (hash_slots < 2) hash_slots =3D 2; if (hash_slots > 131072) @@ -1277,7 +1316,14 @@ static int futex_hash_allocate(unsigned int hash_slo= ts) hb_p->queues[i].hb_slot =3D i + 1; } =20 - rcu_assign_pointer(current->mm->futex_hash_bucket, hb_p); + mm =3D current->mm; + scoped_guard(mutex, &mm->futex_hash_lock) { + hb_p_old =3D rcu_dereference_check(mm->futex_hash_bucket, + lockdep_is_held(&mm->futex_hash_lock)); + rcu_assign_pointer(mm->futex_hash_bucket, hb_p); + } + if (hb_p_old) + futex_put_old_hb_p(hb_p_old); return 0; } =20 diff --git a/kernel/futex/futex.h b/kernel/futex/futex.h index c6d59949766d2..b974d675730e4 100644 --- a/kernel/futex/futex.h +++ b/kernel/futex/futex.h @@ -204,6 +204,7 @@ futex_setup_timer(ktime_t *time, struct hrtimer_sleeper= *timeout, extern struct futex_hash_bucket *futex_hash(union futex_key *key); extern void futex_hash_put(struct futex_hash_bucket *hb); extern void futex_hash_get(struct futex_hash_bucket *hb); +extern bool futex_check_hb_valid(struct futex_hash_bucket *hb); =20 static inline struct futex_hash_bucket *futex_hb_from_futex_q(struct futex= _q *q) { diff --git a/kernel/futex/pi.c b/kernel/futex/pi.c index 399ac712f1fd6..1a0a9cd31f911 100644 --- a/kernel/futex/pi.c +++ b/kernel/futex/pi.c @@ -998,6 +998,7 @@ int futex_lock_pi(u32 __user *uaddr, unsigned int flags= , ktime_t *time, int tryl rt_mutex_pre_schedule(); =20 rt_mutex_init_waiter(&rt_waiter); + rt_waiter.hb =3D hb; =20 /* * On PREEMPT_RT, when hb->lock becomes an rt_mutex, we must not @@ -1066,6 +1067,23 @@ int futex_lock_pi(u32 __user *uaddr, unsigned int fl= ags, ktime_t *time, int tryl */ rt_mutex_post_schedule(); no_block: + if (!futex_check_hb_valid(hb)) { + /* + * We might got the lock, we might not. If the HB changed under + * us it was all for nothing. Try again from scratch. + */ + futex_unqueue_pi(&q); + spin_unlock(q.lock_ptr); + futex_hash_put(hb); + + if (to) { + hrtimer_cancel(&to->timer); + destroy_hrtimer_on_stack(&to->timer); + } + if (refill_pi_state_cache()) + return -ENOMEM; + goto retry_private; + } /* * Fixup the pi_state owner and possibly acquire the lock if we * haven't already. @@ -1226,6 +1244,12 @@ int futex_unlock_pi(u32 __user *uaddr, unsigned int = flags) * space. */ return ret; + } else { + if (!futex_check_hb_valid(hb)) { + spin_unlock(&hb->lock); + futex_hash_put(hb); + goto retry; + } } =20 /* @@ -1250,6 +1274,7 @@ int futex_unlock_pi(u32 __user *uaddr, unsigned int f= lags) return ret; } } + /* XXX if the HB changed but uval did not, we might need to check if ther= e is a waiter pending */ =20 /* * If uval has changed, let user space handle it. diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c index ebebd0eec7f63..188a9b16412df 100644 --- a/kernel/locking/rtmutex.c +++ b/kernel/locking/rtmutex.c @@ -56,10 +56,29 @@ static inline int __ww_mutex_check_kill(struct rt_mutex= *lock, return 0; } =20 +extern bool futex_check_hb_valid(struct futex_hash_bucket *hb); + +static inline bool __internal_retry_reason(struct rt_mutex_waiter *waiter) +{ + if (!IS_ENABLED(CONFIG_FUTEX)) + return false; + + if (!waiter->hb) + return false; + if (futex_check_hb_valid(waiter->hb)) + return false; + return true; +} + #else # define build_ww_mutex() (true) # define ww_container_of(rtm) container_of(rtm, struct ww_mutex, base) # include "ww_mutex.h" + +static inline bool __internal_retry_reason(struct rt_mutex_waiter *waiter) +{ + return false; +} #endif =20 /* @@ -1626,6 +1645,13 @@ static int __sched rt_mutex_slowlock_block(struct rt= _mutex_base *lock, break; } =20 + if (!build_ww_mutex()) { + if (__internal_retry_reason(waiter)) { + ret =3D -EAGAIN; + break; + } + } + if (waiter =3D=3D rt_mutex_top_waiter(lock)) owner =3D rt_mutex_owner(lock); else diff --git a/kernel/locking/rtmutex_common.h b/kernel/locking/rtmutex_commo= n.h index 1162e07cdaea1..fb26ad08f259a 100644 --- a/kernel/locking/rtmutex_common.h +++ b/kernel/locking/rtmutex_common.h @@ -56,6 +56,7 @@ struct rt_mutex_waiter { struct rt_mutex_base *lock; unsigned int wake_state; struct ww_acquire_ctx *ww_ctx; + struct futex_hash_bucket *hb; }; =20 /** @@ -215,6 +216,7 @@ static inline void rt_mutex_init_waiter(struct rt_mutex= _waiter *waiter) RB_CLEAR_NODE(&waiter->tree.entry); waiter->wake_state =3D TASK_NORMAL; waiter->task =3D NULL; + waiter->hb =3D NULL; } =20 static inline void rt_mutex_init_rtlock_waiter(struct rt_mutex_waiter *wai= ter) --=20 2.45.2 From nobody Fri Nov 22 17:05:56 2024 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 709281D79B6 for ; Fri, 15 Nov 2024 17:20:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731691249; cv=none; b=omsSItzZOQ/whVBLB+2Nq6xdahv9i8vC3IcBmBJXxNYePDiNgFrq1yqCrUob8fGP1cuLdjNJFfoRKMdZFVTOYby7GnbUQU9SOo7bbveCXJRsSWHlCFuIBqnkggeo4LGL9DHomynpbdj4qOZy2hkZggzIHYYVVkUeAMsjPaZT7lk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731691249; c=relaxed/simple; bh=x6QccMB6BhKqfr3kAcVgmd1YAopzjROuqdjEMMellmA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=OOg8wqcXIGhXq/xGxBOn75kctrUvJxnfJ5ezF5cNSHbD0LS3uXrcuGm3U2kmQMAp1CI4IkU9moGXvgLx003XuHovJWcQ0zWSKNmkZ6ZToFiJ0wU1/i32Sr6nSuz98kSrmrUGYzueLbQurWliFoEB7jTRZGeVzjotY5l8nrkpO0I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=hWqHnchP; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=83TaXFFc; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="hWqHnchP"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="83TaXFFc" From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1731691244; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rAwV6nCJQNFw7kc3qJeMm8PhLtWI4KD/ANhIrG6zzns=; b=hWqHnchPKB7aoTiSd3TYJIDANsGO3prugD3sWFSFwF/lSYAmzgaH48KRmdqAFbdoZUhRcR 6O0DRFll5ChkSHPiafrweVfkfCqYRnbfaKcip+txowdOf9ZG3eezbgSY5b+aRCLv4DIAhq 7DXX5uiKaZOUq3aVhw/1aYIU3GZIz9Hv12pqky+8JlhvChRM8eRApx4ZrT7RFVpgerPuSm gSmxswY5mWR2wft7wtK6rP4aA0xsCKCGGHUb+tAF0hH/a67lrWvonCV8sR16KpfGc1TRAM I5XkD3momnHHWT8EcYp43XMpNVLE+YK5DI8tX3J/exuE3anlMj3WWW7D8fPf3A== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1731691244; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rAwV6nCJQNFw7kc3qJeMm8PhLtWI4KD/ANhIrG6zzns=; b=83TaXFFcHk98NnjD1v8Pf/T2ZSsGxMxYg0yNntnuwWYviy9VpwqtTNnQkZ4hxM0C4xc6V8 Sla2oA559TSki3Ag== To: linux-kernel@vger.kernel.org Cc: =?UTF-8?q?Andr=C3=A9=20Almeida?= , Darren Hart , Davidlohr Bueso , Ingo Molnar , Juri Lelli , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Waiman Long , Sebastian Andrzej Siewior Subject: [RFC PATCH v3 7/9] =?UTF-8?q?tools/perf:=20Add=20the=20prctl(PR?= =?UTF-8?q?=5FFUTEX=5FHASH,=E2=80=A6)=20to=20futex-hash.?= Date: Fri, 15 Nov 2024 17:58:48 +0100 Message-ID: <20241115172035.795842-8-bigeasy@linutronix.de> In-Reply-To: <20241115172035.795842-1-bigeasy@linutronix.de> References: <20241115172035.795842-1-bigeasy@linutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Wire up PR_FUTEX_HASH to futex-hash. Use the `-b' argument to specify the number of buckets. Read it back and show during invocation. Signed-off-by: Sebastian Andrzej Siewior --- tools/perf/bench/futex-hash.c | 19 +++++++++++++++++-- tools/perf/bench/futex.h | 1 + 2 files changed, 18 insertions(+), 2 deletions(-) diff --git a/tools/perf/bench/futex-hash.c b/tools/perf/bench/futex-hash.c index b472eded521b1..1f7a33f8d078e 100644 --- a/tools/perf/bench/futex-hash.c +++ b/tools/perf/bench/futex-hash.c @@ -22,6 +22,7 @@ #include #include #include +#include =20 #include "../util/mutex.h" #include "../util/stat.h" @@ -53,6 +54,7 @@ static struct bench_futex_parameters params =3D { }; =20 static const struct option options[] =3D { + OPT_UINTEGER('b', "buckets", ¶ms.nbuckets, "Task local futex buckets = to allocate"), OPT_UINTEGER('t', "threads", ¶ms.nthreads, "Specify amount of threads= "), OPT_UINTEGER('r', "runtime", ¶ms.runtime, "Specify runtime (in second= s)"), OPT_UINTEGER('f', "futexes", ¶ms.nfutexes, "Specify amount of futexes= per threads"), @@ -120,6 +122,10 @@ static void print_summary(void) (int)bench__runtime.tv_sec); } =20 +#define PR_FUTEX_HASH 74 +# define PR_FUTEX_HASH_SET_SLOTS 1 +# define PR_FUTEX_HASH_GET_SLOTS 2 + int bench_futex_hash(int argc, const char **argv) { int ret =3D 0; @@ -131,6 +137,7 @@ int bench_futex_hash(int argc, const char **argv) struct perf_cpu_map *cpu; int nrcpus; size_t size; + int num_buckets; =20 argc =3D parse_options(argc, argv, options, bench_futex_hash_usage, 0); if (argc) { @@ -147,6 +154,14 @@ int bench_futex_hash(int argc, const char **argv) act.sa_sigaction =3D toggle_done; sigaction(SIGINT, &act, NULL); =20 + ret =3D prctl(PR_FUTEX_HASH, PR_FUTEX_HASH_SET_SLOTS, params.nbuckets); + if (ret) { + printf("Allocation of %u hash buckets failed: %d/%m\n", + params.nbuckets, ret); + goto errmem; + } + num_buckets =3D prctl(PR_FUTEX_HASH, PR_FUTEX_HASH_GET_SLOTS); + if (params.mlockall) { if (mlockall(MCL_CURRENT | MCL_FUTURE)) err(EXIT_FAILURE, "mlockall"); @@ -162,8 +177,8 @@ int bench_futex_hash(int argc, const char **argv) if (!params.fshared) futex_flag =3D FUTEX_PRIVATE_FLAG; =20 - printf("Run summary [PID %d]: %d threads, each operating on %d [%s] futex= es for %d secs.\n\n", - getpid(), params.nthreads, params.nfutexes, params.fshared ? "shar= ed":"private", params.runtime); + printf("Run summary [PID %d]: %d threads, hash slots: %d each operating o= n %d [%s] futexes for %d secs.\n\n", + getpid(), params.nthreads, num_buckets, params.nfutexes, params.fs= hared ? "shared":"private", params.runtime); =20 init_stats(&throughput_stats); mutex_init(&thread_lock); diff --git a/tools/perf/bench/futex.h b/tools/perf/bench/futex.h index ebdc2b032afc1..abc353c63a9a4 100644 --- a/tools/perf/bench/futex.h +++ b/tools/perf/bench/futex.h @@ -20,6 +20,7 @@ struct bench_futex_parameters { bool multi; /* lock-pi */ bool pi; /* requeue-pi */ bool broadcast; /* requeue */ + unsigned int nbuckets; unsigned int runtime; /* seconds*/ unsigned int nthreads; unsigned int nfutexes; --=20 2.45.2 From nobody Fri Nov 22 17:05:56 2024 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 994861D8DE0 for ; Fri, 15 Nov 2024 17:20:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731691249; cv=none; b=hi4MI0uJCMJ2i8NBIuLUhnVs5C55497PRGtCZS9Cu54bjrabnuWxHc3GvccSZhTqVbhU0vnzjtn0q/DgK6xc2eBaEvMAFMxUrIGUQCNK3DPFRJjFNYyQsgiw7aSAB7IIXsR5M3Ii5gTEfVwqqbXT74S4fAjIDxfwf7a2vGjkCqU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731691249; c=relaxed/simple; bh=IVK4Dsc/GpAtVlvaa/UZo/8T7lHRrgN1aJSTQfnI5OY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=gjY1BtLYIzUKNdNpxEv0JmECoiEwA6L292asKFPpy7fREcP/4dxY/RrDFtNTqcKcbYLYlT1LwZ+C61yKYcPMMSP2HPWv3GRg0GyFBFt1+SwcBDdRKgPkz/G9TejjcOsttmLZw97d6vc2LxPbiQbXQMFdG7zqGouWm1n3mdttcqY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=lAxdx/Gu; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=knwb9IIN; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="lAxdx/Gu"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="knwb9IIN" From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1731691245; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PaWwegc6n6sICWXGqXLO2VC/9Q5mBnHbY++fhoom6jY=; b=lAxdx/Gurfqf9CaFhtNojG8ANvxlf09NXaPUAHy5yIx/ctjhmPfg7ShZE7unpobae6t+NJ mmZd+ZtDeefCVnBUm0KhO89Epb+bRvPV/O9DLy+n93rulfmimXdirg/UdFzZccpm4JLobe h6NKNW5ejQ9HBIM2PKoZmWdvEYE8g03GQzJkfnSNQnn8snsPUOVNxCD/TrdMyNZDV8DFwy i3tOCLIOzQZ3EedHvjTwLg6s2BFAhwxGpVa89aOfnf0+RGKZTMKPUIPHrGlvmQGKpVZ36H CuXGSGpJLcEk7F6OwRlvBPlF7892BQQuynNUIkHkCO1hm7mwDhB/mW7dcl8W7A== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1731691245; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PaWwegc6n6sICWXGqXLO2VC/9Q5mBnHbY++fhoom6jY=; b=knwb9IINridXiNY0mdsR/4MgkNmjWhRuYP0VNkUaygZpPT18wYu2i5LrB83Cs6fAu+KuqY 2dQsoMR7Fup7OnBw== To: linux-kernel@vger.kernel.org Cc: =?UTF-8?q?Andr=C3=A9=20Almeida?= , Darren Hart , Davidlohr Bueso , Ingo Molnar , Juri Lelli , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Waiman Long , Sebastian Andrzej Siewior Subject: [RFC PATCH v3 8/9] tools/perf: The the current affinity for CPU pinning in futex-hash. Date: Fri, 15 Nov 2024 17:58:49 +0100 Message-ID: <20241115172035.795842-9-bigeasy@linutronix.de> In-Reply-To: <20241115172035.795842-1-bigeasy@linutronix.de> References: <20241115172035.795842-1-bigeasy@linutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In order to simplify NUMA local testing, let futex-hash use the current affinity mask and pin the individual threads based on that mask. Signed-off-by: Sebastian Andrzej Siewior --- tools/perf/bench/futex-hash.c | 30 ++++++++++++++++++++++++------ 1 file changed, 24 insertions(+), 6 deletions(-) diff --git a/tools/perf/bench/futex-hash.c b/tools/perf/bench/futex-hash.c index 1f7a33f8d078e..f40b7df6ef3d0 100644 --- a/tools/perf/bench/futex-hash.c +++ b/tools/perf/bench/futex-hash.c @@ -126,10 +126,24 @@ static void print_summary(void) # define PR_FUTEX_HASH_SET_SLOTS 1 # define PR_FUTEX_HASH_GET_SLOTS 2 =20 +static unsigned int get_cpu_bit(cpu_set_t *set, size_t set_size, unsigned = int r_cpu) +{ + unsigned int cpu =3D 0; + + do { + if (CPU_ISSET_S(cpu, set_size, set)) { + if (!r_cpu) + return cpu; + r_cpu--; + } + cpu++; + } while (1); +} + int bench_futex_hash(int argc, const char **argv) { int ret =3D 0; - cpu_set_t *cpuset; + cpu_set_t *cpuset, cpuset_; struct sigaction act; unsigned int i; pthread_attr_t thread_attr; @@ -167,8 +181,12 @@ int bench_futex_hash(int argc, const char **argv) err(EXIT_FAILURE, "mlockall"); } =20 + ret =3D pthread_getaffinity_np(pthread_self(), sizeof(cpuset_), &cpuset_); + BUG_ON(ret); + nrcpus =3D CPU_COUNT(&cpuset_); + if (!params.nthreads) /* default to the number of CPUs */ - params.nthreads =3D perf_cpu_map__nr(cpu); + params.nthreads =3D nrcpus; =20 worker =3D calloc(params.nthreads, sizeof(*worker)); if (!worker) @@ -189,10 +207,9 @@ int bench_futex_hash(int argc, const char **argv) pthread_attr_init(&thread_attr); gettimeofday(&bench__start, NULL); =20 - nrcpus =3D cpu__max_cpu().cpu; - cpuset =3D CPU_ALLOC(nrcpus); + cpuset =3D CPU_ALLOC(4096); BUG_ON(!cpuset); - size =3D CPU_ALLOC_SIZE(nrcpus); + size =3D CPU_ALLOC_SIZE(4096); =20 for (i =3D 0; i < params.nthreads; i++) { worker[i].tid =3D i; @@ -202,7 +219,8 @@ int bench_futex_hash(int argc, const char **argv) =20 CPU_ZERO_S(size, cpuset); =20 - CPU_SET_S(perf_cpu_map__cpu(cpu, i % perf_cpu_map__nr(cpu)).cpu, size, c= puset); + CPU_SET_S(get_cpu_bit(&cpuset_, sizeof(cpuset_), i % nrcpus), size, cpus= et); + ret =3D pthread_attr_setaffinity_np(&thread_attr, size, cpuset); if (ret) { CPU_FREE(cpuset); --=20 2.45.2 From nobody Fri Nov 22 17:05:56 2024 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4C9D81D90CB for ; Fri, 15 Nov 2024 17:20:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731691250; cv=none; b=fQrX+UAbRzuxv8Ej652SNFRXpOaDapWsKxWHfRoUQF5zAwWVyvaX+FNshFdaHWprJOhprzSKsOnKkduhdbc0NGzY8EnnIs6FZBKcC4QFzvIplyYP4pnGCHlqMwNegEV03X206JQQkawEQitA7kww05IwOCAF1pMkjoS9RV907IU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731691250; c=relaxed/simple; bh=ve5F7c3o4tqOGVkj9LxUhARv+iEtJK7WSacclgv+XmQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=GP/XsygY77Te8NB0gtVN3bXT/RXqGa/ucxqPbOgYilbMlQJx1mHHURQ2uuYzZyQrIglVIGGsjASFDIeyNhv2/HO6l88u44w2OUXdWHriWhfR1tlZiR9y689wDcHY5832NUJ1Hjg2Fiu+hNqepGz+b1MiqoJyxj6rkIT0QhCv7KM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=LcGGw/Fj; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=QWS+AALM; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="LcGGw/Fj"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="QWS+AALM" From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1731691245; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qfYUftPAgoBwjrcE4s1jVdEi2kaOhq8RayKWSl+bNvM=; b=LcGGw/FjkYM9CM6ICi6VXyK4k5jluLMNxV4l3H35Qcty17bNcyyfUctfSc6jTrQwobGdSm T8i4DUwUiCmQtPsMO6WFhxtGYKlSPCtoQruqzRd1d3TF8hK5U4VKxLN6pO1HmeTY80cx2/ 65XqYR5EZ6iYT2UR3kf2Hf3OKqono653f1wVDzsboPIKmmbfSiyHAhDNeS5+EXCrH1Z8ss B1oCNtgNOPIsQZj6CaTIkX43DkGvMvRXG1cria5eu3k5M9elGSVgVAKAiNJuM3o/SinxY4 IvD53ZbFe4zcZqi9PYZChwbRFVvpeH2IloTUfsAAdl5RPb1ABwCKqSB2jQsxNA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1731691245; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qfYUftPAgoBwjrcE4s1jVdEi2kaOhq8RayKWSl+bNvM=; b=QWS+AALMaeOvqDygPrYaQ4Iavx0lUs8qLQXHfOdejZjZgbVfMT41eqWsX+0+V9n6uAtMRH jQMyQYsQ8n54gTDA== To: linux-kernel@vger.kernel.org Cc: =?UTF-8?q?Andr=C3=A9=20Almeida?= , Darren Hart , Davidlohr Bueso , Ingo Molnar , Juri Lelli , Peter Zijlstra , Thomas Gleixner , Valentin Schneider , Waiman Long , Sebastian Andrzej Siewior Subject: [RFC PATCH v3 9/9] tools/perf: Allocate futex locks on the local CPU-node. Date: Fri, 15 Nov 2024 17:58:50 +0100 Message-ID: <20241115172035.795842-10-bigeasy@linutronix.de> In-Reply-To: <20241115172035.795842-1-bigeasy@linutronix.de> References: <20241115172035.795842-1-bigeasy@linutronix.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Signed-off-by: Sebastian Andrzej Siewior --- tools/perf/bench/futex-hash.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/tools/perf/bench/futex-hash.c b/tools/perf/bench/futex-hash.c index f40b7df6ef3d0..7c8f3cff3c611 100644 --- a/tools/perf/bench/futex-hash.c +++ b/tools/perf/bench/futex-hash.c @@ -122,6 +122,8 @@ static void print_summary(void) (int)bench__runtime.tv_sec); } =20 +#include + #define PR_FUTEX_HASH 74 # define PR_FUTEX_HASH_SET_SLOTS 1 # define PR_FUTEX_HASH_GET_SLOTS 2 @@ -212,14 +214,19 @@ int bench_futex_hash(int argc, const char **argv) size =3D CPU_ALLOC_SIZE(4096); =20 for (i =3D 0; i < params.nthreads; i++) { + unsigned int cpu_num; worker[i].tid =3D i; - worker[i].futex =3D calloc(params.nfutexes, sizeof(*worker[i].futex)); - if (!worker[i].futex) - goto errmem; =20 CPU_ZERO_S(size, cpuset); + cpu_num =3D get_cpu_bit(&cpuset_, sizeof(cpuset_), i % nrcpus); + //worker[i].futex =3D calloc(params.nfutexes, sizeof(*worker[i].futex)); =20 - CPU_SET_S(get_cpu_bit(&cpuset_, sizeof(cpuset_), i % nrcpus), size, cpus= et); + worker[i].futex =3D numa_alloc_onnode(params.nfutexes * sizeof(*worker[i= ].futex), + numa_node_of_cpu(cpu_num)); + if (worker[i].futex =3D=3D MAP_FAILED || worker[i].futex =3D=3D NULL) + goto errmem; + + CPU_SET_S(cpu_num, size, cpuset); =20 ret =3D pthread_attr_setaffinity_np(&thread_attr, size, cpuset); if (ret) { @@ -271,7 +278,7 @@ int bench_futex_hash(int argc, const char **argv) &worker[i].futex[params.nfutexes-1], t); } =20 - zfree(&worker[i].futex); + numa_free(worker[i].futex, params.nfutexes * sizeof(*worker[i].futex)); } =20 print_summary(); --=20 2.45.2