From nobody Fri Dec 26 19:32:49 2025 Received: from eu-smtp-delivery-151.mimecast.com (eu-smtp-delivery-151.mimecast.com [185.58.85.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CA524BE48 for ; Sun, 31 Dec 2023 21:51:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=ACULAB.COM Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=aculab.com Received: from AcuMS.aculab.com (156.67.243.121 [156.67.243.121]) by relay.mimecast.com with ESMTP with both STARTTLS and AUTH (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id uk-mta-176-11hdVQWzMwqIR5R8bjWLew-1; Sun, 31 Dec 2023 21:51:41 +0000 X-MC-Unique: 11hdVQWzMwqIR5R8bjWLew-1 Received: from AcuMS.Aculab.com (10.202.163.6) by AcuMS.aculab.com (10.202.163.6) with Microsoft SMTP Server (TLS) id 15.0.1497.48; Sun, 31 Dec 2023 21:51:20 +0000 Received: from AcuMS.Aculab.com ([::1]) by AcuMS.aculab.com ([::1]) with mapi id 15.00.1497.048; Sun, 31 Dec 2023 21:51:20 +0000 From: David Laight To: "'linux-kernel@vger.kernel.org'" , "'peterz@infradead.org'" , "'longman@redhat.com'" CC: "'mingo@redhat.com'" , "'will@kernel.org'" , "'boqun.feng@gmail.com'" , "'Linus Torvalds'" , "'virtualization@lists.linux-foundation.org'" , 'Zeng Heng' Subject: [PATCH next v2 1/5] locking/osq_lock: Defer clearing node->locked until the slow osq_lock() path. Thread-Topic: [PATCH next v2 1/5] locking/osq_lock: Defer clearing node->locked until the slow osq_lock() path. Thread-Index: Ado8M4Xvt5eQHXh8TpKpithIgw++9g== Date: Sun, 31 Dec 2023 21:51:20 +0000 Message-ID: <714ca2e587cf4cd485ae04e5afb8d5bb@AcuMS.aculab.com> References: <2b4e8a5816a742d2bd23fdbaa8498e80@AcuMS.aculab.com> In-Reply-To: <2b4e8a5816a742d2bd23fdbaa8498e80@AcuMS.aculab.com> Accept-Language: en-GB, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: aculab.com Content-Language: en-US Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Since node->locked cannot be set before the assignment to prev->next it is save to clear it in the slow path. Signed-off-by: David Laight Reviewed-by: Waiman Long --- kernel/locking/osq_lock.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c index 75a6f6133866..e0bc74d85a76 100644 --- a/kernel/locking/osq_lock.c +++ b/kernel/locking/osq_lock.c @@ -97,7 +97,6 @@ bool osq_lock(struct optimistic_spin_queue *lock) int curr =3D encode_cpu(smp_processor_id()); int old; =20 - node->locked =3D 0; node->next =3D NULL; node->cpu =3D curr; =20 @@ -111,6 +110,7 @@ bool osq_lock(struct optimistic_spin_queue *lock) if (old =3D=3D OSQ_UNLOCKED_VAL) return true; =20 + node->locked =3D 0; prev =3D decode_cpu(old); node->prev =3D prev; =20 --=20 2.17.1 - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1= PT, UK Registration No: 1397386 (Wales) From nobody Fri Dec 26 19:32:49 2025 Received: from eu-smtp-delivery-151.mimecast.com (eu-smtp-delivery-151.mimecast.com [185.58.86.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5A62CC8CA for ; Sun, 31 Dec 2023 21:53:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=ACULAB.COM Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=aculab.com Received: from AcuMS.aculab.com (156.67.243.121 [156.67.243.121]) by relay.mimecast.com with ESMTP with both STARTTLS and AUTH (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id uk-mta-29-UXCaayHdPeixgRLFCuK6JQ-1; Sun, 31 Dec 2023 21:53:13 +0000 X-MC-Unique: UXCaayHdPeixgRLFCuK6JQ-1 Received: from AcuMS.Aculab.com (10.202.163.6) by AcuMS.aculab.com (10.202.163.6) with Microsoft SMTP Server (TLS) id 15.0.1497.48; Sun, 31 Dec 2023 21:52:51 +0000 Received: from AcuMS.Aculab.com ([::1]) by AcuMS.aculab.com ([::1]) with mapi id 15.00.1497.048; Sun, 31 Dec 2023 21:52:51 +0000 From: David Laight To: "'linux-kernel@vger.kernel.org'" , "'peterz@infradead.org'" , "'longman@redhat.com'" CC: "'mingo@redhat.com'" , "'will@kernel.org'" , "'boqun.feng@gmail.com'" , "'Linus Torvalds'" , "'virtualization@lists.linux-foundation.org'" , 'Zeng Heng' Subject: [PATCH next v2 2/5] locking/osq_lock: Optimise the vcpu_is_preempted() check. Thread-Topic: [PATCH next v2 2/5] locking/osq_lock: Optimise the vcpu_is_preempted() check. Thread-Index: Ado8M69TGuBFtEJaSr+qwMS7CJLhkw== Date: Sun, 31 Dec 2023 21:52:51 +0000 Message-ID: <3a9d1782cd50436c99ced8c10175bae6@AcuMS.aculab.com> References: <2b4e8a5816a742d2bd23fdbaa8498e80@AcuMS.aculab.com> In-Reply-To: <2b4e8a5816a742d2bd23fdbaa8498e80@AcuMS.aculab.com> Accept-Language: en-GB, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: aculab.com Content-Language: en-US Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The vcpu_is_preempted() test stops osq_lock() spinning if a virtual cpu is no longer running. Although patched out for bare-metal the code still needs the cpu number. Reading this from 'prev->cpu' is a pretty much guaranteed have a cache miss when osq_unlock() is waking up the next cpu. Instead save 'prev->cpu' in 'node->prev_cpu' and use that value instead. Update in the osq_lock() 'unqueue' path when 'node->prev' is changed. This is simpler than checking for 'node->prev' changing and caching 'prev->cpu'. Signed-off-by: David Laight Reviewed-by: Waiman Long --- kernel/locking/osq_lock.c | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c index e0bc74d85a76..eb8a6dfdb79d 100644 --- a/kernel/locking/osq_lock.c +++ b/kernel/locking/osq_lock.c @@ -14,8 +14,9 @@ =20 struct optimistic_spin_node { struct optimistic_spin_node *next, *prev; - int locked; /* 1 if lock acquired */ - int cpu; /* encoded CPU # + 1 value */ + int locked; /* 1 if lock acquired */ + int cpu; /* encoded CPU # + 1 value */ + int prev_cpu; /* encoded CPU # + 1 value */ }; =20 static DEFINE_PER_CPU_SHARED_ALIGNED(struct optimistic_spin_node, osq_node= ); @@ -29,11 +30,6 @@ static inline int encode_cpu(int cpu_nr) return cpu_nr + 1; } =20 -static inline int node_cpu(struct optimistic_spin_node *node) -{ - return node->cpu - 1; -} - static inline struct optimistic_spin_node *decode_cpu(int encoded_cpu_val) { int cpu_nr =3D encoded_cpu_val - 1; @@ -110,9 +106,10 @@ bool osq_lock(struct optimistic_spin_queue *lock) if (old =3D=3D OSQ_UNLOCKED_VAL) return true; =20 - node->locked =3D 0; + node->prev_cpu =3D old; prev =3D decode_cpu(old); node->prev =3D prev; + node->locked =3D 0; =20 /* * osq_lock() unqueue @@ -144,7 +141,7 @@ bool osq_lock(struct optimistic_spin_queue *lock) * polling, be careful. */ if (smp_cond_load_relaxed(&node->locked, VAL || need_resched() || - vcpu_is_preempted(node_cpu(node->prev)))) + vcpu_is_preempted(READ_ONCE(node->prev_cpu) - 1))) return true; =20 /* unqueue */ @@ -201,6 +198,7 @@ bool osq_lock(struct optimistic_spin_queue *lock) * it will wait in Step-A. */ =20 + WRITE_ONCE(next->prev_cpu, prev->cpu); WRITE_ONCE(next->prev, prev); WRITE_ONCE(prev->next, next); =20 --=20 2.17.1 - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1= PT, UK Registration No: 1397386 (Wales) From nobody Fri Dec 26 19:32:49 2025 Received: from eu-smtp-delivery-151.mimecast.com (eu-smtp-delivery-151.mimecast.com [185.58.85.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ABEC9DF42 for ; Sun, 31 Dec 2023 21:54:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=ACULAB.COM Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=aculab.com Received: from AcuMS.aculab.com (156.67.243.121 [156.67.243.121]) by relay.mimecast.com with ESMTP with both STARTTLS and AUTH (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id uk-mta-254-BKaLleaNN2m97ul2XwiUmA-1; Sun, 31 Dec 2023 21:54:24 +0000 X-MC-Unique: BKaLleaNN2m97ul2XwiUmA-1 Received: from AcuMS.Aculab.com (10.202.163.6) by AcuMS.aculab.com (10.202.163.6) with Microsoft SMTP Server (TLS) id 15.0.1497.48; Sun, 31 Dec 2023 21:54:02 +0000 Received: from AcuMS.Aculab.com ([::1]) by AcuMS.aculab.com ([::1]) with mapi id 15.00.1497.048; Sun, 31 Dec 2023 21:54:02 +0000 From: David Laight To: "'linux-kernel@vger.kernel.org'" , "'peterz@infradead.org'" , "'longman@redhat.com'" CC: "'mingo@redhat.com'" , "'will@kernel.org'" , "'boqun.feng@gmail.com'" , "'Linus Torvalds'" , "'virtualization@lists.linux-foundation.org'" , 'Zeng Heng' Subject: [PATCH next v2 3/5] locking/osq_lock: Use node->prev_cpu instead of saving node->prev. Thread-Topic: [PATCH next v2 3/5] locking/osq_lock: Use node->prev_cpu instead of saving node->prev. Thread-Index: Ado8M93l5L/nOR+URXCdpqS8EUheiQ== Date: Sun, 31 Dec 2023 21:54:02 +0000 Message-ID: <7906aaa73f93493c873e6286c1f96645@AcuMS.aculab.com> References: <2b4e8a5816a742d2bd23fdbaa8498e80@AcuMS.aculab.com> In-Reply-To: <2b4e8a5816a742d2bd23fdbaa8498e80@AcuMS.aculab.com> Accept-Language: en-GB, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: aculab.com Content-Language: en-US Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" node->prev is only used to update 'prev' in the unlikely case of concurrent unqueues. This can be replaced by a check for node->prev_cpu changing and then calling decode_cpu() to get the changed 'prev' pointer. node->cpu (or more particularly) prev->cpu is only used for the osq_wait_next() call in the unqueue path. Normally this is exactly the value that the initial xchg() read from lock->tail (used to obtain 'prev'), but can get updated by concurrent unqueues. Both the 'prev' and 'cpu' members of optimistic_spin_node are now unused and can be deleted. Signed-off-by: David Laight Reviewed-by: Waiman Long --- kernel/locking/osq_lock.c | 31 +++++++++++++++++-------------- 1 file changed, 17 insertions(+), 14 deletions(-) diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c index eb8a6dfdb79d..27324b509f68 100644 --- a/kernel/locking/osq_lock.c +++ b/kernel/locking/osq_lock.c @@ -13,9 +13,8 @@ */ =20 struct optimistic_spin_node { - struct optimistic_spin_node *next, *prev; + struct optimistic_spin_node *next; int locked; /* 1 if lock acquired */ - int cpu; /* encoded CPU # + 1 value */ int prev_cpu; /* encoded CPU # + 1 value */ }; =20 @@ -91,10 +90,9 @@ bool osq_lock(struct optimistic_spin_queue *lock) struct optimistic_spin_node *node =3D this_cpu_ptr(&osq_node); struct optimistic_spin_node *prev, *next; int curr =3D encode_cpu(smp_processor_id()); - int old; + int prev_cpu; =20 node->next =3D NULL; - node->cpu =3D curr; =20 /* * We need both ACQUIRE (pairs with corresponding RELEASE in @@ -102,13 +100,12 @@ bool osq_lock(struct optimistic_spin_queue *lock) * the node fields we just initialised) semantics when updating * the lock tail. */ - old =3D atomic_xchg(&lock->tail, curr); - if (old =3D=3D OSQ_UNLOCKED_VAL) + prev_cpu =3D atomic_xchg(&lock->tail, curr); + if (prev_cpu =3D=3D OSQ_UNLOCKED_VAL) return true; =20 - node->prev_cpu =3D old; - prev =3D decode_cpu(old); - node->prev =3D prev; + node->prev_cpu =3D prev_cpu; + prev =3D decode_cpu(prev_cpu); node->locked =3D 0; =20 /* @@ -174,9 +171,16 @@ bool osq_lock(struct optimistic_spin_queue *lock) =20 /* * Or we race against a concurrent unqueue()'s step-B, in which - * case its step-C will write us a new @node->prev pointer. + * case its step-C will write us a new @node->prev_cpu value. */ - prev =3D READ_ONCE(node->prev); + { + int new_prev_cpu =3D READ_ONCE(node->prev_cpu); + + if (new_prev_cpu =3D=3D prev_cpu) + continue; + prev_cpu =3D new_prev_cpu; + prev =3D decode_cpu(prev_cpu); + } } =20 /* @@ -186,7 +190,7 @@ bool osq_lock(struct optimistic_spin_queue *lock) * back to @prev. */ =20 - next =3D osq_wait_next(lock, node, prev->cpu); + next =3D osq_wait_next(lock, node, prev_cpu); if (!next) return false; =20 @@ -198,8 +202,7 @@ bool osq_lock(struct optimistic_spin_queue *lock) * it will wait in Step-A. */ =20 - WRITE_ONCE(next->prev_cpu, prev->cpu); - WRITE_ONCE(next->prev, prev); + WRITE_ONCE(next->prev_cpu, prev_cpu); WRITE_ONCE(prev->next, next); =20 return false; --=20 2.17.1 - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1= PT, UK Registration No: 1397386 (Wales) From nobody Fri Dec 26 19:32:49 2025 Received: from eu-smtp-delivery-151.mimecast.com (eu-smtp-delivery-151.mimecast.com [185.58.85.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 48BC1F9CC for ; Sun, 31 Dec 2023 21:55:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=ACULAB.COM Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=aculab.com Received: from AcuMS.aculab.com (156.67.243.121 [156.67.243.121]) by relay.mimecast.com with ESMTP with both STARTTLS and AUTH (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id uk-mta-44-6gjeuhImOKioWY8extJLBA-1; Sun, 31 Dec 2023 21:55:20 +0000 X-MC-Unique: 6gjeuhImOKioWY8extJLBA-1 Received: from AcuMS.Aculab.com (10.202.163.6) by AcuMS.aculab.com (10.202.163.6) with Microsoft SMTP Server (TLS) id 15.0.1497.48; Sun, 31 Dec 2023 21:54:59 +0000 Received: from AcuMS.Aculab.com ([::1]) by AcuMS.aculab.com ([::1]) with mapi id 15.00.1497.048; Sun, 31 Dec 2023 21:54:59 +0000 From: David Laight To: "'linux-kernel@vger.kernel.org'" , "'peterz@infradead.org'" , "'longman@redhat.com'" CC: "'mingo@redhat.com'" , "'will@kernel.org'" , "'boqun.feng@gmail.com'" , "'Linus Torvalds'" , "'virtualization@lists.linux-foundation.org'" , 'Zeng Heng' Subject: [PATCH next v2 4/5] locking/osq_lock: Avoid writing to node->next in the osq_lock() fast path. Thread-Topic: [PATCH next v2 4/5] locking/osq_lock: Avoid writing to node->next in the osq_lock() fast path. Thread-Index: Ado8NAjQtRL812H3R1Kc4G+FOscjCQ== Date: Sun, 31 Dec 2023 21:54:59 +0000 Message-ID: <06a11b2c7d784f2d80dc8e81c7175c57@AcuMS.aculab.com> References: <2b4e8a5816a742d2bd23fdbaa8498e80@AcuMS.aculab.com> In-Reply-To: <2b4e8a5816a742d2bd23fdbaa8498e80@AcuMS.aculab.com> Accept-Language: en-GB, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: aculab.com Content-Language: en-US Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When osq_lock() returns false or osq_unlock() returns static analysis shows that node->next should always be NULL. This means that it isn't necessary to explicitly set it to NULL prior to atomic_xchg(&lock->tail, curr) on extry to osq_lock(). Just in case there a non-obvious race condition that can leave it non-NULL check with WARN_ON_ONCE() and NULL if set. Note that without this check the fast path (adding at the list head) doesn't need to to access the per-cpu osq_node at all. Signed-off-by: David Laight Reviewed-by: Waiman Long --- kernel/locking/osq_lock.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-) diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c index 27324b509f68..35bb99e96697 100644 --- a/kernel/locking/osq_lock.c +++ b/kernel/locking/osq_lock.c @@ -87,12 +87,17 @@ osq_wait_next(struct optimistic_spin_queue *lock, =20 bool osq_lock(struct optimistic_spin_queue *lock) { - struct optimistic_spin_node *node =3D this_cpu_ptr(&osq_node); - struct optimistic_spin_node *prev, *next; + struct optimistic_spin_node *node, *prev, *next; int curr =3D encode_cpu(smp_processor_id()); int prev_cpu; =20 - node->next =3D NULL; + /* + * node->next should be NULL on entry. + * Check just in case there is a race somewhere. + * Note that this is probably an unnecessary cache miss in the fast path. + */ + if (WARN_ON_ONCE(raw_cpu_read(osq_node.next) !=3D NULL)) + raw_cpu_write(osq_node.next, NULL); =20 /* * We need both ACQUIRE (pairs with corresponding RELEASE in @@ -104,8 +109,9 @@ bool osq_lock(struct optimistic_spin_queue *lock) if (prev_cpu =3D=3D OSQ_UNLOCKED_VAL) return true; =20 - node->prev_cpu =3D prev_cpu; + node =3D this_cpu_ptr(&osq_node); prev =3D decode_cpu(prev_cpu); + node->prev_cpu =3D prev_cpu; node->locked =3D 0; =20 /* --=20 2.17.1 - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1= PT, UK Registration No: 1397386 (Wales) From nobody Fri Dec 26 19:32:49 2025 Received: from eu-smtp-delivery-151.mimecast.com (eu-smtp-delivery-151.mimecast.com [185.58.86.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 84870FBE5 for ; Sun, 31 Dec 2023 21:56:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=ACULAB.COM Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=aculab.com Received: from AcuMS.aculab.com (156.67.243.121 [156.67.243.121]) by relay.mimecast.com with ESMTP with both STARTTLS and AUTH (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id uk-mta-169-VmXxqLQzMMGaMcPZew1zOA-1; Sun, 31 Dec 2023 21:56:12 +0000 X-MC-Unique: VmXxqLQzMMGaMcPZew1zOA-1 Received: from AcuMS.Aculab.com (10.202.163.6) by AcuMS.aculab.com (10.202.163.6) with Microsoft SMTP Server (TLS) id 15.0.1497.48; Sun, 31 Dec 2023 21:55:50 +0000 Received: from AcuMS.Aculab.com ([::1]) by AcuMS.aculab.com ([::1]) with mapi id 15.00.1497.048; Sun, 31 Dec 2023 21:55:50 +0000 From: David Laight To: "'linux-kernel@vger.kernel.org'" , "'peterz@infradead.org'" , "'longman@redhat.com'" CC: "'mingo@redhat.com'" , "'will@kernel.org'" , "'boqun.feng@gmail.com'" , "'Linus Torvalds'" , "'virtualization@lists.linux-foundation.org'" , 'Zeng Heng' Subject: [PATCH next v2 5/5] locking/osq_lock: Optimise decode_cpu() and per_cpu_ptr(). Thread-Topic: [PATCH next v2 5/5] locking/osq_lock: Optimise decode_cpu() and per_cpu_ptr(). Thread-Index: Ado8NCf0vtha6NqURtGgfE7//QxHew== Date: Sun, 31 Dec 2023 21:55:50 +0000 Message-ID: <7c1148fe64fb46a7a81c984776cd91df@AcuMS.aculab.com> References: <2b4e8a5816a742d2bd23fdbaa8498e80@AcuMS.aculab.com> In-Reply-To: <2b4e8a5816a742d2bd23fdbaa8498e80@AcuMS.aculab.com> Accept-Language: en-GB, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: aculab.com Content-Language: en-US Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" per_cpu_ptr() indexes __per_cpu_offset[] with the cpu number. This requires the cpu number be 64bit. However the value is osq_lock() comes from a 32bit xchg() and there isn't a way of telling gcc the high bits are zero (they are) so there will always be an instruction to clear the high bits. The cpu number is also offset by one (to make the initialiser 0) It seems to be impossible to get gcc to convert __per_cpu_offset[cpu_p1 - 1] into (__per_cpu_offset - 1)[cpu_p1] (transferring the offset to the address= ). Converting the cpu number to 32bit unsigned prior to the decrement means that gcc knows the decrement has set the high bits to zero and doesn't add a register-register move (or cltq) to zero/sign extend the value. Not massive but saves two instructions. Signed-off-by: David Laight Reviewed-by: Waiman Long --- kernel/locking/osq_lock.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c index 35bb99e96697..37a4fa872989 100644 --- a/kernel/locking/osq_lock.c +++ b/kernel/locking/osq_lock.c @@ -29,11 +29,9 @@ static inline int encode_cpu(int cpu_nr) return cpu_nr + 1; } =20 -static inline struct optimistic_spin_node *decode_cpu(int encoded_cpu_val) +static inline struct optimistic_spin_node *decode_cpu(unsigned int encoded= _cpu_val) { - int cpu_nr =3D encoded_cpu_val - 1; - - return per_cpu_ptr(&osq_node, cpu_nr); + return per_cpu_ptr(&osq_node, encoded_cpu_val - 1); } =20 /* --=20 2.17.1 - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1= PT, UK Registration No: 1397386 (Wales)