[PATCH 0/8] Introduce simple hazard pointers for lockdep

Boqun Feng posted 8 patches 3 months, 2 weeks ago
include/linux/shazptr.h  |  73 +++++++++
kernel/locking/Makefile  |   2 +-
kernel/locking/lockdep.c |  11 +-
kernel/locking/shazptr.c | 318 +++++++++++++++++++++++++++++++++++++++
kernel/rcu/rcuscale.c    |  60 +++++++-
kernel/rcu/refscale.c    |  77 ++++++++++
6 files changed, 534 insertions(+), 7 deletions(-)
create mode 100644 include/linux/shazptr.h
create mode 100644 kernel/locking/shazptr.c
[PATCH 0/8] Introduce simple hazard pointers for lockdep
Posted by Boqun Feng 3 months, 2 weeks ago
Hi,

This is the official first version of simple hazard pointers following
the RFC:

	https://lore.kernel.org/lkml/20250414060055.341516-1-boqun.feng@gmail.com/

I rebase it onto v6.16-rc3 and hope to get more feedback this time.

Thanks a lot for Breno Leitao to try the RFC out and share the numbers.

I did an extra comparison this time, between the shazptr solution and
the synchronize_rcu_expedited() solution. In my test, during a 100 times
"tc qdisc replace" run:

* IPI rate with the shazptr solution: ~14 per second per core.
* IPI rate with synchronize_rcu_expedited(): ~140 per second per core.

(IPI results were from the 'CAL' line in /proc/interrupt)

This shows that while both solutions have the similar speedup, shazptr
solution avoids the introduce of high IPI rate compared to
synchronize_rcu_expedited().

Feedback is welcome and please let know if there is any concern or
suggestion. Thanks!

Regards,
Boqun

--------------------------------------
Please find the old performance below:

On my system (a 96-cpu VMs), the results of:

	time /usr/sbin/tc qdisc replace dev eth0 root handle 0x1: mq

are (with lockdep enabled):

	(without the patchset)
	real    0m1.039s
	user    0m0.001s
	sys     0m0.069s

	(with the patchset)
	real    0m0.053s
	user    0m0.000s
	sys     0m0.051s

i.e. almost 20x speed-up.

Other comparisons between RCU and shazptr, the rcuscale results (using
default configuration from
tools/testing/selftests/rcutorture/bin/kvm.sh):

RCU:

	Average grace-period duration: 7470.02 microseconds
	Minimum grace-period duration: 3981.6
	50th percentile grace-period duration: 6002.73
	90th percentile grace-period duration: 7008.93
	99th percentile grace-period duration: 10015
	Maximum grace-period duration: 142228

shazptr:

	Average grace-period duration: 0.845825 microseconds
	Minimum grace-period duration: 0.199
	50th percentile grace-period duration: 0.585
	90th percentile grace-period duration: 1.656
	99th percentile grace-period duration: 3.872
	Maximum grace-period duration: 3049.05

shazptr (skip_synchronize_self_scan=1, i.e. always let scan kthread to
wakeup):

	Average grace-period duration: 467.861 microseconds
	Minimum grace-period duration: 92.913
	50th percentile grace-period duration: 440.691
	90th percentile grace-period duration: 460.623
	99th percentile grace-period duration: 650.068
	Maximum grace-period duration: 5775.46

shazptr_wildcard (i.e. readers always use SHAZPTR_WILDCARD):

	Average grace-period duration: 599.569 microseconds
	Minimum grace-period duration: 1.432
	50th percentile grace-period duration: 582.631
	90th percentile grace-period duration: 781.704
	99th percentile grace-period duration: 1160.26
	Maximum grace-period duration: 6727.53

shazptr_wildcard (skip_synchronize_self_scan=1):

	Average grace-period duration: 460.466 microseconds
	Minimum grace-period duration: 303.546
	50th percentile grace-period duration: 424.334
	90th percentile grace-period duration: 482.637
	99th percentile grace-period duration: 600.214
	Maximum grace-period duration: 4126.94

Boqun Feng (8):
  Introduce simple hazard pointers
  shazptr: Add refscale test
  shazptr: Add refscale test for wildcard
  shazptr: Avoid synchronize_shaptr() busy waiting
  shazptr: Allow skip self scan in synchronize_shaptr()
  rcuscale: Allow rcu_scale_ops::get_gp_seq to be NULL
  rcuscale: Add tests for simple hazard pointers
  locking/lockdep: Use shazptr to protect the key hashlist

 include/linux/shazptr.h  |  73 +++++++++
 kernel/locking/Makefile  |   2 +-
 kernel/locking/lockdep.c |  11 +-
 kernel/locking/shazptr.c | 318 +++++++++++++++++++++++++++++++++++++++
 kernel/rcu/rcuscale.c    |  60 +++++++-
 kernel/rcu/refscale.c    |  77 ++++++++++
 6 files changed, 534 insertions(+), 7 deletions(-)
 create mode 100644 include/linux/shazptr.h
 create mode 100644 kernel/locking/shazptr.c

-- 
2.39.5 (Apple Git-154)
Re: [PATCH 0/8] Introduce simple hazard pointers for lockdep
Posted by Mathieu Desnoyers 3 months, 2 weeks ago
On 2025-06-24 23:10, Boqun Feng wrote:
> Hi,
> 
> This is the official first version of simple hazard pointers following
> the RFC:
> 
> 	https://lore.kernel.org/lkml/20250414060055.341516-1-boqun.feng@gmail.com/
> 
> I rebase it onto v6.16-rc3 and hope to get more feedback this time.
> 
> Thanks a lot for Breno Leitao to try the RFC out and share the numbers.
> 
> I did an extra comparison this time, between the shazptr solution and
> the synchronize_rcu_expedited() solution. In my test, during a 100 times
> "tc qdisc replace" run:
> 
> * IPI rate with the shazptr solution: ~14 per second per core.
> * IPI rate with synchronize_rcu_expedited(): ~140 per second per core.
> 
> (IPI results were from the 'CAL' line in /proc/interrupt)
> 
> This shows that while both solutions have the similar speedup, shazptr
> solution avoids the introduce of high IPI rate compared to
> synchronize_rcu_expedited().
> 
> Feedback is welcome and please let know if there is any concern or
> suggestion. Thanks!

Hi Boqun,

What is unclear to me is what is the delta wrt:

https://lore.kernel.org/lkml/20241008135034.1982519-4-mathieu.desnoyers@efficios.com/

and whether this helper against compiler optimizations would still be needed here:

https://lore.kernel.org/lkml/20241008135034.1982519-2-mathieu.desnoyers@efficios.com/

Thanks,

Mathieu

> 
> Regards,
> Boqun
> 
> --------------------------------------
> Please find the old performance below:
> 
> On my system (a 96-cpu VMs), the results of:
> 
> 	time /usr/sbin/tc qdisc replace dev eth0 root handle 0x1: mq
> 
> are (with lockdep enabled):
> 
> 	(without the patchset)
> 	real    0m1.039s
> 	user    0m0.001s
> 	sys     0m0.069s
> 
> 	(with the patchset)
> 	real    0m0.053s
> 	user    0m0.000s
> 	sys     0m0.051s
> 
> i.e. almost 20x speed-up.
> 
> Other comparisons between RCU and shazptr, the rcuscale results (using
> default configuration from
> tools/testing/selftests/rcutorture/bin/kvm.sh):
> 
> RCU:
> 
> 	Average grace-period duration: 7470.02 microseconds
> 	Minimum grace-period duration: 3981.6
> 	50th percentile grace-period duration: 6002.73
> 	90th percentile grace-period duration: 7008.93
> 	99th percentile grace-period duration: 10015
> 	Maximum grace-period duration: 142228
> 
> shazptr:
> 
> 	Average grace-period duration: 0.845825 microseconds
> 	Minimum grace-period duration: 0.199
> 	50th percentile grace-period duration: 0.585
> 	90th percentile grace-period duration: 1.656
> 	99th percentile grace-period duration: 3.872
> 	Maximum grace-period duration: 3049.05
> 
> shazptr (skip_synchronize_self_scan=1, i.e. always let scan kthread to
> wakeup):
> 
> 	Average grace-period duration: 467.861 microseconds
> 	Minimum grace-period duration: 92.913
> 	50th percentile grace-period duration: 440.691
> 	90th percentile grace-period duration: 460.623
> 	99th percentile grace-period duration: 650.068
> 	Maximum grace-period duration: 5775.46
> 
> shazptr_wildcard (i.e. readers always use SHAZPTR_WILDCARD):
> 
> 	Average grace-period duration: 599.569 microseconds
> 	Minimum grace-period duration: 1.432
> 	50th percentile grace-period duration: 582.631
> 	90th percentile grace-period duration: 781.704
> 	99th percentile grace-period duration: 1160.26
> 	Maximum grace-period duration: 6727.53
> 
> shazptr_wildcard (skip_synchronize_self_scan=1):
> 
> 	Average grace-period duration: 460.466 microseconds
> 	Minimum grace-period duration: 303.546
> 	50th percentile grace-period duration: 424.334
> 	90th percentile grace-period duration: 482.637
> 	99th percentile grace-period duration: 600.214
> 	Maximum grace-period duration: 4126.94
> 
> Boqun Feng (8):
>    Introduce simple hazard pointers
>    shazptr: Add refscale test
>    shazptr: Add refscale test for wildcard
>    shazptr: Avoid synchronize_shaptr() busy waiting
>    shazptr: Allow skip self scan in synchronize_shaptr()
>    rcuscale: Allow rcu_scale_ops::get_gp_seq to be NULL
>    rcuscale: Add tests for simple hazard pointers
>    locking/lockdep: Use shazptr to protect the key hashlist
> 
>   include/linux/shazptr.h  |  73 +++++++++
>   kernel/locking/Makefile  |   2 +-
>   kernel/locking/lockdep.c |  11 +-
>   kernel/locking/shazptr.c | 318 +++++++++++++++++++++++++++++++++++++++
>   kernel/rcu/rcuscale.c    |  60 +++++++-
>   kernel/rcu/refscale.c    |  77 ++++++++++
>   6 files changed, 534 insertions(+), 7 deletions(-)
>   create mode 100644 include/linux/shazptr.h
>   create mode 100644 kernel/locking/shazptr.c
> 


-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
Re: [PATCH 0/8] Introduce simple hazard pointers for lockdep
Posted by Boqun Feng 3 months, 2 weeks ago
On Wed, Jun 25, 2025 at 08:25:52AM -0400, Mathieu Desnoyers wrote:
> On 2025-06-24 23:10, Boqun Feng wrote:
> > Hi,
> > 
> > This is the official first version of simple hazard pointers following
> > the RFC:
> > 
> > 	https://lore.kernel.org/lkml/20250414060055.341516-1-boqun.feng@gmail.com/
> > 
> > I rebase it onto v6.16-rc3 and hope to get more feedback this time.
> > 
> > Thanks a lot for Breno Leitao to try the RFC out and share the numbers.
> > 
> > I did an extra comparison this time, between the shazptr solution and
> > the synchronize_rcu_expedited() solution. In my test, during a 100 times
> > "tc qdisc replace" run:
> > 
> > * IPI rate with the shazptr solution: ~14 per second per core.
> > * IPI rate with synchronize_rcu_expedited(): ~140 per second per core.
> > 
> > (IPI results were from the 'CAL' line in /proc/interrupt)
> > 
> > This shows that while both solutions have the similar speedup, shazptr
> > solution avoids the introduce of high IPI rate compared to
> > synchronize_rcu_expedited().
> > 
> > Feedback is welcome and please let know if there is any concern or
> > suggestion. Thanks!
> 
> Hi Boqun,
> 
> What is unclear to me is what is the delta wrt:
> 
> https://lore.kernel.org/lkml/20241008135034.1982519-4-mathieu.desnoyers@efficios.com/
> 

shazptr is more close to the general hazptr I proposed earlier:

	https://lore.kernel.org/lkml/20240917143402.930114-1-boqun.feng@gmail.com/

, it's aimed as a global facility therefore no hazptr_domain is needed,
plus it supports non-busy waiting synchronize_shazptr() at the
beginning.

> and whether this helper against compiler optimizations would still be needed here:
> 
> https://lore.kernel.org/lkml/20241008135034.1982519-2-mathieu.desnoyers@efficios.com/
> 

For the current user, no, but eventually we are going to need it.

Regards,
Boqun

> Thanks,
> 
> Mathieu
> 
> > 
> > Regards,
> > Boqun
> > 
> > --------------------------------------
> > Please find the old performance below:
> > 
> > On my system (a 96-cpu VMs), the results of:
> > 
> > 	time /usr/sbin/tc qdisc replace dev eth0 root handle 0x1: mq
> > 
> > are (with lockdep enabled):
> > 
> > 	(without the patchset)
> > 	real    0m1.039s
> > 	user    0m0.001s
> > 	sys     0m0.069s
> > 
> > 	(with the patchset)
> > 	real    0m0.053s
> > 	user    0m0.000s
> > 	sys     0m0.051s
> > 
> > i.e. almost 20x speed-up.
> > 
> > Other comparisons between RCU and shazptr, the rcuscale results (using
> > default configuration from
> > tools/testing/selftests/rcutorture/bin/kvm.sh):
> > 
> > RCU:
> > 
> > 	Average grace-period duration: 7470.02 microseconds
> > 	Minimum grace-period duration: 3981.6
> > 	50th percentile grace-period duration: 6002.73
> > 	90th percentile grace-period duration: 7008.93
> > 	99th percentile grace-period duration: 10015
> > 	Maximum grace-period duration: 142228
> > 
> > shazptr:
> > 
> > 	Average grace-period duration: 0.845825 microseconds
> > 	Minimum grace-period duration: 0.199
> > 	50th percentile grace-period duration: 0.585
> > 	90th percentile grace-period duration: 1.656
> > 	99th percentile grace-period duration: 3.872
> > 	Maximum grace-period duration: 3049.05
> > 
> > shazptr (skip_synchronize_self_scan=1, i.e. always let scan kthread to
> > wakeup):
> > 
> > 	Average grace-period duration: 467.861 microseconds
> > 	Minimum grace-period duration: 92.913
> > 	50th percentile grace-period duration: 440.691
> > 	90th percentile grace-period duration: 460.623
> > 	99th percentile grace-period duration: 650.068
> > 	Maximum grace-period duration: 5775.46
> > 
> > shazptr_wildcard (i.e. readers always use SHAZPTR_WILDCARD):
> > 
> > 	Average grace-period duration: 599.569 microseconds
> > 	Minimum grace-period duration: 1.432
> > 	50th percentile grace-period duration: 582.631
> > 	90th percentile grace-period duration: 781.704
> > 	99th percentile grace-period duration: 1160.26
> > 	Maximum grace-period duration: 6727.53
> > 
> > shazptr_wildcard (skip_synchronize_self_scan=1):
> > 
> > 	Average grace-period duration: 460.466 microseconds
> > 	Minimum grace-period duration: 303.546
> > 	50th percentile grace-period duration: 424.334
> > 	90th percentile grace-period duration: 482.637
> > 	99th percentile grace-period duration: 600.214
> > 	Maximum grace-period duration: 4126.94
> > 
> > Boqun Feng (8):
> >    Introduce simple hazard pointers
> >    shazptr: Add refscale test
> >    shazptr: Add refscale test for wildcard
> >    shazptr: Avoid synchronize_shaptr() busy waiting
> >    shazptr: Allow skip self scan in synchronize_shaptr()
> >    rcuscale: Allow rcu_scale_ops::get_gp_seq to be NULL
> >    rcuscale: Add tests for simple hazard pointers
> >    locking/lockdep: Use shazptr to protect the key hashlist
> > 
> >   include/linux/shazptr.h  |  73 +++++++++
> >   kernel/locking/Makefile  |   2 +-
> >   kernel/locking/lockdep.c |  11 +-
> >   kernel/locking/shazptr.c | 318 +++++++++++++++++++++++++++++++++++++++
> >   kernel/rcu/rcuscale.c    |  60 +++++++-
> >   kernel/rcu/refscale.c    |  77 ++++++++++
> >   6 files changed, 534 insertions(+), 7 deletions(-)
> >   create mode 100644 include/linux/shazptr.h
> >   create mode 100644 kernel/locking/shazptr.c
> > 
> 
> 
> -- 
> Mathieu Desnoyers
> EfficiOS Inc.
> https://www.efficios.com
Re: [PATCH 0/8] Introduce simple hazard pointers for lockdep
Posted by Christoph Hellwig 3 months, 2 weeks ago
On Tue, Jun 24, 2025 at 08:10:53PM -0700, Boqun Feng wrote:
> Hi,
> 
> This is the official first version of simple hazard pointers following
> the RFC:

Can you please put an explanation of what hazard pointers are
prominently into this cover letter?
Re: [PATCH 0/8] Introduce simple hazard pointers for lockdep
Posted by Boqun Feng 3 months, 2 weeks ago
On Wed, Jun 25, 2025 at 05:05:11AM -0700, Christoph Hellwig wrote:
> On Tue, Jun 24, 2025 at 08:10:53PM -0700, Boqun Feng wrote:
> > Hi,
> > 
> > This is the official first version of simple hazard pointers following
> > the RFC:
> 
> Can you please put an explanation of what hazard pointers are
> prominently into this cover letter?
> 

Sure, I will put one for the future version, here is the gist:


Hazard pointers provide the similar synchronzation behavior as RCU:
readers are cheap, updaters need to wait for existing readers to go
before they can free the objects.

The difference between hazard pointers and RCU is that instead of
waiting for a grace period, which all the readers have to exit the RCU
read-side critical sections, the updaters of hazard pointers only need
to wait for the readers that are accessing the objects they are about to
free. For example, if we have 2 readers accessing different objects and
1 updater is freeing one of them:

using RCU:

	Reader 1		Reader 2		Updater
	========		========		=======
	rcu_read_lock();
	r = rcu_dereference(a);
				rcu_read_lock();
				r = rcu_dereference(b);
							synchronize_rcu();
	rcu_read_unlock();
				rcu_read_unlock();
							<synchronize_rcu() returns>
							free(a);

The updater will need to wait for reader 2 to finish before it can
free 'a', however when using hazard pointers:

	Reader 1		Reader 2		Updater
	========		========		=======
	g = shazptr_acquire(a);
				g = shazptr_acqurie(b);
							synchronize_shazptr(a);
	shazptr_clear(g);
							<synchronize_shazptr(a) returns>
							free(a);

				shazptr_clear(g); // <- updater doesn't
						  //    need to wait for
						  //    this.

The updater's wait can finish immediately if no one is accessing 'a', in
other words it doesn't need to wait for reader 2.

This means for a particular workload, hazard pointers may have smaller
memory footprint and less updater wait time compared to RCU, while still
have the similar performance on the reader side.


That being said, it does come with some cost, the readers would need to
provide their own hazard pointer slots (allocating memory) in general
cases. And in the simple hazard pointer implementation in this series,
although readers don't need to provide their own hazard pointer slots,
they need to disable the preemption to use the hazard pointer, and the
performance would downgrade (to a naive SRCU implementation probably) if
they want to protect multiple objects in one read-side critical section.


Regards,
Boqun
Re: [PATCH 0/8] Introduce simple hazard pointers for lockdep
Posted by Christoph Hellwig 3 months, 2 weeks ago
On Wed, Jun 25, 2025 at 07:08:57AM -0700, Boqun Feng wrote:
> Sure, I will put one for the future version, here is the gist:

Thanks a lot!

> The updater's wait can finish immediately if no one is accessing 'a', in
> other words it doesn't need to wait for reader 2.

So basically it is the RCU concept, but limited to protecting exactly
one pointer update per critical section with no ability for the read
to e.g. acquire a refcount on the objected pointed to by that pointer?
Re: [PATCH 0/8] Introduce simple hazard pointers for lockdep
Posted by Boqun Feng 3 months, 2 weeks ago
On Thu, Jun 26, 2025 at 03:16:49AM -0700, Christoph Hellwig wrote:
> On Wed, Jun 25, 2025 at 07:08:57AM -0700, Boqun Feng wrote:
> > Sure, I will put one for the future version, here is the gist:
> 
> Thanks a lot!
> 
> > The updater's wait can finish immediately if no one is accessing 'a', in
> > other words it doesn't need to wait for reader 2.
> 
> So basically it is the RCU concept, but limited to protecting exactly
> one pointer update per critical section with no ability for the read
> to e.g. acquire a refcount on the objected pointed to by that pointer?

For the current simple hazard pointer, yes. But simple hazard pointers
is easily to extend so support reading:

	{ gp is a global pointer }

	Reader				Updater
	======				=======
	g = shazptr_acquire(p):
	      WRITE_ONCE(*this_cpu_ptr(slot), gp);
	      smp_mb();
	
	if (READ_ONCE(gp) == *this_cpu_ptr(slot)) {
	    // still being protected.
	    <can read gp here>
					to_free = READ_ONCE(gp);
					WRITE_ONCE(gp, new);
					synchronize_shazptr(to_free):
					  smp_mb();
					  // wait on the slot of reader
					  // CPU being 0.
					  READ_ONCE(per_cpu(reader, slot));
	}

	shazptr_clear(g):
	  WRITE_ONCE(*this_cpu_ptr(slot), NULL); // unblock synchronize_shazptr()


Usually the shazptr_acqurie() + "pointer comparison"* is called
shazptr_try_protect().

I will add a document about this in the next version along with other
bits of hazard pointers.

[*]: The pointer comparison is more complicated topic, but Mathieu has
     figured out how to do it correctly:

     https://lore.kernel.org/lkml/20241008135034.1982519-2-mathieu.desnoyers@efficios.com/

Regards,
Boqun
Re: [PATCH 0/8] Introduce simple hazard pointers for lockdep
Posted by Paul E. McKenney 3 months, 1 week ago
On Thu, Jun 26, 2025 at 08:47:25AM -0700, Boqun Feng wrote:
> On Thu, Jun 26, 2025 at 03:16:49AM -0700, Christoph Hellwig wrote:
> > On Wed, Jun 25, 2025 at 07:08:57AM -0700, Boqun Feng wrote:
> > > Sure, I will put one for the future version, here is the gist:
> > 
> > Thanks a lot!
> > 
> > > The updater's wait can finish immediately if no one is accessing 'a', in
> > > other words it doesn't need to wait for reader 2.
> > 
> > So basically it is the RCU concept, but limited to protecting exactly
> > one pointer update per critical section with no ability for the read
> > to e.g. acquire a refcount on the objected pointed to by that pointer?
> 
> For the current simple hazard pointer, yes. But simple hazard pointers
> is easily to extend so support reading:
> 
> 	{ gp is a global pointer }
> 
> 	Reader				Updater
> 	======				=======
> 	g = shazptr_acquire(p):
> 	      WRITE_ONCE(*this_cpu_ptr(slot), gp);
> 	      smp_mb();
> 	
> 	if (READ_ONCE(gp) == *this_cpu_ptr(slot)) {
> 	    // still being protected.
> 	    <can read gp here>
> 					to_free = READ_ONCE(gp);
> 					WRITE_ONCE(gp, new);
> 					synchronize_shazptr(to_free):
> 					  smp_mb();
> 					  // wait on the slot of reader
> 					  // CPU being 0.
> 					  READ_ONCE(per_cpu(reader, slot));
> 	}
> 
> 	shazptr_clear(g):
> 	  WRITE_ONCE(*this_cpu_ptr(slot), NULL); // unblock synchronize_shazptr()
> 
> 
> Usually the shazptr_acqurie() + "pointer comparison"* is called
> shazptr_try_protect().
> 
> I will add a document about this in the next version along with other
> bits of hazard pointers.
> 
> [*]: The pointer comparison is more complicated topic, but Mathieu has
>      figured out how to do it correctly:
> 
>      https://lore.kernel.org/lkml/20241008135034.1982519-2-mathieu.desnoyers@efficios.com/

It might be helpful to add that, at a high level, hazard pointers
are a scalable replacement for reference counting.  At a similarly
high level, RCU is a scalable replacement for reader-writer locking.
At lower levels, there is considerable overlap in applicability, so that
you can use RCU to replace many reference-counting use cases and hazard
pointers to replace many reader-writer-locking use cases..

Plus, as both Mathieu and Boqun pointed out, both RCU and hazard pointers
can be combined with other synchronization mechanisms, including each
other.

							Thanx, Paul
Re: [PATCH 0/8] Introduce simple hazard pointers for lockdep
Posted by Mathieu Desnoyers 3 months, 2 weeks ago
On 2025-06-26 06:16, Christoph Hellwig wrote:
> On Wed, Jun 25, 2025 at 07:08:57AM -0700, Boqun Feng wrote:
>> Sure, I will put one for the future version, here is the gist:
> 
> Thanks a lot!
> 
>> The updater's wait can finish immediately if no one is accessing 'a', in
>> other words it doesn't need to wait for reader 2.
> 
> So basically it is the RCU concept, but limited to protecting exactly
> one pointer update per critical section with no ability for the read
> to e.g. acquire a refcount on the objected pointed to by that pointer?

FWIW, hazard pointers can be chained with other existence guarantee
mechanisms. I've done prototypes that use hazard pointers chained
with a reference counter in the object to implement something similar to
smart pointers. Let me know if you are interested in the details.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com