[v1] RE: [PATCH 0/5] sched: Lazy preemption muck

RE: [PATCH 0/5] sched: Lazy preemption muck
Posted by David Laight 1 year, 3 months ago
From: Thomas Gleixner
> Sent: 13 October 2024 20:02
> 
> On Thu, Oct 10 2024 at 10:23, David Laight wrote:
> > ...
> >> And once all the problems with LAZY are sorted then this cond_resched()
> >> line just goes away and the loop looks like this:
> >>
> >>     while ($cond) {
> >>           spin_lock(L);
> >>           do_stuff();
> >>           spin_unlock(L);
> >>     }
> >
> > The problem with that pattern is the cost of the atomics.
> > Thay can easily be significant especially if there are
> > a lot of iterations and do_stuff() is cheap;
> >
> > If $cond needs the lock, the code is really:
> > 	spin_lock(L);
> > 	while ($cond) {
> > 		do_stuff();
> > 		spin_unlock(L);
> > 		spin_lock(L);
> > 	}
> > 	spin_unlock(L);
> >
> > which make it even more obvious that you need a cheap
> > test to optimise away the unlock/lock pair.
> 
> You cannot optimize the unlock/lock pair away for a large number of
> iterations because then you bring back the problem of extended
> latencies.
> 
> It does not matter whether $cond is cheap and do_stuff() is cheap. If
> you have enough iterations then even a cheap do_stuff() causes massive
> latencies, unless you keep the horrible cond_resched() mess, which we
> are trying to remove.

While cond_resched() can probably go, you need a cheap need_resched()
so the loop above can contain:
		if (need_resched()) {
			spin_unlock(L);
			spin_lock(L);
		}
to avoid the atomics when both $cond and do_stuff() are cheap
but there are a lot of iterations.

There will also be cases where it isn't anywhere near as simple
as unlock/lock (eg traversing a linked list) because additional
code is needed to ensure the loop can be continued.

> What you are proposing is a programming antipattern and the lock/unlock
> around do_stuff() in the clean loop I outlined is mostly free when there
> is no contention, unless you use a pointless micro benchmark which has
> an empty (or almost empty) do_stuff() implementation. We are not
> optimizing for completely irrelevant theoretical nonsense.

Aren't you adding a extra pair of atomics on every iteration.
That is going to be noticeable.
Never mind the cases where it isn't that simple.

	David

> 
> Thanks,
> 
>         tglx
> 

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)