[PATCH v4 0/1] eventpoll: Fix priority inversion problem

Nam Cao posted 1 patch 2 months, 3 weeks ago
fs/eventpoll.c | 139 +++++++++----------------------------------------
1 file changed, 26 insertions(+), 113 deletions(-)
[PATCH v4 0/1] eventpoll: Fix priority inversion problem
Posted by Nam Cao 2 months, 3 weeks ago
Hi,

This v4 is the follow-up to v3 at:
https://lore.kernel.org/linux-fsdevel/20250527090836.1290532-1-namcao@linutronix.de/
which resolves a priority inversion problem.

The v3 patch was merged, but then got reverted due to regression.

The direction of v3 was wrong in the first place. It changed the
eventpoll's event list to be lockless, making the code harder to read. I
stared at the patch again, but still couldn't figure out what the bug is.

The performance numbers were indeed impressive with lockless, but the
numbers are from a benchmark, which is unclear whether it really reflects
real workload.

This v4 takes a completely different approach: it converts the rwlock to
spinlock. Unfortunately, unlike rwlock, spinlock does not allow concurrent
readers. This patch therefore reduces the performance numbers.

I have some optimization tricks to reduce spinlock contention and bring the
numbers back. But Linus appeared and declared that epoll's performance
shouldn't be the priority. So I decided not to post those optimization
patches.

For those who are curious, the optimization patches are at:
    git@github.com:covanam/linux.git epoll_optimize
be warned that they have not been well-tested.

The regression with v3 turned me into paranoid mode now. So I made this v4
as obvious as I can.

Nam Cao (1):
  eventpoll: Replace rwlock with spinlock

 fs/eventpoll.c | 139 +++++++++----------------------------------------
 1 file changed, 26 insertions(+), 113 deletions(-)

-- 
2.39.5
Re: [PATCH v4 0/1] eventpoll: Fix priority inversion problem
Posted by Christian Brauner 1 month ago
On Tue, 15 Jul 2025 14:46:33 +0200, Nam Cao wrote:
> This v4 is the follow-up to v3 at:
> https://lore.kernel.org/linux-fsdevel/20250527090836.1290532-1-namcao@linutronix.de/
> which resolves a priority inversion problem.
> 
> The v3 patch was merged, but then got reverted due to regression.
> 
> The direction of v3 was wrong in the first place. It changed the
> eventpoll's event list to be lockless, making the code harder to read. I
> stared at the patch again, but still couldn't figure out what the bug is.
> 
> [...]

Applied to the vfs-6.18.misc branch of the vfs/vfs.git tree.
Patches in the vfs-6.18.misc branch should appear in linux-next soon.

Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.

It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.

Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs-6.18.misc

[1/1] eventpoll: Replace rwlock with spinlock
      https://git.kernel.org/vfs/vfs/c/0c43094f8cc9