include/linux/rwsem.h | 1 + kernel/locking/rwsem.c | 87 ++++++++++++++++++++++++++++++++++++++++-- mm/khugepaged.c | 36 ++++++++--------- 3 files changed, 104 insertions(+), 20 deletions(-)
From: Li Zhe <lizhe.67@bytedance.com> In the current kernel rwsem implementation, there is an interface to downgrade write lock to read lock, but there is no interface to upgrade a read lock to write lock. This means that in order to acquire write lock while holding read lock, we have to release the read lock first and then acquire the write lock, which will introduce some troubles in concurrent programming. This patch set provides the 'upgrade_read' interface to solve this problem. This interface can change a read lock to a write lock. Li Zhe (2): rwsem: introduce upgrade_read interface khugepaged: use upgrade_read() to optimize collapse_huge_page include/linux/rwsem.h | 1 + kernel/locking/rwsem.c | 87 ++++++++++++++++++++++++++++++++++++++++-- mm/khugepaged.c | 36 ++++++++--------- 3 files changed, 104 insertions(+), 20 deletions(-) -- 2.20.1
On Wed, Oct 16, 2024 at 12:35:58PM +0800, lizhe.67@bytedance.com wrote: > From: Li Zhe <lizhe.67@bytedance.com> > > In the current kernel rwsem implementation, there is an interface to > downgrade write lock to read lock, but there is no interface to upgrade > a read lock to write lock. This means that in order to acquire write > lock while holding read lock, we have to release the read lock first and > then acquire the write lock, which will introduce some troubles in > concurrent programming. This patch set provides the 'upgrade_read' interface > to solve this problem. This interface can change a read lock to a write > lock. upgrade-read is fundamentally prone to deadlocks. Imagine two concurrent invocations, each waiting for all readers to go away before proceeding to upgrade to a writer. Any solution to fixing that will end up being semantically similar to dropping the read lock and acquiring a write lock -- there will not be a single continuous critical section. As such, this interface makes no sense.
On Wed, 16 Oct 2024 10:09:55 +0200, peterz@infradead.org wrote: > On Wed, Oct 16, 2024 at 12:35:58PM +0800, lizhe.67@bytedance.com wrote: > > From: Li Zhe <lizhe.67@bytedance.com> > > > > In the current kernel rwsem implementation, there is an interface to > > downgrade write lock to read lock, but there is no interface to upgrade > > a read lock to write lock. This means that in order to acquire write > > lock while holding read lock, we have to release the read lock first and > > then acquire the write lock, which will introduce some troubles in > > concurrent programming. This patch set provides the 'upgrade_read' interface > > to solve this problem. This interface can change a read lock to a write > > lock. > > upgrade-read is fundamentally prone to deadlocks. Imagine two concurrent > invocations, each waiting for all readers to go away before proceeding > to upgrade to a writer. > > Any solution to fixing that will end up being semantically similar to > dropping the read lock and acquiring a write lock -- there will not be a > single continuous critical section. According to the implementation of this patch, one of the invocation will get '-EBUSY' in this case. If -EBUSY is obtained and the invocation thread continues to retry instead of dropping the read lock and acquiring a write lock, it may cause problems. Of course, this patchset only try it's best to achieve a single continuous critical section as much as possible, and there is no guarantee. > As such, this interface makes no sense. This interface is just trying to reduce the overhead caused by the additional checks, which is caused by non-continuous critical sections, as much as possible. Rather than eliminating it in all scenarios. So would it be better to change the error code to something else? So that the caller will not retry this interface?
On Wed, Oct 16, 2024 at 04:53:45PM +0800, lizhe.67@bytedance.com wrote: > On Wed, 16 Oct 2024 10:09:55 +0200, peterz@infradead.org wrote: > > > On Wed, Oct 16, 2024 at 12:35:58PM +0800, lizhe.67@bytedance.com wrote: > > > From: Li Zhe <lizhe.67@bytedance.com> > > > > > > In the current kernel rwsem implementation, there is an interface to > > > downgrade write lock to read lock, but there is no interface to upgrade > > > a read lock to write lock. This means that in order to acquire write > > > lock while holding read lock, we have to release the read lock first and > > > then acquire the write lock, which will introduce some troubles in > > > concurrent programming. This patch set provides the 'upgrade_read' interface > > > to solve this problem. This interface can change a read lock to a write > > > lock. > > > > upgrade-read is fundamentally prone to deadlocks. Imagine two concurrent > > invocations, each waiting for all readers to go away before proceeding > > to upgrade to a writer. > > > > Any solution to fixing that will end up being semantically similar to > > dropping the read lock and acquiring a write lock -- there will not be a > > single continuous critical section. > > According to the implementation of this patch, one of the invocation will Since the premise as described here is utter nonsense, I didn't get to actually reading the implementation -- why continue to waste time etc. > get '-EBUSY' in this case. If -EBUSY is obtained and the invocation thread > continues to retry instead of dropping the read lock and acquiring a write lock, > it may cause problems. Failure should drop the read lock, otherwise it is too easy to mess things up. > Of course, this patchset only try it's best to achieve a > single continuous critical section as much as possible, and there is no guarantee. As already stated, nothing like that was mentioned. > > As such, this interface makes no sense. > > This interface is just trying to reduce the overhead caused by the > additional checks, which is caused by non-continuous critical > sections, as much as possible. Rather than eliminating it in all > scenarios. So would it be better to change the error code to something > else? So that the caller will not retry this interface? You fail to quantify the gains. How am I supposed to know if the (significant?) increase in complexity is worth it? Why should I accept this increase in complexity for the sake of khugepaged, something which I care very little about?
© 2016 - 2024 Red Hat, Inc.