[RFC PATCH 0/6] job: replace AioContext lock with job_mutex

Emanuele Giuseppe Esposito posted 6 patches 2 years, 9 months ago
Test checkpatch passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20210707165813.55361-1-eesposit@redhat.com
Maintainers: John Snow <jsnow@redhat.com>, Markus Armbruster <armbru@redhat.com>, Kevin Wolf <kwolf@redhat.com>, Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>, Wen Congyang <wencongyang2@huawei.com>, Xie Changlong <xiechanglong.d@gmail.com>, Max Reitz <mreitz@redhat.com>
There is a newer version of this series
include/block/blockjob_int.h   |   1 +
include/qemu/job.h             | 159 ++++++++++--
block.c                        |   2 +-
block/backup.c                 |   4 +
block/commit.c                 |   4 +-
block/mirror.c                 |  30 ++-
block/monitor/block-hmp-cmds.c |   6 -
block/replication.c            |   3 +-
blockdev.c                     | 235 ++++++------------
blockjob.c                     | 140 +++++++----
job-qmp.c                      |  65 +++--
job.c                          | 432 ++++++++++++++++++++++++++-------
qemu-img.c                     |  19 +-
13 files changed, 724 insertions(+), 376 deletions(-)
[RFC PATCH 0/6] job: replace AioContext lock with job_mutex
Posted by Emanuele Giuseppe Esposito 2 years, 9 months ago
This is a continuation on the work to reduce (and possibly get rid of) the usage of AioContext lock, by introducing smaller granularity locks to keep the thread safety.

This series aims to:
1) remove the aiocontext lock and substitute it with the already existing
   global job_mutex
2) fix what it looks like to be an oversight when moving the blockjob.c logic
   into the more generic job.c: job_mutex was introduced especially to
   protect job->busy flag, but it seems that it was not used in successive
   patches, because there are multiple code sections that directly
   access the field without any locking.
3) use job_mutex instead of the aiocontext_lock
4) extend the reach of the job_mutex to protect all shared fields
   that the job structure has.

The reason why we propose to use the existing job_mutex and not make one for
each job is to keep things as simple as possible for now, and because
the jobs are not in the execution critical path, so we can affort
some delays.
Having a lock per job would increase overall complexity and
increase the chances of deadlocks (one good example could be the job
transactions, where multiple jobs are grouped together).
Anyways, the per-job mutex can always be added in the future.

Patch 1-4 are in preparation for patch 5. They try to simplify and clarify
the job_mutex usage. Patch 5 tries to add proper syncronization to the job
structure, replacing the AioContext lock when necessary.
Patch 6 just removes unnecessary AioContext locks that are now unneeded.


RFC: I am not sure the way I layed out the locks is ideal.
But their usage should not make deadlocks. I also made sure
the series passess all qemu_iotests.

What is very clear from this patch is that it
is strictly related to the brdv_* and lower level calls, because
they also internally check or even use the aiocontext lock.
Therefore, in order to make it work, I temporarly added some
aiocontext_acquire/release pair around the function that
still assert for them or assume they are hold and temporarly
unlock (unlock() - lock()).

I also apologize for the amount of changes in patch 5, any suggestion on
how to improve the patch layout is also very much appreciated.

Emanuele Giuseppe Esposito (6):
  job: use getter/setters instead of accessing the Job fields directly
  job: _locked functions and public job_lock/unlock for next patch
  job: minor changes to simplify locking
  job.h: categorize job fields
  job: use global job_mutex to protect struct Job
  jobs: remove unnecessary AioContext aquire/release pairs

 include/block/blockjob_int.h   |   1 +
 include/qemu/job.h             | 159 ++++++++++--
 block.c                        |   2 +-
 block/backup.c                 |   4 +
 block/commit.c                 |   4 +-
 block/mirror.c                 |  30 ++-
 block/monitor/block-hmp-cmds.c |   6 -
 block/replication.c            |   3 +-
 blockdev.c                     | 235 ++++++------------
 blockjob.c                     | 140 +++++++----
 job-qmp.c                      |  65 +++--
 job.c                          | 432 ++++++++++++++++++++++++++-------
 qemu-img.c                     |  19 +-
 13 files changed, 724 insertions(+), 376 deletions(-)

-- 
2.31.1


Re: [RFC PATCH 0/6] job: replace AioContext lock with job_mutex
Posted by Stefan Hajnoczi 2 years, 9 months ago
On Wed, Jul 07, 2021 at 06:58:07PM +0200, Emanuele Giuseppe Esposito wrote:
> This is a continuation on the work to reduce (and possibly get rid of) the usage of AioContext lock, by introducing smaller granularity locks to keep the thread safety.
> 
> This series aims to:
> 1) remove the aiocontext lock and substitute it with the already existing
>    global job_mutex
> 2) fix what it looks like to be an oversight when moving the blockjob.c logic
>    into the more generic job.c: job_mutex was introduced especially to
>    protect job->busy flag, but it seems that it was not used in successive
>    patches, because there are multiple code sections that directly
>    access the field without any locking.
> 3) use job_mutex instead of the aiocontext_lock
> 4) extend the reach of the job_mutex to protect all shared fields
>    that the job structure has.
> 
> The reason why we propose to use the existing job_mutex and not make one for
> each job is to keep things as simple as possible for now, and because
> the jobs are not in the execution critical path, so we can affort
> some delays.
> Having a lock per job would increase overall complexity and
> increase the chances of deadlocks (one good example could be the job
> transactions, where multiple jobs are grouped together).
> Anyways, the per-job mutex can always be added in the future.
> 
> Patch 1-4 are in preparation for patch 5. They try to simplify and clarify
> the job_mutex usage. Patch 5 tries to add proper syncronization to the job
> structure, replacing the AioContext lock when necessary.
> Patch 6 just removes unnecessary AioContext locks that are now unneeded.
> 
> 
> RFC: I am not sure the way I layed out the locks is ideal.
> But their usage should not make deadlocks. I also made sure
> the series passess all qemu_iotests.
> 
> What is very clear from this patch is that it
> is strictly related to the brdv_* and lower level calls, because
> they also internally check or even use the aiocontext lock.
> Therefore, in order to make it work, I temporarly added some
> aiocontext_acquire/release pair around the function that
> still assert for them or assume they are hold and temporarly
> unlock (unlock() - lock()).

Sounds like the issue is that this patch series assumes AioContext locks
are no longer required for calling the blk_*()/bdrv_*() APIs? That is
not the case yet, so you had to then add those aio_context_lock() calls
back in elsewhere. This approach introduces unnecessary risk. I think we
should wait until blk_*()/bdrv_*() no longer requires the caller to hold
the AioContext lock before applying this series.

Stefan
Re: [RFC PATCH 0/6] job: replace AioContext lock with job_mutex
Posted by Paolo Bonzini 2 years, 9 months ago
On 08/07/21 12:36, Stefan Hajnoczi wrote:
>> What is very clear from this patch is that it
>> is strictly related to the brdv_* and lower level calls, because
>> they also internally check or even use the aiocontext lock.
>> Therefore, in order to make it work, I temporarly added some
>> aiocontext_acquire/release pair around the function that
>> still assert for them or assume they are hold and temporarly
>> unlock (unlock() - lock()).
>
> Sounds like the issue is that this patch series assumes AioContext locks
> are no longer required for calling the blk_*()/bdrv_*() APIs? That is
> not the case yet, so you had to then add those aio_context_lock() calls
> back in elsewhere. This approach introduces unnecessary risk. I think we
> should wait until blk_*()/bdrv_*() no longer requires the caller to hold
> the AioContext lock before applying this series.

In general I'm in favor of pushing the lock further down into smaller 
and smaller critical sections; it's a good approach to make further 
audits easier until it's "obvious" that the lock is unnecessary.  I 
haven't yet reviewed Emanuele's patches to see if this is what he's 
doing where he's adding the acquire/release calls, but that's my 
understanding of both his cover letter and your reply.

The I/O blk_*()/bdrv_*() *should* not require the caller to hold the 
AioContext lock; all drivers use their own CoMutex or QemuMutex when 
needed, and generic code should also be ready (caveat emptor).  Others, 
such as reopen, are a mess that requires a separate audit.  Restricting 
acquire/release to be only around those seems like a good starting point.

Paolo


Re: [RFC PATCH 0/6] job: replace AioContext lock with job_mutex
Posted by Stefan Hajnoczi 2 years, 9 months ago
On Thu, Jul 08, 2021 at 01:32:12PM +0200, Paolo Bonzini wrote:
> On 08/07/21 12:36, Stefan Hajnoczi wrote:
> > > What is very clear from this patch is that it
> > > is strictly related to the brdv_* and lower level calls, because
> > > they also internally check or even use the aiocontext lock.
> > > Therefore, in order to make it work, I temporarly added some
> > > aiocontext_acquire/release pair around the function that
> > > still assert for them or assume they are hold and temporarly
> > > unlock (unlock() - lock()).
> > 
> > Sounds like the issue is that this patch series assumes AioContext locks
> > are no longer required for calling the blk_*()/bdrv_*() APIs? That is
> > not the case yet, so you had to then add those aio_context_lock() calls
> > back in elsewhere. This approach introduces unnecessary risk. I think we
> > should wait until blk_*()/bdrv_*() no longer requires the caller to hold
> > the AioContext lock before applying this series.
> 
> In general I'm in favor of pushing the lock further down into smaller and
> smaller critical sections; it's a good approach to make further audits
> easier until it's "obvious" that the lock is unnecessary.  I haven't yet
> reviewed Emanuele's patches to see if this is what he's doing where he's
> adding the acquire/release calls, but that's my understanding of both his
> cover letter and your reply.

The problem is the unnecessary risk. We know what the goal is for
blk_*()/bdrv_*() but it's not quite there yet. Does making changes in
block jobs help solve the final issues with blk_*()/bdrv_*()?

If yes, then it's a risk worth taking. If no, then spending time
developing interim code, reviewing those patches, and risking breakage
doesn't seem worth it. I'd rather wait for blk_*()/bdrv_*() to be fully
complete and then see patches that delete aio_context_acquire() in most
places or add locks in the remaining places where the caller was relying
on the AioContext lock.

Stefan
Re: [RFC PATCH 0/6] job: replace AioContext lock with job_mutex
Posted by Emanuele Giuseppe Esposito 2 years, 9 months ago

On 08/07/2021 15:04, Stefan Hajnoczi wrote:
> On Thu, Jul 08, 2021 at 01:32:12PM +0200, Paolo Bonzini wrote:
>> On 08/07/21 12:36, Stefan Hajnoczi wrote:
>>>> What is very clear from this patch is that it
>>>> is strictly related to the brdv_* and lower level calls, because
>>>> they also internally check or even use the aiocontext lock.
>>>> Therefore, in order to make it work, I temporarly added some
>>>> aiocontext_acquire/release pair around the function that
>>>> still assert for them or assume they are hold and temporarly
>>>> unlock (unlock() - lock()).
>>>
>>> Sounds like the issue is that this patch series assumes AioContext locks
>>> are no longer required for calling the blk_*()/bdrv_*() APIs? That is
>>> not the case yet, so you had to then add those aio_context_lock() calls
>>> back in elsewhere. This approach introduces unnecessary risk. I think we
>>> should wait until blk_*()/bdrv_*() no longer requires the caller to hold
>>> the AioContext lock before applying this series.
>>
>> In general I'm in favor of pushing the lock further down into smaller and
>> smaller critical sections; it's a good approach to make further audits
>> easier until it's "obvious" that the lock is unnecessary.  I haven't yet
>> reviewed Emanuele's patches to see if this is what he's doing where he's
>> adding the acquire/release calls, but that's my understanding of both his
>> cover letter and your reply.
> 
> The problem is the unnecessary risk. We know what the goal is for
> blk_*()/bdrv_*() but it's not quite there yet. Does making changes in
> block jobs help solve the final issues with blk_*()/bdrv_*()?

Correct me if I am wrong, but it seems to me that the bdrv_*()/blk_*() 
operation mostly take care of building, modifying and walking the bds 
graph. So since graph nodes can have multiple AioContext, it makes sense 
that we have a lock when modifying the graph, right?

If so, we can simply try to replace the AioContext lock with a graph 
lock, or something like that. But I am not sure of this.

Emanuele
> 
> If yes, then it's a risk worth taking. If no, then spending time
> developing interim code, reviewing those patches, and risking breakage
> doesn't seem worth it. I'd rather wait for blk_*()/bdrv_*() to be fully
> complete and then see patches that delete aio_context_acquire() in most
> places or add locks in the remaining places where the caller was relying
> on the AioContext lock.
> 
> Stefan
> 


Re: [RFC PATCH 0/6] job: replace AioContext lock with job_mutex
Posted by Stefan Hajnoczi 2 years, 9 months ago
On Wed, Jul 07, 2021 at 06:58:07PM +0200, Emanuele Giuseppe Esposito wrote:
> This is a continuation on the work to reduce (and possibly get rid of) the usage of AioContext lock, by introducing smaller granularity locks to keep the thread safety.
> 
> This series aims to:
> 1) remove the aiocontext lock and substitute it with the already existing
>    global job_mutex
> 2) fix what it looks like to be an oversight when moving the blockjob.c logic
>    into the more generic job.c: job_mutex was introduced especially to
>    protect job->busy flag, but it seems that it was not used in successive
>    patches, because there are multiple code sections that directly
>    access the field without any locking.
> 3) use job_mutex instead of the aiocontext_lock
> 4) extend the reach of the job_mutex to protect all shared fields
>    that the job structure has.

Can you explain the big picture:

1. What are the rules for JobDrivers? Imagine you are implementing a new
   JobDriver. What do you need to know in order to write correct code?

2. What are the rules for monitor? The main pattern is looking up a job,
   invoking a job API on it, and then calling job_unlock().

Stefan
Re: [RFC PATCH 0/6] job: replace AioContext lock with job_mutex
Posted by Emanuele Giuseppe Esposito 2 years, 9 months ago

On 08/07/2021 15:09, Stefan Hajnoczi wrote:
> On Wed, Jul 07, 2021 at 06:58:07PM +0200, Emanuele Giuseppe Esposito wrote:
>> This is a continuation on the work to reduce (and possibly get rid of) the usage of AioContext lock, by introducing smaller granularity locks to keep the thread safety.
>>
>> This series aims to:
>> 1) remove the aiocontext lock and substitute it with the already existing
>>     global job_mutex
>> 2) fix what it looks like to be an oversight when moving the blockjob.c logic
>>     into the more generic job.c: job_mutex was introduced especially to
>>     protect job->busy flag, but it seems that it was not used in successive
>>     patches, because there are multiple code sections that directly
>>     access the field without any locking.
>> 3) use job_mutex instead of the aiocontext_lock
>> 4) extend the reach of the job_mutex to protect all shared fields
>>     that the job structure has.
> 
> Can you explain the big picture:
> 
> 1. What are the rules for JobDrivers? Imagine you are implementing a new
>     JobDriver. What do you need to know in order to write correct code?

I think that in general, the rules for JobDrivers remain the same. The 
job_mutex lock should be invisible (or almost) from the point of view of 
a JobDriver, because the job API available for it should take care of 
the necessary locking/unlocking.

> 
> 2. What are the rules for monitor? The main pattern is looking up a job,
>     invoking a job API on it, and then calling job_unlock().

The monitor instead is aware of this lock: the reason for that is 
exactly what you have described here.
Looking up + invoking a job API operation (for example calling 
find_job() and then job_pause() ) must be performed with the same lock 
hold all the time, otherwise other threads could modify the job while 
the monitor runs its command.

Please let me know if something is not clear and/or if you have 
additional comments on this!

Emanuele

> 
> Stefan
> 


Re: [RFC PATCH 0/6] job: replace AioContext lock with job_mutex
Posted by Stefan Hajnoczi 2 years, 9 months ago
On Mon, Jul 12, 2021 at 10:42:47AM +0200, Emanuele Giuseppe Esposito wrote:
> On 08/07/2021 15:09, Stefan Hajnoczi wrote:
> > On Wed, Jul 07, 2021 at 06:58:07PM +0200, Emanuele Giuseppe Esposito wrote:
> > > This is a continuation on the work to reduce (and possibly get rid of) the usage of AioContext lock, by introducing smaller granularity locks to keep the thread safety.
> > > 
> > > This series aims to:
> > > 1) remove the aiocontext lock and substitute it with the already existing
> > >     global job_mutex
> > > 2) fix what it looks like to be an oversight when moving the blockjob.c logic
> > >     into the more generic job.c: job_mutex was introduced especially to
> > >     protect job->busy flag, but it seems that it was not used in successive
> > >     patches, because there are multiple code sections that directly
> > >     access the field without any locking.
> > > 3) use job_mutex instead of the aiocontext_lock
> > > 4) extend the reach of the job_mutex to protect all shared fields
> > >     that the job structure has.
> > 
> > Can you explain the big picture:
> > 
> > 1. What are the rules for JobDrivers? Imagine you are implementing a new
> >     JobDriver. What do you need to know in order to write correct code?
> 
> I think that in general, the rules for JobDrivers remain the same. The
> job_mutex lock should be invisible (or almost) from the point of view of a
> JobDriver, because the job API available for it should take care of the
> necessary locking/unlocking.
> 
> > 
> > 2. What are the rules for monitor? The main pattern is looking up a job,
> >     invoking a job API on it, and then calling job_unlock().
> 
> The monitor instead is aware of this lock: the reason for that is exactly
> what you have described here.
> Looking up + invoking a job API operation (for example calling find_job()
> and then job_pause() ) must be performed with the same lock hold all the
> time, otherwise other threads could modify the job while the monitor runs
> its command.

That helps, thanks!

Stefan