It was introduced in commit b129972c8b41e15b0521895a46fd9c752b68a5e,
with the following motivation:
Because start_exclusive uses CPU_FOREACH, merge exclusive_lock with
qemu_cpu_list_lock: together with a call to exclusive_idle (via
cpu_exec_start/end) in cpu_list_add, this protects exclusive work
against concurrent CPU addition and removal.
However, it seems to be redundant, because the cpu-exclusive
infrastructure provides suffificent protection against the newly added
CPU starting execution while the cpu-exclusive work is running, and the
aforementioned traversing of the cpu list is protected by
qemu_cpu_list_lock.
Besides, this appears to be the only place where the cpu-exclusive
section is entered with the BQL taken, which has been found to trigger
AB-BA deadlock as follows:
vCPU thread main thread
----------- -----------
async_safe_run_on_cpu(self,
async_synic_update)
... [cpu hot-add]
process_queued_cpu_work()
qemu_mutex_unlock_iothread()
[grab BQL]
start_exclusive() cpu_list_add()
async_synic_update() finish_safe_work()
qemu_mutex_lock_iothread() cpu_exec_start()
So remove it. This paves the way to establishing a strict nesting rule
of never entering the exclusive section with the BQL taken.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
cpus-common.c | 8 --------
1 file changed, 8 deletions(-)
diff --git a/cpus-common.c b/cpus-common.c
index 3ca58c64e8..023cfebfa3 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -69,12 +69,6 @@ static int cpu_get_free_index(void)
return cpu_index;
}
-static void finish_safe_work(CPUState *cpu)
-{
- cpu_exec_start(cpu);
- cpu_exec_end(cpu);
-}
-
void cpu_list_add(CPUState *cpu)
{
qemu_mutex_lock(&qemu_cpu_list_lock);
@@ -86,8 +80,6 @@ void cpu_list_add(CPUState *cpu)
}
QTAILQ_INSERT_TAIL_RCU(&cpus, cpu, node);
qemu_mutex_unlock(&qemu_cpu_list_lock);
-
- finish_safe_work(cpu);
}
void cpu_list_remove(CPUState *cpu)
--
2.21.0
Roman Kagan <rkagan@virtuozzo.com> writes:
> It was introduced in commit b129972c8b41e15b0521895a46fd9c752b68a5e,
> with the following motivation:
I can't find this commit in my tree.
>
> Because start_exclusive uses CPU_FOREACH, merge exclusive_lock with
> qemu_cpu_list_lock: together with a call to exclusive_idle (via
> cpu_exec_start/end) in cpu_list_add, this protects exclusive work
> against concurrent CPU addition and removal.
>
> However, it seems to be redundant, because the cpu-exclusive
> infrastructure provides suffificent protection against the newly added
> CPU starting execution while the cpu-exclusive work is running, and the
> aforementioned traversing of the cpu list is protected by
> qemu_cpu_list_lock.
>
> Besides, this appears to be the only place where the cpu-exclusive
> section is entered with the BQL taken, which has been found to trigger
> AB-BA deadlock as follows:
>
> vCPU thread main thread
> ----------- -----------
> async_safe_run_on_cpu(self,
> async_synic_update)
> ... [cpu hot-add]
> process_queued_cpu_work()
> qemu_mutex_unlock_iothread()
> [grab BQL]
> start_exclusive() cpu_list_add()
> async_synic_update() finish_safe_work()
> qemu_mutex_lock_iothread() cpu_exec_start()
>
> So remove it. This paves the way to establishing a strict nesting rule
> of never entering the exclusive section with the BQL taken.
>
> Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
> ---
> cpus-common.c | 8 --------
> 1 file changed, 8 deletions(-)
>
> diff --git a/cpus-common.c b/cpus-common.c
> index 3ca58c64e8..023cfebfa3 100644
> --- a/cpus-common.c
> +++ b/cpus-common.c
> @@ -69,12 +69,6 @@ static int cpu_get_free_index(void)
> return cpu_index;
> }
>
> -static void finish_safe_work(CPUState *cpu)
> -{
> - cpu_exec_start(cpu);
> - cpu_exec_end(cpu);
> -}
> -
This makes sense to me intellectually but I'm worried I've missed the
reason for it being introduced. Without finish_safe_work we have to wait
for the actual vCPU thread function to acquire and release the BQL and
enter it's first cpu_exec_start().
I guess I'd be happier if we had a hotplug test where we could stress
test the operation and be sure we've not just moved the deadlock
somewhere else.
> void cpu_list_add(CPUState *cpu)
> {
> qemu_mutex_lock(&qemu_cpu_list_lock);
> @@ -86,8 +80,6 @@ void cpu_list_add(CPUState *cpu)
> }
> QTAILQ_INSERT_TAIL_RCU(&cpus, cpu, node);
> qemu_mutex_unlock(&qemu_cpu_list_lock);
> -
> - finish_safe_work(cpu);
> }
>
> void cpu_list_remove(CPUState *cpu)
--
Alex Bennée
On Mon, Jun 24, 2019 at 11:58:23AM +0100, Alex Bennée wrote:
> Roman Kagan <rkagan@virtuozzo.com> writes:
>
> > It was introduced in commit b129972c8b41e15b0521895a46fd9c752b68a5e,
> > with the following motivation:
>
> I can't find this commit in my tree.
OOPS, that was supposed to be ab129972c8b41e15b0521895a46fd9c752b68a5e,
sorry.
>
> >
> > Because start_exclusive uses CPU_FOREACH, merge exclusive_lock with
> > qemu_cpu_list_lock: together with a call to exclusive_idle (via
> > cpu_exec_start/end) in cpu_list_add, this protects exclusive work
> > against concurrent CPU addition and removal.
> >
> > However, it seems to be redundant, because the cpu-exclusive
> > infrastructure provides suffificent protection against the newly added
> > CPU starting execution while the cpu-exclusive work is running, and the
> > aforementioned traversing of the cpu list is protected by
> > qemu_cpu_list_lock.
> >
> > Besides, this appears to be the only place where the cpu-exclusive
> > section is entered with the BQL taken, which has been found to trigger
> > AB-BA deadlock as follows:
> >
> > vCPU thread main thread
> > ----------- -----------
> > async_safe_run_on_cpu(self,
> > async_synic_update)
> > ... [cpu hot-add]
> > process_queued_cpu_work()
> > qemu_mutex_unlock_iothread()
> > [grab BQL]
> > start_exclusive() cpu_list_add()
> > async_synic_update() finish_safe_work()
> > qemu_mutex_lock_iothread() cpu_exec_start()
> >
> > So remove it. This paves the way to establishing a strict nesting rule
> > of never entering the exclusive section with the BQL taken.
> >
> > Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
> > ---
> > cpus-common.c | 8 --------
> > 1 file changed, 8 deletions(-)
> >
> > diff --git a/cpus-common.c b/cpus-common.c
> > index 3ca58c64e8..023cfebfa3 100644
> > --- a/cpus-common.c
> > +++ b/cpus-common.c
> > @@ -69,12 +69,6 @@ static int cpu_get_free_index(void)
> > return cpu_index;
> > }
> >
> > -static void finish_safe_work(CPUState *cpu)
> > -{
> > - cpu_exec_start(cpu);
> > - cpu_exec_end(cpu);
> > -}
> > -
>
> This makes sense to me intellectually but I'm worried I've missed the
> reason for it being introduced. Without finish_safe_work we have to wait
> for the actual vCPU thread function to acquire and release the BQL and
> enter it's first cpu_exec_start().
>
> I guess I'd be happier if we had a hotplug test where we could stress
> test the operation and be sure we've not just moved the deadlock
> somewhere else.
Me too. Unfortunately I haven't managed to come up with an idea how to
do this test. One of the race participants, the safe work in a vCPU
thread, happens in response to an MSR write by the guest. ATM there's
no way to do it without an actual guest running. I'll have a look if I
can make a vm test for it, using a linux guest and its /dev/cpu/*/msr.
Thanks,
Roman.
>
> > void cpu_list_add(CPUState *cpu)
> > {
> > qemu_mutex_lock(&qemu_cpu_list_lock);
> > @@ -86,8 +80,6 @@ void cpu_list_add(CPUState *cpu)
> > }
> > QTAILQ_INSERT_TAIL_RCU(&cpus, cpu, node);
> > qemu_mutex_unlock(&qemu_cpu_list_lock);
> > -
> > - finish_safe_work(cpu);
> > }
> >
> > void cpu_list_remove(CPUState *cpu)
>
>
> --
> Alex Bennée
>
Roman Kagan <rkagan@virtuozzo.com> writes:
> On Mon, Jun 24, 2019 at 11:58:23AM +0100, Alex Bennée wrote:
>> Roman Kagan <rkagan@virtuozzo.com> writes:
>>
>> > It was introduced in commit b129972c8b41e15b0521895a46fd9c752b68a5e,
>> > with the following motivation:
>>
>> I can't find this commit in my tree.
>
> OOPS, that was supposed to be ab129972c8b41e15b0521895a46fd9c752b68a5e,
> sorry.
>
>>
>> >
>> > Because start_exclusive uses CPU_FOREACH, merge exclusive_lock with
>> > qemu_cpu_list_lock: together with a call to exclusive_idle (via
>> > cpu_exec_start/end) in cpu_list_add, this protects exclusive work
>> > against concurrent CPU addition and removal.
>> >
>> > However, it seems to be redundant, because the cpu-exclusive
>> > infrastructure provides suffificent protection against the newly added
>> > CPU starting execution while the cpu-exclusive work is running, and the
>> > aforementioned traversing of the cpu list is protected by
>> > qemu_cpu_list_lock.
>> >
>> > Besides, this appears to be the only place where the cpu-exclusive
>> > section is entered with the BQL taken, which has been found to trigger
>> > AB-BA deadlock as follows:
>> >
>> > vCPU thread main thread
>> > ----------- -----------
>> > async_safe_run_on_cpu(self,
>> > async_synic_update)
>> > ... [cpu hot-add]
>> > process_queued_cpu_work()
>> > qemu_mutex_unlock_iothread()
>> > [grab BQL]
>> > start_exclusive() cpu_list_add()
>> > async_synic_update() finish_safe_work()
>> > qemu_mutex_lock_iothread() cpu_exec_start()
>> >
>> > So remove it. This paves the way to establishing a strict nesting rule
>> > of never entering the exclusive section with the BQL taken.
>> >
>> > Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
>> > ---
>> > cpus-common.c | 8 --------
>> > 1 file changed, 8 deletions(-)
>> >
>> > diff --git a/cpus-common.c b/cpus-common.c
>> > index 3ca58c64e8..023cfebfa3 100644
>> > --- a/cpus-common.c
>> > +++ b/cpus-common.c
>> > @@ -69,12 +69,6 @@ static int cpu_get_free_index(void)
>> > return cpu_index;
>> > }
>> >
>> > -static void finish_safe_work(CPUState *cpu)
>> > -{
>> > - cpu_exec_start(cpu);
>> > - cpu_exec_end(cpu);
>> > -}
>> > -
>>
>> This makes sense to me intellectually but I'm worried I've missed the
>> reason for it being introduced. Without finish_safe_work we have to wait
>> for the actual vCPU thread function to acquire and release the BQL and
>> enter it's first cpu_exec_start().
>>
>> I guess I'd be happier if we had a hotplug test where we could stress
>> test the operation and be sure we've not just moved the deadlock
>> somewhere else.
>
> Me too. Unfortunately I haven't managed to come up with an idea how to
> do this test. One of the race participants, the safe work in a vCPU
> thread, happens in response to an MSR write by the guest. ATM there's
> no way to do it without an actual guest running. I'll have a look if I
> can make a vm test for it, using a linux guest and its /dev/cpu/*/msr.
Depending on how much machinery is required to trigger this we could
add a system mode test. However there isn't much point if it requires
duplicating the entire guest hotplug stack. It maybe easier to trigger
on ARM - the PCSI sequence isn't overly complicated to deal with but I
don't know what the impact of MSIs is.
--
Alex Bennée
© 2016 - 2026 Red Hat, Inc.