In vcpu_create after scheduler data is allocated, if
vmtrace_alloc_buffer fails, it will jump to the wrong cleanup label
resulting in a memory leak. Correct the label.
Fixes: 217dd79ee292 ("xen/domain: Add vmtrace_size domain creation parameter")
Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
---
xen/common/domain.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 3c65cca5b0ff..b1e2709d3d82 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -451,7 +451,7 @@ struct vcpu *vcpu_create(struct domain *d, unsigned int vcpu_id)
goto fail_wq;
if ( vmtrace_alloc_buffer(v) != 0 )
- goto fail_wq;
+ goto fail_sched;
if ( arch_vcpu_create(v) != 0 )
goto fail_sched;
base-commit: a845b50c12f3ec3a14255f5eb3062d0c9801a356
--
2.50.1
On 28/07/2025 8:52 pm, Stewart Hildebrand wrote:
> In vcpu_create after scheduler data is allocated, if
> vmtrace_alloc_buffer fails, it will jump to the wrong cleanup label
> resulting in a memory leak. Correct the label.
>
> Fixes: 217dd79ee292 ("xen/domain: Add vmtrace_size domain creation parameter")
> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
Urgh, sorry for breaking this. Ultimately it comes from having two
different error handling schemes.
This patch is probably ok to start with (and to backport), but a better
fix would be to handle sched and wq in vcpu_teardown(). That way we get
a single failure path that does the correct thing irrespective.
An unrelated observation, but there's a waitqueue vcpu allocated in the
common path, but I was under the impression that only x86 had any need
for wqv (and I still need to get around to fixing introspection so we
can drop wait.c entirely).
~Andrew
On 28.07.2025 22:09, Andrew Cooper wrote:
> On 28/07/2025 8:52 pm, Stewart Hildebrand wrote:
>> In vcpu_create after scheduler data is allocated, if
>> vmtrace_alloc_buffer fails, it will jump to the wrong cleanup label
>> resulting in a memory leak. Correct the label.
>>
>> Fixes: 217dd79ee292 ("xen/domain: Add vmtrace_size domain creation parameter")
>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
>
> Urgh, sorry for breaking this. Ultimately it comes from having two
> different error handling schemes.
>
> This patch is probably ok to start with (and to backport), but a better
> fix would be to handle sched and wq in vcpu_teardown(). That way we get
> a single failure path that does the correct thing irrespective.
I agree, and that variant would apparently be as easily backportable.
Stewart, are you up for going that route?
Jan
> An unrelated observation, but there's a waitqueue vcpu allocated in the
> common path, but I was under the impression that only x86 had any need
> for wqv (and I still need to get around to fixing introspection so we
> can drop wait.c entirely).
>
> ~Andrew
On 7/29/25 04:32, Jan Beulich wrote:
> On 28.07.2025 22:09, Andrew Cooper wrote:
>> On 28/07/2025 8:52 pm, Stewart Hildebrand wrote:
>>> In vcpu_create after scheduler data is allocated, if
>>> vmtrace_alloc_buffer fails, it will jump to the wrong cleanup label
>>> resulting in a memory leak. Correct the label.
>>>
>>> Fixes: 217dd79ee292 ("xen/domain: Add vmtrace_size domain creation parameter")
>>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
>>
>> Urgh, sorry for breaking this. Ultimately it comes from having two
>> different error handling schemes.
>>
>> This patch is probably ok to start with (and to backport), but a better
>> fix would be to handle sched and wq in vcpu_teardown(). That way we get
>> a single failure path that does the correct thing irrespective.
>
> I agree, and that variant would apparently be as easily backportable.
> Stewart, are you up for going that route?
Yep, I'll give it a try
>
> Jan
>
>> An unrelated observation, but there's a waitqueue vcpu allocated in the
>> common path, but I was under the impression that only x86 had any need
>> for wqv (and I still need to get around to fixing introspection so we
>> can drop wait.c entirely).
>>
>> ~Andrew
>
On 29/07/2025 10:08 pm, Stewart Hildebrand wrote:
> On 7/29/25 04:32, Jan Beulich wrote:
>> On 28.07.2025 22:09, Andrew Cooper wrote:
>>> On 28/07/2025 8:52 pm, Stewart Hildebrand wrote:
>>>> In vcpu_create after scheduler data is allocated, if
>>>> vmtrace_alloc_buffer fails, it will jump to the wrong cleanup label
>>>> resulting in a memory leak. Correct the label.
>>>>
>>>> Fixes: 217dd79ee292 ("xen/domain: Add vmtrace_size domain creation parameter")
>>>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
>>> Urgh, sorry for breaking this. Ultimately it comes from having two
>>> different error handling schemes.
>>>
>>> This patch is probably ok to start with (and to backport), but a better
>>> fix would be to handle sched and wq in vcpu_teardown(). That way we get
>>> a single failure path that does the correct thing irrespective.
>> I agree, and that variant would apparently be as easily backportable.
>> Stewart, are you up for going that route?
> Yep, I'll give it a try
Just as a heads up before you start, the key is to ensure that anything
called in *_teardown() is idempotent.
~Andrew
On 28/07/2025 9:09 pm, Andrew Cooper wrote:
> On 28/07/2025 8:52 pm, Stewart Hildebrand wrote:
>> In vcpu_create after scheduler data is allocated, if
>> vmtrace_alloc_buffer fails, it will jump to the wrong cleanup label
>> resulting in a memory leak. Correct the label.
>>
>> Fixes: 217dd79ee292 ("xen/domain: Add vmtrace_size domain creation parameter")
>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
> Urgh, sorry for breaking this. Ultimately it comes from having two
> different error handling schemes.
>
> This patch is probably ok to start with (and to backport), but a better
> fix would be to handle sched and wq in vcpu_teardown(). That way we get
> a single failure path that does the correct thing irrespective.
>
> An unrelated observation, but there's a waitqueue vcpu allocated in the
> common path, but I was under the impression that only x86 had any need
> for wqv (and I still need to get around to fixing introspection so we
> can drop wait.c entirely).
P.S. we allocate full wqv for idle CPUs, and they definitely do not need
it on any architecture. Looks like there's some low hanging fruit here too.
~Andrew
On 28.07.2025 22:12, Andrew Cooper wrote:
> On 28/07/2025 9:09 pm, Andrew Cooper wrote:
>> On 28/07/2025 8:52 pm, Stewart Hildebrand wrote:
>>> In vcpu_create after scheduler data is allocated, if
>>> vmtrace_alloc_buffer fails, it will jump to the wrong cleanup label
>>> resulting in a memory leak. Correct the label.
>>>
>>> Fixes: 217dd79ee292 ("xen/domain: Add vmtrace_size domain creation parameter")
>>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
>> Urgh, sorry for breaking this. Ultimately it comes from having two
>> different error handling schemes.
>>
>> This patch is probably ok to start with (and to backport), but a better
>> fix would be to handle sched and wq in vcpu_teardown(). That way we get
>> a single failure path that does the correct thing irrespective.
>>
>> An unrelated observation, but there's a waitqueue vcpu allocated in the
>> common path, but I was under the impression that only x86 had any need
>> for wqv (and I still need to get around to fixing introspection so we
>> can drop wait.c entirely).
>
> P.S. we allocate full wqv for idle CPUs, and they definitely do not need
> it on any architecture. Looks like there's some low hanging fruit here too.
Hmm, the only init_waitqueue_vcpu() that I see is in an "else" to an
"if ( is_idle_domain(d) )".
Jan
© 2016 - 2025 Red Hat, Inc.