It is only in the error case that we want to clean up the new pool's
scheduler data; in the success case it's rather the old scheduler's
data which needs cleaning up.
Reported-by: René Winther Højgaard <renewin@proton.me>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -810,7 +810,7 @@ int sched_move_domain(struct domain *d,
for ( unit = old_units; unit; )
{
if ( unit->priv )
- sched_free_udata(c->sched, unit->priv);
+ sched_free_udata(ret ? c->sched : old_ops, unit->priv);
old_unit = unit;
unit = unit->next_in_list;
xfree(old_unit);
On Mon, Dec 4, 2023 at 10:57 AM Jan Beulich <jbeulich@suse.com> wrote: > > It is only in the error case that we want to clean up the new pool's > scheduler data; in the success case it's rather the old scheduler's > data which needs cleaning up. > > Reported-by: René Winther Højgaard <renewin@proton.me> > Signed-off-by: Jan Beulich <jbeulich@suse.com> > Reviewed-by: Juergen Gross <jgross@suse.com> > > --- a/xen/common/sched/core.c > +++ b/xen/common/sched/core.c > @@ -810,7 +810,7 @@ int sched_move_domain(struct domain *d, > for ( unit = old_units; unit; ) > { > if ( unit->priv ) > - sched_free_udata(c->sched, unit->priv); > + sched_free_udata(ret ? c->sched : old_ops, unit->priv); > old_unit = unit; > unit = unit->next_in_list; > xfree(old_unit); This code is unfortunately written in a "clever" way which seems to have introduced some confusion. The one place which calls "goto out_free" goes through and replaces *most* of the "old_*" variables with the "new" equivalents. That's why we're iterating over `old_units` even on the failure path. The result is that this change doesn't catch another bug on the following line, in the error case: sched_free_domdata(old_ops, old_domdata); At this point, old_ops is still the old ops, but old_domdata is the *new* domdata. A patch like the following (compile tested only) would fix it along the lines of the original intent: 8<------- diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c index eba0cea4bb..78f21839d3 100644 --- a/xen/common/sched/core.c +++ b/xen/common/sched/core.c @@ -720,6 +720,7 @@ int sched_move_domain(struct domain *d, struct cpupool *c) { old_units = new_units; old_domdata = domdata; + old_ops = c->sched; ret = -ENOMEM; goto out_free; } @@ -809,10 +810,15 @@ int sched_move_domain(struct domain *d, struct cpupool *c) domain_unpause(d); out_free: + /* + * NB if we've jumped here, "old_units", "old_ops", and so on will + * actually be pointing to the new ops, since when aborting it's + * the new ops we want to free. + */ for ( unit = old_units; unit; ) { if ( unit->priv ) - sched_free_udata(c->sched, unit->priv); + sched_free_udata(old_ops, unit->priv); old_unit = unit; unit = unit->next_in_list; xfree(old_unit); ---->8 But given that this kind of cleverness has already fooled two of our most senior developers, I'd suggest making the whole thing more explicit; something like the attached (again compile-tested only)? -George From ab7ecae921d0fa21bd2561990c91aeda79275cef Mon Sep 17 00:00:00 2001 From: George Dunlap <george.dunlap@cloud.com> Date: Mon, 4 Dec 2023 12:25:03 +0000 Subject: [PATCH] sched: clarify and correct sched_move_domain()'s cleanup path MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit While moving a domain, sched_move_domain first makes per-scheduler allocations for the scheduler of the new cpupool, then deletes the old ones. In the event of an error, the *new* allocations must be freed via the new scheduler. In order to avoid code duplication, the error path commandeered the old_units an old_domdata variables to point them at the new units, so that the which on the non-error path freed the old units would now free the new units. This code, however, had two bugs in it. First, when looping over the units, it always frees using "c->sched" (the new ops); meaning on the success path we're freeing the old domdata using the new scheduler. Secondly, when freeing the domdata, it always frees using old_ops, meaning on the failure path we're freeing the new domdata using the old scheduler. We could follow suit with the existing code, and on the error path set old_ops to c->sched, and always use "old_ops" in the freeing code. This would solve both problems. However, given the number of mistakes already made in this code, it seems like a better option is to make a new set of explicitly-named "free_*" variables to use for this purpose. While here, rename "domdata" to "new_domdata", in line with "new_units", for clarity. Reported-by: René Winther Højgaard <renewin@proton.me> Initial-fix-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: George Dunlap <george.dunlap@cloud.com> --- xen/common/sched/core.c | 34 ++++++++++++++++++++-------------- 1 file changed, 20 insertions(+), 14 deletions(-) diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c index eba0cea4bb..b4d0785903 100644 --- a/xen/common/sched/core.c +++ b/xen/common/sched/core.c @@ -678,12 +678,11 @@ int sched_move_domain(struct domain *d, struct cpupool *c) { struct vcpu *v; struct sched_unit *unit, *old_unit; - struct sched_unit *new_units = NULL, *old_units; + struct sched_unit *new_units = NULL, *old_units, *free_units; struct sched_unit **unit_ptr = &new_units; unsigned int new_p, unit_idx; - void *domdata; - struct scheduler *old_ops = dom_scheduler(d); - void *old_domdata; + struct scheduler *old_ops = dom_scheduler(d), *free_ops; + void *new_domdata, *old_domdata, *free_domdata; unsigned int gran = cpupool_get_granularity(c); unsigned int n_units = d->vcpu[0] ? DIV_ROUND_UP(d->max_vcpus, gran) : 0; int ret = 0; @@ -696,10 +695,10 @@ int sched_move_domain(struct domain *d, struct cpupool *c) rcu_read_lock(&sched_res_rculock); - domdata = sched_alloc_domdata(c->sched, d); - if ( IS_ERR(domdata) ) + new_domdata = sched_alloc_domdata(c->sched, d); + if ( IS_ERR(new_domdata) ) { - ret = PTR_ERR(domdata); + ret = PTR_ERR(new_domdata); goto out; } @@ -712,14 +711,16 @@ int sched_move_domain(struct domain *d, struct cpupool *c) unit->domain = d; unit->unit_id = unit_idx * gran; unit->vcpu_list = d->vcpu[unit->unit_id]; - unit->priv = sched_alloc_udata(c->sched, unit, domdata); + unit->priv = sched_alloc_udata(c->sched, unit, new_domdata); *unit_ptr = unit; } if ( !unit || !unit->priv ) { - old_units = new_units; - old_domdata = domdata; + /* Failure path frees the new units/domdata from the new ops */ + free_units = new_units; + free_domdata = new_domdata; + free_ops = c->sched; ret = -ENOMEM; goto out_free; } @@ -751,7 +752,7 @@ int sched_move_domain(struct domain *d, struct cpupool *c) old_units = d->sched_unit_list; d->cpupool = c; - d->sched_priv = domdata; + d->sched_priv = new_domdata; unit = new_units; for_each_vcpu ( d, v ) @@ -808,17 +809,22 @@ int sched_move_domain(struct domain *d, struct cpupool *c) domain_unpause(d); + /* Success path frees the old units/domdata from the old ops */ + free_ops = old_ops; + free_units = old_units; + free_domdata = old_domdata; + out_free: - for ( unit = old_units; unit; ) + for ( unit = free_units; unit; ) { if ( unit->priv ) - sched_free_udata(c->sched, unit->priv); + sched_free_udata(free_ops, unit->priv); old_unit = unit; unit = unit->next_in_list; xfree(old_unit); } - sched_free_domdata(old_ops, old_domdata); + sched_free_domdata(free_ops, free_domdata); out: rcu_read_unlock(&sched_res_rculock); -- 2.25.1
On 04.12.23 14:00, George Dunlap wrote: > On Mon, Dec 4, 2023 at 10:57 AM Jan Beulich <jbeulich@suse.com> wrote: >> >> It is only in the error case that we want to clean up the new pool's >> scheduler data; in the success case it's rather the old scheduler's >> data which needs cleaning up. >> >> Reported-by: René Winther Højgaard <renewin@proton.me> >> Signed-off-by: Jan Beulich <jbeulich@suse.com> >> Reviewed-by: Juergen Gross <jgross@suse.com> >> >> --- a/xen/common/sched/core.c >> +++ b/xen/common/sched/core.c >> @@ -810,7 +810,7 @@ int sched_move_domain(struct domain *d, >> for ( unit = old_units; unit; ) >> { >> if ( unit->priv ) >> - sched_free_udata(c->sched, unit->priv); >> + sched_free_udata(ret ? c->sched : old_ops, unit->priv); >> old_unit = unit; >> unit = unit->next_in_list; >> xfree(old_unit); > > This code is unfortunately written in a "clever" way which seems to > have introduced some confusion. The one place which calls "goto > out_free" goes through and replaces *most* of the "old_*" variables > with the "new" equivalents. That's why we're iterating over > `old_units` even on the failure path. > > The result is that this change doesn't catch another bug on the > following line, in the error case: > > sched_free_domdata(old_ops, old_domdata); > > At this point, old_ops is still the old ops, but old_domdata is the > *new* domdata. > > A patch like the following (compile tested only) would fix it along > the lines of the original intent: > 8<------- > diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c > index eba0cea4bb..78f21839d3 100644 > --- a/xen/common/sched/core.c > +++ b/xen/common/sched/core.c > @@ -720,6 +720,7 @@ int sched_move_domain(struct domain *d, struct cpupool *c) > { > old_units = new_units; > old_domdata = domdata; > + old_ops = c->sched; > ret = -ENOMEM; > goto out_free; > } > @@ -809,10 +810,15 @@ int sched_move_domain(struct domain *d, struct cpupool *c) > domain_unpause(d); > > out_free: > + /* > + * NB if we've jumped here, "old_units", "old_ops", and so on will > + * actually be pointing to the new ops, since when aborting it's > + * the new ops we want to free. > + */ > for ( unit = old_units; unit; ) > { > if ( unit->priv ) > - sched_free_udata(c->sched, unit->priv); > + sched_free_udata(old_ops, unit->priv); > old_unit = unit; > unit = unit->next_in_list; > xfree(old_unit); > ---->8 > > But given that this kind of cleverness has already fooled two of our > most senior developers, I'd suggest making the whole thing more > explicit; something like the attached (again compile-tested only)? And I have again a third approach, making it crystal clear what is happening with which data. No need to explain what is freed via which variables. See attached patch (this time it should be really there). Thoughts? Juergen
On 04.12.2023 15:10, Juergen Gross wrote: > And I have again a third approach, making it crystal clear what is happening > with which data. No need to explain what is freed via which variables. See > attached patch (this time it should be really there). Looks more neat to me than George's. Just one minor thing: Please can the first parameter of sched_move_domain_cleanup() be constified? Jan
On 04.12.23 15:38, Jan Beulich wrote: > On 04.12.2023 15:10, Juergen Gross wrote: >> And I have again a third approach, making it crystal clear what is happening >> with which data. No need to explain what is freed via which variables. See >> attached patch (this time it should be really there). > > Looks more neat to me than George's. Just one minor thing: Please can the > first parameter of sched_move_domain_cleanup() be constified? Yes, will do that. I'll send out V2 soon together with the other fix (this probably wants an update of the commit message) and a small cleanup patch I have. Juergen
On Mon, Dec 4, 2023 at 2:10 PM Juergen Gross <jgross@suse.com> wrote: > > On 04.12.23 14:00, George Dunlap wrote: > > On Mon, Dec 4, 2023 at 10:57 AM Jan Beulich <jbeulich@suse.com> wrote: > >> > >> It is only in the error case that we want to clean up the new pool's > >> scheduler data; in the success case it's rather the old scheduler's > >> data which needs cleaning up. > >> > >> Reported-by: René Winther Højgaard <renewin@proton.me> > >> Signed-off-by: Jan Beulich <jbeulich@suse.com> > >> Reviewed-by: Juergen Gross <jgross@suse.com> > >> > >> --- a/xen/common/sched/core.c > >> +++ b/xen/common/sched/core.c > >> @@ -810,7 +810,7 @@ int sched_move_domain(struct domain *d, > >> for ( unit = old_units; unit; ) > >> { > >> if ( unit->priv ) > >> - sched_free_udata(c->sched, unit->priv); > >> + sched_free_udata(ret ? c->sched : old_ops, unit->priv); > >> old_unit = unit; > >> unit = unit->next_in_list; > >> xfree(old_unit); > > > > This code is unfortunately written in a "clever" way which seems to > > have introduced some confusion. The one place which calls "goto > > out_free" goes through and replaces *most* of the "old_*" variables > > with the "new" equivalents. That's why we're iterating over > > `old_units` even on the failure path. > > > > The result is that this change doesn't catch another bug on the > > following line, in the error case: > > > > sched_free_domdata(old_ops, old_domdata); > > > > At this point, old_ops is still the old ops, but old_domdata is the > > *new* domdata. > > > > A patch like the following (compile tested only) would fix it along > > the lines of the original intent: > > 8<------- > > diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c > > index eba0cea4bb..78f21839d3 100644 > > --- a/xen/common/sched/core.c > > +++ b/xen/common/sched/core.c > > @@ -720,6 +720,7 @@ int sched_move_domain(struct domain *d, struct cpupool *c) > > { > > old_units = new_units; > > old_domdata = domdata; > > + old_ops = c->sched; > > ret = -ENOMEM; > > goto out_free; > > } > > @@ -809,10 +810,15 @@ int sched_move_domain(struct domain *d, struct cpupool *c) > > domain_unpause(d); > > > > out_free: > > + /* > > + * NB if we've jumped here, "old_units", "old_ops", and so on will > > + * actually be pointing to the new ops, since when aborting it's > > + * the new ops we want to free. > > + */ > > for ( unit = old_units; unit; ) > > { > > if ( unit->priv ) > > - sched_free_udata(c->sched, unit->priv); > > + sched_free_udata(old_ops, unit->priv); > > old_unit = unit; > > unit = unit->next_in_list; > > xfree(old_unit); > > ---->8 > > > > But given that this kind of cleverness has already fooled two of our > > most senior developers, I'd suggest making the whole thing more > > explicit; something like the attached (again compile-tested only)? > > And I have again a third approach, making it crystal clear what is happening > with which data. No need to explain what is freed via which variables. See > attached patch (this time it should be really there). Yes, I thought about making a function as well -- that works for me too. Personally I prefer to keep the "goto out", rather than duplicating the rcu_read_unlock(). I'd yield if Jan said he preferred duplication, however. -George
On 04.12.2023 15:18, George Dunlap wrote: > On Mon, Dec 4, 2023 at 2:10 PM Juergen Gross <jgross@suse.com> wrote: >> >> On 04.12.23 14:00, George Dunlap wrote: >>> On Mon, Dec 4, 2023 at 10:57 AM Jan Beulich <jbeulich@suse.com> wrote: >>>> >>>> It is only in the error case that we want to clean up the new pool's >>>> scheduler data; in the success case it's rather the old scheduler's >>>> data which needs cleaning up. >>>> >>>> Reported-by: René Winther Højgaard <renewin@proton.me> >>>> Signed-off-by: Jan Beulich <jbeulich@suse.com> >>>> Reviewed-by: Juergen Gross <jgross@suse.com> >>>> >>>> --- a/xen/common/sched/core.c >>>> +++ b/xen/common/sched/core.c >>>> @@ -810,7 +810,7 @@ int sched_move_domain(struct domain *d, >>>> for ( unit = old_units; unit; ) >>>> { >>>> if ( unit->priv ) >>>> - sched_free_udata(c->sched, unit->priv); >>>> + sched_free_udata(ret ? c->sched : old_ops, unit->priv); >>>> old_unit = unit; >>>> unit = unit->next_in_list; >>>> xfree(old_unit); >>> >>> This code is unfortunately written in a "clever" way which seems to >>> have introduced some confusion. The one place which calls "goto >>> out_free" goes through and replaces *most* of the "old_*" variables >>> with the "new" equivalents. That's why we're iterating over >>> `old_units` even on the failure path. >>> >>> The result is that this change doesn't catch another bug on the >>> following line, in the error case: >>> >>> sched_free_domdata(old_ops, old_domdata); >>> >>> At this point, old_ops is still the old ops, but old_domdata is the >>> *new* domdata. >>> >>> A patch like the following (compile tested only) would fix it along >>> the lines of the original intent: >>> 8<------- >>> diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c >>> index eba0cea4bb..78f21839d3 100644 >>> --- a/xen/common/sched/core.c >>> +++ b/xen/common/sched/core.c >>> @@ -720,6 +720,7 @@ int sched_move_domain(struct domain *d, struct cpupool *c) >>> { >>> old_units = new_units; >>> old_domdata = domdata; >>> + old_ops = c->sched; >>> ret = -ENOMEM; >>> goto out_free; >>> } >>> @@ -809,10 +810,15 @@ int sched_move_domain(struct domain *d, struct cpupool *c) >>> domain_unpause(d); >>> >>> out_free: >>> + /* >>> + * NB if we've jumped here, "old_units", "old_ops", and so on will >>> + * actually be pointing to the new ops, since when aborting it's >>> + * the new ops we want to free. >>> + */ >>> for ( unit = old_units; unit; ) >>> { >>> if ( unit->priv ) >>> - sched_free_udata(c->sched, unit->priv); >>> + sched_free_udata(old_ops, unit->priv); >>> old_unit = unit; >>> unit = unit->next_in_list; >>> xfree(old_unit); >>> ---->8 >>> >>> But given that this kind of cleverness has already fooled two of our >>> most senior developers, I'd suggest making the whole thing more >>> explicit; something like the attached (again compile-tested only)? >> >> And I have again a third approach, making it crystal clear what is happening >> with which data. No need to explain what is freed via which variables. See >> attached patch (this time it should be really there). > > Yes, I thought about making a function as well -- that works for me too. > > Personally I prefer to keep the "goto out", rather than duplicating > the rcu_read_unlock(). I'd yield if Jan said he preferred > duplication, however. I'm on the edge there actually. Jan
On 04.12.23 15:39, Jan Beulich wrote: > On 04.12.2023 15:18, George Dunlap wrote: >> On Mon, Dec 4, 2023 at 2:10 PM Juergen Gross <jgross@suse.com> wrote: >>> >>> On 04.12.23 14:00, George Dunlap wrote: >>>> On Mon, Dec 4, 2023 at 10:57 AM Jan Beulich <jbeulich@suse.com> wrote: >>>>> >>>>> It is only in the error case that we want to clean up the new pool's >>>>> scheduler data; in the success case it's rather the old scheduler's >>>>> data which needs cleaning up. >>>>> >>>>> Reported-by: René Winther Højgaard <renewin@proton.me> >>>>> Signed-off-by: Jan Beulich <jbeulich@suse.com> >>>>> Reviewed-by: Juergen Gross <jgross@suse.com> >>>>> >>>>> --- a/xen/common/sched/core.c >>>>> +++ b/xen/common/sched/core.c >>>>> @@ -810,7 +810,7 @@ int sched_move_domain(struct domain *d, >>>>> for ( unit = old_units; unit; ) >>>>> { >>>>> if ( unit->priv ) >>>>> - sched_free_udata(c->sched, unit->priv); >>>>> + sched_free_udata(ret ? c->sched : old_ops, unit->priv); >>>>> old_unit = unit; >>>>> unit = unit->next_in_list; >>>>> xfree(old_unit); >>>> >>>> This code is unfortunately written in a "clever" way which seems to >>>> have introduced some confusion. The one place which calls "goto >>>> out_free" goes through and replaces *most* of the "old_*" variables >>>> with the "new" equivalents. That's why we're iterating over >>>> `old_units` even on the failure path. >>>> >>>> The result is that this change doesn't catch another bug on the >>>> following line, in the error case: >>>> >>>> sched_free_domdata(old_ops, old_domdata); >>>> >>>> At this point, old_ops is still the old ops, but old_domdata is the >>>> *new* domdata. >>>> >>>> A patch like the following (compile tested only) would fix it along >>>> the lines of the original intent: >>>> 8<------- >>>> diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c >>>> index eba0cea4bb..78f21839d3 100644 >>>> --- a/xen/common/sched/core.c >>>> +++ b/xen/common/sched/core.c >>>> @@ -720,6 +720,7 @@ int sched_move_domain(struct domain *d, struct cpupool *c) >>>> { >>>> old_units = new_units; >>>> old_domdata = domdata; >>>> + old_ops = c->sched; >>>> ret = -ENOMEM; >>>> goto out_free; >>>> } >>>> @@ -809,10 +810,15 @@ int sched_move_domain(struct domain *d, struct cpupool *c) >>>> domain_unpause(d); >>>> >>>> out_free: >>>> + /* >>>> + * NB if we've jumped here, "old_units", "old_ops", and so on will >>>> + * actually be pointing to the new ops, since when aborting it's >>>> + * the new ops we want to free. >>>> + */ >>>> for ( unit = old_units; unit; ) >>>> { >>>> if ( unit->priv ) >>>> - sched_free_udata(c->sched, unit->priv); >>>> + sched_free_udata(old_ops, unit->priv); >>>> old_unit = unit; >>>> unit = unit->next_in_list; >>>> xfree(old_unit); >>>> ---->8 >>>> >>>> But given that this kind of cleverness has already fooled two of our >>>> most senior developers, I'd suggest making the whole thing more >>>> explicit; something like the attached (again compile-tested only)? >>> >>> And I have again a third approach, making it crystal clear what is happening >>> with which data. No need to explain what is freed via which variables. See >>> attached patch (this time it should be really there). >> >> Yes, I thought about making a function as well -- that works for me too. >> >> Personally I prefer to keep the "goto out", rather than duplicating >> the rcu_read_unlock(). I'd yield if Jan said he preferred >> duplication, however. > > I'm on the edge there actually. In this case I'd prefer it my way, as it avoids having to scroll down to the out: label to see what is happening there. Additionally it enables to get rid of the ret variable. In the end the main part of the patch is the new function, so I'm not really feeling strong regarding the dropping of "goto out". Juergen
On Mon, Dec 4, 2023 at 2:44 PM Juergen Gross <jgross@suse.com> wrote: > >> Personally I prefer to keep the "goto out", rather than duplicating > >> the rcu_read_unlock(). I'd yield if Jan said he preferred > >> duplication, however. > > > > I'm on the edge there actually. > > In this case I'd prefer it my way, as it avoids having to scroll down to the > out: label to see what is happening there. Additionally it enables to get rid > of the ret variable. The issue is, suppose we change something else, like needing to grab (and release) another lock? Sharing the exit path makes it easier to avoid those kinds of mistakes. -George
On 04.12.23 20:33, George Dunlap wrote: > On Mon, Dec 4, 2023 at 2:44 PM Juergen Gross <jgross@suse.com> wrote: >>>> Personally I prefer to keep the "goto out", rather than duplicating >>>> the rcu_read_unlock(). I'd yield if Jan said he preferred >>>> duplication, however. >>> >>> I'm on the edge there actually. >> >> In this case I'd prefer it my way, as it avoids having to scroll down to the >> out: label to see what is happening there. Additionally it enables to get rid >> of the ret variable. > > The issue is, suppose we change something else, like needing to grab > (and release) another lock? Sharing the exit path makes it easier to > avoid those kinds of mistakes. Yes, this could happen. OTOH it could happen that an action is added on the exit path which should _not_ be executed in the error case. I agree that in case of more than one exit action needed the goto approach is superior. For zero exit actions it is inferior, while for one it will depend on the specific case IMHO. Juergen
On 04.12.23 14:00, George Dunlap wrote: > On Mon, Dec 4, 2023 at 10:57 AM Jan Beulich <jbeulich@suse.com> wrote: >> >> It is only in the error case that we want to clean up the new pool's >> scheduler data; in the success case it's rather the old scheduler's >> data which needs cleaning up. >> >> Reported-by: René Winther Højgaard <renewin@proton.me> >> Signed-off-by: Jan Beulich <jbeulich@suse.com> >> Reviewed-by: Juergen Gross <jgross@suse.com> >> >> --- a/xen/common/sched/core.c >> +++ b/xen/common/sched/core.c >> @@ -810,7 +810,7 @@ int sched_move_domain(struct domain *d, >> for ( unit = old_units; unit; ) >> { >> if ( unit->priv ) >> - sched_free_udata(c->sched, unit->priv); >> + sched_free_udata(ret ? c->sched : old_ops, unit->priv); >> old_unit = unit; >> unit = unit->next_in_list; >> xfree(old_unit); > > This code is unfortunately written in a "clever" way which seems to > have introduced some confusion. The one place which calls "goto > out_free" goes through and replaces *most* of the "old_*" variables > with the "new" equivalents. That's why we're iterating over > `old_units` even on the failure path. > > The result is that this change doesn't catch another bug on the > following line, in the error case: > > sched_free_domdata(old_ops, old_domdata); > > At this point, old_ops is still the old ops, but old_domdata is the > *new* domdata. > > A patch like the following (compile tested only) would fix it along > the lines of the original intent: > 8<------- > diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c > index eba0cea4bb..78f21839d3 100644 > --- a/xen/common/sched/core.c > +++ b/xen/common/sched/core.c > @@ -720,6 +720,7 @@ int sched_move_domain(struct domain *d, struct cpupool *c) > { > old_units = new_units; > old_domdata = domdata; > + old_ops = c->sched; > ret = -ENOMEM; > goto out_free; > } > @@ -809,10 +810,15 @@ int sched_move_domain(struct domain *d, struct cpupool *c) > domain_unpause(d); > > out_free: > + /* > + * NB if we've jumped here, "old_units", "old_ops", and so on will > + * actually be pointing to the new ops, since when aborting it's > + * the new ops we want to free. > + */ > for ( unit = old_units; unit; ) > { > if ( unit->priv ) > - sched_free_udata(c->sched, unit->priv); > + sched_free_udata(old_ops, unit->priv); > old_unit = unit; > unit = unit->next_in_list; > xfree(old_unit); > ---->8 > > But given that this kind of cleverness has already fooled two of our > most senior developers, I'd suggest making the whole thing more > explicit; something like the attached (again compile-tested only)? And I have again a third approach, making it crystal clear what is happening with which data. No need to explain what is freed via which variables. See attached patch. Thoughts? Juergen
On Mon, Dec 4, 2023 at 1:47 PM Juergen Gross <jgross@suse.com> wrote: > > On 04.12.23 14:00, George Dunlap wrote: > > On Mon, Dec 4, 2023 at 10:57 AM Jan Beulich <jbeulich@suse.com> wrote: > >> > >> It is only in the error case that we want to clean up the new pool's > >> scheduler data; in the success case it's rather the old scheduler's > >> data which needs cleaning up. > >> > >> Reported-by: René Winther Højgaard <renewin@proton.me> > >> Signed-off-by: Jan Beulich <jbeulich@suse.com> > >> Reviewed-by: Juergen Gross <jgross@suse.com> > >> > >> --- a/xen/common/sched/core.c > >> +++ b/xen/common/sched/core.c > >> @@ -810,7 +810,7 @@ int sched_move_domain(struct domain *d, > >> for ( unit = old_units; unit; ) > >> { > >> if ( unit->priv ) > >> - sched_free_udata(c->sched, unit->priv); > >> + sched_free_udata(ret ? c->sched : old_ops, unit->priv); > >> old_unit = unit; > >> unit = unit->next_in_list; > >> xfree(old_unit); > > > > This code is unfortunately written in a "clever" way which seems to > > have introduced some confusion. The one place which calls "goto > > out_free" goes through and replaces *most* of the "old_*" variables > > with the "new" equivalents. That's why we're iterating over > > `old_units` even on the failure path. > > > > The result is that this change doesn't catch another bug on the > > following line, in the error case: > > > > sched_free_domdata(old_ops, old_domdata); > > > > At this point, old_ops is still the old ops, but old_domdata is the > > *new* domdata. > > > > A patch like the following (compile tested only) would fix it along > > the lines of the original intent: > > 8<------- > > diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c > > index eba0cea4bb..78f21839d3 100644 > > --- a/xen/common/sched/core.c > > +++ b/xen/common/sched/core.c > > @@ -720,6 +720,7 @@ int sched_move_domain(struct domain *d, struct cpupool *c) > > { > > old_units = new_units; > > old_domdata = domdata; > > + old_ops = c->sched; > > ret = -ENOMEM; > > goto out_free; > > } > > @@ -809,10 +810,15 @@ int sched_move_domain(struct domain *d, struct cpupool *c) > > domain_unpause(d); > > > > out_free: > > + /* > > + * NB if we've jumped here, "old_units", "old_ops", and so on will > > + * actually be pointing to the new ops, since when aborting it's > > + * the new ops we want to free. > > + */ > > for ( unit = old_units; unit; ) > > { > > if ( unit->priv ) > > - sched_free_udata(c->sched, unit->priv); > > + sched_free_udata(old_ops, unit->priv); > > old_unit = unit; > > unit = unit->next_in_list; > > xfree(old_unit); > > ---->8 > > > > But given that this kind of cleverness has already fooled two of our > > most senior developers, I'd suggest making the whole thing more > > explicit; something like the attached (again compile-tested only)? > > And I have again a third approach, making it crystal clear what is happening > with which data. No need to explain what is freed via which variables. See > attached patch. > > Thoughts? I only see a PGP key and signature. Did you forget to attach the patch? -George
© 2016 - 2024 Red Hat, Inc.