In function virt_cpu_plug(), it will send cpu plug message to interrupt
controller extioi and ipi irqchip. If there is problem in this function,
system should continue to run and keep state the same before cpu is
added.
Object cpuslot::cpu is set at last only when there is no any error.
If there is, send cpu unplug message to extioi and ipi irqchip, and then
return immediately.
Fixes: ab9935d2991e (hw/loongarch/virt: Implement cpu plug interface)
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
---
hw/loongarch/virt.c | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index a5840ff968..5118f01e4b 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -981,8 +981,6 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev,
LoongArchVirtMachineState *lvms = LOONGARCH_VIRT_MACHINE(hotplug_dev);
Error *err = NULL;
- cpu_slot = virt_find_cpu_slot(MACHINE(lvms), cpu->phy_id);
- cpu_slot->cpu = CPU(dev);
if (lvms->ipi) {
hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, &err);
if (err) {
@@ -995,6 +993,10 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev,
hotplug_handler_plug(HOTPLUG_HANDLER(lvms->extioi), dev, &err);
if (err) {
error_propagate(errp, err);
+ if (lvms->ipi) {
+ /* Send unplug message to restore, discard error here */
+ hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->ipi), dev, NULL);
+ }
return;
}
}
@@ -1003,9 +1005,20 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev,
hotplug_handler_plug(HOTPLUG_HANDLER(lvms->acpi_ged), dev, &err);
if (err) {
error_propagate(errp, err);
+ if (lvms->ipi) {
+ hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->ipi), dev, NULL);
+ }
+
+ if (lvms->extioi) {
+ hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->extioi),
+ dev, NULL);
+ }
+ return;
}
}
+ cpu_slot = virt_find_cpu_slot(MACHINE(lvms), cpu->phy_id);
+ cpu_slot->cpu = CPU(dev);
return;
}
--
2.39.3
Bibo Mao <maobibo@loongson.cn> writes: > In function virt_cpu_plug(), it will send cpu plug message to interrupt > controller extioi and ipi irqchip. If there is problem in this function, > system should continue to run and keep state the same before cpu is > added. > > Object cpuslot::cpu is set at last only when there is no any error. > If there is, send cpu unplug message to extioi and ipi irqchip, and then > return immediately. > > Fixes: ab9935d2991e (hw/loongarch/virt: Implement cpu plug interface) > Signed-off-by: Bibo Mao <maobibo@loongson.cn> > --- > hw/loongarch/virt.c | 17 +++++++++++++++-- > 1 file changed, 15 insertions(+), 2 deletions(-) > > diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c > index a5840ff968..5118f01e4b 100644 > --- a/hw/loongarch/virt.c > +++ b/hw/loongarch/virt.c > @@ -981,8 +981,6 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev, > LoongArchVirtMachineState *lvms = LOONGARCH_VIRT_MACHINE(hotplug_dev); > Error *err = NULL; > > - cpu_slot = virt_find_cpu_slot(MACHINE(lvms), cpu->phy_id); > - cpu_slot->cpu = CPU(dev); > if (lvms->ipi) { > hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, &err); > if (err) { > @@ -995,6 +993,10 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev, > hotplug_handler_plug(HOTPLUG_HANDLER(lvms->extioi), dev, &err); > if (err) { > error_propagate(errp, err); > + if (lvms->ipi) { > + /* Send unplug message to restore, discard error here */ > + hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->ipi), dev, NULL); > + } > return; > } > } > @@ -1003,9 +1005,20 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev, > hotplug_handler_plug(HOTPLUG_HANDLER(lvms->acpi_ged), dev, &err); > if (err) { > error_propagate(errp, err); > + if (lvms->ipi) { > + hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->ipi), dev, NULL); > + } > + > + if (lvms->extioi) { > + hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->extioi), > + dev, NULL); > + } > + return; > } > } > > + cpu_slot = virt_find_cpu_slot(MACHINE(lvms), cpu->phy_id); > + cpu_slot->cpu = CPU(dev); > return; > } Hmm. You're right about the problem: virt_cpu_plug() neglects to revert changes when it fails. You're probably right to move the assignment to cpu_slot->cpu to the end. Anything you can delay until success is assured you don't have to revert. I say "probably" because the code that now runs before the assignment might theoretically "see" the assignment, and I didn't examine it to exclude that. Where I have doubts is the code to revert changes. The hotplug_handler_plug() error checkign suggests it can fail. Can hotplug_handler_unplug() fail, too? The error checking in virt_cpu_unplug() right above suggests it can. What happens if it fails in virt_cpu_plug()?
On 2025/3/20 下午2:16, Markus Armbruster wrote: > Bibo Mao <maobibo@loongson.cn> writes: > >> In function virt_cpu_plug(), it will send cpu plug message to interrupt >> controller extioi and ipi irqchip. If there is problem in this function, >> system should continue to run and keep state the same before cpu is >> added. >> >> Object cpuslot::cpu is set at last only when there is no any error. >> If there is, send cpu unplug message to extioi and ipi irqchip, and then >> return immediately. >> >> Fixes: ab9935d2991e (hw/loongarch/virt: Implement cpu plug interface) >> Signed-off-by: Bibo Mao <maobibo@loongson.cn> >> --- >> hw/loongarch/virt.c | 17 +++++++++++++++-- >> 1 file changed, 15 insertions(+), 2 deletions(-) >> >> diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c >> index a5840ff968..5118f01e4b 100644 >> --- a/hw/loongarch/virt.c >> +++ b/hw/loongarch/virt.c >> @@ -981,8 +981,6 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev, >> LoongArchVirtMachineState *lvms = LOONGARCH_VIRT_MACHINE(hotplug_dev); >> Error *err = NULL; >> >> - cpu_slot = virt_find_cpu_slot(MACHINE(lvms), cpu->phy_id); >> - cpu_slot->cpu = CPU(dev); >> if (lvms->ipi) { >> hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, &err); >> if (err) { >> @@ -995,6 +993,10 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev, >> hotplug_handler_plug(HOTPLUG_HANDLER(lvms->extioi), dev, &err); >> if (err) { >> error_propagate(errp, err); >> + if (lvms->ipi) { >> + /* Send unplug message to restore, discard error here */ >> + hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->ipi), dev, NULL); >> + } >> return; >> } >> } >> @@ -1003,9 +1005,20 @@ static void virt_cpu_plug(HotplugHandler *hotplug_dev, >> hotplug_handler_plug(HOTPLUG_HANDLER(lvms->acpi_ged), dev, &err); >> if (err) { >> error_propagate(errp, err); >> + if (lvms->ipi) { >> + hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->ipi), dev, NULL); >> + } >> + >> + if (lvms->extioi) { >> + hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->extioi), >> + dev, NULL); >> + } >> + return; >> } >> } >> >> + cpu_slot = virt_find_cpu_slot(MACHINE(lvms), cpu->phy_id); >> + cpu_slot->cpu = CPU(dev); >> return; >> } > > Hmm. > > You're right about the problem: virt_cpu_plug() neglects to revert > changes when it fails. > > You're probably right to move the assignment to cpu_slot->cpu to the > end. Anything you can delay until success is assured you don't have to > revert. I say "probably" because the code that now runs before the > assignment might theoretically "see" the assignment, and I didn't > examine it to exclude that. > > Where I have doubts is the code to revert changes. > > The hotplug_handler_plug() error checkign suggests it can fail. > > Can hotplug_handler_unplug() fail, too? The error checking in > virt_cpu_unplug() right above suggests it can. Basically from existing code about ipi/extioi hotplug handler, it is impossible to there is error, here is only for future use. If there is error in function virt_cpu_plug(), undo() such as hotplug_handler_unplug() should be called. However if undo() reports error, I do not know how to handle it, here just discard error in function undo(). Regards Bibo Mao > > What happens if it fails in virt_cpu_plug()? >
bibo mao <maobibo@loongson.cn> writes: On 2025/3/20 下午2:16, Markus Armbruster wrote: >> Bibo Mao <maobibo@loongson.cn> writes: >> >>> In function virt_cpu_plug(), it will send cpu plug message to interrupt >>> controller extioi and ipi irqchip. If there is problem in this function, >>> system should continue to run and keep state the same before cpu is >>> added. >>> >>> Object cpuslot::cpu is set at last only when there is no any error. >>> If there is, send cpu unplug message to extioi and ipi irqchip, and then >>> return immediately. >>> >>> Fixes: ab9935d2991e (hw/loongarch/virt: Implement cpu plug interface) >>> Signed-off-by: Bibo Mao <maobibo@loongson.cn> [...] >> Hmm. >> >> You're right about the problem: virt_cpu_plug() neglects to revert >> changes when it fails. >> >> You're probably right to move the assignment to cpu_slot->cpu to the >> end. Anything you can delay until success is assured you don't have to >> revert. I say "probably" because the code that now runs before the >> assignment might theoretically "see" the assignment, and I didn't >> examine it to exclude that. >> >> Where I have doubts is the code to revert changes. >> >> The hotplug_handler_plug() error checkign suggests it can fail. >> >> Can hotplug_handler_unplug() fail, too? The error checking in >> virt_cpu_unplug() right above suggests it can. > > Basically from existing code about ipi/extioi hotplug handler, it is > impossible to there is error, here is only for future use. Aha. More at the end of my reply. > If there is error in function virt_cpu_plug(), undo() such as > hotplug_handler_unplug() should be called. However if undo() reports > error, I do not know how to handle it, here just discard error in > function undo(). Steinbach's Guideline for Systems Programming: Never test for an error condition you don't know how to handle. This old quip is a funny way to say that errors we don't know how to handle are *bad*, and should be avoided. > Regards > Bibo Mao >> >> What happens if it fails in virt_cpu_plug()? You assure us this can't happen today. Because of that, broken error recovery is not an actual problem. However, if things change some day so it can happen, broken error recovery becomes an actual problem. so, broken error recovery just "for future use" is actually just for silent future breakage. But is it broken? This is what I'm trying to find out with my "what happens if" question. If it is broken, then passing &error_abort would likely be less bad: crash instead of silent breakage. Also makes it completely obvious in the code that these errors are not handled, whereas broken error handling looks like it is until you actually think about it.
On 2025/3/20 下午3:25, Markus Armbruster wrote: > bibo mao <maobibo@loongson.cn> writes: > > On 2025/3/20 下午2:16, Markus Armbruster wrote: >>> Bibo Mao <maobibo@loongson.cn> writes: >>> >>>> In function virt_cpu_plug(), it will send cpu plug message to interrupt >>>> controller extioi and ipi irqchip. If there is problem in this function, >>>> system should continue to run and keep state the same before cpu is >>>> added. >>>> >>>> Object cpuslot::cpu is set at last only when there is no any error. >>>> If there is, send cpu unplug message to extioi and ipi irqchip, and then >>>> return immediately. >>>> >>>> Fixes: ab9935d2991e (hw/loongarch/virt: Implement cpu plug interface) >>>> Signed-off-by: Bibo Mao <maobibo@loongson.cn> > > [...] > >>> Hmm. >>> >>> You're right about the problem: virt_cpu_plug() neglects to revert >>> changes when it fails. >>> >>> You're probably right to move the assignment to cpu_slot->cpu to the >>> end. Anything you can delay until success is assured you don't have to >>> revert. I say "probably" because the code that now runs before the >>> assignment might theoretically "see" the assignment, and I didn't >>> examine it to exclude that. >>> >>> Where I have doubts is the code to revert changes. >>> >>> The hotplug_handler_plug() error checkign suggests it can fail. >>> >>> Can hotplug_handler_unplug() fail, too? The error checking in >>> virt_cpu_unplug() right above suggests it can. >> >> Basically from existing code about ipi/extioi hotplug handler, it is >> impossible to there is error, here is only for future use. > > Aha. More at the end of my reply. > >> If there is error in function virt_cpu_plug(), undo() such as >> hotplug_handler_unplug() should be called. However if undo() reports >> error, I do not know how to handle it, here just discard error in >> function undo(). > > Steinbach's Guideline for Systems Programming: Never test for an error > condition you don't know how to handle. > > This old quip is a funny way to say that errors we don't know how to > handle are *bad*, and should be avoided. > >> Regards >> Bibo Mao >>> >>> What happens if it fails in virt_cpu_plug()? > > You assure us this can't happen today. Because of that, broken error > recovery is not an actual problem. > > However, if things change some day so it can happen, broken error > recovery becomes an actual problem. > > so, broken error recovery just "for future use" is actually just for > silent future breakage. > > But is it broken? This is what I'm trying to find out with my "what > happens if" question. > > If it is broken, then passing &error_abort would likely be less bad: > crash instead of silent breakage. Also makes it completely obvious in > the code that these errors are not handled, whereas broken error > handling looks like it is until you actually think about it. > yes, it seems that &error_abort is better than NULL, it is better than slient and do nothing. If error really happens, we had to solve it then. And I will refresh the patch in next version. Regards Bibo Mao
© 2016 - 2025 Red Hat, Inc.