In function virt_cpu_unplug(), it will send cpu unplug message to
interrupt controller extioi and ipi irqchip. If there is problem in
this function, system should continue to run and keep state the same
before cpu is removed.
If error happends in cpu unplug stage, send cpu plug message to extioi
and ipi irqchip to restore to previous stage, and then return immediately.
Fixes: 2cd6857f6f5b (hw/loongarch/virt: Implement cpu unplug interface)
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
---
hw/loongarch/virt.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index 8563967c8b..503362a69e 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -958,6 +958,8 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev,
hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->extioi), dev, &err);
if (err) {
error_propagate(errp, err);
+ hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev,
+ &error_abort);
return;
}
@@ -965,6 +967,10 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev,
hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->acpi_ged), dev, &err);
if (err) {
error_propagate(errp, err);
+ hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev,
+ &error_abort);
+ hotplug_handler_plug(HOTPLUG_HANDLER(lvms->extioi), dev,
+ &error_abort);
return;
}
--
2.39.3
Bibo Mao <maobibo@loongson.cn> writes: > In function virt_cpu_unplug(), it will send cpu unplug message to > interrupt controller extioi and ipi irqchip. If there is problem in > this function, system should continue to run and keep state the same > before cpu is removed. > > If error happends in cpu unplug stage, send cpu plug message to extioi > and ipi irqchip to restore to previous stage, and then return immediately. > > Fixes: 2cd6857f6f5b (hw/loongarch/virt: Implement cpu unplug interface) > Signed-off-by: Bibo Mao <maobibo@loongson.cn> > --- > hw/loongarch/virt.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c > index 8563967c8b..503362a69e 100644 > --- a/hw/loongarch/virt.c > +++ b/hw/loongarch/virt.c > @@ -958,6 +958,8 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, > hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->extioi), dev, &err); > if (err) { > error_propagate(errp, err); > + hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, > + &error_abort); > return; > } > > @@ -965,6 +967,10 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, > hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->acpi_ged), dev, &err); > if (err) { > error_propagate(errp, err); > + hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, > + &error_abort); > + hotplug_handler_plug(HOTPLUG_HANDLER(lvms->extioi), dev, > + &error_abort); > return; > } virt_cpu_unplug() calls hotplug_handler_unplug() three times to notify ipi, extioi, and acpi_get. If any notification fails, virt_cpu_unplug() calls hotplug_handler_plug() to "un-notify" the preceeding ones, if any. This must not fail. virt_cpu_plug() does it the other way round (see previous patch). So, hotplug_handler_plug() must not fail in virt_cpu_unplug(), yet we check for it to fail in virt_cpu_plug(). Can it really fail in virt_cpu_plug()? If yes, why can't it fail in virt_cpu_unplug()? Same questions for hotplug_handler_unplug().
On 2025/3/21 下午2:47, Markus Armbruster wrote: > Bibo Mao <maobibo@loongson.cn> writes: > >> In function virt_cpu_unplug(), it will send cpu unplug message to >> interrupt controller extioi and ipi irqchip. If there is problem in >> this function, system should continue to run and keep state the same >> before cpu is removed. >> >> If error happends in cpu unplug stage, send cpu plug message to extioi >> and ipi irqchip to restore to previous stage, and then return immediately. >> >> Fixes: 2cd6857f6f5b (hw/loongarch/virt: Implement cpu unplug interface) >> Signed-off-by: Bibo Mao <maobibo@loongson.cn> >> --- >> hw/loongarch/virt.c | 6 ++++++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c >> index 8563967c8b..503362a69e 100644 >> --- a/hw/loongarch/virt.c >> +++ b/hw/loongarch/virt.c >> @@ -958,6 +958,8 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, >> hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->extioi), dev, &err); >> if (err) { >> error_propagate(errp, err); >> + hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, >> + &error_abort); >> return; >> } >> >> @@ -965,6 +967,10 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, >> hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->acpi_ged), dev, &err); >> if (err) { >> error_propagate(errp, err); >> + hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, >> + &error_abort); >> + hotplug_handler_plug(HOTPLUG_HANDLER(lvms->extioi), dev, >> + &error_abort); >> return; >> } > > virt_cpu_unplug() calls hotplug_handler_unplug() three times to notify > ipi, extioi, and acpi_get. If any notification fails, virt_cpu_unplug() > calls hotplug_handler_plug() to "un-notify" the preceeding ones, if any. > This must not fail. > > virt_cpu_plug() does it the other way round (see previous patch). > > So, hotplug_handler_plug() must not fail in virt_cpu_unplug(), yet we > check for it to fail in virt_cpu_plug(). > > Can it really fail in virt_cpu_plug()? > > If yes, why can't it fail in virt_cpu_unplug()? I do not know what is you meaning. In last email I said it was impossible. un-notify is for future use. And you reply such as: *You assure us this can't happen today. Because of that, broken error recovery is not an actual problem. However, if things change some day so it can happen, broken error recovery becomes an actual problem. so, broken error recovery just "for future use" is actually just for silent future breakage. But is it broken? This is what I'm trying to find out with my "what happens if" question. If it is broken, then passing &error_abort would likely be less bad: crash instead of silent breakage. Also makes it completely obvious in the code that these errors are not handled, whereas broken error handling looks like it is until you actually think about it.* Sorry for my bad English, so what is your option about here? Regards Bibo Mao > > Same questions for hotplug_handler_unplug(). >
+Igor On 2025/3/21 下午2:47, Markus Armbruster wrote: > Bibo Mao <maobibo@loongson.cn> writes: > >> In function virt_cpu_unplug(), it will send cpu unplug message to >> interrupt controller extioi and ipi irqchip. If there is problem in >> this function, system should continue to run and keep state the same >> before cpu is removed. >> >> If error happends in cpu unplug stage, send cpu plug message to extioi >> and ipi irqchip to restore to previous stage, and then return immediately. >> >> Fixes: 2cd6857f6f5b (hw/loongarch/virt: Implement cpu unplug interface) >> Signed-off-by: Bibo Mao <maobibo@loongson.cn> >> --- >> hw/loongarch/virt.c | 6 ++++++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c >> index 8563967c8b..503362a69e 100644 >> --- a/hw/loongarch/virt.c >> +++ b/hw/loongarch/virt.c >> @@ -958,6 +958,8 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, >> hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->extioi), dev, &err); >> if (err) { >> error_propagate(errp, err); >> + hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, >> + &error_abort); >> return; >> } >> >> @@ -965,6 +967,10 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, >> hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->acpi_ged), dev, &err); >> if (err) { >> error_propagate(errp, err); >> + hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, >> + &error_abort); >> + hotplug_handler_plug(HOTPLUG_HANDLER(lvms->extioi), dev, >> + &error_abort); >> return; >> } > > virt_cpu_unplug() calls hotplug_handler_unplug() three times to notify > ipi, extioi, and acpi_get. If any notification fails, virt_cpu_unplug() > calls hotplug_handler_plug() to "un-notify" the preceeding ones, if any. > This must not fail. > > virt_cpu_plug() does it the other way round (see previous patch). > > So, hotplug_handler_plug() must not fail in virt_cpu_unplug(), yet we > check for it to fail in virt_cpu_plug(). > > Can it really fail in virt_cpu_plug()? > > If yes, why can't it fail in virt_cpu_unplug()? you can check function acpi_cpu_plug_cb()/loongarch_ipi_cpu_plug(), that is cpuplug callback for acpi_ged and ipi. it will not fail. If *virt_cpu_pre_plug()* pass, it will succeed. Regards Bibo Mao > > Same questions for hotplug_handler_unplug(). >
bibo mao <maobibo@loongson.cn> writes: > +Igor > > > On 2025/3/21 下午2:47, Markus Armbruster wrote: >> Bibo Mao <maobibo@loongson.cn> writes: >> >>> In function virt_cpu_unplug(), it will send cpu unplug message to >>> interrupt controller extioi and ipi irqchip. If there is problem in >>> this function, system should continue to run and keep state the same >>> before cpu is removed. >>> >>> If error happends in cpu unplug stage, send cpu plug message to extioi >>> and ipi irqchip to restore to previous stage, and then return immediately. >>> >>> Fixes: 2cd6857f6f5b (hw/loongarch/virt: Implement cpu unplug interface) >>> Signed-off-by: Bibo Mao <maobibo@loongson.cn> >>> --- >>> hw/loongarch/virt.c | 6 ++++++ >>> 1 file changed, 6 insertions(+) >>> >>> diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c >>> index 8563967c8b..503362a69e 100644 >>> --- a/hw/loongarch/virt.c >>> +++ b/hw/loongarch/virt.c >>> @@ -958,6 +958,8 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, >>> hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->extioi), dev, &err); >>> if (err) { >>> error_propagate(errp, err); >>> + hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, >>> + &error_abort); >>> return; >>> } >>> >>> @@ -965,6 +967,10 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, >>> hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->acpi_ged), dev, &err); >>> if (err) { >>> error_propagate(errp, err); >>> + hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, >>> + &error_abort); >>> + hotplug_handler_plug(HOTPLUG_HANDLER(lvms->extioi), dev, >>> + &error_abort); >>> return; >>> } >> >> virt_cpu_unplug() calls hotplug_handler_unplug() three times to notify >> ipi, extioi, and acpi_get. If any notification fails, virt_cpu_unplug() >> calls hotplug_handler_plug() to "un-notify" the preceeding ones, if any. >> This must not fail. >> >> virt_cpu_plug() does it the other way round (see previous patch). >> >> So, hotplug_handler_plug() must not fail in virt_cpu_unplug(), yet we >> check for it to fail in virt_cpu_plug(). >> >> Can it really fail in virt_cpu_plug()? >> >> If yes, why can't it fail in virt_cpu_unplug()? > you can check function acpi_cpu_plug_cb()/loongarch_ipi_cpu_plug(), that > is cpuplug callback for acpi_ged and ipi. it will not fail. > > If *virt_cpu_pre_plug()* pass, it will succeed. > > Regards > Bibo Mao > >> >> Same questions for hotplug_handler_unplug(). Let me restate my argument. We call hotplug_handler_plug() on the happy path, and on error recovery paths. Four cases: 1. Can fail on the happy path Error recovery is required. 1.1 Can fail on the error recovery path Error recovery is required, but broken. 1.2 Can't fail on the error recovery path Error recovery is required and works, but why it works is not obvious. Deserves a comment explaining why hotplug_handler_plug() can't fail here even though it can fail on the happy path next door. 2. Can't fail on the happy path Error recovery is unreachable. 2.1 Can fail on the error recovery path Error recovery is unreachable and broken. Possibly a time bomb, and possibly misleading readers. 2.2 Can't fail on the error recovery path Error recovery is unreachable and would work, but why it would work is again a not obvious. Which of the four cases is it?
On 2025/3/21 下午3:21, Markus Armbruster wrote: > bibo mao <maobibo@loongson.cn> writes: > >> +Igor >> >> >> On 2025/3/21 下午2:47, Markus Armbruster wrote: >>> Bibo Mao <maobibo@loongson.cn> writes: >>> >>>> In function virt_cpu_unplug(), it will send cpu unplug message to >>>> interrupt controller extioi and ipi irqchip. If there is problem in >>>> this function, system should continue to run and keep state the same >>>> before cpu is removed. >>>> >>>> If error happends in cpu unplug stage, send cpu plug message to extioi >>>> and ipi irqchip to restore to previous stage, and then return immediately. >>>> >>>> Fixes: 2cd6857f6f5b (hw/loongarch/virt: Implement cpu unplug interface) >>>> Signed-off-by: Bibo Mao <maobibo@loongson.cn> >>>> --- >>>> hw/loongarch/virt.c | 6 ++++++ >>>> 1 file changed, 6 insertions(+) >>>> >>>> diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c >>>> index 8563967c8b..503362a69e 100644 >>>> --- a/hw/loongarch/virt.c >>>> +++ b/hw/loongarch/virt.c >>>> @@ -958,6 +958,8 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, >>>> hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->extioi), dev, &err); >>>> if (err) { >>>> error_propagate(errp, err); >>>> + hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, >>>> + &error_abort); >>>> return; >>>> } >>>> >>>> @@ -965,6 +967,10 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, >>>> hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->acpi_ged), dev, &err); >>>> if (err) { >>>> error_propagate(errp, err); >>>> + hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, >>>> + &error_abort); >>>> + hotplug_handler_plug(HOTPLUG_HANDLER(lvms->extioi), dev, >>>> + &error_abort); >>>> return; >>>> } >>> >>> virt_cpu_unplug() calls hotplug_handler_unplug() three times to notify >>> ipi, extioi, and acpi_get. If any notification fails, virt_cpu_unplug() >>> calls hotplug_handler_plug() to "un-notify" the preceeding ones, if any. >>> This must not fail. >>> >>> virt_cpu_plug() does it the other way round (see previous patch). >>> >>> So, hotplug_handler_plug() must not fail in virt_cpu_unplug(), yet we >>> check for it to fail in virt_cpu_plug(). >>> >>> Can it really fail in virt_cpu_plug()? >>> >>> If yes, why can't it fail in virt_cpu_unplug()? >> you can check function acpi_cpu_plug_cb()/loongarch_ipi_cpu_plug(), that >> is cpuplug callback for acpi_ged and ipi. it will not fail. >> >> If *virt_cpu_pre_plug()* pass, it will succeed. >> >> Regards >> Bibo Mao >> >>> >>> Same questions for hotplug_handler_unplug(). > > Let me restate my argument. > > We call hotplug_handler_plug() on the happy path, and on error recovery > paths. Four cases: > > 1. Can fail on the happy path > > Error recovery is required. > > 1.1 Can fail on the error recovery path > > Error recovery is required, but broken. > > 1.2 Can't fail on the error recovery path > > Error recovery is required and works, but why it works is not > obvious. Deserves a comment explaining why hotplug_handler_plug() > can't fail here even though it can fail on the happy path next door. > > 2. Can't fail on the happy path > > Error recovery is unreachable. > > 2.1 Can fail on the error recovery path > > Error recovery is unreachable and broken. Possibly a time bomb, and > possibly misleading readers. > > 2.2 Can't fail on the error recovery path > > Error recovery is unreachable and would work, but why it would work > is again a not obvious. > > Which of the four cases is it? By my understanding, it is "2. Can't fail on the happy path", and Error recovery is unreachable. I have said that it is impossible and recovery is only for future use. do you mean recovery should be removed? And directly &error_abort is used in virt_cpu_plug() such as: static void virt_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev, Error **errp) { if (lvms->ipi) { hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, &error_abort); Regards Bibo Mao
bibo mao <maobibo@loongson.cn> writes: > On 2025/3/21 下午3:21, Markus Armbruster wrote: >> bibo mao <maobibo@loongson.cn> writes: >> >>> +Igor >>> >>> >>> On 2025/3/21 下午2:47, Markus Armbruster wrote: >>>> Bibo Mao <maobibo@loongson.cn> writes: >>>> >>>>> In function virt_cpu_unplug(), it will send cpu unplug message to >>>>> interrupt controller extioi and ipi irqchip. If there is problem in >>>>> this function, system should continue to run and keep state the same >>>>> before cpu is removed. >>>>> >>>>> If error happends in cpu unplug stage, send cpu plug message to extioi >>>>> and ipi irqchip to restore to previous stage, and then return immediately. >>>>> >>>>> Fixes: 2cd6857f6f5b (hw/loongarch/virt: Implement cpu unplug interface) >>>>> Signed-off-by: Bibo Mao <maobibo@loongson.cn> >>>>> --- >>>>> hw/loongarch/virt.c | 6 ++++++ >>>>> 1 file changed, 6 insertions(+) >>>>> >>>>> diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c >>>>> index 8563967c8b..503362a69e 100644 >>>>> --- a/hw/loongarch/virt.c >>>>> +++ b/hw/loongarch/virt.c >>>>> @@ -958,6 +958,8 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, >>>>> hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->extioi), dev, &err); >>>>> if (err) { >>>>> error_propagate(errp, err); >>>>> + hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, >>>>> + &error_abort); >>>>> return; >>>>> } >>>>> >>>>> @@ -965,6 +967,10 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, >>>>> hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->acpi_ged), dev, &err); >>>>> if (err) { >>>>> error_propagate(errp, err); >>>>> + hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, >>>>> + &error_abort); >>>>> + hotplug_handler_plug(HOTPLUG_HANDLER(lvms->extioi), dev, >>>>> + &error_abort); >>>>> return; >>>>> } >>>> >>>> virt_cpu_unplug() calls hotplug_handler_unplug() three times to notify >>>> ipi, extioi, and acpi_get. If any notification fails, virt_cpu_unplug() >>>> calls hotplug_handler_plug() to "un-notify" the preceeding ones, if any. >>>> This must not fail. >>>> >>>> virt_cpu_plug() does it the other way round (see previous patch). >>>> >>>> So, hotplug_handler_plug() must not fail in virt_cpu_unplug(), yet we >>>> check for it to fail in virt_cpu_plug(). >>>> >>>> Can it really fail in virt_cpu_plug()? >>>> >>>> If yes, why can't it fail in virt_cpu_unplug()? >>> you can check function acpi_cpu_plug_cb()/loongarch_ipi_cpu_plug(), that >>> is cpuplug callback for acpi_ged and ipi. it will not fail. >>> >>> If *virt_cpu_pre_plug()* pass, it will succeed. >>> >>> Regards >>> Bibo Mao >>> >>>> >>>> Same questions for hotplug_handler_unplug(). >> >> Let me restate my argument. >> >> We call hotplug_handler_plug() on the happy path, and on error recovery >> paths. Four cases: >> >> 1. Can fail on the happy path >> >> Error recovery is required. >> >> 1.1 Can fail on the error recovery path >> >> Error recovery is required, but broken. >> >> 1.2 Can't fail on the error recovery path >> >> Error recovery is required and works, but why it works is not >> obvious. Deserves a comment explaining why hotplug_handler_plug() >> can't fail here even though it can fail on the happy path next door. >> >> 2. Can't fail on the happy path >> >> Error recovery is unreachable. >> >> 2.1 Can fail on the error recovery path >> >> Error recovery is unreachable and broken. Possibly a time bomb, and >> possibly misleading readers. >> >> 2.2 Can't fail on the error recovery path >> >> Error recovery is unreachable and would work, but why it would work >> is again a not obvious. >> >> Which of the four cases is it? > By my understanding, it is "2. Can't fail on the happy path", and Error > recovery is unreachable. Got it. > I have said that it is impossible and recovery is only for future use. > > do you mean recovery should be removed? And directly &error_abort is > used in virt_cpu_plug() such as: > static void virt_cpu_plug(HotplugHandler *hotplug_dev, > DeviceState *dev, Error **errp) > { > if (lvms->ipi) { > hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, &error_abort); Yes, I prefer this. Here's why. Error recovery that is unreachable now but might become reachable at some future time is untestable now. Mind, this does not necessarily make it a bad idea by itself. But there's more. Anything that makes this error recovery reachable either breaks it or makes correctness locally unobvious. Why? To make it reachable, plug / unplug must be able to fail on the happy path. But then they either can fail on the error recovery path as well (which breaks error recovery), or they can't fail there for reasons that are not locally obvious. This sets a trap for readers. An attentive reader will see the problem (like I did), but to see why the code is not broken right now will take digging (like we did together). And after such digging, we're left with a queasy feeling about robustness of the code (like we are now). Passing &error_abort on the happy path avoids all this. Instead it clearly tells the reader that this is not expected to fail. If failure becomes possible at some future time, this should crash in testing. If we neglect to test the new failure (and we really shouldn't), we crash on error in production right away instead of risking botched error recovery messing up the program's state. Both are bad outcomes, but which one's less bad I find impossible to predict.
On 2025/3/21 下午4:08, Markus Armbruster wrote: > bibo mao <maobibo@loongson.cn> writes: > >> On 2025/3/21 下午3:21, Markus Armbruster wrote: >>> bibo mao <maobibo@loongson.cn> writes: >>> >>>> +Igor >>>> >>>> >>>> On 2025/3/21 下午2:47, Markus Armbruster wrote: >>>>> Bibo Mao <maobibo@loongson.cn> writes: >>>>> >>>>>> In function virt_cpu_unplug(), it will send cpu unplug message to >>>>>> interrupt controller extioi and ipi irqchip. If there is problem in >>>>>> this function, system should continue to run and keep state the same >>>>>> before cpu is removed. >>>>>> >>>>>> If error happends in cpu unplug stage, send cpu plug message to extioi >>>>>> and ipi irqchip to restore to previous stage, and then return immediately. >>>>>> >>>>>> Fixes: 2cd6857f6f5b (hw/loongarch/virt: Implement cpu unplug interface) >>>>>> Signed-off-by: Bibo Mao <maobibo@loongson.cn> >>>>>> --- >>>>>> hw/loongarch/virt.c | 6 ++++++ >>>>>> 1 file changed, 6 insertions(+) >>>>>> >>>>>> diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c >>>>>> index 8563967c8b..503362a69e 100644 >>>>>> --- a/hw/loongarch/virt.c >>>>>> +++ b/hw/loongarch/virt.c >>>>>> @@ -958,6 +958,8 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, >>>>>> hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->extioi), dev, &err); >>>>>> if (err) { >>>>>> error_propagate(errp, err); >>>>>> + hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, >>>>>> + &error_abort); >>>>>> return; >>>>>> } >>>>>> >>>>>> @@ -965,6 +967,10 @@ static void virt_cpu_unplug(HotplugHandler *hotplug_dev, >>>>>> hotplug_handler_unplug(HOTPLUG_HANDLER(lvms->acpi_ged), dev, &err); >>>>>> if (err) { >>>>>> error_propagate(errp, err); >>>>>> + hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, >>>>>> + &error_abort); >>>>>> + hotplug_handler_plug(HOTPLUG_HANDLER(lvms->extioi), dev, >>>>>> + &error_abort); >>>>>> return; >>>>>> } >>>>> >>>>> virt_cpu_unplug() calls hotplug_handler_unplug() three times to notify >>>>> ipi, extioi, and acpi_get. If any notification fails, virt_cpu_unplug() >>>>> calls hotplug_handler_plug() to "un-notify" the preceeding ones, if any. >>>>> This must not fail. >>>>> >>>>> virt_cpu_plug() does it the other way round (see previous patch). >>>>> >>>>> So, hotplug_handler_plug() must not fail in virt_cpu_unplug(), yet we >>>>> check for it to fail in virt_cpu_plug(). >>>>> >>>>> Can it really fail in virt_cpu_plug()? >>>>> >>>>> If yes, why can't it fail in virt_cpu_unplug()? >>>> you can check function acpi_cpu_plug_cb()/loongarch_ipi_cpu_plug(), that >>>> is cpuplug callback for acpi_ged and ipi. it will not fail. >>>> >>>> If *virt_cpu_pre_plug()* pass, it will succeed. >>>> >>>> Regards >>>> Bibo Mao >>>> >>>>> >>>>> Same questions for hotplug_handler_unplug(). >>> >>> Let me restate my argument. >>> >>> We call hotplug_handler_plug() on the happy path, and on error recovery >>> paths. Four cases: >>> >>> 1. Can fail on the happy path >>> >>> Error recovery is required. >>> >>> 1.1 Can fail on the error recovery path >>> >>> Error recovery is required, but broken. >>> >>> 1.2 Can't fail on the error recovery path >>> >>> Error recovery is required and works, but why it works is not >>> obvious. Deserves a comment explaining why hotplug_handler_plug() >>> can't fail here even though it can fail on the happy path next door. >>> >>> 2. Can't fail on the happy path >>> >>> Error recovery is unreachable. >>> >>> 2.1 Can fail on the error recovery path >>> >>> Error recovery is unreachable and broken. Possibly a time bomb, and >>> possibly misleading readers. >>> >>> 2.2 Can't fail on the error recovery path >>> >>> Error recovery is unreachable and would work, but why it would work >>> is again a not obvious. >>> >>> Which of the four cases is it? >> By my understanding, it is "2. Can't fail on the happy path", and Error >> recovery is unreachable. > > Got it. > >> I have said that it is impossible and recovery is only for future use. >> >> do you mean recovery should be removed? And directly &error_abort is >> used in virt_cpu_plug() such as: >> static void virt_cpu_plug(HotplugHandler *hotplug_dev, >> DeviceState *dev, Error **errp) >> { >> if (lvms->ipi) { >> hotplug_handler_plug(HOTPLUG_HANDLER(lvms->ipi), dev, &error_abort); > > Yes, I prefer this. Here's why. This is ok, however I do not think there is obvious advantage. V6 is ok also, there is recovery path only that recovery has problem, system will crash, through it is impossible now. Maybe someone jumps out and say that there is error handling in cpu hotplug, however no such in pc_memory_plug(), why does not LoongArch in such way. If so, there will endless discussion. void x86_cpu_plug(HotplugHandler *hotplug_dev, DeviceState *dev, Error **errp) { CPUArchId *found_cpu; Error *local_err = NULL; X86CPU *cpu = X86_CPU(dev); X86MachineState *x86ms = X86_MACHINE(hotplug_dev); if (x86ms->acpi_dev) { hotplug_handler_plug(x86ms->acpi_dev, dev, &local_err); static void pc_memory_plug(HotplugHandler *hotplug_dev, DeviceState *dev, Error **errp) { PCMachineState *pcms = PC_MACHINE(hotplug_dev); X86MachineState *x86ms = X86_MACHINE(hotplug_dev); MachineState *ms = MACHINE(hotplug_dev); bool is_nvdimm = object_dynamic_cast(OBJECT(dev), TYPE_NVDIMM); pc_dimm_plug(PC_DIMM(dev), MACHINE(pcms)); if (is_nvdimm) { nvdimm_plug(ms->nvdimms_state); } hotplug_handler_plug(x86ms->acpi_dev, dev, &error_abort); } Regards Bibo Mao > > Error recovery that is unreachable now but might become reachable at > some future time is untestable now. Mind, this does not necessarily > make it a bad idea by itself. But there's more. > > Anything that makes this error recovery reachable either breaks it or > makes correctness locally unobvious. Why? To make it reachable, plug / > unplug must be able to fail on the happy path. But then they either can > fail on the error recovery path as well (which breaks error recovery), > or they can't fail there for reasons that are not locally obvious. > > This sets a trap for readers. An attentive reader will see the problem > (like I did), but to see why the code is not broken right now will take > digging (like we did together). And after such digging, we're left with > a queasy feeling about robustness of the code (like we are now). > > Passing &error_abort on the happy path avoids all this. Instead it > clearly tells the reader that this is not expected to fail. > > If failure becomes possible at some future time, this should crash in > testing. If we neglect to test the new failure (and we really > shouldn't), we crash on error in production right away instead of > risking botched error recovery messing up the program's state. > Both are bad outcomes, but which one's less bad I find impossible to > predict. >
© 2016 - 2025 Red Hat, Inc.