[Qemu-devel] [RFC] hw/core/bus.c: Only the main system bus can have no parent

Peter Maydell posted 1 patch 4 years, 10 months ago
Test docker-clang@ubuntu passed
Test checkpatch passed
Test asan passed
Test docker-mingw@fedora passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20190514171545.24961-1-peter.maydell@linaro.org
There is a newer version of this series
hw/core/bus.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
[Qemu-devel] [RFC] hw/core/bus.c: Only the main system bus can have no parent
Posted by Peter Maydell 4 years, 10 months ago
In commit 80376c3fc2c38fdd453 in 2010 we added a workaround for
some qbus buses not being connected to qdev devices -- if the
bus has no parent object then we register a reset function which
resets the bus on system reset.

Nearly a decade later, we have now no buses in the tree which
are created with non-NULL parents, so we can remove the
workaround and instead just assert that if the bus has a NULL
parent then it is the main system bus.

(The absence of other parentless buses was confirmed by
code inspection of all the callsites of qbus_create() and
qbus_create_inplace() and cross-checked by 'make check'.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
While I was reviewing Damian's reset patchset I noticed this
code which meant that we theoretically had multiple 'roots' to
the set of things being reset, so I wondered what was actually
using it. It turns out nothing was :-)

Commit 80376c3fc2c38fdd453 also added a TODO in vl.c suggesting
that there is the wrong place to register the reset function
which effectively resets the whole system starting at the
root which is the main system bus:
   qemu_register_reset(qbus_reset_all_fn, sysbus_get_default());
I don't understand why vl.c is a bad place to put that, and I'd
rather not move it to qdev.c (where in qdev.c?) because that
would reshuffle reset ordering which seems liable to cause
regressions. So maybe we should just delete that TODO comment?

---
 hw/core/bus.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/hw/core/bus.c b/hw/core/bus.c
index e09843f6abe..e50287c2b35 100644
--- a/hw/core/bus.c
+++ b/hw/core/bus.c
@@ -96,10 +96,9 @@ static void qbus_realize(BusState *bus, DeviceState *parent, const char *name)
         bus->parent->num_child_bus++;
         object_property_add_child(OBJECT(bus->parent), bus->name, OBJECT(bus), NULL);
         object_unref(OBJECT(bus));
-    } else if (bus != sysbus_get_default()) {
-        /* TODO: once all bus devices are qdevified,
-           only reset handler for main_system_bus should be registered here. */
-        qemu_register_reset(qbus_reset_all_fn, bus);
+    } else {
+        /* The only bus without a parent is the main system bus */
+        assert(bus == sysbus_get_default());
     }
 }
 
-- 
2.20.1


Re: [Qemu-devel] [RFC] hw/core/bus.c: Only the main system bus can have no parent
Posted by Markus Armbruster 4 years, 10 months ago
Peter Maydell <peter.maydell@linaro.org> writes:

> In commit 80376c3fc2c38fdd453 in 2010 we added a workaround for
> some qbus buses not being connected to qdev devices -- if the
> bus has no parent object then we register a reset function which
> resets the bus on system reset.
>
> Nearly a decade later, we have now no buses in the tree which
> are created with non-NULL parents, so we can remove the
> workaround and instead just assert that if the bus has a NULL
> parent then it is the main system bus.
>
> (The absence of other parentless buses was confirmed by
> code inspection of all the callsites of qbus_create() and
> qbus_create_inplace() and cross-checked by 'make check'.)

Could we assert(parent || bus == main_system_bus) in qbus_realize()?

Aside: I hate sysbus_get_default().  It creates main_system_bus on first
call, wherever that call may be hiding.  I feel we should create it
explicitly.  I'd then make main_system_bus public, and delete
sysbus_get_default().

> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> ---
> While I was reviewing Damian's reset patchset I noticed this
> code which meant that we theoretically had multiple 'roots' to
> the set of things being reset, so I wondered what was actually
> using it. It turns out nothing was :-)
>
> Commit 80376c3fc2c38fdd453 also added a TODO in vl.c suggesting
> that there is the wrong place to register the reset function
> which effectively resets the whole system starting at the
> root which is the main system bus:
>    qemu_register_reset(qbus_reset_all_fn, sysbus_get_default());
> I don't understand why vl.c is a bad place to put that, and I'd
> rather not move it to qdev.c (where in qdev.c?) because that
> would reshuffle reset ordering which seems liable to cause
> regressions. So maybe we should just delete that TODO comment?

Hmm.

The one in vl.c arranges to run qbus_reset_all(main_system_bus), which
walks the tree rooted at main_system_bus, resetting its buses and
devices in post-order.

A registry of callbacks to run on certain events is a fine technique.
Relying on registration order, however, is in bad taste.  We should
model dependencies between reset functions explicitly.

That said, we can't ignore dependencies just because we've coded them
badly.

I count more than 100 qemu_register_reset(), and most of them look like
they reset hardware.  Why do devices use qemu_register_reset() instead
of DeviceClass method reset?

Registered handlers run in (implicitly defined) registration order,
reset methods in (explicit) qdev tree post order.  Much better as long
as that's the order we want.

Say we managed to clean up this mess somehow, so reset handler
registration order doesn't matter anymore.  Then moving the
qemu_register_reset() for main_system_bus from main() to wherever we
create main_system_bus would make sense, wouldn't it?

If it does make sense, we should keep the TODO in main(), because it
asks for exactly that.  Perhaps delete "by qdev.c".

> ---
>  hw/core/bus.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/hw/core/bus.c b/hw/core/bus.c
> index e09843f6abe..e50287c2b35 100644
> --- a/hw/core/bus.c
> +++ b/hw/core/bus.c
> @@ -96,10 +96,9 @@ static void qbus_realize(BusState *bus, DeviceState *parent, const char *name)
>          bus->parent->num_child_bus++;
>          object_property_add_child(OBJECT(bus->parent), bus->name, OBJECT(bus), NULL);
>          object_unref(OBJECT(bus));
> -    } else if (bus != sysbus_get_default()) {
> -        /* TODO: once all bus devices are qdevified,
> -           only reset handler for main_system_bus should be registered here. */
> -        qemu_register_reset(qbus_reset_all_fn, bus);
> +    } else {
> +        /* The only bus without a parent is the main system bus */
> +        assert(bus == sysbus_get_default());
>      }
>  }

You delete a qemu_register_reset() because it's unreachable.  The commit
that added it also added a qemu_unregister_reset().  It's now in
bus_unparent().  Why is it still needed?

Re: [Qemu-devel] [RFC] hw/core/bus.c: Only the main system bus can have no parent
Posted by Peter Maydell 4 years, 10 months ago
On Thu, 16 May 2019 at 06:37, Markus Armbruster <armbru@redhat.com> wrote:
>
> Peter Maydell <peter.maydell@linaro.org> writes:
>
> > In commit 80376c3fc2c38fdd453 in 2010 we added a workaround for
> > some qbus buses not being connected to qdev devices -- if the
> > bus has no parent object then we register a reset function which
> > resets the bus on system reset.
> >
> > Nearly a decade later, we have now no buses in the tree which
> > are created with non-NULL parents, so we can remove the
> > workaround and instead just assert that if the bus has a NULL
> > parent then it is the main system bus.
> >
> > (The absence of other parentless buses was confirmed by
> > code inspection of all the callsites of qbus_create() and
> > qbus_create_inplace() and cross-checked by 'make check'.)
>
> Could we assert(parent || bus == main_system_bus) in qbus_realize()?

Er, that's what this patch is doing.

> Aside: I hate sysbus_get_default().  It creates main_system_bus on first
> call, wherever that call may be hiding.  I feel we should create it
> explicitly.  I'd then make main_system_bus public, and delete
> sysbus_get_default().

Yes, I think that would be a reasonable thing to do.
The implicit creation is weird since we effectively
rely on a main system bus existing anyway (it is the root
of the reset tree).

> > Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
> > ---
> > While I was reviewing Damian's reset patchset I noticed this
> > code which meant that we theoretically had multiple 'roots' to
> > the set of things being reset, so I wondered what was actually
> > using it. It turns out nothing was :-)
> >
> > Commit 80376c3fc2c38fdd453 also added a TODO in vl.c suggesting
> > that there is the wrong place to register the reset function
> > which effectively resets the whole system starting at the
> > root which is the main system bus:
> >    qemu_register_reset(qbus_reset_all_fn, sysbus_get_default());
> > I don't understand why vl.c is a bad place to put that, and I'd
> > rather not move it to qdev.c (where in qdev.c?) because that
> > would reshuffle reset ordering which seems liable to cause
> > regressions. So maybe we should just delete that TODO comment?
>
> Hmm.
>
> The one in vl.c arranges to run qbus_reset_all(main_system_bus), which
> walks the tree rooted at main_system_bus, resetting its buses and
> devices in post-order.
>
> A registry of callbacks to run on certain events is a fine technique.
> Relying on registration order, however, is in bad taste.  We should
> model dependencies between reset functions explicitly.

That might be nice, but in practice we have no such model at
all, and I don't think I've seen anybody propose one. I hope we
don't have too many accidental ordering dependencies, but I'm
not confident that we have none at all, and would prefer not to
prod that sleeping dragon...

The multi-phase-reset patches Damien has on list at the moment
would allow some of the reset ordering issues to be sidestepped
because "phase 1" for all devices happens before "phase 2" so
you have "before" and "after" places to put the logic in different
devices.

> That said, we can't ignore dependencies just because we've coded them
> badly.
>
> I count more than 100 qemu_register_reset(), and most of them look like
> they reset hardware.  Why do devices use qemu_register_reset() instead
> of DeviceClass method reset?

Most of the ones for hardware are "this device hasn't been
converted to be a QOM Device" (eg hw/arm/omap1.c, hw/input/pckbd.c,
lots of the stuff in hw/ppc).

The other reason for having to have a qemu_register_reset() handler
to reset something that's a Device is if that Device is not on
a qbus. The most common example of this is CPUs -- since those
don't have a bus to live on they don't get reset by the "reset
everything that's on a QOM bus reachable from the main system
bus" logic. I'm not sure what the nicest way to address this is:
transitioning away from "reset of devices is based on the qdev tree"
to something else seems between difficult and impossible, even
though logically speaking the QOM tree is in many cases closer
to the actual hardware hierarchy of reset.

> Registered handlers run in (implicitly defined) registration order,
> reset methods in (explicit) qdev tree post order.  Much better as long
> as that's the order we want.
>
> Say we managed to clean up this mess somehow, so reset handler
> registration order doesn't matter anymore.  Then moving the
> qemu_register_reset() for main_system_bus from main() to wherever we
> create main_system_bus would make sense, wouldn't it?

I guess so... (There's an argument that the main system bus
should be a child bus of the Machine object, logically speaking,
but Machines aren't subtypes of Device so that doesn't work.)

> If it does make sense, we should keep the TODO in main(), because it
> asks for exactly that.  Perhaps delete "by qdev.c".
>
> > ---
> >  hw/core/bus.c | 7 +++----
> >  1 file changed, 3 insertions(+), 4 deletions(-)
> >
> > diff --git a/hw/core/bus.c b/hw/core/bus.c
> > index e09843f6abe..e50287c2b35 100644
> > --- a/hw/core/bus.c
> > +++ b/hw/core/bus.c
> > @@ -96,10 +96,9 @@ static void qbus_realize(BusState *bus, DeviceState *parent, const char *name)
> >          bus->parent->num_child_bus++;
> >          object_property_add_child(OBJECT(bus->parent), bus->name, OBJECT(bus), NULL);
> >          object_unref(OBJECT(bus));
> > -    } else if (bus != sysbus_get_default()) {
> > -        /* TODO: once all bus devices are qdevified,
> > -           only reset handler for main_system_bus should be registered here. */
> > -        qemu_register_reset(qbus_reset_all_fn, bus);
> > +    } else {
> > +        /* The only bus without a parent is the main system bus */
> > +        assert(bus == sysbus_get_default());
> >      }
> >  }
>
> You delete a qemu_register_reset() because it's unreachable.  The commit
> that added it also added a qemu_unregister_reset().  It's now in
> bus_unparent().  Why is it still needed?

You're right, the bus_unparent() logic should also be cleaned up.

thanks
-- PMM

Re: [Qemu-devel] [RFC] hw/core/bus.c: Only the main system bus can have no parent
Posted by Damien Hedde 4 years, 10 months ago

On 5/16/19 11:19 AM, Peter Maydell wrote:
> On Thu, 16 May 2019 at 06:37, Markus Armbruster <armbru@redhat.com> wrote:
>>
>> Peter Maydell <peter.maydell@linaro.org> writes:
>>
>>> In commit 80376c3fc2c38fdd453 in 2010 we added a workaround for
>>> some qbus buses not being connected to qdev devices -- if the
>>> bus has no parent object then we register a reset function which
>>> resets the bus on system reset.
>>>
>>> Nearly a decade later, we have now no buses in the tree which
>>> are created with non-NULL parents, so we can remove the
>>> workaround and instead just assert that if the bus has a NULL
>>> parent then it is the main system bus.
>>>
>>> (The absence of other parentless buses was confirmed by
>>> code inspection of all the callsites of qbus_create() and
>>> qbus_create_inplace() and cross-checked by 'make check'.)
>>
>> Could we assert(parent || bus == main_system_bus) in qbus_realize()?
> 
> Er, that's what this patch is doing.
> 
>> Aside: I hate sysbus_get_default().  It creates main_system_bus on first
>> call, wherever that call may be hiding.  I feel we should create it
>> explicitly.  I'd then make main_system_bus public, and delete
>> sysbus_get_default().
> 
> Yes, I think that would be a reasonable thing to do.
> The implicit creation is weird since we effectively
> rely on a main system bus existing anyway (it is the root
> of the reset tree).
> 
>>> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
>>> ---
>>> While I was reviewing Damian's reset patchset I noticed this
>>> code which meant that we theoretically had multiple 'roots' to
>>> the set of things being reset, so I wondered what was actually
>>> using it. It turns out nothing was :-)
>>>
>>> Commit 80376c3fc2c38fdd453 also added a TODO in vl.c suggesting
>>> that there is the wrong place to register the reset function
>>> which effectively resets the whole system starting at the
>>> root which is the main system bus:
>>>    qemu_register_reset(qbus_reset_all_fn, sysbus_get_default());
>>> I don't understand why vl.c is a bad place to put that, and I'd
>>> rather not move it to qdev.c (where in qdev.c?) because that
>>> would reshuffle reset ordering which seems liable to cause
>>> regressions. So maybe we should just delete that TODO comment?
>>
>> Hmm.
>>
>> The one in vl.c arranges to run qbus_reset_all(main_system_bus), which
>> walks the tree rooted at main_system_bus, resetting its buses and
>> devices in post-order.
>>
>> A registry of callbacks to run on certain events is a fine technique.
>> Relying on registration order, however, is in bad taste.  We should
>> model dependencies between reset functions explicitly.
> 
> That might be nice, but in practice we have no such model at
> all, and I don't think I've seen anybody propose one. I hope we
> don't have too many accidental ordering dependencies, but I'm
> not confident that we have none at all, and would prefer not to
> prod that sleeping dragon...
> 
> The multi-phase-reset patches Damien has on list at the moment
> would allow some of the reset ordering issues to be sidestepped
> because "phase 1" for all devices happens before "phase 2" so
> you have "before" and "after" places to put the logic in different
> devices.
> 
>> That said, we can't ignore dependencies just because we've coded them
>> badly.
>>
>> I count more than 100 qemu_register_reset(), and most of them look like
>> they reset hardware.  Why do devices use qemu_register_reset() instead
>> of DeviceClass method reset?
> 
> Most of the ones for hardware are "this device hasn't been
> converted to be a QOM Device" (eg hw/arm/omap1.c, hw/input/pckbd.c,
> lots of the stuff in hw/ppc).
> 
> The other reason for having to have a qemu_register_reset() handler
> to reset something that's a Device is if that Device is not on
> a qbus. The most common example of this is CPUs -- since those
> don't have a bus to live on they don't get reset by the "reset
> everything that's on a QOM bus reachable from the main system
> bus" logic. I'm not sure what the nicest way to address this is:
> transitioning away from "reset of devices is based on the qdev tree"
> to something else seems between difficult and impossible, even
> though logically speaking the QOM tree is in many cases closer
> to the actual hardware hierarchy of reset.

One "solution" to reduce the qemu_register_reset usage would be to do
handle in the Device base class (at creation or realize) if it has no
parent bus like it is done for buses. But this would probably have an
impact on reset ordering.

> 
>> Registered handlers run in (implicitly defined) registration order,
>> reset methods in (explicit) qdev tree post order.  Much better as long
>> as that's the order we want.
>>
>> Say we managed to clean up this mess somehow, so reset handler
>> registration order doesn't matter anymore.  Then moving the
>> qemu_register_reset() for main_system_bus from main() to wherever we
>> create main_system_bus would make sense, wouldn't it?
> 
> I guess so... (There's an argument that the main system bus
> should be a child bus of the Machine object, logically speaking,
> but Machines aren't subtypes of Device so that doesn't work.)
> 
>> If it does make sense, we should keep the TODO in main(), because it
>> asks for exactly that.  Perhaps delete "by qdev.c".
>>
>>> ---
>>>  hw/core/bus.c | 7 +++----
>>>  1 file changed, 3 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/hw/core/bus.c b/hw/core/bus.c
>>> index e09843f6abe..e50287c2b35 100644
>>> --- a/hw/core/bus.c
>>> +++ b/hw/core/bus.c
>>> @@ -96,10 +96,9 @@ static void qbus_realize(BusState *bus, DeviceState *parent, const char *name)
>>>          bus->parent->num_child_bus++;
>>>          object_property_add_child(OBJECT(bus->parent), bus->name, OBJECT(bus), NULL);
>>>          object_unref(OBJECT(bus));
>>> -    } else if (bus != sysbus_get_default()) {
>>> -        /* TODO: once all bus devices are qdevified,
>>> -           only reset handler for main_system_bus should be registered here. */
>>> -        qemu_register_reset(qbus_reset_all_fn, bus);
>>> +    } else {
>>> +        /* The only bus without a parent is the main system bus */
>>> +        assert(bus == sysbus_get_default());
>>>      }
>>>  }
>>
>> You delete a qemu_register_reset() because it's unreachable.  The commit
>> that added it also added a qemu_unregister_reset().  It's now in
>> bus_unparent().  Why is it still needed?
> 
> You're right, the bus_unparent() logic should also be cleaned up.
> 
> thanks
> -- PMM
> 

Re: [Qemu-devel] [RFC] hw/core/bus.c: Only the main system bus can have no parent
Posted by Markus Armbruster 4 years, 10 months ago
Damien Hedde <damien.hedde@greensocs.com> writes:

> On 5/16/19 11:19 AM, Peter Maydell wrote:
>> On Thu, 16 May 2019 at 06:37, Markus Armbruster <armbru@redhat.com> wrote:
>>>
>>> Peter Maydell <peter.maydell@linaro.org> writes:
>>>
>>>> In commit 80376c3fc2c38fdd453 in 2010 we added a workaround for
>>>> some qbus buses not being connected to qdev devices -- if the
>>>> bus has no parent object then we register a reset function which
>>>> resets the bus on system reset.
>>>>
>>>> Nearly a decade later, we have now no buses in the tree which
>>>> are created with non-NULL parents, so we can remove the
>>>> workaround and instead just assert that if the bus has a NULL
>>>> parent then it is the main system bus.
>>>>
>>>> (The absence of other parentless buses was confirmed by
>>>> code inspection of all the callsites of qbus_create() and
>>>> qbus_create_inplace() and cross-checked by 'make check'.)
>>>
>>> Could we assert(parent || bus == main_system_bus) in qbus_realize()?
>> 
>> Er, that's what this patch is doing.

You're right; I got confused.

>>> Aside: I hate sysbus_get_default().  It creates main_system_bus on first
>>> call, wherever that call may be hiding.  I feel we should create it
>>> explicitly.  I'd then make main_system_bus public, and delete
>>> sysbus_get_default().
>> 
>> Yes, I think that would be a reasonable thing to do.
>> The implicit creation is weird since we effectively
>> rely on a main system bus existing anyway (it is the root
>> of the reset tree).
>> 
>>>> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
>>>> ---
>>>> While I was reviewing Damian's reset patchset I noticed this
>>>> code which meant that we theoretically had multiple 'roots' to
>>>> the set of things being reset, so I wondered what was actually
>>>> using it. It turns out nothing was :-)
>>>>
>>>> Commit 80376c3fc2c38fdd453 also added a TODO in vl.c suggesting
>>>> that there is the wrong place to register the reset function
>>>> which effectively resets the whole system starting at the
>>>> root which is the main system bus:
>>>>    qemu_register_reset(qbus_reset_all_fn, sysbus_get_default());
>>>> I don't understand why vl.c is a bad place to put that, and I'd
>>>> rather not move it to qdev.c (where in qdev.c?) because that
>>>> would reshuffle reset ordering which seems liable to cause
>>>> regressions. So maybe we should just delete that TODO comment?
>>>
>>> Hmm.
>>>
>>> The one in vl.c arranges to run qbus_reset_all(main_system_bus), which
>>> walks the tree rooted at main_system_bus, resetting its buses and
>>> devices in post-order.
>>>
>>> A registry of callbacks to run on certain events is a fine technique.
>>> Relying on registration order, however, is in bad taste.  We should
>>> model dependencies between reset functions explicitly.
>> 
>> That might be nice, but in practice we have no such model at
>> all, and I don't think I've seen anybody propose one.

Well, we do have qbus_reset_all() & friends reset buses and devices in
post order.  That's a model, isn't it?  I guess it can't model *all*
dependencies.  Still, shouldn't we use it wherever it actually suffices?

>>                                                       I hope we
>> don't have too many accidental ordering dependencies, but I'm
>> not confident that we have none at all, and would prefer not to
>> prod that sleeping dragon...
>> 
>> The multi-phase-reset patches Damien has on list at the moment
>> would allow some of the reset ordering issues to be sidestepped
>> because "phase 1" for all devices happens before "phase 2" so
>> you have "before" and "after" places to put the logic in different
>> devices.
>> 
>>> That said, we can't ignore dependencies just because we've coded them
>>> badly.
>>>
>>> I count more than 100 qemu_register_reset(), and most of them look like
>>> they reset hardware.  Why do devices use qemu_register_reset() instead
>>> of DeviceClass method reset?
>> 
>> Most of the ones for hardware are "this device hasn't been
>> converted to be a QOM Device" (eg hw/arm/omap1.c, hw/input/pckbd.c,
>> lots of the stuff in hw/ppc).

hw/input/pckbd.c is instructive.  The qemu_register_reset() in
i8042_mm_init() is inded for a non-qdevified device.  The one in
i8042_realizefn() has no such excuse.

Does not contradict what you wrote, of course.  Still, shouldn't we at
least get rid of the latter kind?

>> The other reason for having to have a qemu_register_reset() handler
>> to reset something that's a Device is if that Device is not on
>> a qbus. The most common example of this is CPUs -- since those
>> don't have a bus to live on they don't get reset by the "reset
>> everything that's on a QOM bus reachable from the main system
>> bus" logic. I'm not sure what the nicest way to address this is:
>> transitioning away from "reset of devices is based on the qdev tree"
>> to something else seems between difficult and impossible, even
>> though logically speaking the QOM tree is in many cases closer
>> to the actual hardware hierarchy of reset.
>
> One "solution" to reduce the qemu_register_reset usage would be to do
> handle in the Device base class (at creation or realize) if it has no
> parent bus like it is done for buses. But this would probably have an
> impact on reset ordering.

I'm afraid *any* improvement will have an impact on reset ordering.
Most reorderings will be just fine.  How terrible could the
less-than-fine ones be?

>>> Registered handlers run in (implicitly defined) registration order,
>>> reset methods in (explicit) qdev tree post order.  Much better as long
>>> as that's the order we want.
>>>
>>> Say we managed to clean up this mess somehow, so reset handler
>>> registration order doesn't matter anymore.  Then moving the
>>> qemu_register_reset() for main_system_bus from main() to wherever we
>>> create main_system_bus would make sense, wouldn't it?
>> 
>> I guess so... (There's an argument that the main system bus
>> should be a child bus of the Machine object, logically speaking,
>> but Machines aren't subtypes of Device so that doesn't work.)

We could replace the special case "bus's parent is null" by the special
case "bus's parent is a machine instead of a device", but I'm not sure
what exactly it would buy us.

>>> If it does make sense, we should keep the TODO in main(), because it
>>> asks for exactly that.  Perhaps delete "by qdev.c".
[...]

Re: [Qemu-devel] [RFC] hw/core/bus.c: Only the main system bus can have no parent
Posted by Peter Maydell 4 years, 10 months ago
On Tue, 21 May 2019 at 15:34, Markus Armbruster <armbru@redhat.com> wrote:
>
> Damien Hedde <damien.hedde@greensocs.com> writes:
>
> > On 5/16/19 11:19 AM, Peter Maydell wrote:
> >> On Thu, 16 May 2019 at 06:37, Markus Armbruster <armbru@redhat.com> wrote:
> >>>
> >>> A registry of callbacks to run on certain events is a fine technique.
> >>> Relying on registration order, however, is in bad taste.  We should
> >>> model dependencies between reset functions explicitly.
> >>
> >> That might be nice, but in practice we have no such model at
> >> all, and I don't think I've seen anybody propose one.
>
> Well, we do have qbus_reset_all() & friends reset buses and devices in
> post order.  That's a model, isn't it?  I guess it can't model *all*
> dependencies.  Still, shouldn't we use it wherever it actually suffices?

It's a well-defined order, but it doesn't actually help in a
lot of cases, because often the thing you care about ordering
on is not a device or is not in the same tree as the thing
it depends on. (For instance, there's an annoying ordering
issue between the rom-loader's "reset" function which copies rom
blob contents into RAM, and the Arm M-profile CPU reset method,
which needs to read the starting PC and SP out of RAM. [*])
It's also still an implicit ordering, in the sense that if
there's a dependency between device A (in subtree A') and device
B (in subtree B') then this will all work fine up until somebody
at the top level innocently reorders A' and B' in the list of
children of their mutual parent for some reason and then finds
they've broken an implicit dependency.

[*] aside: this one would actually be fixed by the multi-phase reset
proposal, since the definition of the reset phases is such that
the rom-loader should write to memory in phase 2 ('hold') and
the CPU should read from it in phase 3 ('exit').

> hw/input/pckbd.c is instructive.  The qemu_register_reset() in
> i8042_mm_init() is inded for a non-qdevified device.  The one in
> i8042_realizefn() has no such excuse.
>
> Does not contradict what you wrote, of course.  Still, shouldn't we at
> least get rid of the latter kind?

Yes, absolutely. Also we should qdevify the non-qdev devices.
This part is something where we have a clear path forwards
for making cleanups (no tricky design decisions/debate required),
it just requires somebody to write the actual code.

> >> The other reason for having to have a qemu_register_reset() handler
> >> to reset something that's a Device is if that Device is not on
> >> a qbus. The most common example of this is CPUs -- since those
> >> don't have a bus to live on they don't get reset by the "reset
> >> everything that's on a QOM bus reachable from the main system
> >> bus" logic. I'm not sure what the nicest way to address this is:
> >> transitioning away from "reset of devices is based on the qdev tree"
> >> to something else seems between difficult and impossible, even
> >> though logically speaking the QOM tree is in many cases closer
> >> to the actual hardware hierarchy of reset.
> >
> > One "solution" to reduce the qemu_register_reset usage would be to do
> > handle in the Device base class (at creation or realize) if it has no
> > parent bus like it is done for buses. But this would probably have an
> > impact on reset ordering.
>
> I'm afraid *any* improvement will have an impact on reset ordering.
> Most reorderings will be just fine.  How terrible could the
> less-than-fine ones be?

If you get "CPU reset" and "built in bootloader sets the PC to the
initial address specified by the -kernel file" the wrong way around
then we break booting :-)

> >>> Registered handlers run in (implicitly defined) registration order,
> >>> reset methods in (explicit) qdev tree post order.  Much better as long
> >>> as that's the order we want.
> >>>
> >>> Say we managed to clean up this mess somehow, so reset handler
> >>> registration order doesn't matter anymore.  Then moving the
> >>> qemu_register_reset() for main_system_bus from main() to wherever we
> >>> create main_system_bus would make sense, wouldn't it?
> >>
> >> I guess so... (There's an argument that the main system bus
> >> should be a child bus of the Machine object, logically speaking,
> >> but Machines aren't subtypes of Device so that doesn't work.)
>
> We could replace the special case "bus's parent is null" by the special
> case "bus's parent is a machine instead of a device", but I'm not sure
> what exactly it would buy us.

It's mostly just logically neater -- you could imagine a future
QEMU version that supported one simulation which had models
of more than one machine simultaneously, in which case there
ought to be two system buses, one per machine. And it's
logical that vl.c has to create the machine that the user
asked for, but it's a bit odder that it also has to create the
system bus specially extra, even though it's really just part
of the machine. But as I say, because Machine isn't a subtype of
Device you can't make buses be children of Machine anyway.
Fixing that is more effort than would be warranted for "it looks
slightly nicer this way around".

thanks
-- PMM

Re: [Qemu-devel] [RFC] hw/core/bus.c: Only the main system bus can have no parent
Posted by Markus Armbruster 4 years, 10 months ago
Peter Maydell <peter.maydell@linaro.org> writes:

> On Tue, 21 May 2019 at 15:34, Markus Armbruster <armbru@redhat.com> wrote:
>>
>> Damien Hedde <damien.hedde@greensocs.com> writes:
>>
>> > On 5/16/19 11:19 AM, Peter Maydell wrote:
>> >> On Thu, 16 May 2019 at 06:37, Markus Armbruster <armbru@redhat.com> wrote:
>> >>>
>> >>> A registry of callbacks to run on certain events is a fine technique.
>> >>> Relying on registration order, however, is in bad taste.  We should
>> >>> model dependencies between reset functions explicitly.
>> >>
>> >> That might be nice, but in practice we have no such model at
>> >> all, and I don't think I've seen anybody propose one.
>>
>> Well, we do have qbus_reset_all() & friends reset buses and devices in
>> post order.  That's a model, isn't it?  I guess it can't model *all*
>> dependencies.  Still, shouldn't we use it wherever it actually suffices?
>
> It's a well-defined order, but it doesn't actually help in a
> lot of cases, because often the thing you care about ordering
> on is not a device or is not in the same tree as the thing
> it depends on.

Dependencies need some kind of connection.

In the physical world, connections are a scarce resource.  This leads to
somewhat regular dependencies.  Reset order follows wiring.  The wiring
isn't always a tree, but it's tree-like enough to make trees a useful
concept there.

In the virtual world, connections aren't scarce; we can create
dependencies between anything.  Some dependencies, however, are just
sloppy modelling.

>                (For instance, there's an annoying ordering
> issue between the rom-loader's "reset" function which copies rom
> blob contents into RAM, and the Arm M-profile CPU reset method,
> which needs to read the starting PC and SP out of RAM. [*])

As your [*] explains, this is an artifact of sloppy modelling.

> It's also still an implicit ordering, in the sense that if
> there's a dependency between device A (in subtree A') and device
> B (in subtree B') then this will all work fine up until somebody
> at the top level innocently reorders A' and B' in the list of
> children of their mutual parent for some reason and then finds
> they've broken an implicit dependency.

Yes, implicit dependencies are brittle.  One of the reasons why making
them explicit can be worthwhile.

> [*] aside: this one would actually be fixed by the multi-phase reset
> proposal, since the definition of the reset phases is such that
> the rom-loader should write to memory in phase 2 ('hold') and
> the CPU should read from it in phase 3 ('exit').
>
>> hw/input/pckbd.c is instructive.  The qemu_register_reset() in
>> i8042_mm_init() is inded for a non-qdevified device.  The one in
>> i8042_realizefn() has no such excuse.
>>
>> Does not contradict what you wrote, of course.  Still, shouldn't we at
>> least get rid of the latter kind?
>
> Yes, absolutely. Also we should qdevify the non-qdev devices.
> This part is something where we have a clear path forwards
> for making cleanups (no tricky design decisions/debate required),
> it just requires somebody to write the actual code.

After all these years, the transition to qdev is still incomplete, and
the incompleteness still bogs us down.

We don't even know what still needs to be converted.  If we had a list
of such device models, and which machines depend on them, we could apply
a bit more force to the problem.

>> >> The other reason for having to have a qemu_register_reset() handler
>> >> to reset something that's a Device is if that Device is not on
>> >> a qbus. The most common example of this is CPUs -- since those
>> >> don't have a bus to live on they don't get reset by the "reset
>> >> everything that's on a QOM bus reachable from the main system
>> >> bus" logic. I'm not sure what the nicest way to address this is:
>> >> transitioning away from "reset of devices is based on the qdev tree"
>> >> to something else seems between difficult and impossible, even
>> >> though logically speaking the QOM tree is in many cases closer
>> >> to the actual hardware hierarchy of reset.
>> >
>> > One "solution" to reduce the qemu_register_reset usage would be to do
>> > handle in the Device base class (at creation or realize) if it has no
>> > parent bus like it is done for buses. But this would probably have an
>> > impact on reset ordering.
>>
>> I'm afraid *any* improvement will have an impact on reset ordering.
>> Most reorderings will be just fine.  How terrible could the
>> less-than-fine ones be?
>
> If you get "CPU reset" and "built in bootloader sets the PC to the
> initial address specified by the -kernel file" the wrong way around
> then we break booting :-)

Wonderfully unsubtle failure!  Even the stupidest of smoke tests should
catch it.

Would you be willing to hazard a guess on the risk of creating bugs
subtle enough to survive basic smoke tests?

>> >>> Registered handlers run in (implicitly defined) registration order,
>> >>> reset methods in (explicit) qdev tree post order.  Much better as long
>> >>> as that's the order we want.
>> >>>
>> >>> Say we managed to clean up this mess somehow, so reset handler
>> >>> registration order doesn't matter anymore.  Then moving the
>> >>> qemu_register_reset() for main_system_bus from main() to wherever we
>> >>> create main_system_bus would make sense, wouldn't it?
>> >>
>> >> I guess so... (There's an argument that the main system bus
>> >> should be a child bus of the Machine object, logically speaking,
>> >> but Machines aren't subtypes of Device so that doesn't work.)
>>
>> We could replace the special case "bus's parent is null" by the special
>> case "bus's parent is a machine instead of a device", but I'm not sure
>> what exactly it would buy us.
>
> It's mostly just logically neater -- you could imagine a future
> QEMU version that supported one simulation which had models
> of more than one machine simultaneously, in which case there
> ought to be two system buses, one per machine. And it's
> logical that vl.c has to create the machine that the user
> asked for, but it's a bit odder that it also has to create the
> system bus specially extra, even though it's really just part
> of the machine. But as I say, because Machine isn't a subtype of
> Device you can't make buses be children of Machine anyway.
> Fixing that is more effort than would be warranted for "it looks
> slightly nicer this way around".

Concur.

On a green field, we'd perhaps create things so that these buses can be
children of machines.  But our field looks quite ploughed.

Re: [Qemu-devel] [RFC] hw/core/bus.c: Only the main system bus can have no parent
Posted by Peter Maydell 4 years, 10 months ago
On Thu, 23 May 2019 at 07:39, Markus Armbruster <armbru@redhat.com> wrote:
>
> Peter Maydell <peter.maydell@linaro.org> writes:
> > It's a well-defined order, but it doesn't actually help in a
> > lot of cases, because often the thing you care about ordering
> > on is not a device or is not in the same tree as the thing
> > it depends on.
>
> Dependencies need some kind of connection.
>
> In the physical world, connections are a scarce resource.  This leads to
> somewhat regular dependencies.  Reset order follows wiring.  The wiring
> isn't always a tree, but it's tree-like enough to make trees a useful
> concept there.

I would say rather than in the physical world reset tends
to happen in parallel -- somebody pulls a reset line high
and everything connected to it simultaneously goes into a
state where it resets its internal state, and stays there.
Things like "don't read PC before memory is initialized" tend
to be handled by "memory is initialized while the CPU is being
held in reset, and then the CPU only reads it when it *leaves*
the reset state". (So the suggested multi-phasing does try
to make our emulation match the real hardware).

There is also some 'tree-like' behaviour in that where a
collection of devices can all be reset at once this is usually
based on the "composition tree", eg if you reset an SoC then
it's resetting all the devices in the SoC. Unfortunately QEMU's
qbus tree does not at all match this "composition tree". Our
QOM object tree gets closer, but I have no idea how you might
get from the qbus-reset world we have today to a world where
reset propagated down the QOM object tree.

> >> hw/input/pckbd.c is instructive.  The qemu_register_reset() in
> >> i8042_mm_init() is inded for a non-qdevified device.  The one in
> >> i8042_realizefn() has no such excuse.
> >>
> >> Does not contradict what you wrote, of course.  Still, shouldn't we at
> >> least get rid of the latter kind?
> >
> > Yes, absolutely. Also we should qdevify the non-qdev devices.
> > This part is something where we have a clear path forwards
> > for making cleanups (no tricky design decisions/debate required),
> > it just requires somebody to write the actual code.
>
> After all these years, the transition to qdev is still incomplete, and
> the incompleteness still bogs us down.
>
> We don't even know what still needs to be converted.  If we had a list
> of such device models, and which machines depend on them, we could apply
> a bit more force to the problem.

In this case we *do* have a clear list of things to fix:
we just need to search for places that are directly calling
qemu_register_reset() (and weed out the handful of "no, really
not a device" false positives). I think that this is actually
the case for most situations where failure-to-qdevify is a
blocker -- it's a blocker because there's a particular API we
want to get rid of or clean up or whatever, and then we just
need to look for uses of the API. If there happen to be non-qdev
devices which (perhaps buggily) don't use the API, we don't
need to care about them because they don't block us from the cleanup.

> >> I'm afraid *any* improvement will have an impact on reset ordering.
> >> Most reorderings will be just fine.  How terrible could the
> >> less-than-fine ones be?
> >
> > If you get "CPU reset" and "built in bootloader sets the PC to the
> > initial address specified by the -kernel file" the wrong way around
> > then we break booting :-)
>
> Wonderfully unsubtle failure!  Even the stupidest of smoke tests should
> catch it.
>
> Would you be willing to hazard a guess on the risk of creating bugs
> subtle enough to survive basic smoke tests?

Probably quite high, given that we don't have very good test
coverage even for 'basic smoke tests', and we certainly don't
have a smoke test that covers every device model. Which doesn't
mean we should be afraid of touching reset ordering entirely:
it just means that for this particular patch I wanted to
play safe, because "remove a TODO comment" doesn't seem like a
solid enough payoff for running the risk.

thanks
-- PMM

Re: [Qemu-devel] [RFC] hw/core/bus.c: Only the main system bus can have no parent
Posted by Markus Armbruster 4 years, 10 months ago
Peter Maydell <peter.maydell@linaro.org> writes:

> On Thu, 23 May 2019 at 07:39, Markus Armbruster <armbru@redhat.com> wrote:
>>
>> Peter Maydell <peter.maydell@linaro.org> writes:
>> > It's a well-defined order, but it doesn't actually help in a
>> > lot of cases, because often the thing you care about ordering
>> > on is not a device or is not in the same tree as the thing
>> > it depends on.
>>
>> Dependencies need some kind of connection.
>>
>> In the physical world, connections are a scarce resource.  This leads to
>> somewhat regular dependencies.  Reset order follows wiring.  The wiring
>> isn't always a tree, but it's tree-like enough to make trees a useful
>> concept there.
>
> I would say rather than in the physical world reset tends
> to happen in parallel -- somebody pulls a reset line high
> and everything connected to it simultaneously goes into a
> state where it resets its internal state, and stays there.

Yes.  Can be viewed as a tree of depth 1.  Parallel doesn't contradict
tree.

> Things like "don't read PC before memory is initialized" tend
> to be handled by "memory is initialized while the CPU is being
> held in reset, and then the CPU only reads it when it *leaves*
> the reset state". (So the suggested multi-phasing does try
> to make our emulation match the real hardware).

No argument.

> There is also some 'tree-like' behaviour in that where a
> collection of devices can all be reset at once this is usually
> based on the "composition tree", eg if you reset an SoC then
> it's resetting all the devices in the SoC. Unfortunately QEMU's
> qbus tree does not at all match this "composition tree". Our
> QOM object tree gets closer, but I have no idea how you might
> get from the qbus-reset world we have today to a world where
> reset propagated down the QOM object tree.

Supports the idea that inadequate device modelling contributes to the
problem.

One of the stated reasons for the invention of QOM was better device
modelling.  Sadly, we've done that only in a few corners.  There's too
much left of qdev's oversimplified design.  Or maybe too little left of
qdev's initial simplicity.  We're not exactly in a happy place there,
I'm afraid.

>> >> hw/input/pckbd.c is instructive.  The qemu_register_reset() in
>> >> i8042_mm_init() is inded for a non-qdevified device.  The one in
>> >> i8042_realizefn() has no such excuse.
>> >>
>> >> Does not contradict what you wrote, of course.  Still, shouldn't we at
>> >> least get rid of the latter kind?
>> >
>> > Yes, absolutely. Also we should qdevify the non-qdev devices.
>> > This part is something where we have a clear path forwards
>> > for making cleanups (no tricky design decisions/debate required),
>> > it just requires somebody to write the actual code.
>>
>> After all these years, the transition to qdev is still incomplete, and
>> the incompleteness still bogs us down.
>>
>> We don't even know what still needs to be converted.  If we had a list
>> of such device models, and which machines depend on them, we could apply
>> a bit more force to the problem.
>
> In this case we *do* have a clear list of things to fix:
> we just need to search for places that are directly calling
> qemu_register_reset() (and weed out the handful of "no, really
> not a device" false positives). I think that this is actually
> the case for most situations where failure-to-qdevify is a
> blocker -- it's a blocker because there's a particular API we
> want to get rid of or clean up or whatever, and then we just
> need to look for uses of the API. If there happen to be non-qdev
> devices which (perhaps buggily) don't use the API, we don't
> need to care about them because they don't block us from the cleanup.

Fair enough.

>> >> I'm afraid *any* improvement will have an impact on reset ordering.
>> >> Most reorderings will be just fine.  How terrible could the
>> >> less-than-fine ones be?
>> >
>> > If you get "CPU reset" and "built in bootloader sets the PC to the
>> > initial address specified by the -kernel file" the wrong way around
>> > then we break booting :-)
>>
>> Wonderfully unsubtle failure!  Even the stupidest of smoke tests should
>> catch it.
>>
>> Would you be willing to hazard a guess on the risk of creating bugs
>> subtle enough to survive basic smoke tests?
>
> Probably quite high, given that we don't have very good test
> coverage even for 'basic smoke tests', and we certainly don't
> have a smoke test that covers every device model. Which doesn't
> mean we should be afraid of touching reset ordering entirely:
> it just means that for this particular patch I wanted to
> play safe, because "remove a TODO comment" doesn't seem like a
> solid enough payoff for running the risk.

Makes sense.

Registering qbus_reset_all_fn() in main() is kind of ugly, but it works.
There's a comment pointing out it's ugly.  Right now it's a TODO
comment, which maybe expresses more hope for cleanup than there really
is.  I'd leave it alone anyway.

Re: [Qemu-devel] [RFC] hw/core/bus.c: Only the main system bus can have no parent
Posted by Peter Maydell 4 years, 10 months ago
On Thu, 23 May 2019 at 13:09, Markus Armbruster <armbru@redhat.com> wrote:
> Registering qbus_reset_all_fn() in main() is kind of ugly, but it works.
> There's a comment pointing out it's ugly.  Right now it's a TODO
> comment, which maybe expresses more hope for cleanup than there really
> is.  I'd leave it alone anyway.

...so after this long thread, are we at a conclusion?
I think my view is that the patch-as-sent needs to be
revised to also fix bus_unparent() but is otherwise OK.

thanks
-- PMM

Re: [Qemu-devel] [RFC] hw/core/bus.c: Only the main system bus can have no parent
Posted by Damien Hedde 4 years, 10 months ago
On 5/23/19 2:12 PM, Peter Maydell wrote:
> ...so after this long thread, are we at a conclusion?
> I think my view is that the patch-as-sent needs to be
> revised to also fix bus_unparent() but is otherwise OK.
> 

I think so too.

Damien

Re: [Qemu-devel] [RFC] hw/core/bus.c: Only the main system bus can have no parent
Posted by Markus Armbruster 4 years, 10 months ago
Peter Maydell <peter.maydell@linaro.org> writes:

> On Thu, 23 May 2019 at 13:09, Markus Armbruster <armbru@redhat.com> wrote:
>> Registering qbus_reset_all_fn() in main() is kind of ugly, but it works.
>> There's a comment pointing out it's ugly.  Right now it's a TODO
>> comment, which maybe expresses more hope for cleanup than there really
>> is.  I'd leave it alone anyway.
>
> ...so after this long thread, are we at a conclusion?
> I think my view is that the patch-as-sent needs to be
> revised to also fix bus_unparent() but is otherwise OK.

Yup.  You can add my
Reviewed-by: Markus Armbruster <armbru@redhat.com>
to it then