[Qemu-devel] [RFC 00/17] multi-phase reset mechanism

Damien Hedde posted 17 patches 5 years, 1 month ago
Test docker-mingw@fedora passed
Test docker-clang@ubuntu failed
Test asan failed
Test checkpatch passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/cover.1553510737.git.damien.hedde@greensocs.com
Maintainers: "Edgar E. Iglesias" <edgar.iglesias@gmail.com>, Peter Maydell <peter.maydell@linaro.org>, "Marc-André Lureau" <marcandre.lureau@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, Alistair Francis <alistair@alistair23.me>
hw/arm/xilinx_zynq.c           |  14 +-
hw/char/cadence_uart.c         |  48 ++-
hw/core/Makefile.objs          |   3 +
hw/core/bus.c                  |  64 +++-
hw/core/qdev-vmstate.c         |  27 ++
hw/core/qdev.c                 | 166 ++++++++++-
hw/core/reset-domain.c         | 121 ++++++++
hw/core/reset.c                | 149 +++++++++-
hw/core/resettable.c           |  69 +++++
hw/misc/zynq_slcr.c            | 515 ++++++++++++++++++---------------
include/hw/char/cadence_uart.h |  10 +-
include/hw/qdev-core.h         |  97 +++++++
include/hw/reset-domain.h      |  49 ++++
include/hw/resettable.h        |  83 ++++++
include/sysemu/reset.h         |  47 +++
vl.c                           |   3 +-
16 files changed, 1195 insertions(+), 270 deletions(-)
create mode 100644 hw/core/qdev-vmstate.c
create mode 100644 hw/core/reset-domain.c
create mode 100644 hw/core/resettable.c
create mode 100644 include/hw/reset-domain.h
create mode 100644 include/hw/resettable.h
[Qemu-devel] [RFC 00/17] multi-phase reset mechanism
Posted by Damien Hedde 5 years, 1 month ago
Hi all,

This series is a proposal to implement the multi-phase reset we've discussed
here (https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg00310.html) and
more recently there
(https://lists.gnu.org/archive/html/qemu-devel/2019-03/msg00081.html).

To summarize, we need a multi-phase reset to allow for a more complex
initialization of a platform. In particular we need to "propagate" a clock
tree, but we have to ensure that every device is initialized first.

To solve this problem, the following 3-phases reset mechanism is proposed (I've
removed the 4th given our last discussion).

#DESCRIPTION

INIT PHASE: Reset the object internal state, put a resetting flag and do the
    same for the reset subtree. No side effect on other devices to guarantee
    that, in a reset domain, everything get initialized first. This corresponds
    mostly to what is currently done in the device/bus reset method.

HOLD PHASE: This phase allows to control a reset with a I/O. When a I/O control
    a reset procedure based on the I/O level (not edge), we may need to assert
    the reset, wait some time, and finally de-assert the reset. The consequence
    is that such a device can stay in a "resetting state" and may need to show
    this state to other devices through its outputs. For example, a clock
    controller will typically shutdown its clocks when it is in resetting state.

EXIT PHASE (previously named 'release'): This phase sets outputs to state after
     reset. For a clock controller it starts the clocks. It also clears the
     "resetting" flag. A device should not react to inputs until this flag has
     been cleared. During this phase, outputs are propagated in the reset domain
     (and outside the reset domain).

To implement this, this series add a Resettable interface containing 3 methods:
one for each phase. The init phase's method takes a boolean argument allowing to
distinguish cold and warm resets.

In this series, Device and Bus implement the interface. A vmstate description
subsection is added to allow migration of reset related state for device.
3 methods (one per phase) are also added in Device's class. They correspond to
the local part (see the code example below) of the reset. They avoid device
specialization to have to handle the sub-domain part and "resetting" state.

Functions to add gpio control over the reset are added. It is possible to add
an active low/high reset pin to control the warm reset and an active low/high
power gating pin to control the cold reset.

The bus/device reset tree is converted to this 3-phases mechanism. I also
added a new ResetDomain object which is just a Resettable container. It is
used to have a global "system reset domain" to handle main 3-phases reset and
replace the current single-phase-reset handler mechanism.

As an example, in the xilinx_zynq machine, I've converted the slcr (the resets
and clocks controller device) and the uart peripheral to this 3-phases reset
mechanism. Gpio are added between the devices to transmit reset control from
the slcr to the uarts.

All changes have been made so that existing reset behavior is not modified.

#INTERFACE CHOICE

To be honest, I have some doubt about the interface. I've kept it minimal and
the consequence is implementation is complex (and kind of duplicated in every
base implementation like Device or Bus). One of the problem is the `resetting`
flag, which in practical must be a counter to handle reset "reentrance".
Indeed nothing forbids to reset some device that is already held in "resetting"
state by some other means. As an example, in the zynq machine, there can be 
a global/system reset while an uart is reset by the slcr. It it also possible
to cold and warm resets triggered concurrently.

The Resettable methods implementation have to looks this (methods are called
with iothread locked):
```
void init_phase(Device *dev, bool cold)
{
    /* call sub-resettable init phases */

    dev->resetting += 1;
    /* do local init phase (eg: state reset) */
}
void hold_phase(Device *dev)
{
    /* call sub-resettable hold phases */

    /* do local hold phase (eg: set some I/O level) */
}
void exit_phase(Device *dev)
{
    /* call sub-resettable exit phases (independently of resetting value since,
       every resettable have its own resetting counter) */
   
    dev->resetting -= 1;
    if (dev->resetting == 0) {
        /* do local exit phase (eg: set some I/O level) only if the device is
           really leaving the resetting state */
    }
}
```
Since I don't think specialization should care about sub-resettable and the
resetting counter, I've added the local init/hold/exit phases as DeviceClass
methods.
Otherwise, I see two other solutions:
+ keep the interface as it is
+ add some state knowledge in the interface (with some kind of get_state method)
  so that resetting counter and some kind of sub-resettable list are handled in
  the common code in the interface. The 3 methods will then handle only the local
  actions.
+ have 6 methods in the interface, one for the local actions, one for the
  sub-resettable so that sub-resettable is not handled in the common code. And we
  need also a get_resetting method to access/update the counter.

#DEVICE RESET CHOICE

The Device is a special case, it has 2 reset entry point: `qdev_reset_all` and
`device_reset`. The latter doing a device reset only, while the former also
reset all buses hierarchy under the device.

I choose the put the sub-buses into the device reset domain, so that the
behavior of the resettable interface on Device is the same as qdev_reset_all.
I don't know if `device_reset` has some real meaning (resetting only the
Device). It is not often used and I think every time it is used there is no
sub-buses so the behavior is the same for both functions.

If I am mistaken about putting buses into device reset domain, it is possible
to make a special bus-hierarchy-reset-domain for every Device/Bus that differs
from the Resettable interface on Device/Bus.

# SYSBUS SUPPORT

Regarding the sysbus support Edgar mentioned in the prevous discussion: In this
series, there is no support in sysbus base class for disabling memory regions.
Having each sysbus device specialization, in every memory region handler, to
check if `resetting` is set or not is not user-friendly. But for we can't 
modify memory region `enabled` flags since devices may already set/unset them.
Maybe we could have some kind of super-switch to disable all memory regions in
a sysbus device but I don't know how this could be done.

The series is organised as follow:
Patches 1 and 2 adds Resettable interface and ResetDomain object.
Patches 3 to 7 converts Device and Bus to Resettable.
Patches 8 to 12 handles the global system reset domain
Patches 13 to 17 do the zynq implementation (patch 13 is an already-reviewed
patch from the clock api patch series)

Thank you for your feedback,
Damien

Damien Hedde (17):
  Create Resettable QOM interface
  Create the ResetDomain QOM object
  make Device and Bus Resettable
  Add local reset methods in Device class
  add vmstate description for device reset state
  Add function to control reset with gpio inputs
  convert qdev/bus_reset_all to Resettable
  Add a global ResetDomain object for system emulation
  global ResetDomain support for legacy reset handlers
  Delete the system ResetDomain at the end of emulation
  Put orphan buses in system reset domain
  Put default sysbus in system reset domain
  hw/misc/zynq_slcr: use standard register definition
  convert cadence_uart to 3-phases reset
  Convert zynq's slcr to 3-phases reset
  Add uart reset support in zynq_slcr
  Connect the uart reset gpios in the zynq platform

 hw/arm/xilinx_zynq.c           |  14 +-
 hw/char/cadence_uart.c         |  48 ++-
 hw/core/Makefile.objs          |   3 +
 hw/core/bus.c                  |  64 +++-
 hw/core/qdev-vmstate.c         |  27 ++
 hw/core/qdev.c                 | 166 ++++++++++-
 hw/core/reset-domain.c         | 121 ++++++++
 hw/core/reset.c                | 149 +++++++++-
 hw/core/resettable.c           |  69 +++++
 hw/misc/zynq_slcr.c            | 515 ++++++++++++++++++---------------
 include/hw/char/cadence_uart.h |  10 +-
 include/hw/qdev-core.h         |  97 +++++++
 include/hw/reset-domain.h      |  49 ++++
 include/hw/resettable.h        |  83 ++++++
 include/sysemu/reset.h         |  47 +++
 vl.c                           |   3 +-
 16 files changed, 1195 insertions(+), 270 deletions(-)
 create mode 100644 hw/core/qdev-vmstate.c
 create mode 100644 hw/core/reset-domain.c
 create mode 100644 hw/core/resettable.c
 create mode 100644 include/hw/reset-domain.h
 create mode 100644 include/hw/resettable.h

-- 
2.21.0


Re: [Qemu-devel] [RFC 00/17] multi-phase reset mechanism
Posted by no-reply@patchew.org 5 years, 1 month ago
Patchew URL: https://patchew.org/QEMU/cover.1553510737.git.damien.hedde@greensocs.com/



Hi,

This series failed the asan build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
time make docker-test-debug@fedora TARGET_LIST=x86_64-softmmu J=14 NETWORK=1
=== TEST SCRIPT END ===

/usr/bin/ld: /tmp/qemu-test/src/include/hw/resettable.h:80: undefined reference to `resettable_deassert_reset'
clang++ -L/tmp/qemu-test/build/dtc/libfdt  -I/usr/include/pixman-1  -I/tmp/qemu-test/src/dtc/libfdt -Werror -DHAS_LIBSSH2_SFTP_FSYNC  -pthread -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include  -fPIE -DPIE -m64 -mcx16 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv -std=gnu99  -Wno-string-plus-int -Wno-initializer-overrides -Wexpansion-to-defined -Wendif-labels -Wno-shift-negative-value -Wno-missing-include-dirs -Wempty-body -Wnested-externs -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wold-style-definition -Wtype-limits -fstack-protector-strong  -I/usr/include/p11-kit-1    -I/usr/include/libpng16  -I/usr/include/spice-1 -I/usr/include/spice-server -I/usr/include/cacard -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -I/usr/include/nss3 -I/usr/include/nspr4 -pthread -I/usr/include/libmount -I/usr/include/blkid -I/usr/include/uuid -I/usr/include/pixman-1  -I/tmp/qemu-test/src/tests -fsanitize=undefined -fsanitize=address -g  -Wl,--warn-common -Wl,-z,relro -Wl,-z,now -pie -m64 -g  -o tests/test-crypto-secret tests/test-crypto-secret.o authz/base.o authz/simple.o authz/list.o authz/listfile.o crypto/init.o crypto/hash.o crypto/hash-nettle.o crypto/hmac.o crypto/hmac-nettle.o crypto/aes.o crypto/desrfb.o crypto/cipher.o crypto/tlscreds.o crypto/tlscredsanon.o crypto/tlscredspsk.o crypto/tlscredsx509.o crypto/tlssession.o crypto/secret.o crypto/random-gnutls.o crypto/pbkdf.o crypto/pbkdf-nettle.o crypto/ivgen.o crypto/ivgen-essiv.o crypto/ivgen-plain.o crypto/ivgen-plain64.o crypto/afsplit.o crypto/xts.o crypto/block.o crypto/block-qcow.o crypto/block-luks.o qom/object.o qom/container.o qom/qom-qobject.o qom/object_interfaces.o  libqemuutil.a   -lm -lz  -lgthread-2.0 -pthread -lglib-2.0   -lrt -lz -lutil -lcap-ng -lnettle  -lgnutls  
clang++ -L/tmp/qemu-test/build/dtc/libfdt  -I/usr/include/pixman-1  -I/tmp/qemu-test/src/dtc/libfdt -Werror -DHAS_LIBSSH2_SFTP_FSYNC  -pthread -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include  -fPIE -DPIE -m64 -mcx16 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wall -Wundef -Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common -fwrapv -std=gnu99  -Wno-string-plus-int -Wno-initializer-overrides -Wexpansion-to-defined -Wendif-labels -Wno-shift-negative-value -Wno-missing-include-dirs -Wempty-body -Wnested-externs -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wold-style-definition -Wtype-limits -fstack-protector-strong  -I/usr/include/p11-kit-1    -I/usr/include/libpng16  -I/usr/include/spice-1 -I/usr/include/spice-server -I/usr/include/cacard -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -I/usr/include/nss3 -I/usr/include/nspr4 -pthread -I/usr/include/libmount -I/usr/include/blkid -I/usr/include/uuid -I/usr/include/pixman-1  -I/tmp/qemu-test/src/tests -fsanitize=undefined -fsanitize=address -g  -Wl,--warn-common -Wl,-z,relro -Wl,-z,now -pie -m64 -g  -o tests/test-crypto-tlscredsx509 tests/test-crypto-tlscredsx509.o tests/crypto-tls-x509-helpers.o tests/pkix_asn1_tab.o authz/base.o authz/simple.o authz/list.o authz/listfile.o crypto/init.o crypto/hash.o crypto/hash-nettle.o crypto/hmac.o crypto/hmac-nettle.o crypto/aes.o crypto/desrfb.o crypto/cipher.o crypto/tlscreds.o crypto/tlscredsanon.o crypto/tlscredspsk.o crypto/tlscredsx509.o crypto/tlssession.o crypto/secret.o crypto/random-gnutls.o crypto/pbkdf.o crypto/pbkdf-nettle.o crypto/ivgen.o crypto/ivgen-essiv.o crypto/ivgen-plain.o crypto/ivgen-plain64.o crypto/afsplit.o crypto/xts.o crypto/block.o crypto/block-qcow.o crypto/block-luks.o qom/object.o qom/container.o qom/qom-qobject.o qom/object_interfaces.o  libqemuutil.a  -ltasn1 -lm -lz  -lgthread-2.0 -pthread -lglib-2.0   -lrt -lz -lutil -lcap-ng -lnettle  -lgnutls  
clang-7: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [/tmp/qemu-test/src/rules.mak:124: tests/test-qdev-global-props] Error 1
make: *** Waiting for unfinished jobs....
Traceback (most recent call last):


The full log is available at
http://patchew.org/logs/cover.1553510737.git.damien.hedde@greensocs.com/testing.asan/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
Re: [Qemu-devel] [RFC 00/17] multi-phase reset mechanism
Posted by Peter Maydell 5 years ago
On Mon, 25 Mar 2019 at 11:02, Damien Hedde <damien.hedde@greensocs.com> wrote:
>
> Hi all,
>
> This series is a proposal to implement the multi-phase reset we've discussed
> here (https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg00310.html) and
> more recently there
> (https://lists.gnu.org/archive/html/qemu-devel/2019-03/msg00081.html).
>
> To summarize, we need a multi-phase reset to allow for a more complex
> initialization of a platform. In particular we need to "propagate" a clock
> tree, but we have to ensure that every device is initialized first.

Hi; I finally managed to get my thoughts about reset coherent
enough to write down in an email.


OVERVIEW

I had a read through the patchset and spent a while trying to
understand what we currently have.

Our current (device) reset model is:
 * single-phase
    -- there is only one 'reset' method
 * implicit
    -- devices don't need to be explicitly registered anywhere
       in a "reset hierarchy"; instead they are reset by virtue
       of being in the bus heirarchy they are already in
 * bus-based
    -- sysbus reset is registered in vl.c; free-floating other buses[*]
       (ie those with a NULL parent) have a reset registered in qbus_realize;
       buses with parents (ie anything with a non-NULL parent passed to
       qbus_create() or qbus_create_inplace()) will get traversed by
       the recursive traversal of whatever bus their parent is on.
 * not exhaustive
    -- any devices not put on a bus, or put on a bus whose parent device
       is not on a bus, will not get reset
 * not modelling GPIO reset signal lines

[*] It turns out we actually don't have any of these any more, so
we can remove the code that deals with them. The only parentless bus
is the main system bus, which is the root of the "reset hierarchy".

This patchset is trying to address:
 * changing to multi-phase
 * modelling of GPIO reset signal lines

It leaves reset as bus-based: currently we do this via qbus_walk_children/
qdev_walk_children, which traverse the bus->children and dev->child_bus
lists, and in the patchset's implementation of Resettable for qdev/qbus
the methods iterate through those.

I think this is reasonable -- we don't want to try to tackle every
reset related problem at once. The two issues the patchset is looking
at fit well together, because the GPIO-reset-lines are a motivation
for switching to multiphase (you need to handle both entering and
leaving reset).

("not exhaustive" is the thing we should really try to
fix at some point, but I have no good ideas for how to do this.)

API DESIGN

On what the right APIs should be: I think we should separate
"the API that's nice for devices to implement" from "the API that's
nice for callers wanting to reset a device". Here's my suggestion
for doing that:


Have the Resettable interface be:
 * init
 * hold
 * exit
 * get_reset_count
 * increment_reset_count (returns previous count)
 * decrement_reset_count (returns new count)
 * some method for "iterate over any Resettable child objects"
   (probably a "call this callback for each child" type API)

Individual devices implement init/hold/exit
Device base class implements the reset_count methods
Device base class implements a default 'init' method that calls dc->reset
  and default hold/exit that are no-ops
Device base class has a new vmstate subsection which migrates the
  reset count, subsection's needed function says "only send
  if reset count is non-zero". Back-compat here should be fine
  as current machines don't implement any way that a device in
  the system can be being held in reset at the point where a
  migration happens. If future changes add features to a
  machine model that permit that, the migration is forwards-compatible.
Device base class implements the iterate-over-children method to
 iterate over the dev->child_bus buses
Bus base class implements reset_count methods
Bus base class implements default no-op init/hold/exit
Bus base class implements the iterate-over-children method to
 iterate over the bus->children devices
Handling migration of the bus base class reset count is a little
 awkward -- we'll need to put it in a vmstate subsection which
 is part of the vmstate of the device which owns the bus.
 (I talked to David Gilbert on IRC about this and have some
 possibilities for how to do it, but will postpone writing them
 out until we've decided whether we like the APIs.)

The "API for devices to implement" is then the init/hold/exit
methods of the Resettable interface -- they don't need to worry
about these methods possibly being called multiple times, and
they don't need to handle the reference count or passing on
the calling of the phase methods to their children. They
just need to implement the correct behaviour for their device
for this phase.

The "API for callers wanting to reset a device" is a set of
helper functions that take a pointer to a Resettable object.
It's these that deal with the reset count and children:

resettable_assert_reset(Resettable *r)
{
   if (r->increment_reset_count() == 0) {
       r->init();
       r->foreach_child(do_call_init);
       r->hold();
       r->foreach_child(do_call_hold);
   }
}

resettable_deassert_reset(Resettable *r)
{
   if (r->decrement_reset_count() == 0) {
      r->foreach_child(do_call_exit);
      r->exit();
   }
}

plus a utility function for "call assert then deassert",
and maybe one that wraps the get_reset_count method.
So callers that want to reset devices (or buses) don't need
to care about phases, they just assert and then deassert reset.

Do you think that works ?


API TRANSITIONS

The other issue here is API transitions: the patchset essentially
obsoletes the old DeviceClass::reset function, for instance. I think
we should be clear about what the old and new APIs here are, and
what our plans for transitioning to the new ones are. In some cases
there are really very few users of the old API -- for instance the
patchset makes qbus_reset_all(bus) a synonym for qbus_reset(bus, false),
but there are only a dozen or so users of qbus_reset_all(). I think
we should just go ahead and convert them all. (For purposes of
structuring the patchset starting with a patch that says "reimplement
qbus_reset_all() in terms of qbus_reset()" is OK, but then we should
fix up the callers afterwards.) There are of course a lot more
implementations of DeviceState::reset so transitioning away from
that is a lot trickier, but we could look at a coccinelle script
that could automate it.

If you could describe in the cover letter of the next version of
the patchseries all the old APIs being deprecated and the new
ones that replace them, I think that would be very helpful.


MISCELLANEOUS MINOR POINTS

I know I suggested the idea of a ResetDomain object, but in the
series as it stands I'm not sure it's serving very much purpose;
perhaps we should drop it for the moment (just leaving the legacy
reset handlers and sysbus reset the way they are) ? We can come back
to it later as a concept.

The "support reset via GPIO line" patch looks generally OK, but
you can't just add fields to the DeviceState vmstate struct -- you'll
break migration from older QEMU versions. The new fields need to
go in a vmstate subsection with an appropriate 'needed' function.

We should definitely make sure we have good documentation for
what device authors should do to implement reset.

thanks
-- PMM

Re: [Qemu-devel] [RFC 00/17] multi-phase reset mechanism
Posted by Damien Hedde 4 years, 12 months ago
Hi,

On 5/16/19 2:41 PM, Peter Maydell wrote:
> On Mon, 25 Mar 2019 at 11:02, Damien Hedde <damien.hedde@greensocs.com> wrote:
>>
>> Hi all,
>>
>> This series is a proposal to implement the multi-phase reset we've discussed
>> here (https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg00310.html) and
>> more recently there
>> (https://lists.gnu.org/archive/html/qemu-devel/2019-03/msg00081.html).
>>
>> To summarize, we need a multi-phase reset to allow for a more complex
>> initialization of a platform. In particular we need to "propagate" a clock
>> tree, but we have to ensure that every device is initialized first.
> 
> Hi; I finally managed to get my thoughts about reset coherent
> enough to write down in an email.
> 
> 
> OVERVIEW
> 
> I had a read through the patchset and spent a while trying to
> understand what we currently have.
> 
> Our current (device) reset model is:
>  * single-phase
>     -- there is only one 'reset' method
>  * implicit
>     -- devices don't need to be explicitly registered anywhere
>        in a "reset hierarchy"; instead they are reset by virtue
>        of being in the bus heirarchy they are already in
>  * bus-based
>     -- sysbus reset is registered in vl.c; free-floating other buses[*]
>        (ie those with a NULL parent) have a reset registered in qbus_realize;
>        buses with parents (ie anything with a non-NULL parent passed to
>        qbus_create() or qbus_create_inplace()) will get traversed by
>        the recursive traversal of whatever bus their parent is on.
>  * not exhaustive
>     -- any devices not put on a bus, or put on a bus whose parent device
>        is not on a bus, will not get reset
>  * not modelling GPIO reset signal lines
> 
> [*] It turns out we actually don't have any of these any more, so
> we can remove the code that deals with them. The only parentless bus
> is the main system bus, which is the root of the "reset hierarchy".
> 
> This patchset is trying to address:
>  * changing to multi-phase
>  * modelling of GPIO reset signal lines
> 
> It leaves reset as bus-based: currently we do this via qbus_walk_children/
> qdev_walk_children, which traverse the bus->children and dev->child_bus
> lists, and in the patchset's implementation of Resettable for qdev/qbus
> the methods iterate through those.
> 
> I think this is reasonable -- we don't want to try to tackle every
> reset related problem at once. The two issues the patchset is looking
> at fit well together, because the GPIO-reset-lines are a motivation
> for switching to multiphase (you need to handle both entering and
> leaving reset).
> 
> ("not exhaustive" is the thing we should really try to
> fix at some point, but I have no good ideas for how to do this.)
> 
> API DESIGN
> 
> On what the right APIs should be: I think we should separate
> "the API that's nice for devices to implement" from "the API that's
> nice for callers wanting to reset a device". Here's my suggestion
> for doing that:
> 
> 
> Have the Resettable interface be:
>  * init
>  * hold
>  * exit
>  * get_reset_count
>  * increment_reset_count (returns previous count)
>  * decrement_reset_count (returns new count)
>  * some method for "iterate over any Resettable child objects"
>    (probably a "call this callback for each child" type API)
> 
> Individual devices implement init/hold/exit
> Device base class implements the reset_count methods
> Device base class implements a default 'init' method that calls dc->reset
>   and default hold/exit that are no-ops
> Device base class has a new vmstate subsection which migrates the
>   reset count, subsection's needed function says "only send
>   if reset count is non-zero". Back-compat here should be fine
>   as current machines don't implement any way that a device in
>   the system can be being held in reset at the point where a
>   migration happens. If future changes add features to a
>   machine model that permit that, the migration is forwards-compatible.
> Device base class implements the iterate-over-children method to
>  iterate over the dev->child_bus buses
> Bus base class implements reset_count methods
> Bus base class implements default no-op init/hold/exit
> Bus base class implements the iterate-over-children method to
>  iterate over the bus->children devices
> Handling migration of the bus base class reset count is a little
>  awkward -- we'll need to put it in a vmstate subsection which
>  is part of the vmstate of the device which owns the bus.
>  (I talked to David Gilbert on IRC about this and have some
>  possibilities for how to do it, but will postpone writing them
>  out until we've decided whether we like the APIs.)

If:
  1. we say we cannot use a Bus as holdable reset entry point (we can
reset it but not (de)assert_reset it without passing by its parent Device)
  2. if there is no additional reset dedicated state (right now a reset
on a bus mostly propagates it to sub-devices)
Then we have no state migration needed for buses and we can probably
find a way to use the parent Device's reset_count.

This is somewhat related to the API TRANSITION part below but I was
wondering if I should add some kind of flag in the interface class to
indicate the support of "migration-during-reset" functionality. It could
be used to at least warn when trying to hold the reset state in such an
class.
This flag will default to false and one would have to set it when
implementing the functionality for a given object class (like I did with
the cadence_uart device).
Maybe a similar flag can be used to allow/forbid using a class as an
holdable reset entry point.

> 
> The "API for devices to implement" is then the init/hold/exit
> methods of the Resettable interface -- they don't need to worry
> about these methods possibly being called multiple times, and
> they don't need to handle the reference count or passing on
> the calling of the phase methods to their children. They
> just need to implement the correct behaviour for their device
> for this phase.
> 
> The "API for callers wanting to reset a device" is a set of
> helper functions that take a pointer to a Resettable object.
> It's these that deal with the reset count and children:
> 
> resettable_assert_reset(Resettable *r)
> {
>    if (r->increment_reset_count() == 0) {
>        r->init();
>        r->foreach_child(do_call_init);
>        r->hold();
>        r->foreach_child(do_call_hold);
>    }
> }
> 
> resettable_deassert_reset(Resettable *r)
> {
>    if (r->decrement_reset_count() == 0) {
>       r->foreach_child(do_call_exit);
>       r->exit();
>    }
> }
> 
> plus a utility function for "call assert then deassert",
> and maybe one that wraps the get_reset_count method.
> So callers that want to reset devices (or buses) don't need
> to care about phases, they just assert and then deassert reset.
> 
> Do you think that works ?

I'll have to handle cold/warm cases but I think it will work.

> 
> 
> API TRANSITIONS
> 
> The other issue here is API transitions: the patchset essentially
> obsoletes the old DeviceClass::reset function, for instance. I think
> we should be clear about what the old and new APIs here are, and
> what our plans for transitioning to the new ones are. In some cases
> there are really very few users of the old API -- for instance the
> patchset makes qbus_reset_all(bus) a synonym for qbus_reset(bus, false),
> but there are only a dozen or so users of qbus_reset_all(). I think
> we should just go ahead and convert them all. (For purposes of
> structuring the patchset starting with a patch that says "reimplement
> qbus_reset_all() in terms of qbus_reset()" is OK, but then we should
> fix up the callers afterwards.) There are of course a lot more
> implementations of DeviceState::reset so transitioning away from
> that is a lot trickier, but we could look at a coccinelle script
> that could automate it.
> 
> If you could describe in the cover letter of the next version of
> the patchseries all the old APIs being deprecated and the new
> ones that replace them, I think that would be very helpful.>
> 
> MISCELLANEOUS MINOR POINTS
> 
> I know I suggested the idea of a ResetDomain object, but in the
> series as it stands I'm not sure it's serving very much purpose;
> perhaps we should drop it for the moment (just leaving the legacy
> reset handlers and sysbus reset the way they are) ? We can come back
> to it later as a concept.

I'm ok with that. It seems unnecessary at that point given that it was
not used by Device/Bus classes. It will make the series simplier.

> 
> The "support reset via GPIO line" patch looks generally OK, but
> you can't just add fields to the DeviceState vmstate struct -- you'll
> break migration from older QEMU versions. The new fields need to
> go in a vmstate subsection with an appropriate 'needed' function.
> 
> We should definitely make sure we have good documentation for
> what device authors should do to implement reset.

I agree, I did not want to commit too much into it before having a more
satisfying API. I'll work on it for the next version.

> 
> thanks
> -- PMM
> 

Thanks a lot for the review,
Damien

Re: [Qemu-devel] [RFC 00/17] multi-phase reset mechanism
Posted by Peter Maydell 4 years, 12 months ago
On Mon, 20 May 2019 at 15:32, Damien Hedde <damien.hedde@greensocs.com> wrote:
>
> Hi,
>
> On 5/16/19 2:41 PM, Peter Maydell wrote:
> > Handling migration of the bus base class reset count is a little
> >  awkward -- we'll need to put it in a vmstate subsection which
> >  is part of the vmstate of the device which owns the bus.
> >  (I talked to David Gilbert on IRC about this and have some
> >  possibilities for how to do it, but will postpone writing them
> >  out until we've decided whether we like the APIs.)
>
> If:
>   1. we say we cannot use a Bus as holdable reset entry point (we can
> reset it but not (de)assert_reset it without passing by its parent Device)
>   2. if there is no additional reset dedicated state (right now a reset
> on a bus mostly propagates it to sub-devices)
> Then we have no state migration needed for buses and we can probably
> find a way to use the parent Device's reset_count.

Yeah, I thought about this, but I decided I didn't like having
"there's a Resettable interface but you have to know that
some implementations are special and it's not possible to
do the same things with them you can with some others".
Implementing migration of the child bus reset state as part of
the parent device's vmstate is a bit tedious but not impossible.
(Basically in the parent device's pre-save function you have
to create a temporary array, populate it by walking the list
of child buses and looking at their reset flags, then migrate
the contents of the array, and in post-load you do "walk child
buses and fill in the reset flags from the array". There's a
VMSTATE_WITH_TMP macro that can apparently assist with this.)

> This is somewhat related to the API TRANSITION part below but I was
> wondering if I should add some kind of flag in the interface class to
> indicate the support of "migration-during-reset" functionality. It could
> be used to at least warn when trying to hold the reset state in such an
> class.
> This flag will default to false and one would have to set it when
> implementing the functionality for a given object class (like I did with
> the cadence_uart device).

I think that migration-during-reset should Just Work, without
necessarily needing any specific changes for a device.
I'm not sure a warning that only prints when the guest does
something to cause device-reset is going to be very helpful:
what would be the thing that the warning would tell us we needed
to do?

thanks
-- PMM

Re: [Qemu-devel] [RFC 00/17] multi-phase reset mechanism
Posted by Damien Hedde 5 years ago
Hi All,

Any comment about this ?

Thanks,
Damien

On 3/25/19 12:01 PM, Damien Hedde wrote:
> Hi all,
> 
> This series is a proposal to implement the multi-phase reset we've discussed
> here (https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg00310.html) and
> more recently there
> (https://lists.gnu.org/archive/html/qemu-devel/2019-03/msg00081.html).
> 
> To summarize, we need a multi-phase reset to allow for a more complex
> initialization of a platform. In particular we need to "propagate" a clock
> tree, but we have to ensure that every device is initialized first.
> 
> To solve this problem, the following 3-phases reset mechanism is proposed (I've
> removed the 4th given our last discussion).
> 
> #DESCRIPTION
> 
> INIT PHASE: Reset the object internal state, put a resetting flag and do the
>     same for the reset subtree. No side effect on other devices to guarantee
>     that, in a reset domain, everything get initialized first. This corresponds
>     mostly to what is currently done in the device/bus reset method.
> 
> HOLD PHASE: This phase allows to control a reset with a I/O. When a I/O control
>     a reset procedure based on the I/O level (not edge), we may need to assert
>     the reset, wait some time, and finally de-assert the reset. The consequence
>     is that such a device can stay in a "resetting state" and may need to show
>     this state to other devices through its outputs. For example, a clock
>     controller will typically shutdown its clocks when it is in resetting state.
> 
> EXIT PHASE (previously named 'release'): This phase sets outputs to state after
>      reset. For a clock controller it starts the clocks. It also clears the
>      "resetting" flag. A device should not react to inputs until this flag has
>      been cleared. During this phase, outputs are propagated in the reset domain
>      (and outside the reset domain).
> 
> To implement this, this series add a Resettable interface containing 3 methods:
> one for each phase. The init phase's method takes a boolean argument allowing to
> distinguish cold and warm resets.
> 
> In this series, Device and Bus implement the interface. A vmstate description
> subsection is added to allow migration of reset related state for device.
> 3 methods (one per phase) are also added in Device's class. They correspond to
> the local part (see the code example below) of the reset. They avoid device
> specialization to have to handle the sub-domain part and "resetting" state.
> 
> Functions to add gpio control over the reset are added. It is possible to add
> an active low/high reset pin to control the warm reset and an active low/high
> power gating pin to control the cold reset.
> 
> The bus/device reset tree is converted to this 3-phases mechanism. I also
> added a new ResetDomain object which is just a Resettable container. It is
> used to have a global "system reset domain" to handle main 3-phases reset and
> replace the current single-phase-reset handler mechanism.
> 
> As an example, in the xilinx_zynq machine, I've converted the slcr (the resets
> and clocks controller device) and the uart peripheral to this 3-phases reset
> mechanism. Gpio are added between the devices to transmit reset control from
> the slcr to the uarts.
> 
> All changes have been made so that existing reset behavior is not modified.
> 
> #INTERFACE CHOICE
> 
> To be honest, I have some doubt about the interface. I've kept it minimal and
> the consequence is implementation is complex (and kind of duplicated in every
> base implementation like Device or Bus). One of the problem is the `resetting`
> flag, which in practical must be a counter to handle reset "reentrance".
> Indeed nothing forbids to reset some device that is already held in "resetting"
> state by some other means. As an example, in the zynq machine, there can be 
> a global/system reset while an uart is reset by the slcr. It it also possible
> to cold and warm resets triggered concurrently.
> 
> The Resettable methods implementation have to looks this (methods are called
> with iothread locked):
> ```
> void init_phase(Device *dev, bool cold)
> {
>     /* call sub-resettable init phases */
> 
>     dev->resetting += 1;
>     /* do local init phase (eg: state reset) */
> }
> void hold_phase(Device *dev)
> {
>     /* call sub-resettable hold phases */
> 
>     /* do local hold phase (eg: set some I/O level) */
> }
> void exit_phase(Device *dev)
> {
>     /* call sub-resettable exit phases (independently of resetting value since,
>        every resettable have its own resetting counter) */
>    
>     dev->resetting -= 1;
>     if (dev->resetting == 0) {
>         /* do local exit phase (eg: set some I/O level) only if the device is
>            really leaving the resetting state */
>     }
> }
> ```
> Since I don't think specialization should care about sub-resettable and the
> resetting counter, I've added the local init/hold/exit phases as DeviceClass
> methods.
> Otherwise, I see two other solutions:
> + keep the interface as it is
> + add some state knowledge in the interface (with some kind of get_state method)
>   so that resetting counter and some kind of sub-resettable list are handled in
>   the common code in the interface. The 3 methods will then handle only the local
>   actions.
> + have 6 methods in the interface, one for the local actions, one for the
>   sub-resettable so that sub-resettable is not handled in the common code. And we
>   need also a get_resetting method to access/update the counter.
> 
> #DEVICE RESET CHOICE
> 
> The Device is a special case, it has 2 reset entry point: `qdev_reset_all` and
> `device_reset`. The latter doing a device reset only, while the former also
> reset all buses hierarchy under the device.
> 
> I choose the put the sub-buses into the device reset domain, so that the
> behavior of the resettable interface on Device is the same as qdev_reset_all.
> I don't know if `device_reset` has some real meaning (resetting only the
> Device). It is not often used and I think every time it is used there is no
> sub-buses so the behavior is the same for both functions.
> 
> If I am mistaken about putting buses into device reset domain, it is possible
> to make a special bus-hierarchy-reset-domain for every Device/Bus that differs
> from the Resettable interface on Device/Bus.
> 
> # SYSBUS SUPPORT
> 
> Regarding the sysbus support Edgar mentioned in the prevous discussion: In this
> series, there is no support in sysbus base class for disabling memory regions.
> Having each sysbus device specialization, in every memory region handler, to
> check if `resetting` is set or not is not user-friendly. But for we can't 
> modify memory region `enabled` flags since devices may already set/unset them.
> Maybe we could have some kind of super-switch to disable all memory regions in
> a sysbus device but I don't know how this could be done.
> 
> The series is organised as follow:
> Patches 1 and 2 adds Resettable interface and ResetDomain object.
> Patches 3 to 7 converts Device and Bus to Resettable.
> Patches 8 to 12 handles the global system reset domain
> Patches 13 to 17 do the zynq implementation (patch 13 is an already-reviewed
> patch from the clock api patch series)
> 
> Thank you for your feedback,
> Damien
> 
> Damien Hedde (17):
>   Create Resettable QOM interface
>   Create the ResetDomain QOM object
>   make Device and Bus Resettable
>   Add local reset methods in Device class
>   add vmstate description for device reset state
>   Add function to control reset with gpio inputs
>   convert qdev/bus_reset_all to Resettable
>   Add a global ResetDomain object for system emulation
>   global ResetDomain support for legacy reset handlers
>   Delete the system ResetDomain at the end of emulation
>   Put orphan buses in system reset domain
>   Put default sysbus in system reset domain
>   hw/misc/zynq_slcr: use standard register definition
>   convert cadence_uart to 3-phases reset
>   Convert zynq's slcr to 3-phases reset
>   Add uart reset support in zynq_slcr
>   Connect the uart reset gpios in the zynq platform
> 
>  hw/arm/xilinx_zynq.c           |  14 +-
>  hw/char/cadence_uart.c         |  48 ++-
>  hw/core/Makefile.objs          |   3 +
>  hw/core/bus.c                  |  64 +++-
>  hw/core/qdev-vmstate.c         |  27 ++
>  hw/core/qdev.c                 | 166 ++++++++++-
>  hw/core/reset-domain.c         | 121 ++++++++
>  hw/core/reset.c                | 149 +++++++++-
>  hw/core/resettable.c           |  69 +++++
>  hw/misc/zynq_slcr.c            | 515 ++++++++++++++++++---------------
>  include/hw/char/cadence_uart.h |  10 +-
>  include/hw/qdev-core.h         |  97 +++++++
>  include/hw/reset-domain.h      |  49 ++++
>  include/hw/resettable.h        |  83 ++++++
>  include/sysemu/reset.h         |  47 +++
>  vl.c                           |   3 +-
>  16 files changed, 1195 insertions(+), 270 deletions(-)
>  create mode 100644 hw/core/qdev-vmstate.c
>  create mode 100644 hw/core/reset-domain.c
>  create mode 100644 hw/core/resettable.c
>  create mode 100644 include/hw/reset-domain.h
>  create mode 100644 include/hw/resettable.h
> 

Re: [Qemu-devel] [RFC 00/17] multi-phase reset mechanism
Posted by Peter Maydell 5 years ago
On Mon, 29 Apr 2019 at 10:36, Damien Hedde <damien.hedde@greensocs.com> wrote:
>
> Hi All,
>
> Any comment about this ?

Sorry we haven't got to this yet. This is on my to-review list,
but so are 23 other series, and I've been on holiday for the
past week or so. I will try to get to it this week.

thanks
-- PMM

Re: [Qemu-devel] [RFC 00/17] multi-phase reset mechanism
Posted by Peter Maydell 5 years ago
On Mon, 29 Apr 2019 at 10:41, Peter Maydell <peter.maydell@linaro.org> wrote:
>
> On Mon, 29 Apr 2019 at 10:36, Damien Hedde <damien.hedde@greensocs.com> wrote:
> >
> > Hi All,
> >
> > Any comment about this ?
>
> Sorry we haven't got to this yet. This is on my to-review list,
> but so are 23 other series, and I've been on holiday for the
> past week or so. I will try to get to it this week.

I made a start on this, but it's going to take me some time
to wrap my head around the design and the issues you raise
in the cover letter. Given the bank holiday on Monday and
some other commitments I have on Tuesday I think it's going
to be the second half of next week at best before I can
do a proper response.

In general I like the code (though as you say we may not
have got the interface quite right -- I had not appreciated
some of the nested-reset cases when I proposed it I think).

I think we could use a document (which would eventually live
in docs/devel) that describes the reset system and provides some
how-to style documentation for
 * what you need to do if you're writing a device
 * what you need to do if you are an external user of
   a device model and you want to reset it
plus some notes on whatever we still have left over as
remnants of the old reset scheme and any intended transitions
to try to remove those remnants. I'll have a go at writing
something because I think that will help me personally in
trying to understand the problem domain and what the new
design is doing.

thanks
-- PMM