tcg-plugins: add hooks for interrupts, exceptions and traps

[RFC PATCH v2 0/7] tcg-plugins: add hooks for interrupts, exceptions and traps

Posted by Julian Ganz 3 days, 4 hours ago

Some analysis greatly benefits, or depends on, information about
interrupts. For example, we may need to handle the execution of a new
translation block differently if it is not the result of normal program
flow but of an interrupt.

Even with the existing interfaces, it is more or less possible to
discern these situations, e.g. as done by the cflow plugin. However,
this process poses a considerable overhead to the core analysis one may
intend to perform.

These changes introduce a generic and easy-to-use interface for plugin
authors in the form of callbacks for different types of traps. Patch 1
defines the callback registration functions and supplies a somewhat
narrow definition of the event the callbacks are called. Patch 2 adds
some hooks for triggering the callbacks. Patch 3 adds an example plugin
showcasing the new API. The remaining patches call the hooks for a
selection of architectures, mapping architecture specific events to the
three categories defined in patch 1. Future non-RFC patchsets will call
these hooks for all architectures (that have some concept of trap or
interrupt).

Sidenote: I'm likely doing something wrong for one architecture or
the other. For example, with the old version Alex Bennée suggested
registering a helper function with arm_register_el_change_hook() for
arm, which is not what I ended up doing. And for AVR my approach to just
assume only (asynchroneous) interrupts exist is also likely too naïve.

Since v1:
  - Split the one callback into multiple callbacks
  - Added a target-agnostic definition of the relevant event(s)
  - Call hooks from architecture-code rather than accel/tcg/cpu-exec.c
  - Added a plugin showcasing API usage

Julian Ganz (7):
  plugins: add API for registering trap related callbacks
  plugins: add hooks for new trap related callbacks
  contrib/plugins: add plugin showcasing new trap related API
  target/arm: call plugin trap callbacks
  target/avr: call plugin trap callbacks
  target/riscv: call plugin trap callbacks
  target/sparc: call plugin trap callbacks

 contrib/plugins/Makefile     |  1 +
 contrib/plugins/traps.c      | 89 ++++++++++++++++++++++++++++++++++++
 include/qemu/plugin-event.h  |  3 ++
 include/qemu/plugin.h        | 12 +++++
 include/qemu/qemu-plugin.h   | 45 ++++++++++++++++++
 plugins/core.c               | 42 +++++++++++++++++
 plugins/qemu-plugins.symbols |  3 ++
 target/arm/helper.c          | 23 ++++++++++
 target/avr/helper.c          |  3 ++
 target/riscv/cpu_helper.c    |  8 ++++
 target/sparc/int32_helper.c  |  7 +++
 target/sparc/int64_helper.c  | 10 ++++
 12 files changed, 246 insertions(+)
 create mode 100644 contrib/plugins/traps.c

-- 
2.45.2

Re: [RFC PATCH v2 0/7] tcg-plugins: add hooks for interrupts, exceptions and traps

Posted by Pierrick Bouvier 1 day, 3 hours ago

Hi Julian,

On 10/19/24 09:39, Julian Ganz wrote:
> Some analysis greatly benefits, or depends on, information about
> interrupts. For example, we may need to handle the execution of a new
> translation block differently if it is not the result of normal program
> flow but of an interrupt.
> 
> Even with the existing interfaces, it is more or less possible to
> discern these situations, e.g. as done by the cflow plugin. However,
> this process poses a considerable overhead to the core analysis one may
> intend to perform.
>

I agree it would be useful. Beyond the scope of this series, it would be 
nice if we could add a control flow related API instead of asking to 
plugins to do it themselves.

If we would provide something like this, is there still a value to add 
an API to detect interrupt/exceptions/traps events?

Note: It's not a critic against what you sent, just an open question on 
*why* it's useful to access this QEMU implementation related information 
vs something more generic.

> These changes introduce a generic and easy-to-use interface for plugin
> authors in the form of callbacks for different types of traps. Patch 1
> defines the callback registration functions and supplies a somewhat
> narrow definition of the event the callbacks are called. Patch 2 adds
> some hooks for triggering the callbacks. Patch 3 adds an example plugin
> showcasing the new API. The remaining patches call the hooks for a
> selection of architectures, mapping architecture specific events to the
> three categories defined in patch 1. Future non-RFC patchsets will call
> these hooks for all architectures (that have some concept of trap or
> interrupt).
> 
> Sidenote: I'm likely doing something wrong for one architecture or
> the other. For example, with the old version Alex Bennée suggested
> registering a helper function with arm_register_el_change_hook() for
> arm, which is not what I ended up doing. And for AVR my approach to just
> assume only (asynchroneous) interrupts exist is also likely too naïve.
> 
> Since v1:
>    - Split the one callback into multiple callbacks
>    - Added a target-agnostic definition of the relevant event(s)
>    - Call hooks from architecture-code rather than accel/tcg/cpu-exec.c
>    - Added a plugin showcasing API usage
> 
> Julian Ganz (7):
>    plugins: add API for registering trap related callbacks
>    plugins: add hooks for new trap related callbacks
>    contrib/plugins: add plugin showcasing new trap related API
>    target/arm: call plugin trap callbacks
>    target/avr: call plugin trap callbacks
>    target/riscv: call plugin trap callbacks
>    target/sparc: call plugin trap callbacks
> 
>   contrib/plugins/Makefile     |  1 +
>   contrib/plugins/traps.c      | 89 ++++++++++++++++++++++++++++++++++++
>   include/qemu/plugin-event.h  |  3 ++
>   include/qemu/plugin.h        | 12 +++++
>   include/qemu/qemu-plugin.h   | 45 ++++++++++++++++++
>   plugins/core.c               | 42 +++++++++++++++++
>   plugins/qemu-plugins.symbols |  3 ++
>   target/arm/helper.c          | 23 ++++++++++
>   target/avr/helper.c          |  3 ++
>   target/riscv/cpu_helper.c    |  8 ++++
>   target/sparc/int32_helper.c  |  7 +++
>   target/sparc/int64_helper.c  | 10 ++++
>   12 files changed, 246 insertions(+)
>   create mode 100644 contrib/plugins/traps.c
>

Re: [RFC PATCH v2 0/7] tcg-plugins: add hooks for interrupts, exceptions and traps

Posted by Julian Ganz 23 hours ago

Hi, Pierrick,

October 21, 2024 at 8:00 PM, "Pierrick Bouvier" wrote:
> I agree it would be useful. Beyond the scope of this series, it would be 
> nice if we could add a control flow related API instead of asking to 
> plugins to do it themselves.
> 
> If we would provide something like this, is there still a value to add 
> an API to detect interrupt/exceptions/traps events?
> 
> Note: It's not a critic against what you sent, just an open question on 
> *why* it's useful to access this QEMU implementation related information 
> vs something more generic.

The motivation for this API is a plugin that simulates a RISC-V tracing
unit (and produces a trace). For that we actually also needed to
track the "regular" control flow, i.e. find out whether a branch was
taken or where a jump went. That wasn't hard, especially considering
that the TCG API already gives you (more or less) basic blocks. Still,
we ended up tracing every instruction because that made some of the logic
much simpler and easier to reason about.

We realized that we need a trap API because they:
* can occur at any time/point of execusion
* usually come with additional effects such as mode changes.

Helpers for discerning whether an instruction is a jump, a branch
instruction or something else would certainly be helpful if you wanted
cross-platform control flow tracing of some sort, but afaik given such
helpers you would just need to check the last instruction in a
translation block and check where the PC goes after that. Additional
callbacks for specifically this situation strike me as a bit
excessive.

But I could be wrong about that.

Regards,
Julian

Re: [RFC PATCH v2 0/7] tcg-plugins: add hooks for interrupts, exceptions and traps

Posted by Pierrick Bouvier 23 hours ago

On 10/21/24 14:02, Julian Ganz wrote:
> Hi, Pierrick,
> 
> October 21, 2024 at 8:00 PM, "Pierrick Bouvier" wrote:
>> I agree it would be useful. Beyond the scope of this series, it would be
>> nice if we could add a control flow related API instead of asking to
>> plugins to do it themselves.
>>
>> If we would provide something like this, is there still a value to add
>> an API to detect interrupt/exceptions/traps events?
>>
>> Note: It's not a critic against what you sent, just an open question on
>> *why* it's useful to access this QEMU implementation related information
>> vs something more generic.
> 
> The motivation for this API is a plugin that simulates a RISC-V tracing
> unit (and produces a trace). For that we actually also needed to
> track the "regular" control flow, i.e. find out whether a branch was
> taken or where a jump went. That wasn't hard, especially considering
> that the TCG API already gives you (more or less) basic blocks. Still,
> we ended up tracing every instruction because that made some of the logic
> much simpler and easier to reason about.
> 
> We realized that we need a trap API because they:
> * can occur at any time/point of execusion
> * usually come with additional effects such as mode changes.
> 

Thanks for sharing your insights.
I think there is definitely value in what you offer, and I'm trying to 
think how we could extend it in the future easily, without having 
another callback when a new event appear. In my experience on plugins, 
the least callbacks we have, and the simpler they are, the better it is.

Maybe we could have a single API like:

enum qemu_plugin_cf_event_type {
	QEMU_PLUGIN_CF_INTERRUPT;
	QEMU_PLUGIN_CF_TRAP;
	QEMU_PLUGIN_CF_SEMIHOSTING;
};

/* Sum type, a.k.a. "Rust-like" enum */
typedef struct {
     enum qemu_plugin_cf_event_type ev;
     union {
         data_for_interrupt interrupt;
         data_for_trap trap;
         data_for_semihosting semihosting;
} qemu_plugin_cf_event;
/* data_for_... could contain things like from/to addresses, interrupt 
id, ... */

...

void on_cf_event(qemu_plugin_cf_event ev)
{
	switch (ev.type) {
		case QEMU_PLUGIN_CF_TRAP:
			...
		case QEMU_PLUGIN_CF_SEMIHOSTING:
			...
		default:
			g_assert_not_reached();
	}
}

/* a plugin can register to one or several event - we could provide a 
QEMU_PLUGIN_CF_ALL for plugins tracking all events. */
qemu_plugin_register_cf_cb(QEMU_PLUGIN_CF_TRAP, &on_cf_event);
qemu_plugin_register_cf_cb(QEMU_PLUGIN_CF_SEMIHOSTING, &on_cf_event);

This way, a single callback can be registered for one or several events. 
And in the future, we are free to attach more data for every event, and 
add other events (TB_FALLTHROUGH, TB_JUMP, etc).

> Helpers for discerning whether an instruction is a jump, a branch
> instruction or something else would certainly be helpful if you wanted
> cross-platform control flow tracing of some sort, but afaik given such
> helpers you would just need to check the last instruction in a
> translation block and check where the PC goes after that. Additional
> callbacks for specifically this situation strike me as a bit
> excessive.
>
> But I could be wrong about that.
>

You're right, and the current cflow plugin is more a demonstration of 
using existing API than an efficient solution to this problem.
For cflow detection specifically, I think we can do better, by adding 
instrumentation right where we chain/jump between tb, and of course, 
tracking other events like you did in this series.

> Regards,
> Julian

Re: [RFC PATCH v2 0/7] tcg-plugins: add hooks for interrupts, exceptions and traps

Posted by Julian Ganz 12 hours ago

Hi, Pierrick,

October 21, 2024 at 11:59 PM, "Pierrick Bouvier" wrote:
> On 10/21/24 14:02, Julian Ganz wrote:
> >  The motivation for this API is a plugin that simulates a RISC-V tracing
> >  unit (and produces a trace). For that we actually also needed to
> >  track the "regular" control flow, i.e. find out whether a branch was
> >  taken or where a jump went. That wasn't hard, especially considering
> >  that the TCG API already gives you (more or less) basic blocks. Still,
> >  we ended up tracing every instruction because that made some of the logic
> >  much simpler and easier to reason about.
> >  We realized that we need a trap API because they:
> >  * can occur at any time/point of execusion
> >  * usually come with additional effects such as mode changes.
> > 
> Thanks for sharing your insights.
> I think there is definitely value in what you offer, and I'm trying to think how we could extend it in the future easily, without having another callback when a new event appear. In my experience on plugins, the least callbacks we have, and the simpler they are, the better it is.
> 
> Maybe we could have a single API like:
> 
> enum qemu_plugin_cf_event_type {
>  QEMU_PLUGIN_CF_INTERRUPT;
>  QEMU_PLUGIN_CF_TRAP;
>  QEMU_PLUGIN_CF_SEMIHOSTING;
> };

I have considered such an enum, as an input for the callback, as a
parameter of the registration function, and both. Of course, if you were
to add a selection parameter for the registration function, you likely
want OR-able flags.

An additional input for the callback type would obviously require a new
function type just for that callback. Since the callbacks are somewhat
similar to the VCPU init, exit, resume, ... ones it felt appropriate
to use the same function type for all of them.

As for the registration, it may make the registration a bit more
convenient and maybe keep the API clutter a bit lower, but not by that
much.

> /* Sum type, a.k.a. "Rust-like" enum */
> typedef struct {
>  enum qemu_plugin_cf_event_type ev;
>  union {
>  data_for_interrupt interrupt;
>  data_for_trap trap;
>  data_for_semihosting semihosting;
> } qemu_plugin_cf_event;
> /* data_for_... could contain things like from/to addresses, interrupt id, ... */

I don't think this is a good idea.

Traps are just too diverse, imo there is too little overlap between
different architectures, with the sole exception maybe being the PC
prior to the trap. "Interrupt id" sounds like a reasonably common
concept, but then you would need to define a mapping for each and every
architecture. What integer type do you use? In RISC-V, for example,
exceptions and interrupt "ids" are differentiated via the most
significant bit. Dou keep that or do you zero it? And then there's
ring/privilage mode, cause (sometimes for each mode), ...

It would also complicate call sites for hooks in target code. You'd
either need awkwardly long function signitures or setup code for that
struct. Both are things you don't want, as a hook call site should
never distract from the actual logic surrounding them. You could
probably have something reasonable in Rust, using a builder/command
pattern. But in C this would require too much boiler plate code than
I'd be comfortable with.

Regards,
Julian

Re: [RFC PATCH v2 0/7] tcg-plugins: add hooks for interrupts, exceptions and traps

Posted by Alex Bennée 12 hours ago

"Julian Ganz" <nenut@skiff.uberspace.de> writes:

> Hi, Pierrick,
>
> October 21, 2024 at 11:59 PM, "Pierrick Bouvier" wrote:
>> On 10/21/24 14:02, Julian Ganz wrote:
>> >  The motivation for this API is a plugin that simulates a RISC-V tracing
>> >  unit (and produces a trace). For that we actually also needed to
>> >  track the "regular" control flow, i.e. find out whether a branch was
>> >  taken or where a jump went. That wasn't hard, especially considering
>> >  that the TCG API already gives you (more or less) basic blocks. Still,
>> >  we ended up tracing every instruction because that made some of the logic
>> >  much simpler and easier to reason about.
>> >  We realized that we need a trap API because they:
>> >  * can occur at any time/point of execusion
>> >  * usually come with additional effects such as mode changes.
>> > 
>> Thanks for sharing your insights.
>> I think there is definitely value in what you offer, and I'm trying
>> to think how we could extend it in the future easily, without having
>> another callback when a new event appear. In my experience on
>> plugins, the least callbacks we have, and the simpler they are, the
>> better it is.
>> 
>> Maybe we could have a single API like:
>> 
<snip>
>
> Traps are just too diverse, imo there is too little overlap between
> different architectures, with the sole exception maybe being the PC
> prior to the trap. "Interrupt id" sounds like a reasonably common
> concept, but then you would need to define a mapping for each and every
> architecture. What integer type do you use? In RISC-V, for example,
> exceptions and interrupt "ids" are differentiated via the most
> significant bit. Dou keep that or do you zero it? And then there's
> ring/privilage mode, cause (sometimes for each mode), ...
>
> It would also complicate call sites for hooks in target code. You'd
> either need awkwardly long function signitures or setup code for that
> struct. Both are things you don't want, as a hook call site should
> never distract from the actual logic surrounding them. You could
> probably have something reasonable in Rust, using a builder/command
> pattern. But in C this would require too much boiler plate code than
> I'd be comfortable with.

How easy would it be to expose a Rust API? I'm curious because now we
are looking to integrate Rust into QEMU we could consider transitioning
to a Rust API for plugins. It has been done before:

  https://github.com/novafacing/qemu-rs/tree/main/qemu-plugin-sys

and

  https://github.com/novafacing/qemu-rs/tree/main/qemu-plugin

I'm curious as to what it would look like. I don't know how tenable it
would be to run 2 plugin APIs side-by-side long term though. We would
probably want to make a choice. Also how would that affect other C like
APIs like python?

>
> Regards,
> Julian

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro

Re: [RFC PATCH v2 0/7] tcg-plugins: add hooks for interrupts, exceptions and traps

Posted by Julian Ganz 48 minutes ago

Hi, Alex,

October 22, 2024 at 10:58 AM, "Alex Bennée" wrote:
> How easy would it be to expose a Rust API? I'm curious because now we
> are looking to integrate Rust into QEMU we could consider transitioning
> to a Rust API for plugins. It has been done before:
> 
>  https://github.com/novafacing/qemu-rs/tree/main/qemu-plugin-sys
> 
> and
> 
>  https://github.com/novafacing/qemu-rs/tree/main/qemu-plugin
> 
> I'm curious as to what it would look like. I don't know how tenable it
> would be to run 2 plugin APIs side-by-side long term though. We would
> probably want to make a choice. Also how would that affect other C like
> APIs like python?

I'm maybe not an expert w.r.t. plugins with Rust, but here are my
thoughts:

Calling C code from Rust is obviously not an issue. For ideomatic Rust
(not littered with "unsafe") you want an abstraction over that, but as
Qemu is C you need that somewhere anyway.

Things that are generally easy to handle are opaque types behind
pointers (this is probably true for most language binding) and C
strings, as long as you can figure out who owns them and how long they
live. Things in the current API which make things a bit awkward are
(probably) unions and Glib-types such as the GArray returned by
qemu_plugin_get_registers. Also, you can use Rust functions for
callbacks, but ideally you want to allow using all types implementing
the Fn trait, e.g. closures carrying some context. For that, you need to
transport that context from the point of registration to the callback,
i.e. you need some udata. Not all callbacks have udata, but looking
closer the ones lacking it are those you register only once, which means
we could have some "static" storage for those context. It's not ideal,
but not a show-stopper either. I didn't check how the qemu-plugin crate
handles that situation.

With a native Rust interface, you would not have those problems.
However, for plugins you would need a dylib interface, which comes with
restrictions. In particular, you cannot use generics in the interface.
To allow for the very extension we want the interface would make heavy
use of Box<dyn SomeTrait>, in particular Box<dyn Fn(...) -> ...>.

The awkward thing about those is that you cannot simply convert them
into a void pointer because the "dyn" means fat pointers are involved:
unlike in C++, the vtable is embedded in the "pointer". Since we want to
invoke those from the C side, we need another level of indirection which
lets us operate on an opaque Rust type through non-generic functions
that then does the thing we want to the boxed thing. Ironically, you
don't have the same problem with a Rust plugin developed against a C
interface because you can just use a generic function unpacking (i.e.
casting) the context and throw its instantiation's pointer over the
fence.

So... I'm not sure about the benefits of a native Rust plugin API
compared to the qemu-plugin crate or something similar. Especially
considering that we would want to use the very same callback registry in
the back, anyway. That is, if you want feature parity between the
different plugin APIs.

There are some things that would make language bindings easier in
general, but some of those would involve breaking changes and may not
be worth it.

Regards,
Julian

Re: [RFC PATCH v2 0/7] tcg-plugins: add hooks for interrupts, exceptions and traps

Posted by Alex Bennée 1 day, 2 hours ago

Pierrick Bouvier <pierrick.bouvier@linaro.org> writes:

> Hi Julian,
>
> On 10/19/24 09:39, Julian Ganz wrote:
>> Some analysis greatly benefits, or depends on, information about
>> interrupts. For example, we may need to handle the execution of a new
>> translation block differently if it is not the result of normal program
>> flow but of an interrupt.
>> Even with the existing interfaces, it is more or less possible to
>> discern these situations, e.g. as done by the cflow plugin. However,
>> this process poses a considerable overhead to the core analysis one may
>> intend to perform.
>>
>
> I agree it would be useful. Beyond the scope of this series, it would
> be nice if we could add a control flow related API instead of asking
> to plugins to do it themselves.

I think there is a balance to be had here. We don't want to
inadvertently expose QEMU internals to the plugin API. With this series
at least we rely on stuff the front-end knows which can at least be
tweaked relatively easily.

> If we would provide something like this, is there still a value to add
> an API to detect interrupt/exceptions/traps events?
>
> Note: It's not a critic against what you sent, just an open question
> on *why* it's useful to access this QEMU implementation related
> information vs something more generic.
<snip>

It would be good to have the opinion of the front-end maintainers if
this is too burdensome or easy enough to manage.


-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro

Re: [RFC PATCH v2 0/7] tcg-plugins: add hooks for interrupts, exceptions and traps

Posted by Pierrick Bouvier 1 day ago


On 10/21/24 11:47, Alex Bennée wrote:
> Pierrick Bouvier <pierrick.bouvier@linaro.org> writes:
> 
>> Hi Julian,
>>
>> On 10/19/24 09:39, Julian Ganz wrote:
>>> Some analysis greatly benefits, or depends on, information about
>>> interrupts. For example, we may need to handle the execution of a new
>>> translation block differently if it is not the result of normal program
>>> flow but of an interrupt.
>>> Even with the existing interfaces, it is more or less possible to
>>> discern these situations, e.g. as done by the cflow plugin. However,
>>> this process poses a considerable overhead to the core analysis one may
>>> intend to perform.
>>>
>>
>> I agree it would be useful. Beyond the scope of this series, it would
>> be nice if we could add a control flow related API instead of asking
>> to plugins to do it themselves.
> 
> I think there is a balance to be had here. We don't want to
> inadvertently expose QEMU internals to the plugin API. With this series
> at least we rely on stuff the front-end knows which can at least be
> tweaked relatively easily.
> 

You're right. Maybe a good way to find the balance is to identify the 
real use cases behind this need.

>> If we would provide something like this, is there still a value to add
>> an API to detect interrupt/exceptions/traps events?
>>
>> Note: It's not a critic against what you sent, just an open question
>> on *why* it's useful to access this QEMU implementation related
>> information vs something more generic.
> <snip>
> 
> It would be good to have the opinion of the front-end maintainers if
> this is too burdensome or easy enough to manage.
> 
>

Re: [RFC PATCH v2 0/7] tcg-plugins: add hooks for interrupts, exceptions and traps

Posted by Alex Bennée 2 days, 1 hour ago

Julian Ganz <neither@nut.email> writes:

> Some analysis greatly benefits, or depends on, information about
> interrupts. For example, we may need to handle the execution of a new
> translation block differently if it is not the result of normal program
> flow but of an interrupt.

For future iterations please post as a new series as tagging onto an old
series will confuse tooling like patchew. I shall try and get around to
reviewing later this week.

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro