Makefile | 5 +- accel/tcg/cputlb.c | 99 +++++----- accel/tcg/trace-events | 30 ++- configure | 6 + hmp-commands-info.hx | 17 ++ include/plugins/plugins.h | 14 ++ include/qemu/log.h | 2 + include/qemu/plugins.h | 19 ++ linux-user/exit.c | 12 ++ linux-user/main.c | 31 ++- monitor.c | 19 +- qapi/trace.json | 3 +- qemu-options.hx | 10 + qom/cpu.c | 3 + qom/trace-events | 4 + scripts/tracetool/backend/plugin.py | 79 ++++++++ scripts/tracetool/backend/simple.py | 2 + tests/test-logging.c | 14 ++ trace/Makefile.objs | 25 +++ trace/control-internal.h | 6 + trace/event-internal.h | 4 + trace/plugins.c | 154 +++++++++++++++ trace/plugins/Makefile.plugins | 32 ++++ trace/plugins/hotblocks/hotblocks.c | 100 ++++++++++ trace/plugins/tlbflush/tlbflush.c | 283 ++++++++++++++++++++++++++++ trace/qmp.c | 1 + util/log.c | 34 +++- vl.c | 26 ++- 28 files changed, 962 insertions(+), 72 deletions(-) create mode 100644 include/plugins/plugins.h create mode 100644 include/qemu/plugins.h create mode 100644 scripts/tracetool/backend/plugin.py create mode 100644 trace/plugins.c create mode 100644 trace/plugins/Makefile.plugins create mode 100644 trace/plugins/hotblocks/hotblocks.c create mode 100644 trace/plugins/tlbflush/tlbflush.c
Hi,
This series is a combination of general clean-ups to logging and
tracing and a RFC for extending the tracepoint API by the addition of
plugins. An example usage would be:
./aarch64-linux-user/qemu-aarch64 -d plugin,nochain -plugin file=trace/plugins/hotblocks.so,args=5 ./aarch64-linux-user/tests/sha1
SHA1=15dd99a1991e0b3826fede3deffc1feba42278e6
trace/plugins/hotblocks.so: collected 858d entries in the hash table
pc: 0x00000000400260 (64004 hits) 4257 ns between returns
pc: 0x00000000419280 (64004 hits) 4256 ns between returns
pc: 0x00000000419290 (64003 hits) 4063 ns between returns
pc: 0x000000004192d4 (64002 hits) 4063 ns between returns
pc: 0x000000004192f0 (64002 hits) 4062 ns between returns
After reviewing Pavel's series I wondered if we could re-use the
trace-point API as potential hooks for a plugin API. Trace points
already exist as a series of interesting places in QEMU exposing
information that can be used for analysis. By re-using them we avoid
potential duplication of concerns. Adding new hook points becomes a
simple case of adding a new trace point.
This also allows us to simplify some of the facilities provided to the
plugins. I currently envision two types of plugin:
Filter plugins
These are plugins that sit at a trace point and return true/false to
the trace code. This would short-circuit the trace from continuing,
essentially filtering out unwanted information. The final data is still
exported the usual way using trace APIs many output modes.
Analysis plugins
These are plugins that aggregate data in real-time to provide some
sort of analysis as they go. They may or may not pass on the trace
events to the other trace sub-systems. The example hotblocks plugin
is such and example.
While the first set up plugins can re-use the existing trace
infrastructure to output information the second type needs a
reporting mechanism. As a result I have added a plugin_status()
function which can either report on the monitor or in linux-user be
called on program exit. In both cases the function simply returns a
string and leaves the details of where it gets output to QEMU itself.
=Further Work=
==Known Limitations==
Currently there is only one hook allowed per trace event. We could
make this more flexible or simply just error out if two plugins try
and hook to the same point. What are the expectations of running
multiple plugins hooking into the same point in QEMU?
==TCG Hooks==
Thanks to Lluís' work the trace API already splits up TCG events into
translation time and exec time events and provides the machinery for
hooking a trace helper into the translation stream. Currently that
helper is unconditionally added although perhaps we could expand the
call convention a little for TCG events to allow the translation time
event to filter out planting the execution time helper?
Any serious analysis tool should allow for us to track all memory
accesses so I think the guest_mem_before trace point should probably
be split into guest_mem_before_store and guest_mem_after_load. We
could go the whole hog and add potential trace points for start/end of
all memory operations.
I did start down this road but the code quickly got ugly when dealing
with our macrofied access functions. I plan to fish out the
de-macrofication of softmmu series I posted earlier this year:
https://lists.gnu.org/archive/html/qemu-devel/2018-04/msg03403.html
And see if that can be applied to all our access functions.
===Instruction Tracing===
Pavel's series had a specific hook for instrumenting individual
instructions. I have not yet added it to this series but I think it be
done in a slightly cleaner way now we have the ability to insert TCG
ops into the instruction stream. If we add a tracepoint for post
instruction generation which passes a buffer with the instruction
translated and method to insert a helper before or after the
instruction. This would avoid exposing the cpu_ldst macros to the
plugins.
So what do people think? Could this be a viable way to extend QEMU
with plugins?
Alex.
Aaron Lindsay (1):
trace: add linux-user plugin support
Alex Bennée (19):
util/log: allow -dfilter to stack
util/log: add qemu_dfilter_append_range()
linux-user: add -dfilter progtext shortcut
trace: enable the exec_tb trace events
trace: keep a count of trace-point hits
trace: show trace point counts in the monitor
accel/tcg/cputlb: convert tlb_flush debugging into trace events
accel/tcg/cputlb: convert remaining tlb_debug() to trace events
trace: suppress log output of trace points
qom/cpu: add a cpu_exit trace event
trace: expose a plugin fn pointer in TraceEvent
configure: expose a plugin to the trace-backends
tracetool: generate plugin snippets
trace: add infrastructure for building plugins
hmp: expose status of plugins to the monitor
linux-user: allow dumping of plugin status at end of run
plugins: add an example hotblocks plugin
plugins: add hotness summary to hotblocks
plugin: add tlbflush stats plugin
Pavel Dovgalyuk (1):
trace: add support for plugin infrastructure
Makefile | 5 +-
accel/tcg/cputlb.c | 99 +++++-----
accel/tcg/trace-events | 30 ++-
configure | 6 +
hmp-commands-info.hx | 17 ++
include/plugins/plugins.h | 14 ++
include/qemu/log.h | 2 +
include/qemu/plugins.h | 19 ++
linux-user/exit.c | 12 ++
linux-user/main.c | 31 ++-
monitor.c | 19 +-
qapi/trace.json | 3 +-
qemu-options.hx | 10 +
qom/cpu.c | 3 +
qom/trace-events | 4 +
scripts/tracetool/backend/plugin.py | 79 ++++++++
scripts/tracetool/backend/simple.py | 2 +
tests/test-logging.c | 14 ++
trace/Makefile.objs | 25 +++
trace/control-internal.h | 6 +
trace/event-internal.h | 4 +
trace/plugins.c | 154 +++++++++++++++
trace/plugins/Makefile.plugins | 32 ++++
trace/plugins/hotblocks/hotblocks.c | 100 ++++++++++
trace/plugins/tlbflush/tlbflush.c | 283 ++++++++++++++++++++++++++++
trace/qmp.c | 1 +
util/log.c | 34 +++-
vl.c | 26 ++-
28 files changed, 962 insertions(+), 72 deletions(-)
create mode 100644 include/plugins/plugins.h
create mode 100644 include/qemu/plugins.h
create mode 100644 scripts/tracetool/backend/plugin.py
create mode 100644 trace/plugins.c
create mode 100644 trace/plugins/Makefile.plugins
create mode 100644 trace/plugins/hotblocks/hotblocks.c
create mode 100644 trace/plugins/tlbflush/tlbflush.c
--
2.17.1
On Fri, Oct 05, 2018 at 16:48:49 +0100, Alex Bennée wrote: (snip) > ==Known Limitations== > > Currently there is only one hook allowed per trace event. We could > make this more flexible or simply just error out if two plugins try > and hook to the same point. What are the expectations of running > multiple plugins hooking into the same point in QEMU? It's very common. All popular instrumentation tools (e.g. PANDA, DynamoRIO, Pin) support multiple plugins. > ==TCG Hooks== > > Thanks to Lluís' work the trace API already splits up TCG events into > translation time and exec time events and provides the machinery for > hooking a trace helper into the translation stream. Currently that > helper is unconditionally added although perhaps we could expand the > call convention a little for TCG events to allow the translation time > event to filter out planting the execution time helper? A TCG helper is suboptimal for these kind of events, e.g. instruction/TB/ mem callbacks, because (1) these events happen *very* often, and (2) a helper then has to iterate over a list of callbacks (assuming we support multiple plugins). That is, one TCG helper call, plus cache misses for the callback pointers, plus function calls to call the callbacks. That adds up to 2x average slowdown for SPEC06int, instead of 1.5x slowdown when embedding the callbacks directly into the generated code. Yes, you have to flush the code when unsubscribing from the event, but that cost is amortized by the savings you get when the callbacks occur, which are way more frequent. Besides performance, to provide a pleasant plugin experience we need something better than the current tracing callbacks. > ===Instruction Tracing=== > > Pavel's series had a specific hook for instrumenting individual > instructions. I have not yet added it to this series but I think it be > done in a slightly cleaner way now we have the ability to insert TCG > ops into the instruction stream. I thought Peter explicitly disallowed TCG generation from plugins. Also, IIRC others also mentioned that exposing QEMU internals (e.g. "struct TranslationBlock", or "struct CPUState") to plugins was not on the table. > If we add a tracepoint for post > instruction generation which passes a buffer with the instruction > translated and method to insert a helper before or after the > instruction. This would avoid exposing the cpu_ldst macros to the > plugins. Again, for performance you'd avoid the tracepoint (i.e. calling a helper to call another function) and embed directly the callback from TCG. Same thing applies to TB's. > So what do people think? Could this be a viable way to extend QEMU > with plugins? For frequent events such as the ones mentioned above, I am not sure plugins can be efficiently implemented under tracing. For others (e.g. cpu_init events), sure, they could. But still, differently from tracers, plugins can come and go anytime, so I am not convinced that merging the two features is a good idea. Thanks, Emilio
Emilio G. Cota <cota@braap.org> writes:
> On Fri, Oct 05, 2018 at 16:48:49 +0100, Alex Bennée wrote:
> (snip)
>> ==Known Limitations==
>>
>> Currently there is only one hook allowed per trace event. We could
>> make this more flexible or simply just error out if two plugins try
>> and hook to the same point. What are the expectations of running
>> multiple plugins hooking into the same point in QEMU?
>
> It's very common. All popular instrumentation tools (e.g. PANDA,
> DynamoRIO, Pin) support multiple plugins.
Fair enough.
>
>> ==TCG Hooks==
>>
>> Thanks to Lluís' work the trace API already splits up TCG events into
>> translation time and exec time events and provides the machinery for
>> hooking a trace helper into the translation stream. Currently that
>> helper is unconditionally added although perhaps we could expand the
>> call convention a little for TCG events to allow the translation time
>> event to filter out planting the execution time helper?
>
> A TCG helper is suboptimal for these kind of events, e.g. instruction/TB/
> mem callbacks, because (1) these events happen *very* often, and
> (2) a helper then has to iterate over a list of callbacks (assuming
> we support multiple plugins). That is, one TCG helper call,
> plus cache misses for the callback pointers, plus function calls
> to call the callbacks. That adds up to 2x average slowdown
> for SPEC06int, instead of 1.5x slowdown when embedding the
> callbacks directly into the generated code. Yes, you have to
> flush the code when unsubscribing from the event, but that cost
> is amortized by the savings you get when the callbacks occur,
> which are way more frequent.
What would you want instead of a TCG helper? But certainly being able be
selective about which instances of each trace point are instrumented
will be valuable.
> Besides performance, to provide a pleasant plugin experience we need
> something better than the current tracing callbacks.
What I hope to avoid in re-using trace points is having a whole bunch of
a additional hook points just for plugins. However nothing stops us
adding more tracepoints at more useful places for instrumentation. We
could also do it on a whitelist basis similar to the way the tcg events
are marked.
>
>> ===Instruction Tracing===
>>
>> Pavel's series had a specific hook for instrumenting individual
>> instructions. I have not yet added it to this series but I think it be
>> done in a slightly cleaner way now we have the ability to insert TCG
>> ops into the instruction stream.
>
> I thought Peter explicitly disallowed TCG generation from plugins.
> Also, IIRC others also mentioned that exposing QEMU internals
> (e.g. "struct TranslationBlock", or "struct CPUState") to plugins
> was not on the table.
We definitely want to avoid plugin controlled code generation but the
tcg tracepoint mechanism is transparent to the plugin itself. I think
the pointers should really be treated as anonymous handles rather than
windows into QEMU's internals. Arguably some of the tracepoints should
be exporting more useful numbers (I used cpu->cpu_index in the TLB trace
points) but I don't know if we can change existing trace point
definitions to clean that up.
Again if we whitelist tracepoints for plugins we can be more careful
about the data exported.
>
>> If we add a tracepoint for post
>> instruction generation which passes a buffer with the instruction
>> translated and method to insert a helper before or after the
>> instruction. This would avoid exposing the cpu_ldst macros to the
>> plugins.
>
> Again, for performance you'd avoid the tracepoint (i.e. calling
> a helper to call another function) and embed directly the
> callback from TCG. Same thing applies to TB's.
OK I see what you mean. I think that is doable although it might take a
bit more tcg plumbing.
>
>> So what do people think? Could this be a viable way to extend QEMU
>> with plugins?
>
> For frequent events such as the ones mentioned above, I am
> not sure plugins can be efficiently implemented under
> tracing.
I assume some form of helper-per-instrumented-event/insn is still going
to be needed though? We are not considering some sort of EBF craziness?
> For others (e.g. cpu_init events), sure, they could.
> But still, differently from tracers, plugins can come and go
> anytime, so I am not convinced that merging the two features
> is a good idea.
I don't think we have to mirror tracepoints and plugin points but I'm in
favour of sharing the general mechanism and tooling rather than having a
whole separate set of hooks. We certainly don't want anything like:
trace_exec_tb(tb, pc);
plugin_exec_tb(tb, pc);
scattered throughout the code where the two do align.
>
> Thanks,
>
> Emilio
--
Alex Bennée
Hi Alex, On 08/10/2018 12:28, Alex Bennée wrote: > > Emilio G. Cota <cota@braap.org> writes: > >> On Fri, Oct 05, 2018 at 16:48:49 +0100, Alex Bennée wrote: >> (snip) >>> ==Known Limitations== >>> >>> Currently there is only one hook allowed per trace event. We could >>> make this more flexible or simply just error out if two plugins try >>> and hook to the same point. What are the expectations of running >>> multiple plugins hooking into the same point in QEMU? >> >> It's very common. All popular instrumentation tools (e.g. PANDA, >> DynamoRIO, Pin) support multiple plugins. > > Fair enough. > >> >>> ==TCG Hooks== >>> >>> Thanks to Lluís' work the trace API already splits up TCG events into >>> translation time and exec time events and provides the machinery for >>> hooking a trace helper into the translation stream. Currently that >>> helper is unconditionally added although perhaps we could expand the >>> call convention a little for TCG events to allow the translation time >>> event to filter out planting the execution time helper? >> >> A TCG helper is suboptimal for these kind of events, e.g. instruction/TB/ >> mem callbacks, because (1) these events happen *very* often, and >> (2) a helper then has to iterate over a list of callbacks (assuming >> we support multiple plugins). That is, one TCG helper call, >> plus cache misses for the callback pointers, plus function calls >> to call the callbacks. That adds up to 2x average slowdown >> for SPEC06int, instead of 1.5x slowdown when embedding the >> callbacks directly into the generated code. Yes, you have to >> flush the code when unsubscribing from the event, but that cost >> is amortized by the savings you get when the callbacks occur, >> which are way more frequent. > > What would you want instead of a TCG helper? But certainly being able be > selective about which instances of each trace point are instrumented > will be valuable. > >> Besides performance, to provide a pleasant plugin experience we need >> something better than the current tracing callbacks. > > What I hope to avoid in re-using trace points is having a whole bunch of > a additional hook points just for plugins. However nothing stops us > adding more tracepoints at more useful places for instrumentation. We > could also do it on a whitelist basis similar to the way the tcg events > are marked. > >> >>> ===Instruction Tracing=== >>> >>> Pavel's series had a specific hook for instrumenting individual >>> instructions. I have not yet added it to this series but I think it be >>> done in a slightly cleaner way now we have the ability to insert TCG >>> ops into the instruction stream. >> >> I thought Peter explicitly disallowed TCG generation from plugins. >> Also, IIRC others also mentioned that exposing QEMU internals >> (e.g. "struct TranslationBlock", or "struct CPUState") to plugins >> was not on the table. > > We definitely want to avoid plugin controlled code generation but the > tcg tracepoint mechanism is transparent to the plugin itself. I think > the pointers should really be treated as anonymous handles rather than > windows into QEMU's internals. Arguably some of the tracepoints should > be exporting more useful numbers (I used cpu->cpu_index in the TLB trace > points) but I don't know if we can change existing trace point > definitions to clean that up. > > Again if we whitelist tracepoints for plugins we can be more careful > about the data exported. > >> >>> If we add a tracepoint for post >>> instruction generation which passes a buffer with the instruction >>> translated and method to insert a helper before or after the >>> instruction. This would avoid exposing the cpu_ldst macros to the >>> plugins. >> >> Again, for performance you'd avoid the tracepoint (i.e. calling >> a helper to call another function) and embed directly the >> callback from TCG. Same thing applies to TB's. > > OK I see what you mean. I think that is doable although it might take a > bit more tcg plumbing. > >> >>> So what do people think? Could this be a viable way to extend QEMU >>> with plugins? >> >> For frequent events such as the ones mentioned above, I am >> not sure plugins can be efficiently implemented under >> tracing. > > I assume some form of helper-per-instrumented-event/insn is still going > to be needed though? We are not considering some sort of EBF craziness? > >> For others (e.g. cpu_init events), sure, they could. >> But still, differently from tracers, plugins can come and go >> anytime, so I am not convinced that merging the two features >> is a good idea. > > I don't think we have to mirror tracepoints and plugin points but I'm in > favour of sharing the general mechanism and tooling rather than having a > whole separate set of hooks. We certainly don't want anything like: > > trace_exec_tb(tb, pc); > plugin_exec_tb(tb, pc); > > scattered throughout the code where the two do align. What about turning the tracepoints into the default instrumentation plugin? (the first of Emilio's list of plugins). >> >> Thanks, >> >> Emilio > > > -- > Alex Bennée >
On Mon, Oct 08, 2018 at 11:28:38 +0100, Alex Bennée wrote: > Emilio G. Cota <cota@braap.org> writes: > > Again, for performance you'd avoid the tracepoint (i.e. calling > > a helper to call another function) and embed directly the > > callback from TCG. Same thing applies to TB's. > > OK I see what you mean. I think that is doable although it might take a > bit more tcg plumbing. I have patches to do it, it's not complicated. > >> So what do people think? Could this be a viable way to extend QEMU > >> with plugins? > > > > For frequent events such as the ones mentioned above, I am > > not sure plugins can be efficiently implemented under > > tracing. > > I assume some form of helper-per-instrumented-event/insn is still going > to be needed though? We are not considering some sort of EBF craziness? Helper, yes. But one that points directly to plugin code. > > For others (e.g. cpu_init events), sure, they could. > > But still, differently from tracers, plugins can come and go > > anytime, so I am not convinced that merging the two features > > is a good idea. > > I don't think we have to mirror tracepoints and plugin points but I'm in > favour of sharing the general mechanism and tooling rather than having a > whole separate set of hooks. We certainly don't want anything like: > > trace_exec_tb(tb, pc); > plugin_exec_tb(tb, pc); > > scattered throughout the code where the two do align. We could have something like plugin_trace_exec_tb(tb, pc); that would expand to the two lines above. Or similar. So I agree with you that in some cases the "trace points" for both tracing and plugin might be the same, perhaps identical. But that doesn't necessarily mean that making plugins a subset of tracing is a good idea. I think sharing my plugin implementation will help the discussion. I'll share it as soon as I can (my QEMU plate is full already trying to merge a couple of other features first). Thanks, Emilio
Emilio G. Cota <cota@braap.org> writes: > On Mon, Oct 08, 2018 at 11:28:38 +0100, Alex Bennée wrote: >> Emilio G. Cota <cota@braap.org> writes: >> > Again, for performance you'd avoid the tracepoint (i.e. calling >> > a helper to call another function) and embed directly the >> > callback from TCG. Same thing applies to TB's. >> >> OK I see what you mean. I think that is doable although it might take a >> bit more tcg plumbing. > > I have patches to do it, it's not complicated. Right that would be useful. > >> >> So what do people think? Could this be a viable way to extend QEMU >> >> with plugins? >> > >> > For frequent events such as the ones mentioned above, I am >> > not sure plugins can be efficiently implemented under >> > tracing. >> >> I assume some form of helper-per-instrumented-event/insn is still going >> to be needed though? We are not considering some sort of EBF craziness? > > Helper, yes. But one that points directly to plugin code. It would be nice if the logic the inserts the trace helper vs a direct call could be shared. I guess I'd have to see the implementation to see how ugly it gets. > >> > For others (e.g. cpu_init events), sure, they could. >> > But still, differently from tracers, plugins can come and go >> > anytime, so I am not convinced that merging the two features >> > is a good idea. >> >> I don't think we have to mirror tracepoints and plugin points but I'm in >> favour of sharing the general mechanism and tooling rather than having a >> whole separate set of hooks. We certainly don't want anything like: >> >> trace_exec_tb(tb, pc); >> plugin_exec_tb(tb, pc); >> >> scattered throughout the code where the two do align. > > We could have something like > > plugin_trace_exec_tb(tb, pc); > > that would expand to the two lines above. Or similar. > > So I agree with you that in some cases the "trace points" > for both tracing and plugin might be the same, perhaps > identical. But that doesn't necessarily mean that making > plugins a subset of tracing is a good idea. But we can avoid having plugin-points and trace-events duplicating stuff as well? I guess you want to avoid having the generated code fragments for plugins? The other nice property was avoiding re-duplicating output logic for "filter" style operations. However I didn't actually included such an example in the series. I was pondering a QEMU powered PLT/library call tracer to demonstrate that sort of thing. > I think sharing my plugin implementation will help the > discussion. I'll share it as soon as I can (my QEMU plate > is full already trying to merge a couple of other features > first). Sounds good. > > Thanks, > > Emilio -- Alex Bennée
> From: Alex Bennée [mailto:alex.bennee@linaro.org] > Any serious analysis tool should allow for us to track all memory > accesses so I think the guest_mem_before trace point should probably > be split into guest_mem_before_store and guest_mem_after_load. We > could go the whole hog and add potential trace points for start/end of > all memory operations. I wanted to ask about memory tracing and found this one. Is it possible to use tracepoints for capturing all memory accesses? In our implementation we insert helpers before and after tcg read/write operations. Pavel Dovgalyuk
Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >> From: Alex Bennée [mailto:alex.bennee@linaro.org] >> Any serious analysis tool should allow for us to track all memory >> accesses so I think the guest_mem_before trace point should probably >> be split into guest_mem_before_store and guest_mem_after_load. We >> could go the whole hog and add potential trace points for start/end of >> all memory operations. > > I wanted to ask about memory tracing and found this one. > Is it possible to use tracepoints for capturing all memory accesses? > In our implementation we insert helpers before and after tcg > read/write operations. The current tracepoint isn't enough but yes I think we could. The first thing I need to do is de-macrofy the atomic helpers a little just to make it a bit simpler to add the before/after tracepoints. > > Pavel Dovgalyuk -- Alex Bennée
> From: Alex Bennée [mailto:alex.bennee@linaro.org] > Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: > > >> From: Alex Bennée [mailto:alex.bennee@linaro.org] > >> Any serious analysis tool should allow for us to track all memory > >> accesses so I think the guest_mem_before trace point should probably > >> be split into guest_mem_before_store and guest_mem_after_load. We > >> could go the whole hog and add potential trace points for start/end of > >> all memory operations. > > > > I wanted to ask about memory tracing and found this one. > > Is it possible to use tracepoints for capturing all memory accesses? > > In our implementation we insert helpers before and after tcg > > read/write operations. > > The current tracepoint isn't enough but yes I think we could. The first > thing I need to do is de-macrofy the atomic helpers a little just to > make it a bit simpler to add the before/after tracepoints. But memory accesses can use 'fast path' without the helpers. Thus you still need inserting the new helper for that case. Pavel Dovgalyuk
Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >> From: Alex Bennée [mailto:alex.bennee@linaro.org] >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >> >> >> From: Alex Bennée [mailto:alex.bennee@linaro.org] >> >> Any serious analysis tool should allow for us to track all memory >> >> accesses so I think the guest_mem_before trace point should probably >> >> be split into guest_mem_before_store and guest_mem_after_load. We >> >> could go the whole hog and add potential trace points for start/end of >> >> all memory operations. >> > >> > I wanted to ask about memory tracing and found this one. >> > Is it possible to use tracepoints for capturing all memory accesses? >> > In our implementation we insert helpers before and after tcg >> > read/write operations. >> >> The current tracepoint isn't enough but yes I think we could. The first >> thing I need to do is de-macrofy the atomic helpers a little just to >> make it a bit simpler to add the before/after tracepoints. > > But memory accesses can use 'fast path' without the helpers. > Thus you still need inserting the new helper for that case. trace_guest_mem_before_tcg in tcg-op.c already does this but currently only before operations. That's why I want to split it and pass the load/store value register values into the helpers. > > Pavel Dovgalyuk -- Alex Bennée
> From: Alex Bennée [mailto:alex.bennee@linaro.org] > Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: > > >> From: Alex Bennée [mailto:alex.bennee@linaro.org] > >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: > >> > >> >> From: Alex Bennée [mailto:alex.bennee@linaro.org] > >> >> Any serious analysis tool should allow for us to track all memory > >> >> accesses so I think the guest_mem_before trace point should probably > >> >> be split into guest_mem_before_store and guest_mem_after_load. We > >> >> could go the whole hog and add potential trace points for start/end of > >> >> all memory operations. > >> > > >> > I wanted to ask about memory tracing and found this one. > >> > Is it possible to use tracepoints for capturing all memory accesses? > >> > In our implementation we insert helpers before and after tcg > >> > read/write operations. > >> > >> The current tracepoint isn't enough but yes I think we could. The first > >> thing I need to do is de-macrofy the atomic helpers a little just to > >> make it a bit simpler to add the before/after tracepoints. > > > > But memory accesses can use 'fast path' without the helpers. > > Thus you still need inserting the new helper for that case. > > trace_guest_mem_before_tcg in tcg-op.c already does this but currently > only before operations. That's why I want to split it and pass the > load/store value register values into the helpers. One more question about your trace points. In case of using trace point on every instruction execution, we may need accessing vCPU registers (including the flags). Are they valid in such cases? I'm asking, because at least i386 translation optimizes writebacks. Pavel Dovgalyuk
Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >> From: Alex Bennée [mailto:alex.bennee@linaro.org] >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >> >> >> From: Alex Bennée [mailto:alex.bennee@linaro.org] >> >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >> >> >> >> >> From: Alex Bennée [mailto:alex.bennee@linaro.org] >> >> >> Any serious analysis tool should allow for us to track all memory >> >> >> accesses so I think the guest_mem_before trace point should probably >> >> >> be split into guest_mem_before_store and guest_mem_after_load. We >> >> >> could go the whole hog and add potential trace points for start/end of >> >> >> all memory operations. >> >> > >> >> > I wanted to ask about memory tracing and found this one. >> >> > Is it possible to use tracepoints for capturing all memory accesses? >> >> > In our implementation we insert helpers before and after tcg >> >> > read/write operations. >> >> >> >> The current tracepoint isn't enough but yes I think we could. The first >> >> thing I need to do is de-macrofy the atomic helpers a little just to >> >> make it a bit simpler to add the before/after tracepoints. >> > >> > But memory accesses can use 'fast path' without the helpers. >> > Thus you still need inserting the new helper for that case. >> >> trace_guest_mem_before_tcg in tcg-op.c already does this but currently >> only before operations. That's why I want to split it and pass the >> load/store value register values into the helpers. > > One more question about your trace points. > In case of using trace point on every instruction execution, we may need > accessing vCPU registers (including the flags). Are they valid in such > cases? They are probably valid but the tricky bit will be doing it in a way that doesn't expose the internals of the TCG. Maybe we could exploit the GDB interface for this or come up with a named referencex API. > I'm asking, because at least i386 translation optimizes writebacks. How so? I have to admit the i386 translation code is the most opaque to me but I wouldn't have thought changing the semantics of the guests load/store operations would be a sensible idea. Of course now you mention it my thoughts about memory tracing have been influenced by nice clean RISCy load/store architectures where it's rare to have calculation ops working directly with memory. > > Pavel Dovgalyuk -- Alex Bennée
> From: Alex Bennée [mailto:alex.bennee@linaro.org] > Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: > > One more question about your trace points. > > In case of using trace point on every instruction execution, we may need > > accessing vCPU registers (including the flags). Are they valid in such > > cases? > > They are probably valid but the tricky bit will be doing it in a way > that doesn't expose the internals of the TCG. Maybe we could exploit the > GDB interface for this or come up with a named referencex API. > > > I'm asking, because at least i386 translation optimizes writebacks. > > How so? I have to admit the i386 translation code is the most opaque to > me but I wouldn't have thought changing the semantics of the guests > load/store operations would be a sensible idea. Writeback to the registers (say EFLAGS), not to the memory. Pavel Dovgalyuk
Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >> From: Alex Bennée [mailto:alex.bennee@linaro.org] >> Pavel Dovgalyuk <dovgaluk@ispras.ru> writes: >> > One more question about your trace points. >> > In case of using trace point on every instruction execution, we may need >> > accessing vCPU registers (including the flags). Are they valid in such >> > cases? >> >> They are probably valid but the tricky bit will be doing it in a way >> that doesn't expose the internals of the TCG. Maybe we could exploit the >> GDB interface for this or come up with a named referencex API. >> >> > I'm asking, because at least i386 translation optimizes writebacks. >> >> How so? I have to admit the i386 translation code is the most opaque to >> me but I wouldn't have thought changing the semantics of the guests >> load/store operations would be a sensible idea. > > Writeback to the registers (say EFLAGS), not to the memory. Ahh lazy evaluation of (status) flags. Well having dug around gdbstub it looks like we may get that wrong in for eflags anyway. I think what is probably required is: - a hook in TranslatorOps, resolve_all? - an interface to named things registered with tcg_global_mem_new_* And a way for the plugin to assert that any register accessed via this is consistent at the point in the runtime the plugin hook is called. I wonder what other front ends might have this sort of lazy/partial evaluation? -- Alex Bennée
© 2016 - 2025 Red Hat, Inc.