include/qemu/qemu-plugin.h | 29 ++++++++++++++++++++++++++++- plugins/api.c | 14 ++++++++++++++ plugins/qemu-plugins.symbols | 1 + 3 files changed, 43 insertions(+), 1 deletion(-)
In system mode emulation, some of translation blocks could be
interrupted on memory I/O operation. That leads to artificial
construction of another translation block that contains memory
operation only. If TCG plugin is not aware of that TB kind, it
attempts to insert execution callbacks either on translation
block or instruction, which is silently ignored. As the result
it leads to potentially inconsistent processing of execution and
memory callbacks by the plugin.
Exposing appropriate translation block flag allows plugins to
handle "memory only" blocks in appropriate way.
Signed-off-by: Mikhail Tyutin <m.tyutin@yadro.com>
---
include/qemu/qemu-plugin.h | 29 ++++++++++++++++++++++++++++-
plugins/api.c | 14 ++++++++++++++
plugins/qemu-plugins.symbols | 1 +
3 files changed, 43 insertions(+), 1 deletion(-)
diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h
index 4daab6efd2..5f07fa497c 100644
--- a/include/qemu/qemu-plugin.h
+++ b/include/qemu/qemu-plugin.h
@@ -54,7 +54,7 @@ typedef uint64_t qemu_plugin_id_t;
extern QEMU_PLUGIN_EXPORT int qemu_plugin_version;
-#define QEMU_PLUGIN_VERSION 1
+#define QEMU_PLUGIN_VERSION 2
/**
* struct qemu_info_t - system information for plugins
@@ -236,6 +236,21 @@ enum qemu_plugin_cb_flags {
QEMU_PLUGIN_CB_RW_REGS,
};
+/**
+ * enum qemu_plugin_tb_flags - type of translation block
+ *
+ * @QEMU_PLUGIN_TB_MEM_ONLY:
+ * TB is special block to perform memory I/O operation only.
+ * Block- and instruction- level callbacks have no effect.
+ * @QEMU_PLUGIN_TB_MEM_OPS:
+ * TB has at least one instruction that access memory.
+ * Memory callbacks are applicable to this TB.
+ */
+enum qemu_plugin_tb_flags {
+ QEMU_PLUGIN_TB_MEM_ONLY = 0x01,
+ QEMU_PLUGIN_TB_MEM_OPS = 0x02
+};
+
enum qemu_plugin_mem_rw {
QEMU_PLUGIN_MEM_R = 1,
QEMU_PLUGIN_MEM_W,
@@ -360,6 +375,18 @@ size_t qemu_plugin_tb_n_insns(const struct qemu_plugin_tb *tb);
QEMU_PLUGIN_API
uint64_t qemu_plugin_tb_vaddr(const struct qemu_plugin_tb *tb);
+/**
+ * qemu_plugin_tb_flags() - returns combination of TB flags
+ * @tb: opaque handle to TB passed to callback
+ *
+ * Returned set of flags can be used to check if TB has a non-typical
+ * behaviour. For example: whether or not instruction execution
+ * callbacks are applicable for the block.
+ *
+ * Returns: 0 or combination of qemu_plugin_tb_flags
+ */
+int qemu_plugin_tb_flags(const struct qemu_plugin_tb *tb);
+
/**
* qemu_plugin_tb_get_insn() - retrieve handle for instruction
* @tb: opaque handle to TB passed to callback
diff --git a/plugins/api.c b/plugins/api.c
index 5521b0ad36..4e73aaf422 100644
--- a/plugins/api.c
+++ b/plugins/api.c
@@ -37,6 +37,7 @@
#include "qemu/osdep.h"
#include "qemu/plugin.h"
#include "qemu/log.h"
+#include "qemu/qemu-plugin.h"
#include "tcg/tcg.h"
#include "exec/exec-all.h"
#include "exec/ram_addr.h"
@@ -193,6 +194,19 @@ uint64_t qemu_plugin_tb_vaddr(const struct qemu_plugin_tb *tb)
return tb->vaddr;
}
+int qemu_plugin_tb_flags(const struct qemu_plugin_tb *tb)
+{
+ int ret = 0;
+ if (tb->mem_only) {
+ ret |= QEMU_PLUGIN_TB_MEM_ONLY;
+ }
+ if (tb->mem_helper) {
+ ret |= QEMU_PLUGIN_TB_MEM_OPS;
+ }
+
+ return ret;
+}
+
struct qemu_plugin_insn *
qemu_plugin_tb_get_insn(const struct qemu_plugin_tb *tb, size_t idx)
{
diff --git a/plugins/qemu-plugins.symbols b/plugins/qemu-plugins.symbols
index 71f6c90549..f11f633da6 100644
--- a/plugins/qemu-plugins.symbols
+++ b/plugins/qemu-plugins.symbols
@@ -40,6 +40,7 @@
qemu_plugin_tb_get_insn;
qemu_plugin_tb_n_insns;
qemu_plugin_tb_vaddr;
+ qemu_plugin_tb_flags;
qemu_plugin_uninstall;
qemu_plugin_vcpu_for_each;
};
--
2.34.1
On Wed, 22 Nov 2023 at 12:17, Mikhail Tyutin <m.tyutin@yadro.com> wrote: > > In system mode emulation, some of translation blocks could be > interrupted on memory I/O operation. That leads to artificial > construction of another translation block that contains memory > operation only. If TCG plugin is not aware of that TB kind, it > attempts to insert execution callbacks either on translation > block or instruction, which is silently ignored. As the result > it leads to potentially inconsistent processing of execution and > memory callbacks by the plugin. > Exposing appropriate translation block flag allows plugins to > handle "memory only" blocks in appropriate way. > > Signed-off-by: Mikhail Tyutin <m.tyutin@yadro.com> > --- > include/qemu/qemu-plugin.h | 29 ++++++++++++++++++++++++++++- > plugins/api.c | 14 ++++++++++++++ > plugins/qemu-plugins.symbols | 1 + > 3 files changed, 43 insertions(+), 1 deletion(-) > > diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h > index 4daab6efd2..5f07fa497c 100644 > --- a/include/qemu/qemu-plugin.h > +++ b/include/qemu/qemu-plugin.h > @@ -54,7 +54,7 @@ typedef uint64_t qemu_plugin_id_t; > > extern QEMU_PLUGIN_EXPORT int qemu_plugin_version; > > -#define QEMU_PLUGIN_VERSION 1 > +#define QEMU_PLUGIN_VERSION 2 > > /** > * struct qemu_info_t - system information for plugins > @@ -236,6 +236,21 @@ enum qemu_plugin_cb_flags { > QEMU_PLUGIN_CB_RW_REGS, > }; > > +/** > + * enum qemu_plugin_tb_flags - type of translation block > + * > + * @QEMU_PLUGIN_TB_MEM_ONLY: > + * TB is special block to perform memory I/O operation only. > + * Block- and instruction- level callbacks have no effect. > + * @QEMU_PLUGIN_TB_MEM_OPS: > + * TB has at least one instruction that access memory. > + * Memory callbacks are applicable to this TB. > + */ > +enum qemu_plugin_tb_flags { > + QEMU_PLUGIN_TB_MEM_ONLY = 0x01, > + QEMU_PLUGIN_TB_MEM_OPS = 0x02 > +}; > If we do go for this, can we pick a different naming than "TB flags", please? QEMU already has a "TB flags" concept for TCG -- it's the target-specific flags that encode bits of the CPU state that we baked into the generated code. Those flags are strictly TCG internal and we definitely don't want to expose them to a plugin because they're not a stable interface. So we should call these flags something else so we don't get confused. thanks -- PMM
Mikhail Tyutin <m.tyutin@yadro.com> writes: > In system mode emulation, some of translation blocks could be > interrupted on memory I/O operation. That leads to artificial > construction of another translation block that contains memory > operation only. If TCG plugin is not aware of that TB kind, it > attempts to insert execution callbacks either on translation > block or instruction, which is silently ignored. That was the intention - the instrumented instructions have already been executed. The only thing that matters now is the memory access: /* * Exit the loop and potentially generate a new TB executing the * just the I/O insns. We also limit instrumentation to memory * operations only (which execute after completion) so we don't * double instrument the instruction. */ cpu->cflags_next_tb = curr_cflags(cpu) | CF_MEMI_ONLY | n; > As the result > it leads to potentially inconsistent processing of execution and > memory callbacks by the plugin. > Exposing appropriate translation block flag allows plugins to > handle "memory only" blocks in appropriate way. We don't want to expose internal details to the plugin. It shouldn't need to care. Do you have a test case where you missed counting the execution of the instruction? -- Alex Bennée Virtualisation Tech Lead @ Linaro
Hi Alex, > > Exposing appropriate translation block flag allows plugins to > > handle "memory only" blocks in appropriate way. > > We don't want to expose internal details to the plugin. It shouldn't > need to care. > > Do you have a test case where you missed counting the execution of the > instruction? To speedup plugin execution time we want to shift processing logic to block translation phase and keep execution callback light. Also moving instrumentation at block level as opposite to instruction level, helps to speedup up execution as well. Given that, we prepare structures in translation callback. For example: void handleBlockTranslation(qemu_plugin_id_t, qemu_plugin_tb* tblock) { BlockStats* s = new BlockStats(); s->init(tblock); g_stats->append(s); /* Then, insert execution callbacks and pass BlockStats as userdata for quick data lookup in run time at: 1) basic block prologue: qemu_plugin_register_vcpu_tb_exec_cb(tblock, cb, flags, s); 2) memory load/store: qemu_plugin_register_vcpu_mem_cb(tblock, cb, flags, mem_flags, s); */ } Having artificial "mem only" block leads to allocation of new BlockStats and memory access will be attributed to that block instead of "original" one which is supposed to be executed. If we know that block is "mem only" on translation phase, then memory callback is implemented differently to find relevant BlockStats. -- Mikhail
Mikhail Tyutin <m.tyutin@yadro.com> writes: > Hi Alex, > >> > Exposing appropriate translation block flag allows plugins to >> > handle "memory only" blocks in appropriate way. >> >> We don't want to expose internal details to the plugin. It shouldn't >> need to care. >> >> Do you have a test case where you missed counting the execution of the >> instruction? > > To speedup plugin execution time we want to shift processing logic to > block translation phase and keep execution callback light. Also moving > instrumentation at block level as opposite to instruction level, helps > to speedup up execution as well. The trouble is we don't guarantee that any given block will fully execute every instruction. To be sure you capture every instruction you need to instrument each one although as you point out this is expensive. We do have counters although currently they are not atomic so can only give a gross overview. There are plans to add a vCPU indexed score board to solve that which will be a fairly lightweight instrumentation. > Given that, we prepare structures in translation callback. For example: > > void handleBlockTranslation(qemu_plugin_id_t, qemu_plugin_tb* tblock) > { > BlockStats* s = new BlockStats(); > s->init(tblock); > g_stats->append(s); I guess this is a natural approach given we present a block translation event but even without IO recompilation there will be multiple translation blocks for any given set of addresses. We should make this clearer in the documentation. About the only thing you can be sure of is blocks won't cross page boundaries (although I guess in future they may in linux-user). > > /* Then, insert execution callbacks and pass BlockStats as > userdata for quick data lookup in run time at: > 1) basic block prologue: > qemu_plugin_register_vcpu_tb_exec_cb(tblock, cb, flags, s); > 2) memory load/store: > qemu_plugin_register_vcpu_mem_cb(tblock, cb, flags, > mem_flags, s); > */ > } > > > Having artificial "mem only" block leads to allocation of new > BlockStats and memory access will be attributed to that block instead > of "original" one which is supposed to be executed. If we know that > block is "mem only" on translation phase, then memory callback is > implemented differently to find relevant BlockStats. The alternative would be to have a page tracking structure based on the qemu_plugin_insn_haddr of the instructions and aggregate your data there. -- Alex Bennée Virtualisation Tech Lead @ Linaro
© 2016 - 2024 Red Hat, Inc.