[PATCH v12 5/7] plugins: Add memory hardware address read/write API

Rowan Hart posted 7 patches 8 months ago
Maintainers: Richard Henderson <richard.henderson@linaro.org>, Paolo Bonzini <pbonzini@redhat.com>, "Alex Bennée" <alex.bennee@linaro.org>, "Philippe Mathieu-Daudé" <philmd@linaro.org>, Eduardo Habkost <eduardo@habkost.net>, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>, Yanan Wang <wangyanan55@huawei.com>, Zhao Liu <zhao1.liu@intel.com>, Alexandre Iooss <erdnaxe@crans.org>, Mahmoud Mandour <ma.mandourr@gmail.com>, Pierrick Bouvier <pierrick.bouvier@linaro.org>
There is a newer version of this series
[PATCH v12 5/7] plugins: Add memory hardware address read/write API
Posted by Rowan Hart 8 months ago
From: novafacing <rowanbhart@gmail.com>

This patch adds functions to the plugins API to allow plugins to read
and write memory via hardware addresses. The functions use the current
address space of the current CPU in order to avoid exposing address
space information to users. A later patch may want to add a function to
permit a specified address space, for example to facilitate
architecture-specific plugins that want to operate on them, for example
reading ARM secure memory.

Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Rowan Hart <rowanbhart@gmail.com>
---
 include/qemu/qemu-plugin.h | 93 ++++++++++++++++++++++++++++++++++++
 plugins/api.c              | 97 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 190 insertions(+)

diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h
index 4167c46c2a..5eecdccc67 100644
--- a/include/qemu/qemu-plugin.h
+++ b/include/qemu/qemu-plugin.h
@@ -979,6 +979,99 @@ QEMU_PLUGIN_API
 bool qemu_plugin_write_memory_vaddr(uint64_t addr,
                                    GByteArray *data);
 
+/**
+ * enum qemu_plugin_hwaddr_operation_result - result of a memory operation
+ *
+ * @QEMU_PLUGIN_HWADDR_OPERATION_OK: hwaddr operation succeeded
+ * @QEMU_PLUGIN_HWADDR_OPERATION_ERROR: unexpected error occurred
+ * @QEMU_PLUGIN_HWADDR_OPERATION_DEVICE_ERROR: error in memory device
+ * @QEMU_PLUGIN_HWADDR_OPERATION_ACCESS_DENIED: permission error
+ * @QEMU_PLUGIN_HWADDR_OPERATION_INVALID_ADDRESS: address was invalid
+ * @QEMU_PLUGIN_HWADDR_OPERATION_INVALID_ADDRESS_SPACE: invalid address space
+ */
+enum qemu_plugin_hwaddr_operation_result {
+    QEMU_PLUGIN_HWADDR_OPERATION_OK,
+    QEMU_PLUGIN_HWADDR_OPERATION_ERROR,
+    QEMU_PLUGIN_HWADDR_OPERATION_DEVICE_ERROR,
+    QEMU_PLUGIN_HWADDR_OPERATION_ACCESS_DENIED,
+    QEMU_PLUGIN_HWADDR_OPERATION_INVALID_ADDRESS,
+    QEMU_PLUGIN_HWADDR_OPERATION_INVALID_ADDRESS_SPACE,
+};
+
+/**
+ * qemu_plugin_read_memory_hwaddr() - read from memory using a hardware address
+ *
+ * @addr: The physical address to read from
+ * @data: A byte array to store data into
+ * @len: The number of bytes to read, starting from @addr
+ *
+ * @len bytes of data is read from the current memory space for the current
+ * vCPU starting at @addr and stored into @data. If @data is not large enough to
+ * hold @len bytes, it will be expanded to the necessary size, reallocating if
+ * necessary. @len must be greater than 0.
+ *
+ * This function does not ensure writes are flushed prior to reading, so
+ * callers should take care when calling this function in plugin callbacks to
+ * avoid attempting to read data which may not yet be written and should use
+ * the memory callback API instead.
+ *
+ * This function is only valid for softmmu targets.
+ *
+ * Returns a qemu_plugin_hwaddr_operation_result indicating the result of the
+ * operation.
+ */
+QEMU_PLUGIN_API
+enum qemu_plugin_hwaddr_operation_result
+qemu_plugin_read_memory_hwaddr(uint64_t addr, GByteArray *data, size_t len);
+
+/**
+ * qemu_plugin_write_memory_hwaddr() - write to memory using a hardware address
+ *
+ * @addr: A physical address to write to
+ * @data: A byte array containing the data to write
+ *
+ * The contents of @data will be written to memory starting at the hardware
+ * address @addr in the current address space for the current vCPU.
+ *
+ * This function does not guarantee consistency of writes, nor does it ensure
+ * that pending writes are flushed either before or after the write takes place,
+ * so callers should take care when calling this function in plugin callbacks to
+ * avoid depending on the existence of data written using this function which
+ * may be overwritten afterward. In addition, this function requires that the
+ * pages containing the address are not locked. Practically, this means that you
+ * should not write instruction memory in a current translation block inside a
+ * callback registered with qemu_plugin_register_vcpu_tb_trans_cb.
+ *
+ * You can, for example, write instruction memory in a current translation block
+ * in a callback registered with qemu_plugin_register_vcpu_tb_exec_cb, although
+ * be aware that the write will not be flushed until after the translation block
+ * has finished executing.  In general, this function should be used to write
+ * data memory or to patch code at a known address, not in a current translation
+ * block.
+ *
+ * This function is only valid for softmmu targets.
+ *
+ * Returns a qemu_plugin_hwaddr_operation_result indicating the result of the
+ * operation.
+ */
+QEMU_PLUGIN_API
+enum qemu_plugin_hwaddr_operation_result
+qemu_plugin_write_memory_hwaddr(uint64_t addr, GByteArray *data);
+
+/**
+ * qemu_plugin_translate_vaddr() - translate virtual address for current vCPU
+ *
+ * @vaddr: virtual address to translate
+ * @hwaddr: pointer to store the physical address
+ *
+ * This function is only valid in vCPU context (i.e. in callbacks) and is only
+ * valid for softmmu targets.
+ *
+ * Returns true on success and false on failure.
+ */
+QEMU_PLUGIN_API
+bool qemu_plugin_translate_vaddr(uint64_t vaddr, uint64_t *hwaddr);
+
 /**
  * qemu_plugin_scoreboard_new() - alloc a new scoreboard
  *
diff --git a/plugins/api.c b/plugins/api.c
index 1f64a9ea64..eac04cc1f6 100644
--- a/plugins/api.c
+++ b/plugins/api.c
@@ -39,6 +39,7 @@
 #include "qemu/main-loop.h"
 #include "qemu/plugin.h"
 #include "qemu/log.h"
+#include "system/memory.h"
 #include "tcg/tcg.h"
 #include "exec/gdbstub.h"
 #include "exec/target_page.h"
@@ -494,6 +495,102 @@ bool qemu_plugin_write_memory_vaddr(uint64_t addr, GByteArray *data)
     return true;
 }
 
+enum qemu_plugin_hwaddr_operation_result
+qemu_plugin_read_memory_hwaddr(hwaddr addr, GByteArray *data, size_t len)
+{
+#ifdef CONFIG_SOFTMMU
+    if (len == 0) {
+        return QEMU_PLUGIN_HWADDR_OPERATION_ERROR;
+    }
+
+    g_assert(current_cpu);
+
+
+    int as_idx = cpu_asidx_from_attrs(current_cpu, MEMTXATTRS_UNSPECIFIED);
+    AddressSpace *as = cpu_get_address_space(current_cpu, as_idx);
+
+    if (as == NULL) {
+        return QEMU_PLUGIN_HWADDR_OPERATION_INVALID_ADDRESS_SPACE;
+    }
+
+    g_byte_array_set_size(data, len);
+    MemTxResult res = address_space_rw(as, addr,
+                                       MEMTXATTRS_UNSPECIFIED, data->data,
+                                       data->len, false);
+
+    switch (res) {
+    case MEMTX_OK:
+        return QEMU_PLUGIN_HWADDR_OPERATION_OK;
+    case MEMTX_ERROR:
+        return QEMU_PLUGIN_HWADDR_OPERATION_DEVICE_ERROR;
+    case MEMTX_DECODE_ERROR:
+        return QEMU_PLUGIN_HWADDR_OPERATION_INVALID_ADDRESS;
+    case MEMTX_ACCESS_ERROR:
+        return QEMU_PLUGIN_HWADDR_OPERATION_ACCESS_DENIED;
+    default:
+        return QEMU_PLUGIN_HWADDR_OPERATION_ERROR;
+    }
+#else
+    return QEMU_PLUGIN_HWADDR_OPERATION_ERROR;
+#endif
+}
+
+enum qemu_plugin_hwaddr_operation_result
+qemu_plugin_write_memory_hwaddr(hwaddr addr, GByteArray *data)
+{
+#ifdef CONFIG_SOFTMMU
+    if (data->len == 0) {
+        return QEMU_PLUGIN_HWADDR_OPERATION_ERROR;
+    }
+
+    g_assert(current_cpu);
+
+    int as_idx = cpu_asidx_from_attrs(current_cpu, MEMTXATTRS_UNSPECIFIED);
+    AddressSpace *as = cpu_get_address_space(current_cpu, as_idx);
+
+    if (as == NULL) {
+        return QEMU_PLUGIN_HWADDR_OPERATION_INVALID_ADDRESS_SPACE;
+    }
+
+    MemTxResult res = address_space_rw(as, addr,
+                                       MEMTXATTRS_UNSPECIFIED, data->data,
+                                       data->len, true);
+    switch (res) {
+    case MEMTX_OK:
+        return QEMU_PLUGIN_HWADDR_OPERATION_OK;
+    case MEMTX_ERROR:
+        return QEMU_PLUGIN_HWADDR_OPERATION_DEVICE_ERROR;
+    case MEMTX_DECODE_ERROR:
+        return QEMU_PLUGIN_HWADDR_OPERATION_INVALID_ADDRESS;
+    case MEMTX_ACCESS_ERROR:
+        return QEMU_PLUGIN_HWADDR_OPERATION_ACCESS_DENIED;
+    default:
+        return QEMU_PLUGIN_HWADDR_OPERATION_ERROR;
+    }
+#else
+    return QEMU_PLUGIN_HWADDR_OPERATION_ERROR;
+#endif
+}
+
+bool qemu_plugin_translate_vaddr(uint64_t vaddr, uint64_t *hwaddr)
+{
+#ifdef CONFIG_SOFTMMU
+    g_assert(current_cpu);
+
+    uint64_t res = cpu_get_phys_page_debug(current_cpu, vaddr);
+
+    if (res == (uint64_t)-1) {
+        return false;
+    }
+
+    *hwaddr = res | (vaddr & ~TARGET_PAGE_MASK);
+
+    return true;
+#else
+    return false;
+#endif
+}
+
 struct qemu_plugin_scoreboard *qemu_plugin_scoreboard_new(size_t element_size)
 {
     return plugin_scoreboard_new(element_size);
-- 
2.49.0
Re: [PATCH v12 5/7] plugins: Add memory hardware address read/write API
Posted by Alex Bennée 7 months, 3 weeks ago
Rowan Hart <rowanbhart@gmail.com> writes:

> From: novafacing <rowanbhart@gmail.com>
>
> This patch adds functions to the plugins API to allow plugins to read
> and write memory via hardware addresses. The functions use the current
> address space of the current CPU in order to avoid exposing address
> space information to users. A later patch may want to add a function to
> permit a specified address space, for example to facilitate
> architecture-specific plugins that want to operate on them, for example
> reading ARM secure memory.
>
> Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
> Signed-off-by: Rowan Hart <rowanbhart@gmail.com>
<snip>
> +/**
> + * qemu_plugin_write_memory_hwaddr() - write to memory using a hardware address
> + *
> + * @addr: A physical address to write to
> + * @data: A byte array containing the data to write
> + *
> + * The contents of @data will be written to memory starting at the hardware
> + * address @addr in the current address space for the current vCPU.
> + *
> + * This function does not guarantee consistency of writes, nor does it ensure
> + * that pending writes are flushed either before or after the write takes place,
> + * so callers should take care when calling this function in plugin callbacks to
> + * avoid depending on the existence of data written using this function which
> + * may be overwritten afterward. In addition, this function requires that the
> + * pages containing the address are not locked. Practically, this means that you
> + * should not write instruction memory in a current translation block inside a
> + * callback registered with qemu_plugin_register_vcpu_tb_trans_cb.
> + *
> + * You can, for example, write instruction memory in a current translation block
> + * in a callback registered with qemu_plugin_register_vcpu_tb_exec_cb, although
> + * be aware that the write will not be flushed until after the translation block
> + * has finished executing.  In general, this function should be used to write
> + * data memory or to patch code at a known address, not in a current translation
> + * block.

My main concern about the long list of caveats for writing memory is the
user will almost certainly cause weird things to happen which will then
be hard to debug. I can see the patcher example however it would be
useful to know what other practical uses this interface provides.

<snip>

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro
Re: [PATCH v12 5/7] plugins: Add memory hardware address read/write API
Posted by Pierrick Bouvier 7 months, 3 weeks ago
On 6/17/25 3:24 AM, Alex Bennée wrote:
> Rowan Hart <rowanbhart@gmail.com> writes:
> 
>> From: novafacing <rowanbhart@gmail.com>
>>
>> This patch adds functions to the plugins API to allow plugins to read
>> and write memory via hardware addresses. The functions use the current
>> address space of the current CPU in order to avoid exposing address
>> space information to users. A later patch may want to add a function to
>> permit a specified address space, for example to facilitate
>> architecture-specific plugins that want to operate on them, for example
>> reading ARM secure memory.
>>
>> Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
>> Signed-off-by: Rowan Hart <rowanbhart@gmail.com>
> <snip>
>> +/**
>> + * qemu_plugin_write_memory_hwaddr() - write to memory using a hardware address
>> + *
>> + * @addr: A physical address to write to
>> + * @data: A byte array containing the data to write
>> + *
>> + * The contents of @data will be written to memory starting at the hardware
>> + * address @addr in the current address space for the current vCPU.
>> + *
>> + * This function does not guarantee consistency of writes, nor does it ensure
>> + * that pending writes are flushed either before or after the write takes place,
>> + * so callers should take care when calling this function in plugin callbacks to
>> + * avoid depending on the existence of data written using this function which
>> + * may be overwritten afterward. In addition, this function requires that the
>> + * pages containing the address are not locked. Practically, this means that you
>> + * should not write instruction memory in a current translation block inside a
>> + * callback registered with qemu_plugin_register_vcpu_tb_trans_cb.
>> + *
>> + * You can, for example, write instruction memory in a current translation block
>> + * in a callback registered with qemu_plugin_register_vcpu_tb_exec_cb, although
>> + * be aware that the write will not be flushed until after the translation block
>> + * has finished executing.  In general, this function should be used to write
>> + * data memory or to patch code at a known address, not in a current translation
>> + * block.
> 
> My main concern about the long list of caveats for writing memory is the
> user will almost certainly cause weird things to happen which will then
> be hard to debug. I can see the patcher example however it would be
> useful to know what other practical uses this interface provides.
>

I understand the concern that allowing modification of execution state 
through plugins opens the path for possible bugs. However, it 
significantly augment what is possible to do with them, especially for 
security researchers, as Rowan listed in his answer.
For once, we have someone motivated to contribute upstream instead of 
reinventing another downstream fork, so it should be encouraged.

As well, in case "weird things" happen and people file a bug report, 
they will be free to share their plugin, so we can reproduce and solve 
the problem. It should concern only users trying to modify state of 
execution though, so definitely not the majority of plugins users.

Pierrick

Re: [PATCH v12 5/7] plugins: Add memory hardware address read/write API
Posted by Rowan Hart 7 months, 3 weeks ago
> My main concern about the long list of caveats for writing memory is the
> user will almost certainly cause weird things to happen which will then
> be hard to debug. I can see the patcher example however it would be
> useful to know what other practical uses this interface provides.
>
Of course! My main personal intent here is to facilitate introspection 
and manipulation of guest state for security analysis. Some examples of 
why the memory/register R/W primitives are necessary here include:

Fuzzing:
- Read registers and memory for tracing control flow, comparison 
operands, and profiled values (e.g. memcmp arguments)
- Write memory to inject test cases into the guest (for me and other 
fuzzer developers, this is the biggest reason!)
- Write registers to reset execution or skip over complex checks like 
checksums
- Read and write memory, and read and write registers, to do basic 
snapshot/restore by tracking dirty pages and resetting them

Virtual Machine Introspection (for malware analysis and reverse 
engineering):
- Read memory and registers to find kernel, analyze kernel structures, 
and retrieve info like process lists, memory mappings
- Read memory and registers to quickly trace malware execution in VM
- Write memory and registers to test behavior under various conditions, 
like skipping into checks (motivating example: what happens if you skip 
into the kill switch statement for WannaCry)

Runtime patching (as in the example):
- Writing memory to patch critical legacy code in production often can 
no longer be built or patched via means other than by applying binary 
patches (this is a real problem for e.g. the government, to the point 
where DARPA ran a program 
https://www.darpa.mil/research/programs/assured-micropatching to work on 
it!)
- Writing registers to skip over broken code, redirect to patch code, etc.

Ultimately, the caveats boil down to "don't modify stuff that's touched 
by currently executing code". I personally don't think that's 
unreasonable (as long as it's in the doc-strings) because for any plugin 
that needs to write memory, ensuring the write consistency is probably 
the easiest problem to solve and people working in this space are used 
to having way worse and jankier workarounds. These plugin functions make 
life way easier for them. I have been in touch with 20+ people from 
various companies and projects (including Microsoft, where I work, as 
well as other big and small tech) all working on plugins that could be 
better if this feature existed, so there is definitely a user-base and 
appetite for it!

The last cool use-case is that this moves us a long way towards cleaning 
up the large number of QEMU forks out there designed for RE and security 
testing like QEMU-Nyx, qemuafl, symqemu, and many more. Instead of 
maintaining forks of QEMU (many of these are based on 4.2.0 or older!) 
folks can just maintain a plugin, which lets them take advantage of 
updates and fixes without giant rebases. My goal is to kill these forks 
and have these projects write small, maintainable plugins instead, and 
the authors are on board :)
Re: [PATCH v12 5/7] plugins: Add memory hardware address read/write API
Posted by Alex Bennée 7 months, 3 weeks ago
Rowan Hart <rowanbhart@gmail.com> writes:

>> My main concern about the long list of caveats for writing memory is the
>> user will almost certainly cause weird things to happen which will then
>> be hard to debug. I can see the patcher example however it would be
>> useful to know what other practical uses this interface provides.
>>
> Of course! My main personal intent here is to facilitate introspection
> and manipulation of guest state for security analysis. Some examples
> of why the memory/register R/W primitives are necessary here include:
>
> Fuzzing:
> - Read registers and memory for tracing control flow, comparison
>   operands, and profiled values (e.g. memcmp arguments)
> - Write memory to inject test cases into the guest (for me and other
>   fuzzer developers, this is the biggest reason!)
> - Write registers to reset execution or skip over complex checks like
>   checksums
> - Read and write memory, and read and write registers, to do basic
>   snapshot/restore by tracking dirty pages and resetting them
>
> Virtual Machine Introspection (for malware analysis and reverse
> engineering):
> - Read memory and registers to find kernel, analyze kernel structures,
>   and retrieve info like process lists, memory mappings
> - Read memory and registers to quickly trace malware execution in VM
> - Write memory and registers to test behavior under various
>   conditions, like skipping into checks (motivating example: what
>   happens if you skip into the kill switch statement for WannaCry)
>
> Runtime patching (as in the example):
> - Writing memory to patch critical legacy code in production often can
>   no longer be built or patched via means other than by applying
>   binary patches (this is a real problem for e.g. the government, to
>   the point where DARPA ran a program
>   https://www.darpa.mil/research/programs/assured-micropatching to
>   work on it!)
> - Writing registers to skip over broken code, redirect to patch code, etc.
>
> Ultimately, the caveats boil down to "don't modify stuff that's
> touched by currently executing code". I personally don't think that's
> unreasonable (as long as it's in the doc-strings) because for any
> plugin that needs to write memory, ensuring the write consistency is
> probably the easiest problem to solve and people working in this space
> are used to having way worse and jankier workarounds.

I dread to think what jankier workarounds are!

However I accept that a doc string warning will do for now. I think if
we can strengthen the guarantee at a later date to make the feature more
bullet proof we should. For example we could use start/end_exclusive to
halt the other threads while patching is taking place and then trigger a
full tb-flush to be safe. It depends how often we expect to be patching
things out?

Richard,

Do you have any view about this?

> These plugin
> functions make life way easier for them. I have been in touch with 20+
> people from various companies and projects (including Microsoft, where
> I work, as well as other big and small tech) all working on plugins
> that could be better if this feature existed, so there is definitely a
> user-base and appetite for it!
>
> The last cool use-case is that this moves us a long way towards
> cleaning up the large number of QEMU forks out there designed for RE
> and security testing like QEMU-Nyx, qemuafl, symqemu, and many more.
> Instead of maintaining forks of QEMU (many of these are based on 4.2.0
> or older!) folks can just maintain a plugin, which lets them take
> advantage of updates and fixes without giant rebases. My goal is to
> kill these forks and have these projects write small, maintainable
> plugins instead, and the authors are on board :)

Absolutely - I would like to see that too. The main reason those forks
haven't been up-streamable is because they have to make fairly invasive
changes to the frontends to do their instrumentation. I want to grow the
API to the point we can support these more advanced use cases. I am
however being conservative in adding new APIs so we take the time to get
each one right and minimise:

  - leaking internal details and constricting future evolution of the emulator
  - giving the users too many foot guns in the API

I'll have a look at the next version and see how we are doing.

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro