[v2] Add additional plugin API functions to read and write memory and registers

[PATCH v2 0/3] Add additional plugin API functions to read and write memory and registers

Posted by Rowan Hart 4 months ago

This patch set follows a previous patch which added the
qemu_plugin_read_memory_vaddr function and adds a set of similar
functions to read and write registers, virtual memory, and
physical memory.

The use case I have in mind is for use of QEMU for program analysis and
testing. For example, a fuzzer which uses QEMU for emulation might wish to
inject test data into a program at runtime using qemu_plugin_write_memory_vaddr
(and likewise if testing an operating system or bare metal application using
qemu_plugin_write_memory_hwaddr). It may also wish to read the initial contents
of memory using qemu_plugin_read_memory_vaddr/hwaddr.

Similarly, a testing framework may wish to fake register values, perhaps to
simulate a device failure, perhaps by using qemu_plugin_write_register to set a
register value to an error code.

I think all this functionality works together to make QEMU
plugins more powerful and versatile, hopefully removing barriers
to using upstream QEMU for these tasks which have historically
required maintaining a QEMU fork downstream (like QEMUAFL
https://github.com/AFLplusplus/qemuafl), which is tedious, error
prone, and results in users missing out on enhancements to QEMU.

A test is provided, compile:

gcc -o tests/tcg/x86_64/inject-target tests/tcg/x86_64/inject-target.c

And run:

./build/qemu-x86_64 -d plugin --plugin build/tests/tcg/plugins/libinject.so tests/tcg/x86_64/inject-target

Hopefully after a number of tries, the inject plugin will inject the right
value into the target program, leading to a victory message. This plugin
handles simple "hypercalls", only one of which is implemented and injects
data into guest memory.

novafacing (3):
  Expose gdb_write_register function to consumers of gdbstub
  Add plugin API functions for register R/W, hwaddr R/W, vaddr W
  Add inject plugin and x86_64 target for the inject plugin

 gdbstub/gdbstub.c                |   2 +-
 include/exec/gdbstub.h           |  14 +++
 include/qemu/qemu-plugin.h       | 116 +++++++++++++++--
 plugins/api.c                    |  66 +++++++++-
 tests/tcg/plugins/inject.c       | 206 +++++++++++++++++++++++++++++++
 tests/tcg/plugins/meson.build    |   2 +-
 tests/tcg/x86_64/Makefile.target |   1 +
 tests/tcg/x86_64/inject-target.c |  27 ++++
 8 files changed, 418 insertions(+), 16 deletions(-)
 create mode 100644 tests/tcg/plugins/inject.c
 create mode 100644 tests/tcg/x86_64/inject-target.c

-- 
2.46.1

Re: [PATCH v2 0/3] Add additional plugin API functions to read and write memory and registers

Posted by Pierrick Bouvier 4 months ago

Hi Rowan,

thanks for this submission!

On 12/6/24 02:26, Rowan Hart wrote:
> This patch set follows a previous patch which added the
> qemu_plugin_read_memory_vaddr function and adds a set of similar
> functions to read and write registers, virtual memory, and
> physical memory.
> 
> The use case I have in mind is for use of QEMU for program analysis and
> testing. For example, a fuzzer which uses QEMU for emulation might wish to
> inject test data into a program at runtime using qemu_plugin_write_memory_vaddr
> (and likewise if testing an operating system or bare metal application using
> qemu_plugin_write_memory_hwaddr). It may also wish to read the initial contents
> of memory using qemu_plugin_read_memory_vaddr/hwaddr.
> 

I am personally in favor to adding such features in upstream QEMU, but 
we should discuss it with the maintainers, because it would allow to 
change the state of execution, which is something qemu plugins actively 
didn't try to do. It's a real paradigm shift for plugins.

By writing to memory/registers, we can start replacing instructions and 
control flow, and there is a whole set of consequences to that.

> Similarly, a testing framework may wish to fake register values, perhaps to
> simulate a device failure, perhaps by using qemu_plugin_write_register to set a
> register value to an error code.
> 
> I think all this functionality works together to make QEMU
> plugins more powerful and versatile, hopefully removing barriers
> to using upstream QEMU for these tasks which have historically
> required maintaining a QEMU fork downstream (like QEMUAFL
> https://github.com/AFLplusplus/qemuafl), which is tedious, error
> prone, and results in users missing out on enhancements to QEMU.
> 
> A test is provided, compile:
> 
> gcc -o tests/tcg/x86_64/inject-target tests/tcg/x86_64/inject-target.c
> 
> And run:
> 
> ./build/qemu-x86_64 -d plugin --plugin build/tests/tcg/plugins/libinject.so tests/tcg/x86_64/inject-target
> 
> Hopefully after a number of tries, the inject plugin will inject the right
> value into the target program, leading to a victory message. This plugin
> handles simple "hypercalls", only one of which is implemented and injects
> data into guest memory.
> 

The hypercall functionality would be useful for plugins as a whole. And 
I think it definitely deserves to be worked on, if maintainers are open 
to that as well.

> novafacing (3):
>    Expose gdb_write_register function to consumers of gdbstub
>    Add plugin API functions for register R/W, hwaddr R/W, vaddr W
>    Add inject plugin and x86_64 target for the inject plugin
> 
>   gdbstub/gdbstub.c                |   2 +-
>   include/exec/gdbstub.h           |  14 +++
>   include/qemu/qemu-plugin.h       | 116 +++++++++++++++--
>   plugins/api.c                    |  66 +++++++++-
>   tests/tcg/plugins/inject.c       | 206 +++++++++++++++++++++++++++++++
>   tests/tcg/plugins/meson.build    |   2 +-
>   tests/tcg/x86_64/Makefile.target |   1 +
>   tests/tcg/x86_64/inject-target.c |  27 ++++
>   8 files changed, 418 insertions(+), 16 deletions(-)
>   create mode 100644 tests/tcg/plugins/inject.c
>   create mode 100644 tests/tcg/x86_64/inject-target.c
> 

Regards,
Pierrick

Re: [PATCH v2 0/3] Add additional plugin API functions to read and write memory and registers

Posted by Rowan Hart 4 months ago

> I am personally in favor to adding such features in upstream QEMU, but we should discuss it with the maintainers, because it would allow to change the state of execution, which is something qemu plugins actively didn't try to do. It's a real paradigm shift for plugins.
> 
> By writing to memory/registers, we can start replacing instructions and control flow, and there is a whole set of consequences to that.
> 

Totally agree! As much as I really want this functionality for plugins, I think
alignment on it is quite important. I can see very cool use cases for being
able to replace instructions and control flow to allow hooking functions,
hotpatching, and so forth.

I don't really know the edge cases here so your expertise will be helpful. In
the worst case I can imagine: what would happen if a callback overwrote the
next insn? I'm not sure what behavior I would expect if that insn has already
been translated as part of the same tb. That said, the plugin is aware of which
insns have already been translated, so maybe it is not unreasonable to just
document this as a "don't do that". Let me know what you think.

> The hypercall functionality would be useful for plugins as a whole. And I think it definitely deserves to be worked on, if maintainers are open to that as well.

Sure, I'd be happy to work on this some more. At least on the fuzzing side of
things, the way hypercalls are done across hypervisors (QEMU, Bochs, etc) is
pretty consistent so I think we could provide a useful common set of
functionality. The reason I did the bare minimum here is I'm not familiar with
every architecture, and a good NOP needs to be chosen for each one along with a
reasonable way to pass some arguments -- I don't know if I'm the right person
to make that call.

Glad to hear you're for this idea!

-Rowan

Re: [PATCH v2 0/3] Add additional plugin API functions to read and write memory and registers

Posted by Pierrick Bouvier 4 months ago

On 12/6/24 16:57, Rowan Hart wrote:
>> I am personally in favor to adding such features in upstream QEMU, but we should discuss it with the maintainers, because it would allow to change the state of execution, which is something qemu plugins actively didn't try to do. It's a real paradigm shift for plugins.
>>
>> By writing to memory/registers, we can start replacing instructions and control flow, and there is a whole set of consequences to that.
>>
> 
> Totally agree! As much as I really want this functionality for plugins, I think
> alignment on it is quite important. I can see very cool use cases for being
> able to replace instructions and control flow to allow hooking functions,
> hotpatching, and so forth.
> 
> I don't really know the edge cases here so your expertise will be helpful. In
> the worst case I can imagine: what would happen if a callback overwrote the
> next insn? I'm not sure what behavior I would expect if that insn has already
> been translated as part of the same tb. That said, the plugin is aware of which
> insns have already been translated, so maybe it is not unreasonable to just
> document this as a "don't do that". Let me know what you think.
> 

In the end, if we implement something to modify running code, we should 
make sure it works as expected (flushing the related tb). I am not sure 
about the current status, and all the changes that would be needed, but 
it's something we should discuss before implementing.

More globally, let's wait to hear feedback from maintainers to see if 
they are open to the idea or not. A "hard" no would end it there.

>> The hypercall functionality would be useful for plugins as a whole. And I think it definitely deserves to be worked on, if maintainers are open to that as well.
> 
> Sure, I'd be happy to work on this some more. At least on the fuzzing side of
> things, the way hypercalls are done across hypervisors (QEMU, Bochs, etc) is
> pretty consistent so I think we could provide a useful common set of
> functionality. The reason I did the bare minimum here is I'm not familiar with
> every architecture, and a good NOP needs to be chosen for each one along with a
> reasonable way to pass some arguments -- I don't know if I'm the right person
> to make that call.
> 

We have been discussing something like that for system mode recently, so 
it would definitely be useful.

IMHO, it's open for anyone to contribute this, the plugins area is not a 
private garden where only chosen ones can change things. Just be 
prepared for change requests, and multiple versions before the final one.

Same on this one, we'll see if maintainers are ok with the idea.

> Glad to hear you're for this idea!
> 
> -Rowan

Thanks,
Pierrick

Re: [PATCH v2 0/3] Add additional plugin API functions to read and write memory and registers

Posted by Alex Bennée 3 months, 4 weeks ago

Pierrick Bouvier <pierrick.bouvier@linaro.org> writes:

> On 12/6/24 16:57, Rowan Hart wrote:
>>> I am personally in favor to adding such features in upstream QEMU,
>>> but we should discuss it with the maintainers, because it would
>>> allow to change the state of execution, which is something qemu
>>> plugins actively didn't try to do. It's a real paradigm shift for
>>> plugins.
>>>
>>> By writing to memory/registers, we can start replacing instructions and control flow, and there is a whole set of consequences to that.
>>>
>> Totally agree! As much as I really want this functionality for
>> plugins, I think
>> alignment on it is quite important. I can see very cool use cases for being
>> able to replace instructions and control flow to allow hooking functions,
>> hotpatching, and so forth.

I think currently that makes quite a lot of demands on the translator to
make sure things are kept consistent.

We have been talking about maybe enabling hypercalls of some sort to
allow for hooking explicit function boundaries in linux-user. A natural
extension of that would be for host library bypass functions. I'm unsure
of how that would apply in system emulation mode though where things are
handled on a lot more granular level.

>> I don't really know the edge cases here so your expertise will be
>> helpful. In
>> the worst case I can imagine: what would happen if a callback overwrote the
>> next insn? I'm not sure what behavior I would expect if that insn has already
>> been translated as part of the same tb. That said, the plugin is aware of which
>> insns have already been translated, so maybe it is not unreasonable to just
>> document this as a "don't do that". Let me know what you think.
>> 
>
> In the end, if we implement something to modify running code, we
> should make sure it works as expected (flushing the related tb). I am
> not sure about the current status, and all the changes that would be
> needed, but it's something we should discuss before implementing.

If the right access helpers are used we eventually end up in cputlb and
the code modification detection code will kick in. But that detection
mechanism relies on the guest controlled page tables marking executable
code and honouring rw permissions. If plugins don't honour those
permissions you'll become unstuck quite quickly.

> More globally, let's wait to hear feedback from maintainers to see if
> they are open to the idea or not. A "hard" no would end it there.

It's not a hard no - but I think any such patching ability would need a
quite a bit of thought to make sure edge cases are covered. However I do
expect there will be downstream forks that go further than the upstream
is currently comfortable with and if the code is structured in the right
way we can minimise the pain of re-basing.

>>> The hypercall functionality would be useful for plugins as a whole. And I think it definitely deserves to be worked on, if maintainers are open to that as well.
>> Sure, I'd be happy to work on this some more. At least on the
>> fuzzing side of
>> things, the way hypercalls are done across hypervisors (QEMU, Bochs, etc) is
>> pretty consistent so I think we could provide a useful common set of
>> functionality. The reason I did the bare minimum here is I'm not familiar with
>> every architecture, and a good NOP needs to be chosen for each one along with a
>> reasonable way to pass some arguments -- I don't know if I'm the right person
>> to make that call.
>> 
>
> We have been discussing something like that for system mode recently,
> so it would definitely be useful.
>
> IMHO, it's open for anyone to contribute this, the plugins area is not
> a private garden where only chosen ones can change things. Just be
> prepared for change requests, and multiple versions before the final
> one.
>
> Same on this one, we'll see if maintainers are ok with the idea.
>
>> Glad to hear you're for this idea!
>> -Rowan
>
> Thanks,
> Pierrick

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro

Re: [PATCH v2 0/3] Add additional plugin API functions to read and write memory and registers

Posted by Pierrick Bouvier 3 months, 4 weeks ago

On 12/10/24 03:38, Alex Bennée wrote:
> Pierrick Bouvier <pierrick.bouvier@linaro.org> writes:
> 
>> On 12/6/24 16:57, Rowan Hart wrote:
>>>> I am personally in favor to adding such features in upstream QEMU,
>>>> but we should discuss it with the maintainers, because it would
>>>> allow to change the state of execution, which is something qemu
>>>> plugins actively didn't try to do. It's a real paradigm shift for
>>>> plugins.
>>>>
>>>> By writing to memory/registers, we can start replacing instructions and control flow, and there is a whole set of consequences to that.
>>>>
>>> Totally agree! As much as I really want this functionality for
>>> plugins, I think
>>> alignment on it is quite important. I can see very cool use cases for being
>>> able to replace instructions and control flow to allow hooking functions,
>>> hotpatching, and so forth.
> 
> I think currently that makes quite a lot of demands on the translator to
> make sure things are kept consistent.
> 
> We have been talking about maybe enabling hypercalls of some sort to
> allow for hooking explicit function boundaries in linux-user. A natural
> extension of that would be for host library bypass functions. I'm unsure
> of how that would apply in system emulation mode though where things are
> handled on a lot more granular level.
>

If we are talking about replacing library function calls with host 
variants, I'm not sure it's connected to the hypercalls we are talking 
about here.

>>> I don't really know the edge cases here so your expertise will be
>>> helpful. In
>>> the worst case I can imagine: what would happen if a callback overwrote the
>>> next insn? I'm not sure what behavior I would expect if that insn has already
>>> been translated as part of the same tb. That said, the plugin is aware of which
>>> insns have already been translated, so maybe it is not unreasonable to just
>>> document this as a "don't do that". Let me know what you think.
>>>
>>
>> In the end, if we implement something to modify running code, we
>> should make sure it works as expected (flushing the related tb). I am
>> not sure about the current status, and all the changes that would be
>> needed, but it's something we should discuss before implementing.
> 
> If the right access helpers are used we eventually end up in cputlb and
> the code modification detection code will kick in. But that detection
> mechanism relies on the guest controlled page tables marking executable
> code and honouring rw permissions. If plugins don't honour those
> permissions you'll become unstuck quite quickly.
> 
>> More globally, let's wait to hear feedback from maintainers to see if
>> they are open to the idea or not. A "hard" no would end it there.
> 
> It's not a hard no - but I think any such patching ability would need a
> quite a bit of thought to make sure edge cases are covered. However I do
> expect there will be downstream forks that go further than the upstream
> is currently comfortable with and if the code is structured in the right
> way we can minimise the pain of re-basing.
> 

In a more straightforward way, does it mean you are open to change state 
of execution through plugins?

Out of the technical aspect and getting all the corner cases right, we 
should first discuss if we want to go there, or if we decide it does not 
have its place upstream, and should belong to downstream forks instead.

>>>> The hypercall functionality would be useful for plugins as a whole. And I think it definitely deserves to be worked on, if maintainers are open to that as well.
>>> Sure, I'd be happy to work on this some more. At least on the
>>> fuzzing side of
>>> things, the way hypercalls are done across hypervisors (QEMU, Bochs, etc) is
>>> pretty consistent so I think we could provide a useful common set of
>>> functionality. The reason I did the bare minimum here is I'm not familiar with
>>> every architecture, and a good NOP needs to be chosen for each one along with a
>>> reasonable way to pass some arguments -- I don't know if I'm the right person
>>> to make that call.
>>>
>>
>> We have been discussing something like that for system mode recently,
>> so it would definitely be useful.
>>
>> IMHO, it's open for anyone to contribute this, the plugins area is not
>> a private garden where only chosen ones can change things. Just be
>> prepared for change requests, and multiple versions before the final
>> one.
>>
>> Same on this one, we'll see if maintainers are ok with the idea.
>>
>>> Glad to hear you're for this idea!
>>> -Rowan
>>
>> Thanks,
>> Pierrick
>