[PATCH v6 02/39] system/memory: Restrict eventfd dispatch_write() to emulators

Philippe Mathieu-Daudé posted 39 patches 5 months, 2 weeks ago
Maintainers: Richard Henderson <richard.henderson@linaro.org>, Paolo Bonzini <pbonzini@redhat.com>, "Philippe Mathieu-Daudé" <philmd@linaro.org>, Cameron Esfahani <dirty@apple.com>, Roman Bolshakov <rbolshakov@ddn.com>, Phil Dennis-Jordan <phil@philjordan.eu>, Mads Ynddal <mads@ynddal.dk>, Fabiano Rosas <farosas@suse.de>, Laurent Vivier <lvivier@redhat.com>, Stefano Stabellini <sstabellini@kernel.org>, Anthony PERARD <anthony@xenproject.org>, Paul Durrant <paul@xen.org>, "Edgar E. Iglesias" <edgar.iglesias@gmail.com>, Warner Losh <imp@bsdimp.com>, Kyle Evans <kevans@freebsd.org>, "Alex Bennée" <alex.bennee@linaro.org>, "Dr. David Alan Gilbert" <dave@treblig.org>, Eduardo Habkost <eduardo@habkost.net>, Marcel Apfelbaum <marcel.apfelbaum@gmail.com>, Yanan Wang <wangyanan55@huawei.com>, Zhao Liu <zhao1.liu@intel.com>, Reinoud Zandijk <reinoud@netbsd.org>, Sunil Muthuswamy <sunilmut@microsoft.com>, Eric Blake <eblake@redhat.com>, Markus Armbruster <armbru@redhat.com>, Peter Xu <peterx@redhat.com>, David Hildenbrand <david@redhat.com>, Alexander Graf <agraf@csgraf.de>, Peter Maydell <peter.maydell@linaro.org>
[PATCH v6 02/39] system/memory: Restrict eventfd dispatch_write() to emulators
Posted by Philippe Mathieu-Daudé 5 months, 2 weeks ago
Commit 8c56c1a592b ("memory: emulate ioeventfd") added a !KVM
check because the only accelerator available back then were TCG,
QTest and KVM. Then commit 126e7f78036 ("kvm: require
KVM_CAP_IOEVENTFD and KVM_CAP_IOEVENTFD_ANY_LENGTH") suggested
'!KVM' check should be '(TCG || QTest)'. Later more accelerator
were added. Implement the suggestion as a safety measure, not
dispatching to eventfd when hardware accelerator is used.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 system/memory.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/system/memory.c b/system/memory.c
index 76b44b8220f..4f713889a8e 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -25,7 +25,7 @@
 #include "qom/object.h"
 #include "trace.h"
 #include "system/ram_addr.h"
-#include "system/kvm.h"
+#include "system/qtest.h"
 #include "system/runstate.h"
 #include "system/tcg.h"
 #include "qemu/accel.h"
@@ -1530,12 +1530,7 @@ MemTxResult memory_region_dispatch_write(MemoryRegion *mr,
 
     adjust_endianness(mr, &data, op);
 
-    /*
-     * FIXME: it's not clear why under KVM the write would be processed
-     * directly, instead of going through eventfd.  This probably should
-     * test "tcg_enabled() || qtest_enabled()", or should just go away.
-     */
-    if (!kvm_enabled() &&
+    if ((tcg_enabled() || qtest_enabled()) &&
         memory_region_dispatch_write_eventfds(mr, addr, data, size, attrs)) {
         return MEMTX_OK;
     }
-- 
2.49.0


Re: [PATCH v6 02/39] system/memory: Restrict eventfd dispatch_write() to emulators
Posted by Xiaoyao Li 5 months, 2 weeks ago
On 7/4/2025 1:32 AM, Philippe Mathieu-Daudé wrote:
> Commit 8c56c1a592b ("memory: emulate ioeventfd") added a !KVM
> check because the only accelerator available back then were TCG,
> QTest and KVM. Then commit 126e7f78036 ("kvm: require
> KVM_CAP_IOEVENTFD and KVM_CAP_IOEVENTFD_ANY_LENGTH") suggested
> '!KVM' check should be '(TCG || QTest)'. Later more accelerator
> were added. Implement the suggestion as a safety measure, not
> dispatching to eventfd when hardware accelerator is used.
> 
> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   system/memory.c | 9 ++-------
>   1 file changed, 2 insertions(+), 7 deletions(-)
> 
> diff --git a/system/memory.c b/system/memory.c
> index 76b44b8220f..4f713889a8e 100644
> --- a/system/memory.c
> +++ b/system/memory.c
> @@ -25,7 +25,7 @@
>   #include "qom/object.h"
>   #include "trace.h"
>   #include "system/ram_addr.h"
> -#include "system/kvm.h"
> +#include "system/qtest.h"
>   #include "system/runstate.h"
>   #include "system/tcg.h"
>   #include "qemu/accel.h"
> @@ -1530,12 +1530,7 @@ MemTxResult memory_region_dispatch_write(MemoryRegion *mr,
>   
>       adjust_endianness(mr, &data, op);
>   
> -    /*
> -     * FIXME: it's not clear why under KVM the write would be processed
> -     * directly, instead of going through eventfd.  This probably should
> -     * test "tcg_enabled() || qtest_enabled()", or should just go away.
> -     */
> -    if (!kvm_enabled() &&
> +    if ((tcg_enabled() || qtest_enabled()) &&

The FIXME provides two options:
1. change to "tcg_enabled() || qtest_enabled()"
2. remove !kvm_enabled()

And as the FIXME said, it's not clear why under KVM the write would be 
processed directly. Now, the question becomes why under hardware 
accelerator is used the write would be processed directly instead of 
going through eventfd. I think it needs to answer this question when we 
do such change, and it's better to put the answer as the comment in the 
code.

>           memory_region_dispatch_write_eventfds(mr, addr, data, size, attrs)) {
>           return MEMTX_OK;
>       }


Re: [PATCH v6 02/39] system/memory: Restrict eventfd dispatch_write() to emulators
Posted by Alex Bennée 5 months, 2 weeks ago
Xiaoyao Li <xiaoyao.li@intel.com> writes:

> On 7/4/2025 1:32 AM, Philippe Mathieu-Daudé wrote:
>> Commit 8c56c1a592b ("memory: emulate ioeventfd") added a !KVM
>> check because the only accelerator available back then were TCG,
>> QTest and KVM. Then commit 126e7f78036 ("kvm: require
>> KVM_CAP_IOEVENTFD and KVM_CAP_IOEVENTFD_ANY_LENGTH") suggested
>> '!KVM' check should be '(TCG || QTest)'. Later more accelerator
>> were added. Implement the suggestion as a safety measure, not
>> dispatching to eventfd when hardware accelerator is used.
>> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
>> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>>   system/memory.c | 9 ++-------
>>   1 file changed, 2 insertions(+), 7 deletions(-)
>> diff --git a/system/memory.c b/system/memory.c
>> index 76b44b8220f..4f713889a8e 100644
>> --- a/system/memory.c
>> +++ b/system/memory.c
>> @@ -25,7 +25,7 @@
>>   #include "qom/object.h"
>>   #include "trace.h"
>>   #include "system/ram_addr.h"
>> -#include "system/kvm.h"
>> +#include "system/qtest.h"
>>   #include "system/runstate.h"
>>   #include "system/tcg.h"
>>   #include "qemu/accel.h"
>> @@ -1530,12 +1530,7 @@ MemTxResult memory_region_dispatch_write(MemoryRegion *mr,
>>         adjust_endianness(mr, &data, op);
>>   -    /*
>> -     * FIXME: it's not clear why under KVM the write would be processed
>> -     * directly, instead of going through eventfd.  This probably should
>> -     * test "tcg_enabled() || qtest_enabled()", or should just go away.
>> -     */
>> -    if (!kvm_enabled() &&
>> +    if ((tcg_enabled() || qtest_enabled()) &&
>
> The FIXME provides two options:
> 1. change to "tcg_enabled() || qtest_enabled()"
> 2. remove !kvm_enabled()
>
> And as the FIXME said, it's not clear why under KVM the write would be
> processed directly. Now, the question becomes why under hardware
> accelerator is used the write would be processed directly instead of
> going through eventfd. I think it needs to answer this question when
> we do such change, and it's better to put the answer as the comment in
> the code.

Under KVM the eventfd notifications are sent directly from the kernel to
the relevant fd. There is no reason why under KVM you couldn't inject
the eventfds from QEMU but it would be a weird and sub-optimal setup.
KVM is perfectly capable of trapping the MMIO accesses in kernel.

I don't think eventfd's can be supported for HVF because I don't think
it has such a concept. For vhost-user devices they would then rely on
VHOST_USER_VRING_KICK over the socket instead.

>
>>           memory_region_dispatch_write_eventfds(mr, addr, data, size, attrs)) {
>>           return MEMTX_OK;
>>       }

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro