net/net.c | 30 ++++++++++++++++++++---------- net/vhost-vdpa.c | 8 -------- 2 files changed, 20 insertions(+), 18 deletions(-)
Commit a0d7215e33 ("vhost-vdpa: do not cleanup the vdpa/vhost-net structures if peer nic is present") effectively delayed the backend cleanup, allowing the frontend or the guest to access it resources as long as the frontend NIC is still visible to the guest. However it does not clean up the resources until the qemu process is over. This causes an effective leak if the device is deleted with device_del, as there is no way to close the vdpa device. This makes impossible to re-add that device to this or other QEMU instances until the first instance of QEMU is finished. Move the cleanup from qemu_cleanup to the NIC deletion. Fixes: a0d7215e33 ("vhost-vdpa: do not cleanup the vdpa/vhost-net structures if peer nic is present") Acked-by: Jason Wang <jasowang@redhat.com> Reported-by: Lei Yang <leiyang@redhat.com> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Eugenio Pérez (2): net: parameterize the removing client from nc list net: move backend cleanup to NIC cleanup net/net.c | 30 ++++++++++++++++++++---------- net/vhost-vdpa.c | 8 -------- 2 files changed, 20 insertions(+), 18 deletions(-) -- 2.39.3
On Mon, Jan 29, 2024 at 9:24 PM Eugenio Pérez <eperezma@redhat.com> wrote: > > Commit a0d7215e33 ("vhost-vdpa: do not cleanup the vdpa/vhost-net > structures if peer nic is present") effectively delayed the backend > cleanup, allowing the frontend or the guest to access it resources as > long as the frontend NIC is still visible to the guest. > > However it does not clean up the resources until the qemu process is > over. This causes an effective leak if the device is deleted with > device_del, as there is no way to close the vdpa device. This makes > impossible to re-add that device to this or other QEMU instances until > the first instance of QEMU is finished. > > Move the cleanup from qemu_cleanup to the NIC deletion. > > Fixes: a0d7215e33 ("vhost-vdpa: do not cleanup the vdpa/vhost-net structures if peer nic is present") > Acked-by: Jason Wang <jasowang@redhat.com> > Reported-by: Lei Yang <leiyang@redhat.com> > Signed-off-by: Eugenio Pérez <eperezma@redhat.com> > > Eugenio Pérez (2): > net: parameterize the removing client from nc list > net: move backend cleanup to NIC cleanup > > net/net.c | 30 ++++++++++++++++++++---------- > net/vhost-vdpa.c | 8 -------- > 2 files changed, 20 insertions(+), 18 deletions(-) > > -- Queued. Thanks
Hi Jason, It seems this series wasn't applied successfully, I still cannot see it from the latest tree. Any idea? In any case the fix LGTM. Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Thanks, -Siwei On 1/31/2024 9:43 PM, Jason Wang wrote: > On Mon, Jan 29, 2024 at 9:24 PM Eugenio Pérez <eperezma@redhat.com> wrote: >> Commit a0d7215e33 ("vhost-vdpa: do not cleanup the vdpa/vhost-net >> structures if peer nic is present") effectively delayed the backend >> cleanup, allowing the frontend or the guest to access it resources as >> long as the frontend NIC is still visible to the guest. >> >> However it does not clean up the resources until the qemu process is >> over. This causes an effective leak if the device is deleted with >> device_del, as there is no way to close the vdpa device. This makes >> impossible to re-add that device to this or other QEMU instances until >> the first instance of QEMU is finished. >> >> Move the cleanup from qemu_cleanup to the NIC deletion. >> >> Fixes: a0d7215e33 ("vhost-vdpa: do not cleanup the vdpa/vhost-net structures if peer nic is present") >> Acked-by: Jason Wang <jasowang@redhat.com> >> Reported-by: Lei Yang <leiyang@redhat.com> >> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> >> >> Eugenio Pérez (2): >> net: parameterize the removing client from nc list >> net: move backend cleanup to NIC cleanup >> >> net/net.c | 30 ++++++++++++++++++++---------- >> net/vhost-vdpa.c | 8 -------- >> 2 files changed, 20 insertions(+), 18 deletions(-) >> >> -- > Queued. > > Thanks >
On Tue, Sep 10, 2024 at 11:41 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > > Hi Jason, > > It seems this series wasn't applied successfully, I still cannot see it > from the latest tree. Any idea? It breaks make check. Eugenio, would you want to fix and resend the series? Thanks > > In any case the fix LGTM. > > Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> > > Thanks, > -Siwei > > On 1/31/2024 9:43 PM, Jason Wang wrote: > > On Mon, Jan 29, 2024 at 9:24 PM Eugenio Pérez <eperezma@redhat.com> wrote: > >> Commit a0d7215e33 ("vhost-vdpa: do not cleanup the vdpa/vhost-net > >> structures if peer nic is present") effectively delayed the backend > >> cleanup, allowing the frontend or the guest to access it resources as > >> long as the frontend NIC is still visible to the guest. > >> > >> However it does not clean up the resources until the qemu process is > >> over. This causes an effective leak if the device is deleted with > >> device_del, as there is no way to close the vdpa device. This makes > >> impossible to re-add that device to this or other QEMU instances until > >> the first instance of QEMU is finished. > >> > >> Move the cleanup from qemu_cleanup to the NIC deletion. > >> > >> Fixes: a0d7215e33 ("vhost-vdpa: do not cleanup the vdpa/vhost-net structures if peer nic is present") > >> Acked-by: Jason Wang <jasowang@redhat.com> > >> Reported-by: Lei Yang <leiyang@redhat.com> > >> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> > >> > >> Eugenio Pérez (2): > >> net: parameterize the removing client from nc list > >> net: move backend cleanup to NIC cleanup > >> > >> net/net.c | 30 ++++++++++++++++++++---------- > >> net/vhost-vdpa.c | 8 -------- > >> 2 files changed, 20 insertions(+), 18 deletions(-) > >> > >> -- > > Queued. > > > > Thanks > > >
On Tue, Sep 10, 2024 at 5:46 AM Jason Wang <jasowang@redhat.com> wrote: > > On Tue, Sep 10, 2024 at 11:41 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > > > > Hi Jason, > > > > It seems this series wasn't applied successfully, I still cannot see it > > from the latest tree. Any idea? > > It breaks make check. > > Eugenio, would you want to fix and resend the series? > I'm trying to reproduce but with no luck :(. For the record this is the failed log. Is it possible to try to reproduce it again in the machine / env it crashed? ▶ 10/354 ERROR:../tests/qtest/qos-test. c:191:subprocess_run_one_test: child process (/x86_64/pc/i440FX-pcihost/pci-bus-pc/pci-bus/virtio-net-pci/virtio-net/virtio-net-tests/vhost-user/migrate/subprocess [1494462]) failed unexpectedly ERROR 10/354 qemu:qtest+qtest-x86_64 / qtest-x86_64/qos-test ERROR 14.19s killed by signal 6 SIGABRT >>> PYTHON=/home/devel/git/qemu/build/pyvenv/bin/python3 G_TEST_DBUS_DAEMON=/home/devel/git/qemu/tests/dbus-vmstate-daemon.sh QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon QTEST_QEMU_IMG=./qemu-img QTEST_QEMU_BINARY=./qemu-system-x86_64 MALLOC_PERTURB_=82 /home/devel/git/qemu/build/tests/qtest/qos-test --tap -k ――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――― ✀ ――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――― stderr: Vhost user backend fails to broadcast fake RARP ../tests/qtest/libqtest.c:204: kill_qemu() detected QEMU death from signal 11 (Segmentation fault) (core dumped) ../tests/qtest/libqtest.c:204: kill_qemu() detected QEMU death from signal 11 (Segmentation fault) (core dumped) ** ERROR:../tests/qtest/qos-test.c:191:subprocess_run_one_test: child process (/x86_64/pc/i440FX-pcihost/pci-bus-pc/pci-bus/virtio-net-pci/virtio-net/virtio-net-tests/vhost-user/migrate/subprocess [1494462]) failed unexpectedly > Thanks > > > > > In any case the fix LGTM. > > > > Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> > > > > Thanks, > > -Siwei > > > > On 1/31/2024 9:43 PM, Jason Wang wrote: > > > On Mon, Jan 29, 2024 at 9:24 PM Eugenio Pérez <eperezma@redhat.com> wrote: > > >> Commit a0d7215e33 ("vhost-vdpa: do not cleanup the vdpa/vhost-net > > >> structures if peer nic is present") effectively delayed the backend > > >> cleanup, allowing the frontend or the guest to access it resources as > > >> long as the frontend NIC is still visible to the guest. > > >> > > >> However it does not clean up the resources until the qemu process is > > >> over. This causes an effective leak if the device is deleted with > > >> device_del, as there is no way to close the vdpa device. This makes > > >> impossible to re-add that device to this or other QEMU instances until > > >> the first instance of QEMU is finished. > > >> > > >> Move the cleanup from qemu_cleanup to the NIC deletion. > > >> > > >> Fixes: a0d7215e33 ("vhost-vdpa: do not cleanup the vdpa/vhost-net structures if peer nic is present") > > >> Acked-by: Jason Wang <jasowang@redhat.com> > > >> Reported-by: Lei Yang <leiyang@redhat.com> > > >> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> > > >> > > >> Eugenio Pérez (2): > > >> net: parameterize the removing client from nc list > > >> net: move backend cleanup to NIC cleanup > > >> > > >> net/net.c | 30 ++++++++++++++++++++---------- > > >> net/vhost-vdpa.c | 8 -------- > > >> 2 files changed, 20 insertions(+), 18 deletions(-) > > >> > > >> -- > > > Queued. > > > > > > Thanks > > > > > >
On Wed, Sep 11, 2024 at 11:04 AM Eugenio Perez Martin <eperezma@redhat.com> wrote: > > On Tue, Sep 10, 2024 at 5:46 AM Jason Wang <jasowang@redhat.com> wrote: > > > > On Tue, Sep 10, 2024 at 11:41 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote: > > > > > > Hi Jason, > > > > > > It seems this series wasn't applied successfully, I still cannot see it > > > from the latest tree. Any idea? > > > > It breaks make check. > > > > Eugenio, would you want to fix and resend the series? > > > > I'm trying to reproduce but with no luck :(. > I'm able to reproduce consistently now. > For the record this is the failed log. Is it possible to try to > reproduce it again in the machine / env it crashed? > > ▶ 10/354 ERROR:../tests/qtest/qos-test. > c:191:subprocess_run_one_test: > child process (/x86_64/pc/i440FX-pcihost/pci-bus-pc/pci-bus/virtio-net-pci/virtio-net/virtio-net-tests/vhost-user/migrate/subprocess > [1494462]) failed unexpectedly ERROR > 10/354 qemu:qtest+qtest-x86_64 / qtest-x86_64/qos-test > ERROR 14.19s killed by signal 6 SIGABRT > >>> PYTHON=/home/devel/git/qemu/build/pyvenv/bin/python3 G_TEST_DBUS_DAEMON=/home/devel/git/qemu/tests/dbus-vmstate-daemon.sh QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon QTEST_QEMU_IMG=./qemu-img QTEST_QEMU_BINARY=./qemu-system-x86_64 MALLOC_PERTURB_=82 /home/devel/git/qemu/build/tests/qtest/qos-test --tap -k > ――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――― > ✀ ――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――― > stderr: > Vhost user backend fails to broadcast fake RARP > ../tests/qtest/libqtest.c:204: kill_qemu() detected QEMU death from > signal 11 (Segmentation fault) (core dumped) > ../tests/qtest/libqtest.c:204: kill_qemu() detected QEMU death from > signal 11 (Segmentation fault) (core dumped) > ** > ERROR:../tests/qtest/qos-test.c:191:subprocess_run_one_test: child > process (/x86_64/pc/i440FX-pcihost/pci-bus-pc/pci-bus/virtio-net-pci/virtio-net/virtio-net-tests/vhost-user/migrate/subprocess > [1494462]) failed unexpectedly > > Thanks > > > > > > > > In any case the fix LGTM. > > > > > > Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> > > > > > > Thanks, > > > -Siwei > > > > > > On 1/31/2024 9:43 PM, Jason Wang wrote: > > > > On Mon, Jan 29, 2024 at 9:24 PM Eugenio Pérez <eperezma@redhat.com> wrote: > > > >> Commit a0d7215e33 ("vhost-vdpa: do not cleanup the vdpa/vhost-net > > > >> structures if peer nic is present") effectively delayed the backend > > > >> cleanup, allowing the frontend or the guest to access it resources as > > > >> long as the frontend NIC is still visible to the guest. > > > >> > > > >> However it does not clean up the resources until the qemu process is > > > >> over. This causes an effective leak if the device is deleted with > > > >> device_del, as there is no way to close the vdpa device. This makes > > > >> impossible to re-add that device to this or other QEMU instances until > > > >> the first instance of QEMU is finished. > > > >> > > > >> Move the cleanup from qemu_cleanup to the NIC deletion. > > > >> > > > >> Fixes: a0d7215e33 ("vhost-vdpa: do not cleanup the vdpa/vhost-net structures if peer nic is present") > > > >> Acked-by: Jason Wang <jasowang@redhat.com> > > > >> Reported-by: Lei Yang <leiyang@redhat.com> > > > >> Signed-off-by: Eugenio Pérez <eperezma@redhat.com> > > > >> > > > >> Eugenio Pérez (2): > > > >> net: parameterize the removing client from nc list > > > >> net: move backend cleanup to NIC cleanup > > > >> > > > >> net/net.c | 30 ++++++++++++++++++++---------- > > > >> net/vhost-vdpa.c | 8 -------- > > > >> 2 files changed, 20 insertions(+), 18 deletions(-) > > > >> > > > >> -- > > > > Queued. > > > > > > > > Thanks > > > > > > > > >
© 2016 - 2024 Red Hat, Inc.