[PATCH 0/2] virtio-balloon: make it spec compliant

Michael S. Tsirkin posted 2 patches 1 year, 7 months ago
There is a newer version of this series
arch/um/drivers/virtio_uml.c           |  4 ++--
drivers/remoteproc/remoteproc_virtio.c |  4 ++--
drivers/s390/virtio/virtio_ccw.c       |  4 ++--
drivers/virtio/virtio_balloon.c        | 19 +++++++++++++++++--
drivers/virtio/virtio_mmio.c           |  4 ++--
drivers/virtio/virtio_pci_common.c     |  8 ++++----
drivers/virtio/virtio_vdpa.c           |  4 ++--
7 files changed, 31 insertions(+), 16 deletions(-)
[PATCH 0/2] virtio-balloon: make it spec compliant
Posted by Michael S. Tsirkin 1 year, 7 months ago
Currently, if VIRTIO_BALLOON_F_FREE_PAGE_HINT is off but
VIRTIO_BALLOON_F_REPORTING is on, then the reporting vq
gets number 3 while spec says it's number 4.
It happens to work because the qemu virtio pci driver
is *also* out of spec.

To fix:
1. add vq4 as per spec
2. to help out the buggy qemu driver, if finding vqs fail,
try with vq3 as reporting.

Fixes: b0c504f15471 ("virtio-balloon: add support for providing free page reports to host")
Cc: "Alexander Duyck" <alexander.h.duyck@linux.intel.com>
Reported-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Michael S. Tsirkin (2):
  virtio_balloon: add work around for out of spec QEMU
  virtio: fix vq # for balloon

 arch/um/drivers/virtio_uml.c           |  4 ++--
 drivers/remoteproc/remoteproc_virtio.c |  4 ++--
 drivers/s390/virtio/virtio_ccw.c       |  4 ++--
 drivers/virtio/virtio_balloon.c        | 19 +++++++++++++++++--
 drivers/virtio/virtio_mmio.c           |  4 ++--
 drivers/virtio/virtio_pci_common.c     |  8 ++++----
 drivers/virtio/virtio_vdpa.c           |  4 ++--
 7 files changed, 31 insertions(+), 16 deletions(-)

-- 
MST
Re: [PATCH 0/2] virtio-balloon: make it spec compliant
Posted by David Hildenbrand 1 year, 7 months ago
On 05.07.24 12:08, Michael S. Tsirkin wrote:
> Currently, if VIRTIO_BALLOON_F_FREE_PAGE_HINT is off but
> VIRTIO_BALLOON_F_REPORTING is on, then the reporting vq
> gets number 3 while spec says it's number 4.
> It happens to work because the qemu virtio pci driver
> is *also* out of spec.

I have to ask the obvious: maybe the spec is wrong and we have to refine 
that?

-- 
Cheers,

David / dhildenb
Re: [PATCH 0/2] virtio-balloon: make it spec compliant
Posted by Michael S. Tsirkin 1 year, 7 months ago
On Fri, Jul 05, 2024 at 12:15:30PM +0200, David Hildenbrand wrote:
> On 05.07.24 12:08, Michael S. Tsirkin wrote:
> > Currently, if VIRTIO_BALLOON_F_FREE_PAGE_HINT is off but
> > VIRTIO_BALLOON_F_REPORTING is on, then the reporting vq
> > gets number 3 while spec says it's number 4.
> > It happens to work because the qemu virtio pci driver
> > is *also* out of spec.
> 
> I have to ask the obvious: maybe the spec is wrong and we have to refine
> that?

Well having vq function shift depending on features is certainly
messy ...
How do we know no one implemented the spec as written though?

-- 
MST
Re: [PATCH 0/2] virtio-balloon: make it spec compliant
Posted by David Hildenbrand 1 year, 7 months ago
On 05.07.24 12:19, Michael S. Tsirkin wrote:
> On Fri, Jul 05, 2024 at 12:15:30PM +0200, David Hildenbrand wrote:
>> On 05.07.24 12:08, Michael S. Tsirkin wrote:
>>> Currently, if VIRTIO_BALLOON_F_FREE_PAGE_HINT is off but
>>> VIRTIO_BALLOON_F_REPORTING is on, then the reporting vq
>>> gets number 3 while spec says it's number 4.
>>> It happens to work because the qemu virtio pci driver
>>> is *also* out of spec.
>>
>> I have to ask the obvious: maybe the spec is wrong and we have to refine
>> that?
> 
> Well having vq function shift depending on features is certainly
> messy ...

Right, but that's how all of this started from the beginning.

> How do we know no one implemented the spec as written though?

I understand that concern, IIUC it would imply that:

a) In case of a hypervisor, we never ran with a Linux guest
b) In case of a guest, we never ran under QEMU

It's certainly possible, although I would assume that most other 
implementation candidates (e.g., cloud-hypervisor) would have complained 
by now about Linux issues.

What's your experience: if someone would actually implement it according 
to the spec, would they watch out on the virtio mailing lists for 
changes (or even be able to vote) and would be able to comment that 
adjusting the spec to the real first implementation is wrong?

-- 
Cheers,

David / dhildenb
Re: [PATCH 0/2] virtio-balloon: make it spec compliant
Posted by Michael S. Tsirkin 1 year, 7 months ago
On Fri, Jul 05, 2024 at 01:00:50PM +0200, David Hildenbrand wrote:
> On 05.07.24 12:19, Michael S. Tsirkin wrote:
> > On Fri, Jul 05, 2024 at 12:15:30PM +0200, David Hildenbrand wrote:
> > > On 05.07.24 12:08, Michael S. Tsirkin wrote:
> > > > Currently, if VIRTIO_BALLOON_F_FREE_PAGE_HINT is off but
> > > > VIRTIO_BALLOON_F_REPORTING is on, then the reporting vq
> > > > gets number 3 while spec says it's number 4.
> > > > It happens to work because the qemu virtio pci driver
> > > > is *also* out of spec.
> > > 
> > > I have to ask the obvious: maybe the spec is wrong and we have to refine
> > > that?
> > 
> > Well having vq function shift depending on features is certainly
> > messy ...
> 
> Right, but that's how all of this started from the beginning.
> 
> > How do we know no one implemented the spec as written though?
> 
> I understand that concern, IIUC it would imply that:
> 
> a) In case of a hypervisor, we never ran with a Linux guest
> b) In case of a guest, we never ran under QEMU

Or maybe VIRTIO_BALLOON_F_FREE_PAGE_HINT is set.



> It's certainly possible, although I would assume that most other
> implementation candidates (e.g., cloud-hypervisor) would have complained by
> now about Linux issues.

They either set VIRTIO_BALLOON_F_FREE_PAGE_HINT or followed linux bug to
work around.





> What's your experience: if someone would actually implement it according to
> the spec, would they watch out on the virtio mailing lists for changes (or
> even be able to vote) and would be able to comment that adjusting the spec
> to the real first implementation is wrong?

Unfortunately my experience is that it's not that likely :(


Whatever we do, we need to take existing setups into account.

How would we do it in the spec without breaking working setups?  I guess
we could say that both behaviours are legal.  That would still mean we
need the qemu and linux patches, right?

-- 
MST
Re: [PATCH 0/2] virtio-balloon: make it spec compliant
Posted by David Hildenbrand 1 year, 7 months ago
Sorry for the late reply!

>> I understand that concern, IIUC it would imply that:
>>
>> a) In case of a hypervisor, we never ran with a Linux guest
>> b) In case of a guest, we never ran under QEMU
> 
> Or maybe VIRTIO_BALLOON_F_FREE_PAGE_HINT is set.

Right, in which case it would be according to the spec.

>> It's certainly possible, although I would assume that most other
>> implementation candidates (e.g., cloud-hypervisor) would have complained by
>> now about Linux issues.
> 
> They either set VIRTIO_BALLOON_F_FREE_PAGE_HINT or followed linux bug to
> work around.

Okay, in the latter case it would be the unofficial way of doing it.

>> What's your experience: if someone would actually implement it according to
>> the spec, would they watch out on the virtio mailing lists for changes (or
>> even be able to vote) and would be able to comment that adjusting the spec
>> to the real first implementation is wrong?
> 
> Unfortunately my experience is that it's not that likely :(

That's what I thought, unfortunately.

> 
> 
> Whatever we do, we need to take existing setups into account.
> 
> How would we do it in the spec without breaking working setups?  I guess
> we could say that both behaviours are legal.  That would still mean we
> need the qemu and linux patches, right?

That makes sense to me. Let me take a look at the patches.

-- 
Cheers,

David / dhildenb