[libvirt] [PATCH v2 0/4] New mdev type handling for aggregated resources

Zhenyu Wang posted 4 patches 5 years, 9 months ago
Failed in applying to current master (apply log)
Documentation/vfio-mediated-device.txt | 39 +++++++++++++++---
drivers/gpu/drm/i915/gvt/gvt.c         | 26 +++++++++---
drivers/gpu/drm/i915/gvt/gvt.h         | 14 ++++---
drivers/gpu/drm/i915/gvt/kvmgt.c       | 30 +++++++++++---
drivers/gpu/drm/i915/gvt/vgpu.c        | 56 ++++++++++++++++++++++----
drivers/vfio/mdev/mdev_core.c          | 19 +++++++--
drivers/vfio/mdev/mdev_private.h       |  6 ++-
drivers/vfio/mdev/mdev_sysfs.c         | 42 ++++++++++++++++---
include/linux/mdev.h                   | 10 +++++
9 files changed, 203 insertions(+), 39 deletions(-)
[libvirt] [PATCH v2 0/4] New mdev type handling for aggregated resources
Posted by Zhenyu Wang 5 years, 9 months ago
Current mdev device create interface depends on fixed mdev type, which get uuid
from user to create instance of mdev device. If user wants to use customized
number of resource for mdev device, then only can create new mdev type for that
which may not be flexible. This requirement comes not only from to be able to
allocate flexible resources for KVMGT, but also from Intel scalable IO
virtualization which would use vfio/mdev to be able to allocate arbitrary
resources on mdev instance. More info on [1] [2] [3].

To allow to create user defined resources for mdev, it trys to extend mdev
create interface by adding new "instances=xxx" parameter following uuid, for
target mdev type if aggregation is supported, it can create new mdev device
which contains resources combined by number of instances, e.g

    echo "<uuid>,instances=10" > create

VM manager e.g libvirt can check mdev type with "aggregation" attribute which
can support this setting. If no "aggregation" attribute found for mdev type,
previous behavior is still kept for one instance allocation. And new sysfs
attribute "instances" is created for each mdev device to show allocated number.

This trys to create new KVMGT type with minimal vGPU resources which can be
combined with "instances=x" setting to allocate for user wanted resources.

References:
[1] https://software.intel.com/en-us/download/intel-virtualization-technology-for-directed-io-architecture-specification
[2] https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification
[3] https://schd.ws/hosted_files/lc32018/00/LC3-SIOV-final.pdf  

v2:
  - Add new create_with_instances driver hook
  - Update doc for new attributes

Zhenyu Wang (4):
  vfio/mdev: Add new instances parameter for mdev create
  vfio/mdev: Add mdev device instances attribute
  drm/i915/gvt: Add new aggregation type support
  Documentation/vfio-mediated-device.txt: update for aggregation
    attribute

 Documentation/vfio-mediated-device.txt | 39 +++++++++++++++---
 drivers/gpu/drm/i915/gvt/gvt.c         | 26 +++++++++---
 drivers/gpu/drm/i915/gvt/gvt.h         | 14 ++++---
 drivers/gpu/drm/i915/gvt/kvmgt.c       | 30 +++++++++++---
 drivers/gpu/drm/i915/gvt/vgpu.c        | 56 ++++++++++++++++++++++----
 drivers/vfio/mdev/mdev_core.c          | 19 +++++++--
 drivers/vfio/mdev/mdev_private.h       |  6 ++-
 drivers/vfio/mdev/mdev_sysfs.c         | 42 ++++++++++++++++---
 include/linux/mdev.h                   | 10 +++++
 9 files changed, 203 insertions(+), 39 deletions(-)

-- 
2.18.0

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 0/4] New mdev type handling for aggregated resources
Posted by Alex Williamson 5 years, 8 months ago
On Fri, 20 Jul 2018 10:19:24 +0800
Zhenyu Wang <zhenyuw@linux.intel.com> wrote:

> Current mdev device create interface depends on fixed mdev type, which get uuid
> from user to create instance of mdev device. If user wants to use customized
> number of resource for mdev device, then only can create new mdev type for that
> which may not be flexible. This requirement comes not only from to be able to
> allocate flexible resources for KVMGT, but also from Intel scalable IO
> virtualization which would use vfio/mdev to be able to allocate arbitrary
> resources on mdev instance. More info on [1] [2] [3].
> 
> To allow to create user defined resources for mdev, it trys to extend mdev
> create interface by adding new "instances=xxx" parameter following uuid, for
> target mdev type if aggregation is supported, it can create new mdev device
> which contains resources combined by number of instances, e.g
> 
>     echo "<uuid>,instances=10" > create
> 
> VM manager e.g libvirt can check mdev type with "aggregation" attribute which
> can support this setting. If no "aggregation" attribute found for mdev type,
> previous behavior is still kept for one instance allocation. And new sysfs
> attribute "instances" is created for each mdev device to show allocated number.
> 
> This trys to create new KVMGT type with minimal vGPU resources which can be
> combined with "instances=x" setting to allocate for user wanted resources.

"instances" makes me think this is arg helps to create multiple mdev
instances rather than consuming multiple instances for a single mdev.
You're already exposing the "aggregation" attribute, so doesn't
"aggregate" perhaps make more sense as the create option?  We're asking
the driver to aggregate $NUM instances into a single mdev.  The mdev
attribute should then perhaps also be "aggregated_instances".

The next user question for the interface might be what aspect of the
device gets multiplied by this aggregation?  In i915 I see you're
multiplying the memory sizes by the instance, but clearly the
resolution doesn't change.  I assume this is sort of like mdev types
themselves, ie. some degree of what a type means is buried in the
implementation and some degree of what some number of those types
aggregated together means is impossible to describe generically.

We're also going to need to add aggregation to the checklist for device
compatibility for migration, for example 1) is it the same mdev_type,
1a) are the aggregation counts the same (new), 2) is the host driver
compatible (TBD).

The count handling in create_store(), particularly MDEV_CREATE_OPT_LEN
looks a little suspicious.  I think we should just be validating that
the string before the ',' or the entire string if there is no comma is
UUID length.  Pass only that to uuid_le_to_bin().  We can then strncmp
as you have for "instances=" (or "aggregate=") but then let's be sure
to end the string we pass to kstrtouint(), ie. assume there could be
further args.

Documentation/ABI/testing/sysfs-bus-vfio-mdev also needs to be updated
with this series.

I'm curious what libvirt folks and Kirti think of this, it looks like
it has a nice degree of backwards compatibility, both in the sysfs
interface and the vendor driver interface.  Thanks,

Alex

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 0/4] New mdev type handling for aggregated resources
Posted by Erik Skultety 5 years, 8 months ago
On Tue, Jul 24, 2018 at 11:44:40AM -0600, Alex Williamson wrote:
> On Fri, 20 Jul 2018 10:19:24 +0800
> Zhenyu Wang <zhenyuw@linux.intel.com> wrote:
>
> > Current mdev device create interface depends on fixed mdev type, which get uuid
> > from user to create instance of mdev device. If user wants to use customized
> > number of resource for mdev device, then only can create new mdev type for that
> > which may not be flexible. This requirement comes not only from to be able to
> > allocate flexible resources for KVMGT, but also from Intel scalable IO
> > virtualization which would use vfio/mdev to be able to allocate arbitrary
> > resources on mdev instance. More info on [1] [2] [3].
> >
> > To allow to create user defined resources for mdev, it trys to extend mdev
> > create interface by adding new "instances=xxx" parameter following uuid, for
> > target mdev type if aggregation is supported, it can create new mdev device
> > which contains resources combined by number of instances, e.g
> >
> >     echo "<uuid>,instances=10" > create
> >
> > VM manager e.g libvirt can check mdev type with "aggregation" attribute which
> > can support this setting. If no "aggregation" attribute found for mdev type,
> > previous behavior is still kept for one instance allocation. And new sysfs
> > attribute "instances" is created for each mdev device to show allocated number.
> >
> > This trys to create new KVMGT type with minimal vGPU resources which can be
> > combined with "instances=x" setting to allocate for user wanted resources.
>
> "instances" makes me think this is arg helps to create multiple mdev
> instances rather than consuming multiple instances for a single mdev.
> You're already exposing the "aggregation" attribute, so doesn't
> "aggregate" perhaps make more sense as the create option?  We're asking
> the driver to aggregate $NUM instances into a single mdev.  The mdev
> attribute should then perhaps also be "aggregated_instances".
>
> The next user question for the interface might be what aspect of the
> device gets multiplied by this aggregation?  In i915 I see you're
> multiplying the memory sizes by the instance, but clearly the
> resolution doesn't change.  I assume this is sort of like mdev types
> themselves, ie. some degree of what a type means is buried in the
> implementation and some degree of what some number of those types
> aggregated together means is impossible to describe generically.

I don't seem to clearly see the benefit here, so I have to ask, how is this
better and/or different from allowing a heterogeneous setup if one needs a more
performant instance in terms of more resources? Because to me, once you're able
to aggregate instances, I would assume that a simple "echo `uuid`" with a
different type should succeed as well and provide me (from user's perspective)
with the same results. Could you please clarify this to me, as well as what
resources/parameters are going to be impacted by aggregation?

...

>
> I'm curious what libvirt folks and Kirti think of this, it looks like
> it has a nice degree of backwards compatibility, both in the sysfs
> interface and the vendor driver interface.  Thanks,

Since libvirt doesn't have an API to create mdevs yet, this doesn't pose an
issue for us at the moment. I see this adds new optional sysfs attributes which
we could expose within our device capabilities XML, provided it doesn't use a
free form text, like the description attribute does.

Erik

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 0/4] New mdev type handling for aggregated resources
Posted by Alex Williamson 5 years, 8 months ago
On Thu, 26 Jul 2018 15:50:28 +0200
Erik Skultety <eskultet@redhat.com> wrote:

> On Tue, Jul 24, 2018 at 11:44:40AM -0600, Alex Williamson wrote:
> > On Fri, 20 Jul 2018 10:19:24 +0800
> > Zhenyu Wang <zhenyuw@linux.intel.com> wrote:
> >  
> > > Current mdev device create interface depends on fixed mdev type, which get uuid
> > > from user to create instance of mdev device. If user wants to use customized
> > > number of resource for mdev device, then only can create new mdev type for that
> > > which may not be flexible. This requirement comes not only from to be able to
> > > allocate flexible resources for KVMGT, but also from Intel scalable IO
> > > virtualization which would use vfio/mdev to be able to allocate arbitrary
> > > resources on mdev instance. More info on [1] [2] [3].
> > >
> > > To allow to create user defined resources for mdev, it trys to extend mdev
> > > create interface by adding new "instances=xxx" parameter following uuid, for
> > > target mdev type if aggregation is supported, it can create new mdev device
> > > which contains resources combined by number of instances, e.g
> > >
> > >     echo "<uuid>,instances=10" > create
> > >
> > > VM manager e.g libvirt can check mdev type with "aggregation" attribute which
> > > can support this setting. If no "aggregation" attribute found for mdev type,
> > > previous behavior is still kept for one instance allocation. And new sysfs
> > > attribute "instances" is created for each mdev device to show allocated number.
> > >
> > > This trys to create new KVMGT type with minimal vGPU resources which can be
> > > combined with "instances=x" setting to allocate for user wanted resources.  
> >
> > "instances" makes me think this is arg helps to create multiple mdev
> > instances rather than consuming multiple instances for a single mdev.
> > You're already exposing the "aggregation" attribute, so doesn't
> > "aggregate" perhaps make more sense as the create option?  We're asking
> > the driver to aggregate $NUM instances into a single mdev.  The mdev
> > attribute should then perhaps also be "aggregated_instances".
> >
> > The next user question for the interface might be what aspect of the
> > device gets multiplied by this aggregation?  In i915 I see you're
> > multiplying the memory sizes by the instance, but clearly the
> > resolution doesn't change.  I assume this is sort of like mdev types
> > themselves, ie. some degree of what a type means is buried in the
> > implementation and some degree of what some number of those types
> > aggregated together means is impossible to describe generically.  
> 
> I don't seem to clearly see the benefit here, so I have to ask, how is this
> better and/or different from allowing a heterogeneous setup if one needs a more
> performant instance in terms of more resources? Because to me, once you're able
> to aggregate instances, I would assume that a simple "echo `uuid`" with a
> different type should succeed as well and provide me (from user's perspective)
> with the same results. Could you please clarify this to me, as well as what
> resources/parameters are going to be impacted by aggregation?

I think you're suggesting that we could simply define new mdev types to
account for these higher aggregate instances, for example we can
define discrete types that are 2x, 3x, 4x, etc. the resource count of a
single instance.  What I think we're trying to address with this
proposal is what happens when the resources available are exceptionally
large and they can be combined in arbitrary ways.  For example if a
parent device can expose 10,000 resources and the granularity with
which we can create and mdev instance is 1, is it practical to create
10,000 mdev types or does it make more sense to expose a granularity
and method of aggregation.

Using graphics here perhaps falls a little short of the intention of
the interface because the possible types are easily enumerable and it
would be entirely practical to create discrete types for each.  vGPUs
also have a lot of variables, so defining which attribute of the device
is multiplied by the number of instances is a little more fuzzy.
Thanks,

Alex

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 0/4] New mdev type handling for aggregated resources
Posted by Erik Skultety 5 years, 8 months ago
On Thu, Jul 26, 2018 at 08:29:45AM -0600, Alex Williamson wrote:
> On Thu, 26 Jul 2018 15:50:28 +0200
> Erik Skultety <eskultet@redhat.com> wrote:
>
> > On Tue, Jul 24, 2018 at 11:44:40AM -0600, Alex Williamson wrote:
> > > On Fri, 20 Jul 2018 10:19:24 +0800
> > > Zhenyu Wang <zhenyuw@linux.intel.com> wrote:
> > >
> > > > Current mdev device create interface depends on fixed mdev type, which get uuid
> > > > from user to create instance of mdev device. If user wants to use customized
> > > > number of resource for mdev device, then only can create new mdev type for that
> > > > which may not be flexible. This requirement comes not only from to be able to
> > > > allocate flexible resources for KVMGT, but also from Intel scalable IO
> > > > virtualization which would use vfio/mdev to be able to allocate arbitrary
> > > > resources on mdev instance. More info on [1] [2] [3].
> > > >
> > > > To allow to create user defined resources for mdev, it trys to extend mdev
> > > > create interface by adding new "instances=xxx" parameter following uuid, for
> > > > target mdev type if aggregation is supported, it can create new mdev device
> > > > which contains resources combined by number of instances, e.g
> > > >
> > > >     echo "<uuid>,instances=10" > create
> > > >
> > > > VM manager e.g libvirt can check mdev type with "aggregation" attribute which
> > > > can support this setting. If no "aggregation" attribute found for mdev type,
> > > > previous behavior is still kept for one instance allocation. And new sysfs
> > > > attribute "instances" is created for each mdev device to show allocated number.
> > > >
> > > > This trys to create new KVMGT type with minimal vGPU resources which can be
> > > > combined with "instances=x" setting to allocate for user wanted resources.
> > >
> > > "instances" makes me think this is arg helps to create multiple mdev
> > > instances rather than consuming multiple instances for a single mdev.
> > > You're already exposing the "aggregation" attribute, so doesn't
> > > "aggregate" perhaps make more sense as the create option?  We're asking
> > > the driver to aggregate $NUM instances into a single mdev.  The mdev
> > > attribute should then perhaps also be "aggregated_instances".
> > >
> > > The next user question for the interface might be what aspect of the
> > > device gets multiplied by this aggregation?  In i915 I see you're
> > > multiplying the memory sizes by the instance, but clearly the
> > > resolution doesn't change.  I assume this is sort of like mdev types
> > > themselves, ie. some degree of what a type means is buried in the
> > > implementation and some degree of what some number of those types
> > > aggregated together means is impossible to describe generically.
> >
> > I don't seem to clearly see the benefit here, so I have to ask, how is this
> > better and/or different from allowing a heterogeneous setup if one needs a more
> > performant instance in terms of more resources? Because to me, once you're able
> > to aggregate instances, I would assume that a simple "echo `uuid`" with a
> > different type should succeed as well and provide me (from user's perspective)
> > with the same results. Could you please clarify this to me, as well as what
> > resources/parameters are going to be impacted by aggregation?
>
> I think you're suggesting that we could simply define new mdev types to
> account for these higher aggregate instances, for example we can
> define discrete types that are 2x, 3x, 4x, etc. the resource count of a
> single instance.  What I think we're trying to address with this
> proposal is what happens when the resources available are exceptionally
> large and they can be combined in arbitrary ways.  For example if a
> parent device can expose 10,000 resources and the granularity with
> which we can create and mdev instance is 1, is it practical to create
> 10,000 mdev types or does it make more sense to expose a granularity
> and method of aggregation.

Okay, I got a much better picture now, thanks for the clarification.
The granularity you mentioned definitely does give users more power and control
(in terms of provisioning) over the devices.

Erik

>
> Using graphics here perhaps falls a little short of the intention of
> the interface because the possible types are easily enumerable and it
> would be entirely practical to create discrete types for each.  vGPUs
> also have a lot of variables, so defining which attribute of the device
> is multiplied by the number of instances is a little more fuzzy.
> Thanks,
>
> Alex

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 0/4] New mdev type handling for aggregated resources
Posted by Cornelia Huck 5 years, 8 months ago
On Thu, 26 Jul 2018 15:50:28 +0200
Erik Skultety <eskultet@redhat.com> wrote:

> On Tue, Jul 24, 2018 at 11:44:40AM -0600, Alex Williamson wrote:
> > On Fri, 20 Jul 2018 10:19:24 +0800
> > Zhenyu Wang <zhenyuw@linux.intel.com> wrote:
> >  
> > > Current mdev device create interface depends on fixed mdev type, which get uuid
> > > from user to create instance of mdev device. If user wants to use customized
> > > number of resource for mdev device, then only can create new mdev type for that
> > > which may not be flexible. This requirement comes not only from to be able to
> > > allocate flexible resources for KVMGT, but also from Intel scalable IO
> > > virtualization which would use vfio/mdev to be able to allocate arbitrary
> > > resources on mdev instance. More info on [1] [2] [3].
> > >
> > > To allow to create user defined resources for mdev, it trys to extend mdev
> > > create interface by adding new "instances=xxx" parameter following uuid, for
> > > target mdev type if aggregation is supported, it can create new mdev device
> > > which contains resources combined by number of instances, e.g
> > >
> > >     echo "<uuid>,instances=10" > create
> > >
> > > VM manager e.g libvirt can check mdev type with "aggregation" attribute which
> > > can support this setting. If no "aggregation" attribute found for mdev type,
> > > previous behavior is still kept for one instance allocation. And new sysfs
> > > attribute "instances" is created for each mdev device to show allocated number.
> > >
> > > This trys to create new KVMGT type with minimal vGPU resources which can be
> > > combined with "instances=x" setting to allocate for user wanted resources.  
> >
> > "instances" makes me think this is arg helps to create multiple mdev
> > instances rather than consuming multiple instances for a single mdev.
> > You're already exposing the "aggregation" attribute, so doesn't
> > "aggregate" perhaps make more sense as the create option?  We're asking
> > the driver to aggregate $NUM instances into a single mdev.  The mdev
> > attribute should then perhaps also be "aggregated_instances".

I agree, that seems like a better naming scheme.

(...)

> > I'm curious what libvirt folks and Kirti think of this, it looks like
> > it has a nice degree of backwards compatibility, both in the sysfs
> > interface and the vendor driver interface.  Thanks,  
> 
> Since libvirt doesn't have an API to create mdevs yet, this doesn't pose an
> issue for us at the moment. I see this adds new optional sysfs attributes which
> we could expose within our device capabilities XML, provided it doesn't use a
> free form text, like the description attribute does.

One thing I noticed is that we have seem to have an optional (?)
vendor-driver created "aggregation" attribute (which always prints
"true" in the Intel driver). Would it be better or worse for libvirt if
it contained some kind of upper boundary or so? Additionally, would it
be easier if the "create" attribute always accepted
",instances=1" (calling the .create ops in that case?)

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 0/4] New mdev type handling for aggregated resources
Posted by Erik Skultety 5 years, 8 months ago
On Thu, Jul 26, 2018 at 05:30:07PM +0200, Cornelia Huck wrote:
> On Thu, 26 Jul 2018 15:50:28 +0200
> Erik Skultety <eskultet@redhat.com> wrote:
>
> > On Tue, Jul 24, 2018 at 11:44:40AM -0600, Alex Williamson wrote:
> > > On Fri, 20 Jul 2018 10:19:24 +0800
> > > Zhenyu Wang <zhenyuw@linux.intel.com> wrote:
> > >
> > > > Current mdev device create interface depends on fixed mdev type, which get uuid
> > > > from user to create instance of mdev device. If user wants to use customized
> > > > number of resource for mdev device, then only can create new mdev type for that
> > > > which may not be flexible. This requirement comes not only from to be able to
> > > > allocate flexible resources for KVMGT, but also from Intel scalable IO
> > > > virtualization which would use vfio/mdev to be able to allocate arbitrary
> > > > resources on mdev instance. More info on [1] [2] [3].
> > > >
> > > > To allow to create user defined resources for mdev, it trys to extend mdev
> > > > create interface by adding new "instances=xxx" parameter following uuid, for
> > > > target mdev type if aggregation is supported, it can create new mdev device
> > > > which contains resources combined by number of instances, e.g
> > > >
> > > >     echo "<uuid>,instances=10" > create
> > > >
> > > > VM manager e.g libvirt can check mdev type with "aggregation" attribute which
> > > > can support this setting. If no "aggregation" attribute found for mdev type,
> > > > previous behavior is still kept for one instance allocation. And new sysfs
> > > > attribute "instances" is created for each mdev device to show allocated number.
> > > >
> > > > This trys to create new KVMGT type with minimal vGPU resources which can be
> > > > combined with "instances=x" setting to allocate for user wanted resources.
> > >
> > > "instances" makes me think this is arg helps to create multiple mdev
> > > instances rather than consuming multiple instances for a single mdev.
> > > You're already exposing the "aggregation" attribute, so doesn't
> > > "aggregate" perhaps make more sense as the create option?  We're asking
> > > the driver to aggregate $NUM instances into a single mdev.  The mdev
> > > attribute should then perhaps also be "aggregated_instances".
>
> I agree, that seems like a better naming scheme.
>
> (...)
>
> > > I'm curious what libvirt folks and Kirti think of this, it looks like
> > > it has a nice degree of backwards compatibility, both in the sysfs
> > > interface and the vendor driver interface.  Thanks,
> >
> > Since libvirt doesn't have an API to create mdevs yet, this doesn't pose an
> > issue for us at the moment. I see this adds new optional sysfs attributes which
> > we could expose within our device capabilities XML, provided it doesn't use a
> > free form text, like the description attribute does.
>
> One thing I noticed is that we have seem to have an optional (?)
> vendor-driver created "aggregation" attribute (which always prints
> "true" in the Intel driver). Would it be better or worse for libvirt if
> it contained some kind of upper boundary or so? Additionally, would it

Can you be more specific? Although, I wouldn't argue against data that conveys
some information, since I would treat the mere presence of the optional
attribute as a supported feature that we can expose. Therefore, additional
*structured* data which sets clear limits to a certain feature is only a plus
that we can expose to the users/management layer.

> be easier if the "create" attribute always accepted
> ",instances=1" (calling the .create ops in that case?)

Is ^this libvirt related question? If so, then by accepting such syntax you'll
definitely save us a few lines of code ;). However, until we have a clear
vision on how to tackle the mdev migration, I'd like to avoid proposing a
libvirt "create" API until then.

Erik

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 0/4] New mdev type handling for aggregated resources
Posted by Cornelia Huck 5 years, 8 months ago
On Thu, 26 Jul 2018 17:43:45 +0200
Erik Skultety <eskultet@redhat.com> wrote:

> On Thu, Jul 26, 2018 at 05:30:07PM +0200, Cornelia Huck wrote:
> > One thing I noticed is that we have seem to have an optional (?)
> > vendor-driver created "aggregation" attribute (which always prints
> > "true" in the Intel driver). Would it be better or worse for libvirt if
> > it contained some kind of upper boundary or so? Additionally, would it  
> 
> Can you be more specific? Although, I wouldn't argue against data that conveys
> some information, since I would treat the mere presence of the optional
> attribute as a supported feature that we can expose. Therefore, additional
> *structured* data which sets clear limits to a certain feature is only a plus
> that we can expose to the users/management layer.

My question is what would be easiest for libvirt:

- "aggregation" attribute only present when driver supports aggregation
  (possibly containing max number of resources to be aggregated)
- "aggregation" attribute always present; contains '1' if driver does
  not support aggregation and 'm' if driver can aggregate 'm' resources

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 0/4] New mdev type handling for aggregated resources
Posted by Erik Skultety 5 years, 8 months ago
On Thu, Jul 26, 2018 at 06:04:10PM +0200, Cornelia Huck wrote:
> On Thu, 26 Jul 2018 17:43:45 +0200
> Erik Skultety <eskultet@redhat.com> wrote:
>
> > On Thu, Jul 26, 2018 at 05:30:07PM +0200, Cornelia Huck wrote:
> > > One thing I noticed is that we have seem to have an optional (?)
> > > vendor-driver created "aggregation" attribute (which always prints
> > > "true" in the Intel driver). Would it be better or worse for libvirt if
> > > it contained some kind of upper boundary or so? Additionally, would it
> >
> > Can you be more specific? Although, I wouldn't argue against data that conveys
> > some information, since I would treat the mere presence of the optional
> > attribute as a supported feature that we can expose. Therefore, additional
> > *structured* data which sets clear limits to a certain feature is only a plus
> > that we can expose to the users/management layer.
>
> My question is what would be easiest for libvirt:
>
> - "aggregation" attribute only present when driver supports aggregation
>   (possibly containing max number of resources to be aggregated)
> - "aggregation" attribute always present; contains '1' if driver does
>   not support aggregation and 'm' if driver can aggregate 'm' resources

Both are fine from libvirt's POV, but IMHO the former makes a bit more sense
and I'm in favour of that one, IOW the presence of an attribute denotes a new
functionality which we can report, if it's missing, the feature is clearly
lacking- I don't think we (libvirt) should be reporting the value 1 explicitly
in the XML if the feature is missing, we can assume 1 as the default.

Erik

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 0/4] New mdev type handling for aggregated resources
Posted by Zhenyu Wang 5 years, 8 months ago
On 2018.07.27 16:45:55 +0200, Erik Skultety wrote:
> On Thu, Jul 26, 2018 at 06:04:10PM +0200, Cornelia Huck wrote:
> > On Thu, 26 Jul 2018 17:43:45 +0200
> > Erik Skultety <eskultet@redhat.com> wrote:
> >
> > > On Thu, Jul 26, 2018 at 05:30:07PM +0200, Cornelia Huck wrote:
> > > > One thing I noticed is that we have seem to have an optional (?)
> > > > vendor-driver created "aggregation" attribute (which always prints
> > > > "true" in the Intel driver). Would it be better or worse for libvirt if
> > > > it contained some kind of upper boundary or so? Additionally, would it
> > >
> > > Can you be more specific? Although, I wouldn't argue against data that conveys
> > > some information, since I would treat the mere presence of the optional
> > > attribute as a supported feature that we can expose. Therefore, additional
> > > *structured* data which sets clear limits to a certain feature is only a plus
> > > that we can expose to the users/management layer.
> >
> > My question is what would be easiest for libvirt:
> >
> > - "aggregation" attribute only present when driver supports aggregation
> >   (possibly containing max number of resources to be aggregated)
> > - "aggregation" attribute always present; contains '1' if driver does
> >   not support aggregation and 'm' if driver can aggregate 'm' resources
> 
> Both are fine from libvirt's POV, but IMHO the former makes a bit more sense
> and I'm in favour of that one, IOW the presence of an attribute denotes a new
> functionality which we can report, if it's missing, the feature is clearly
> lacking- I don't think we (libvirt) should be reporting the value 1 explicitly
> in the XML if the feature is missing, we can assume 1 as the default.
> 

Good I'll adhere to that, thanks!

-- 
Open Source Technology Center, Intel ltd.

$gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 0/4] New mdev type handling for aggregated resources
Posted by Alex Williamson 5 years, 8 months ago
On Thu, 26 Jul 2018 17:30:07 +0200
Cornelia Huck <cohuck@redhat.com> wrote:

> On Thu, 26 Jul 2018 15:50:28 +0200
> Erik Skultety <eskultet@redhat.com> wrote:
> 
> > On Tue, Jul 24, 2018 at 11:44:40AM -0600, Alex Williamson wrote:  
> > > On Fri, 20 Jul 2018 10:19:24 +0800
> > > Zhenyu Wang <zhenyuw@linux.intel.com> wrote:
> > >    
> > > > Current mdev device create interface depends on fixed mdev type, which get uuid
> > > > from user to create instance of mdev device. If user wants to use customized
> > > > number of resource for mdev device, then only can create new mdev type for that
> > > > which may not be flexible. This requirement comes not only from to be able to
> > > > allocate flexible resources for KVMGT, but also from Intel scalable IO
> > > > virtualization which would use vfio/mdev to be able to allocate arbitrary
> > > > resources on mdev instance. More info on [1] [2] [3].
> > > >
> > > > To allow to create user defined resources for mdev, it trys to extend mdev
> > > > create interface by adding new "instances=xxx" parameter following uuid, for
> > > > target mdev type if aggregation is supported, it can create new mdev device
> > > > which contains resources combined by number of instances, e.g
> > > >
> > > >     echo "<uuid>,instances=10" > create
> > > >
> > > > VM manager e.g libvirt can check mdev type with "aggregation" attribute which
> > > > can support this setting. If no "aggregation" attribute found for mdev type,
> > > > previous behavior is still kept for one instance allocation. And new sysfs
> > > > attribute "instances" is created for each mdev device to show allocated number.
> > > >
> > > > This trys to create new KVMGT type with minimal vGPU resources which can be
> > > > combined with "instances=x" setting to allocate for user wanted resources.    
> > >
> > > "instances" makes me think this is arg helps to create multiple mdev
> > > instances rather than consuming multiple instances for a single mdev.
> > > You're already exposing the "aggregation" attribute, so doesn't
> > > "aggregate" perhaps make more sense as the create option?  We're asking
> > > the driver to aggregate $NUM instances into a single mdev.  The mdev
> > > attribute should then perhaps also be "aggregated_instances".  
> 
> I agree, that seems like a better naming scheme.
> 
> (...)
> 
> > > I'm curious what libvirt folks and Kirti think of this, it looks like
> > > it has a nice degree of backwards compatibility, both in the sysfs
> > > interface and the vendor driver interface.  Thanks,    
> > 
> > Since libvirt doesn't have an API to create mdevs yet, this doesn't pose an
> > issue for us at the moment. I see this adds new optional sysfs attributes which
> > we could expose within our device capabilities XML, provided it doesn't use a
> > free form text, like the description attribute does.  
> 
> One thing I noticed is that we have seem to have an optional (?)
> vendor-driver created "aggregation" attribute (which always prints
> "true" in the Intel driver). Would it be better or worse for libvirt if
> it contained some kind of upper boundary or so?

Ultimately the aggregation value should be fully specified in
Documentation/ABI, but doesn't the kernel generally use 'Y' or 'N' for
boolean attributes in sysfs?  Maybe mdev core can handle the attribute
since it should exist any time the create_with_instances callback is
provided.

> Additionally, would it
> be easier if the "create" attribute always accepted
> ",instances=1" (calling the .create ops in that case?)

Unless I misunderstand the code or the question, I think this is
exactly what happens:

+	if (instances > 1)
+		ret = parent->ops->create_with_instances(kobj, mdev, instances);
+	else
+		ret = parent->ops->create(kobj, mdev);

And elsewhere:

+	if (instances > 1 && !parent->ops->create_with_instances) {
+		ret = -EINVAL;
+		goto mdev_fail;
+	}

If the mdev core supplied the aggregation attribute, then the presence
of the attribute could indicate to userspace whether it can always
provide an instance (aggregate) count, even if limited to '1' when 'N',
for that mdev type. Thanks,

Alex

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 0/4] New mdev type handling for aggregated resources
Posted by Cornelia Huck 5 years, 8 months ago
On Thu, 26 Jul 2018 09:51:26 -0600
Alex Williamson <alex.williamson@redhat.com> wrote:

> On Thu, 26 Jul 2018 17:30:07 +0200
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Thu, 26 Jul 2018 15:50:28 +0200
> > Erik Skultety <eskultet@redhat.com> wrote:
> > > Since libvirt doesn't have an API to create mdevs yet, this doesn't pose an
> > > issue for us at the moment. I see this adds new optional sysfs attributes which
> > > we could expose within our device capabilities XML, provided it doesn't use a
> > > free form text, like the description attribute does.    
> > 
> > One thing I noticed is that we have seem to have an optional (?)
> > vendor-driver created "aggregation" attribute (which always prints
> > "true" in the Intel driver). Would it be better or worse for libvirt if
> > it contained some kind of upper boundary or so?  
> 
> Ultimately the aggregation value should be fully specified in
> Documentation/ABI, but doesn't the kernel generally use 'Y' or 'N' for
> boolean attributes in sysfs?  Maybe mdev core can handle the attribute
> since it should exist any time the create_with_instances callback is
> provided.

It might make sense to print a number if the driver allows a number of
resources to be aggregated which is not the same as available_instances
(see my reply to the documentation patch).

> 
> > Additionally, would it
> > be easier if the "create" attribute always accepted
> > ",instances=1" (calling the .create ops in that case?)  
> 
> Unless I misunderstand the code or the question, I think this is
> exactly what happens:

Indeed, it does. Let me blame the weather ;)

> If the mdev core supplied the aggregation attribute, then the presence
> of the attribute could indicate to userspace whether it can always
> provide an instance (aggregate) count, even if limited to '1' when 'N',
> for that mdev type. Thanks,
> 
> Alex

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 0/4] New mdev type handling for aggregated resources
Posted by Zhenyu Wang 5 years, 8 months ago
On 2018.07.24 11:44:40 -0600, Alex Williamson wrote:
> On Fri, 20 Jul 2018 10:19:24 +0800
> Zhenyu Wang <zhenyuw@linux.intel.com> wrote:
> 
> > Current mdev device create interface depends on fixed mdev type, which get uuid
> > from user to create instance of mdev device. If user wants to use customized
> > number of resource for mdev device, then only can create new mdev type for that
> > which may not be flexible. This requirement comes not only from to be able to
> > allocate flexible resources for KVMGT, but also from Intel scalable IO
> > virtualization which would use vfio/mdev to be able to allocate arbitrary
> > resources on mdev instance. More info on [1] [2] [3].
> > 
> > To allow to create user defined resources for mdev, it trys to extend mdev
> > create interface by adding new "instances=xxx" parameter following uuid, for
> > target mdev type if aggregation is supported, it can create new mdev device
> > which contains resources combined by number of instances, e.g
> > 
> >     echo "<uuid>,instances=10" > create
> > 
> > VM manager e.g libvirt can check mdev type with "aggregation" attribute which
> > can support this setting. If no "aggregation" attribute found for mdev type,
> > previous behavior is still kept for one instance allocation. And new sysfs
> > attribute "instances" is created for each mdev device to show allocated number.
> > 
> > This trys to create new KVMGT type with minimal vGPU resources which can be
> > combined with "instances=x" setting to allocate for user wanted resources.
> 
> "instances" makes me think this is arg helps to create multiple mdev
> instances rather than consuming multiple instances for a single mdev.
> You're already exposing the "aggregation" attribute, so doesn't
> "aggregate" perhaps make more sense as the create option?  We're asking
> the driver to aggregate $NUM instances into a single mdev.  The mdev
> attribute should then perhaps also be "aggregated_instances".

yeah that seems better.

> 
> The next user question for the interface might be what aspect of the
> device gets multiplied by this aggregation?  In i915 I see you're
> multiplying the memory sizes by the instance, but clearly the
> resolution doesn't change.  I assume this is sort of like mdev types
> themselves, ie. some degree of what a type means is buried in the
> implementation and some degree of what some number of those types
> aggregated together means is impossible to describe generically.
>

yeah, the purpose was to increase memory resource only, but due to current
free formatted 'description', can't have a meaningful expression for that,
and not sure if libvirt likes to understand vendor specific behavior e.g
for intel vGPU?

> We're also going to need to add aggregation to the checklist for device
> compatibility for migration, for example 1) is it the same mdev_type,
> 1a) are the aggregation counts the same (new), 2) is the host driver
> compatible (TBD).
>

Right, will check with Zhi on that.

> The count handling in create_store(), particularly MDEV_CREATE_OPT_LEN
> looks a little suspicious.  I think we should just be validating that
> the string before the ',' or the entire string if there is no comma is
> UUID length.  Pass only that to uuid_le_to_bin().  We can then strncmp
> as you have for "instances=" (or "aggregate=") but then let's be sure
> to end the string we pass to kstrtouint(), ie. assume there could be
> further args.

Original purpose was to limit the length of string to accept, but can take
this way without issue I think.

> 
> Documentation/ABI/testing/sysfs-bus-vfio-mdev also needs to be updated
> with this series.

Ok.

Thanks

> 
> I'm curious what libvirt folks and Kirti think of this, it looks like
> it has a nice degree of backwards compatibility, both in the sysfs
> interface and the vendor driver interface.  Thanks,
> 
> Alex

-- 
Open Source Technology Center, Intel ltd.

$gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH v3 0/6] VFIO mdev aggregated resources handling
Posted by Zhenyu Wang 5 years, 6 months ago
Current mdev device create interface depends on fixed mdev type, which get uuid
from user to create instance of mdev device. If user wants to use customized
number of resource for mdev device, then only can create new mdev type for that
which may not be flexible. This requirement comes not only from to be able to
allocate flexible resources for KVMGT, but also from Intel scalable IO
virtualization which would use vfio/mdev to be able to allocate arbitrary
resources on mdev instance. More info on [1] [2] [3].

To allow to create user defined resources for mdev, it trys to extend mdev
create interface by adding new "aggregate=xxx" parameter following UUID, for
target mdev type if aggregation is supported, it can create new mdev device
which contains resources combined by number of instances, e.g

    echo "<uuid>,aggregate=10" > create

VM manager e.g libvirt can check mdev type with "aggregation" attribute which
can support this setting. If no "aggregation" attribute found for mdev type,
previous behavior is still kept for one instance allocation. And new sysfs
attribute "aggregated_instances" is created for each mdev device to show allocated number.

References:
[1] https://software.intel.com/en-us/download/intel-virtualization-technology-for-directed-io-architecture-specification
[2] https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification  
[3] https://schd.ws/hosted_files/lc32018/00/LC3-SIOV-final.pdf

v2:
  - Add new create_with_instances driver hook
  - Update doc for new attributes

v3:
  - Rename parameter and attribute names from review
  - Make "aggregation" attribute to show maxium resource number
    for aggregation
  - Check driver hooks for attribute create validation
  - Update doc and ABI spec

Zhenyu Wang (6):
  vfio/mdev: Add new "aggregate" parameter for mdev create
  vfio/mdev: Add "aggregation" attribute for supported mdev type
  vfio/mdev: Add "aggregated_instances" attribute for supported mdev
    device
  Documentation/vfio-mediated-device.txt: Update for vfio/mdev
    aggregation support
  Documentation/ABI/testing/sysfs-bus-vfio-mdev: Update for vfio/mdev
    aggregation support
  drm/i915/gvt: Add new type with aggregation support

 Documentation/ABI/testing/sysfs-bus-vfio-mdev | 25 +++++++
 Documentation/vfio-mediated-device.txt        | 44 +++++++++--
 drivers/gpu/drm/i915/gvt/gvt.h                | 11 ++-
 drivers/gpu/drm/i915/gvt/kvmgt.c              | 53 ++++++++++++-
 drivers/gpu/drm/i915/gvt/vgpu.c               | 56 +++++++++++++-
 drivers/vfio/mdev/mdev_core.c                 | 40 +++++++++-
 drivers/vfio/mdev/mdev_private.h              |  6 +-
 drivers/vfio/mdev/mdev_sysfs.c                | 74 ++++++++++++++++++-
 include/linux/mdev.h                          | 19 +++++
 9 files changed, 305 insertions(+), 23 deletions(-)

-- 
2.19.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 0/4] New mdev type handling for aggregated resources
Posted by Tian, Kevin 5 years, 6 months ago
Hi, Zhenyu,

curious about the progress of this series. Is there still some open remaining
or a new version coming soon?

Thanks
Kevin

> From: Zhenyu Wang [mailto:zhenyuw@linux.intel.com]
> Sent: Friday, July 20, 2018 10:19 AM
> 
> Current mdev device create interface depends on fixed mdev type, which
> get uuid
> from user to create instance of mdev device. If user wants to use
> customized
> number of resource for mdev device, then only can create new mdev type
> for that
> which may not be flexible. This requirement comes not only from to be
> able to
> allocate flexible resources for KVMGT, but also from Intel scalable IO
> virtualization which would use vfio/mdev to be able to allocate arbitrary
> resources on mdev instance. More info on [1] [2] [3].
> 
> To allow to create user defined resources for mdev, it trys to extend mdev
> create interface by adding new "instances=xxx" parameter following uuid,
> for
> target mdev type if aggregation is supported, it can create new mdev device
> which contains resources combined by number of instances, e.g
> 
>     echo "<uuid>,instances=10" > create
> 
> VM manager e.g libvirt can check mdev type with "aggregation" attribute
> which
> can support this setting. If no "aggregation" attribute found for mdev type,
> previous behavior is still kept for one instance allocation. And new sysfs
> attribute "instances" is created for each mdev device to show allocated
> number.
> 
> This trys to create new KVMGT type with minimal vGPU resources which
> can be
> combined with "instances=x" setting to allocate for user wanted resources.
> 
> References:
> [1] https://software.intel.com/en-us/download/intel-virtualization-
> technology-for-directed-io-architecture-specification
> [2] https://software.intel.com/en-us/download/intel-scalable-io-
> virtualization-technical-specification
> [3] https://schd.ws/hosted_files/lc32018/00/LC3-SIOV-final.pdf
> 
> v2:
>   - Add new create_with_instances driver hook
>   - Update doc for new attributes
> 
> Zhenyu Wang (4):
>   vfio/mdev: Add new instances parameter for mdev create
>   vfio/mdev: Add mdev device instances attribute
>   drm/i915/gvt: Add new aggregation type support
>   Documentation/vfio-mediated-device.txt: update for aggregation
>     attribute
> 
>  Documentation/vfio-mediated-device.txt | 39 +++++++++++++++---
>  drivers/gpu/drm/i915/gvt/gvt.c         | 26 +++++++++---
>  drivers/gpu/drm/i915/gvt/gvt.h         | 14 ++++---
>  drivers/gpu/drm/i915/gvt/kvmgt.c       | 30 +++++++++++---
>  drivers/gpu/drm/i915/gvt/vgpu.c        | 56 ++++++++++++++++++++++----
>  drivers/vfio/mdev/mdev_core.c          | 19 +++++++--
>  drivers/vfio/mdev/mdev_private.h       |  6 ++-
>  drivers/vfio/mdev/mdev_sysfs.c         | 42 ++++++++++++++++---
>  include/linux/mdev.h                   | 10 +++++
>  9 files changed, 203 insertions(+), 39 deletions(-)
> 
> --
> 2.18.0


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH v2 0/4] New mdev type handling for aggregated resources
Posted by Zhenyu Wang 5 years, 6 months ago
On 2018.10.08 03:19:25 +0000, Tian, Kevin wrote:
> Hi, Zhenyu,
> 
> curious about the progress of this series. Is there still some open remaining
> or a new version coming soon?
> 

I had new version almost ready before my vacation, planned to send after be back.
So will refresh this later. Sorry for any delay as was trying to close several
gvt issues.

Thanks.

-- 
Open Source Technology Center, Intel ltd.

$gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list