Documentation/vfio-mediated-device.txt | 39 +++++++++++++++--- drivers/gpu/drm/i915/gvt/gvt.c | 26 +++++++++--- drivers/gpu/drm/i915/gvt/gvt.h | 14 ++++--- drivers/gpu/drm/i915/gvt/kvmgt.c | 30 +++++++++++--- drivers/gpu/drm/i915/gvt/vgpu.c | 56 ++++++++++++++++++++++---- drivers/vfio/mdev/mdev_core.c | 19 +++++++-- drivers/vfio/mdev/mdev_private.h | 6 ++- drivers/vfio/mdev/mdev_sysfs.c | 42 ++++++++++++++++--- include/linux/mdev.h | 10 +++++ 9 files changed, 203 insertions(+), 39 deletions(-)
Current mdev device create interface depends on fixed mdev type, which get uuid from user to create instance of mdev device. If user wants to use customized number of resource for mdev device, then only can create new mdev type for that which may not be flexible. This requirement comes not only from to be able to allocate flexible resources for KVMGT, but also from Intel scalable IO virtualization which would use vfio/mdev to be able to allocate arbitrary resources on mdev instance. More info on [1] [2] [3]. To allow to create user defined resources for mdev, it trys to extend mdev create interface by adding new "instances=xxx" parameter following uuid, for target mdev type if aggregation is supported, it can create new mdev device which contains resources combined by number of instances, e.g echo "<uuid>,instances=10" > create VM manager e.g libvirt can check mdev type with "aggregation" attribute which can support this setting. If no "aggregation" attribute found for mdev type, previous behavior is still kept for one instance allocation. And new sysfs attribute "instances" is created for each mdev device to show allocated number. This trys to create new KVMGT type with minimal vGPU resources which can be combined with "instances=x" setting to allocate for user wanted resources. References: [1] https://software.intel.com/en-us/download/intel-virtualization-technology-for-directed-io-architecture-specification [2] https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification [3] https://schd.ws/hosted_files/lc32018/00/LC3-SIOV-final.pdf v2: - Add new create_with_instances driver hook - Update doc for new attributes Zhenyu Wang (4): vfio/mdev: Add new instances parameter for mdev create vfio/mdev: Add mdev device instances attribute drm/i915/gvt: Add new aggregation type support Documentation/vfio-mediated-device.txt: update for aggregation attribute Documentation/vfio-mediated-device.txt | 39 +++++++++++++++--- drivers/gpu/drm/i915/gvt/gvt.c | 26 +++++++++--- drivers/gpu/drm/i915/gvt/gvt.h | 14 ++++--- drivers/gpu/drm/i915/gvt/kvmgt.c | 30 +++++++++++--- drivers/gpu/drm/i915/gvt/vgpu.c | 56 ++++++++++++++++++++++---- drivers/vfio/mdev/mdev_core.c | 19 +++++++-- drivers/vfio/mdev/mdev_private.h | 6 ++- drivers/vfio/mdev/mdev_sysfs.c | 42 ++++++++++++++++--- include/linux/mdev.h | 10 +++++ 9 files changed, 203 insertions(+), 39 deletions(-) -- 2.18.0 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Fri, 20 Jul 2018 10:19:24 +0800 Zhenyu Wang <zhenyuw@linux.intel.com> wrote: > Current mdev device create interface depends on fixed mdev type, which get uuid > from user to create instance of mdev device. If user wants to use customized > number of resource for mdev device, then only can create new mdev type for that > which may not be flexible. This requirement comes not only from to be able to > allocate flexible resources for KVMGT, but also from Intel scalable IO > virtualization which would use vfio/mdev to be able to allocate arbitrary > resources on mdev instance. More info on [1] [2] [3]. > > To allow to create user defined resources for mdev, it trys to extend mdev > create interface by adding new "instances=xxx" parameter following uuid, for > target mdev type if aggregation is supported, it can create new mdev device > which contains resources combined by number of instances, e.g > > echo "<uuid>,instances=10" > create > > VM manager e.g libvirt can check mdev type with "aggregation" attribute which > can support this setting. If no "aggregation" attribute found for mdev type, > previous behavior is still kept for one instance allocation. And new sysfs > attribute "instances" is created for each mdev device to show allocated number. > > This trys to create new KVMGT type with minimal vGPU resources which can be > combined with "instances=x" setting to allocate for user wanted resources. "instances" makes me think this is arg helps to create multiple mdev instances rather than consuming multiple instances for a single mdev. You're already exposing the "aggregation" attribute, so doesn't "aggregate" perhaps make more sense as the create option? We're asking the driver to aggregate $NUM instances into a single mdev. The mdev attribute should then perhaps also be "aggregated_instances". The next user question for the interface might be what aspect of the device gets multiplied by this aggregation? In i915 I see you're multiplying the memory sizes by the instance, but clearly the resolution doesn't change. I assume this is sort of like mdev types themselves, ie. some degree of what a type means is buried in the implementation and some degree of what some number of those types aggregated together means is impossible to describe generically. We're also going to need to add aggregation to the checklist for device compatibility for migration, for example 1) is it the same mdev_type, 1a) are the aggregation counts the same (new), 2) is the host driver compatible (TBD). The count handling in create_store(), particularly MDEV_CREATE_OPT_LEN looks a little suspicious. I think we should just be validating that the string before the ',' or the entire string if there is no comma is UUID length. Pass only that to uuid_le_to_bin(). We can then strncmp as you have for "instances=" (or "aggregate=") but then let's be sure to end the string we pass to kstrtouint(), ie. assume there could be further args. Documentation/ABI/testing/sysfs-bus-vfio-mdev also needs to be updated with this series. I'm curious what libvirt folks and Kirti think of this, it looks like it has a nice degree of backwards compatibility, both in the sysfs interface and the vendor driver interface. Thanks, Alex -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Tue, Jul 24, 2018 at 11:44:40AM -0600, Alex Williamson wrote: > On Fri, 20 Jul 2018 10:19:24 +0800 > Zhenyu Wang <zhenyuw@linux.intel.com> wrote: > > > Current mdev device create interface depends on fixed mdev type, which get uuid > > from user to create instance of mdev device. If user wants to use customized > > number of resource for mdev device, then only can create new mdev type for that > > which may not be flexible. This requirement comes not only from to be able to > > allocate flexible resources for KVMGT, but also from Intel scalable IO > > virtualization which would use vfio/mdev to be able to allocate arbitrary > > resources on mdev instance. More info on [1] [2] [3]. > > > > To allow to create user defined resources for mdev, it trys to extend mdev > > create interface by adding new "instances=xxx" parameter following uuid, for > > target mdev type if aggregation is supported, it can create new mdev device > > which contains resources combined by number of instances, e.g > > > > echo "<uuid>,instances=10" > create > > > > VM manager e.g libvirt can check mdev type with "aggregation" attribute which > > can support this setting. If no "aggregation" attribute found for mdev type, > > previous behavior is still kept for one instance allocation. And new sysfs > > attribute "instances" is created for each mdev device to show allocated number. > > > > This trys to create new KVMGT type with minimal vGPU resources which can be > > combined with "instances=x" setting to allocate for user wanted resources. > > "instances" makes me think this is arg helps to create multiple mdev > instances rather than consuming multiple instances for a single mdev. > You're already exposing the "aggregation" attribute, so doesn't > "aggregate" perhaps make more sense as the create option? We're asking > the driver to aggregate $NUM instances into a single mdev. The mdev > attribute should then perhaps also be "aggregated_instances". > > The next user question for the interface might be what aspect of the > device gets multiplied by this aggregation? In i915 I see you're > multiplying the memory sizes by the instance, but clearly the > resolution doesn't change. I assume this is sort of like mdev types > themselves, ie. some degree of what a type means is buried in the > implementation and some degree of what some number of those types > aggregated together means is impossible to describe generically. I don't seem to clearly see the benefit here, so I have to ask, how is this better and/or different from allowing a heterogeneous setup if one needs a more performant instance in terms of more resources? Because to me, once you're able to aggregate instances, I would assume that a simple "echo `uuid`" with a different type should succeed as well and provide me (from user's perspective) with the same results. Could you please clarify this to me, as well as what resources/parameters are going to be impacted by aggregation? ... > > I'm curious what libvirt folks and Kirti think of this, it looks like > it has a nice degree of backwards compatibility, both in the sysfs > interface and the vendor driver interface. Thanks, Since libvirt doesn't have an API to create mdevs yet, this doesn't pose an issue for us at the moment. I see this adds new optional sysfs attributes which we could expose within our device capabilities XML, provided it doesn't use a free form text, like the description attribute does. Erik -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Thu, 26 Jul 2018 15:50:28 +0200 Erik Skultety <eskultet@redhat.com> wrote: > On Tue, Jul 24, 2018 at 11:44:40AM -0600, Alex Williamson wrote: > > On Fri, 20 Jul 2018 10:19:24 +0800 > > Zhenyu Wang <zhenyuw@linux.intel.com> wrote: > > > > > Current mdev device create interface depends on fixed mdev type, which get uuid > > > from user to create instance of mdev device. If user wants to use customized > > > number of resource for mdev device, then only can create new mdev type for that > > > which may not be flexible. This requirement comes not only from to be able to > > > allocate flexible resources for KVMGT, but also from Intel scalable IO > > > virtualization which would use vfio/mdev to be able to allocate arbitrary > > > resources on mdev instance. More info on [1] [2] [3]. > > > > > > To allow to create user defined resources for mdev, it trys to extend mdev > > > create interface by adding new "instances=xxx" parameter following uuid, for > > > target mdev type if aggregation is supported, it can create new mdev device > > > which contains resources combined by number of instances, e.g > > > > > > echo "<uuid>,instances=10" > create > > > > > > VM manager e.g libvirt can check mdev type with "aggregation" attribute which > > > can support this setting. If no "aggregation" attribute found for mdev type, > > > previous behavior is still kept for one instance allocation. And new sysfs > > > attribute "instances" is created for each mdev device to show allocated number. > > > > > > This trys to create new KVMGT type with minimal vGPU resources which can be > > > combined with "instances=x" setting to allocate for user wanted resources. > > > > "instances" makes me think this is arg helps to create multiple mdev > > instances rather than consuming multiple instances for a single mdev. > > You're already exposing the "aggregation" attribute, so doesn't > > "aggregate" perhaps make more sense as the create option? We're asking > > the driver to aggregate $NUM instances into a single mdev. The mdev > > attribute should then perhaps also be "aggregated_instances". > > > > The next user question for the interface might be what aspect of the > > device gets multiplied by this aggregation? In i915 I see you're > > multiplying the memory sizes by the instance, but clearly the > > resolution doesn't change. I assume this is sort of like mdev types > > themselves, ie. some degree of what a type means is buried in the > > implementation and some degree of what some number of those types > > aggregated together means is impossible to describe generically. > > I don't seem to clearly see the benefit here, so I have to ask, how is this > better and/or different from allowing a heterogeneous setup if one needs a more > performant instance in terms of more resources? Because to me, once you're able > to aggregate instances, I would assume that a simple "echo `uuid`" with a > different type should succeed as well and provide me (from user's perspective) > with the same results. Could you please clarify this to me, as well as what > resources/parameters are going to be impacted by aggregation? I think you're suggesting that we could simply define new mdev types to account for these higher aggregate instances, for example we can define discrete types that are 2x, 3x, 4x, etc. the resource count of a single instance. What I think we're trying to address with this proposal is what happens when the resources available are exceptionally large and they can be combined in arbitrary ways. For example if a parent device can expose 10,000 resources and the granularity with which we can create and mdev instance is 1, is it practical to create 10,000 mdev types or does it make more sense to expose a granularity and method of aggregation. Using graphics here perhaps falls a little short of the intention of the interface because the possible types are easily enumerable and it would be entirely practical to create discrete types for each. vGPUs also have a lot of variables, so defining which attribute of the device is multiplied by the number of instances is a little more fuzzy. Thanks, Alex -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Thu, Jul 26, 2018 at 08:29:45AM -0600, Alex Williamson wrote: > On Thu, 26 Jul 2018 15:50:28 +0200 > Erik Skultety <eskultet@redhat.com> wrote: > > > On Tue, Jul 24, 2018 at 11:44:40AM -0600, Alex Williamson wrote: > > > On Fri, 20 Jul 2018 10:19:24 +0800 > > > Zhenyu Wang <zhenyuw@linux.intel.com> wrote: > > > > > > > Current mdev device create interface depends on fixed mdev type, which get uuid > > > > from user to create instance of mdev device. If user wants to use customized > > > > number of resource for mdev device, then only can create new mdev type for that > > > > which may not be flexible. This requirement comes not only from to be able to > > > > allocate flexible resources for KVMGT, but also from Intel scalable IO > > > > virtualization which would use vfio/mdev to be able to allocate arbitrary > > > > resources on mdev instance. More info on [1] [2] [3]. > > > > > > > > To allow to create user defined resources for mdev, it trys to extend mdev > > > > create interface by adding new "instances=xxx" parameter following uuid, for > > > > target mdev type if aggregation is supported, it can create new mdev device > > > > which contains resources combined by number of instances, e.g > > > > > > > > echo "<uuid>,instances=10" > create > > > > > > > > VM manager e.g libvirt can check mdev type with "aggregation" attribute which > > > > can support this setting. If no "aggregation" attribute found for mdev type, > > > > previous behavior is still kept for one instance allocation. And new sysfs > > > > attribute "instances" is created for each mdev device to show allocated number. > > > > > > > > This trys to create new KVMGT type with minimal vGPU resources which can be > > > > combined with "instances=x" setting to allocate for user wanted resources. > > > > > > "instances" makes me think this is arg helps to create multiple mdev > > > instances rather than consuming multiple instances for a single mdev. > > > You're already exposing the "aggregation" attribute, so doesn't > > > "aggregate" perhaps make more sense as the create option? We're asking > > > the driver to aggregate $NUM instances into a single mdev. The mdev > > > attribute should then perhaps also be "aggregated_instances". > > > > > > The next user question for the interface might be what aspect of the > > > device gets multiplied by this aggregation? In i915 I see you're > > > multiplying the memory sizes by the instance, but clearly the > > > resolution doesn't change. I assume this is sort of like mdev types > > > themselves, ie. some degree of what a type means is buried in the > > > implementation and some degree of what some number of those types > > > aggregated together means is impossible to describe generically. > > > > I don't seem to clearly see the benefit here, so I have to ask, how is this > > better and/or different from allowing a heterogeneous setup if one needs a more > > performant instance in terms of more resources? Because to me, once you're able > > to aggregate instances, I would assume that a simple "echo `uuid`" with a > > different type should succeed as well and provide me (from user's perspective) > > with the same results. Could you please clarify this to me, as well as what > > resources/parameters are going to be impacted by aggregation? > > I think you're suggesting that we could simply define new mdev types to > account for these higher aggregate instances, for example we can > define discrete types that are 2x, 3x, 4x, etc. the resource count of a > single instance. What I think we're trying to address with this > proposal is what happens when the resources available are exceptionally > large and they can be combined in arbitrary ways. For example if a > parent device can expose 10,000 resources and the granularity with > which we can create and mdev instance is 1, is it practical to create > 10,000 mdev types or does it make more sense to expose a granularity > and method of aggregation. Okay, I got a much better picture now, thanks for the clarification. The granularity you mentioned definitely does give users more power and control (in terms of provisioning) over the devices. Erik > > Using graphics here perhaps falls a little short of the intention of > the interface because the possible types are easily enumerable and it > would be entirely practical to create discrete types for each. vGPUs > also have a lot of variables, so defining which attribute of the device > is multiplied by the number of instances is a little more fuzzy. > Thanks, > > Alex -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Thu, 26 Jul 2018 15:50:28 +0200 Erik Skultety <eskultet@redhat.com> wrote: > On Tue, Jul 24, 2018 at 11:44:40AM -0600, Alex Williamson wrote: > > On Fri, 20 Jul 2018 10:19:24 +0800 > > Zhenyu Wang <zhenyuw@linux.intel.com> wrote: > > > > > Current mdev device create interface depends on fixed mdev type, which get uuid > > > from user to create instance of mdev device. If user wants to use customized > > > number of resource for mdev device, then only can create new mdev type for that > > > which may not be flexible. This requirement comes not only from to be able to > > > allocate flexible resources for KVMGT, but also from Intel scalable IO > > > virtualization which would use vfio/mdev to be able to allocate arbitrary > > > resources on mdev instance. More info on [1] [2] [3]. > > > > > > To allow to create user defined resources for mdev, it trys to extend mdev > > > create interface by adding new "instances=xxx" parameter following uuid, for > > > target mdev type if aggregation is supported, it can create new mdev device > > > which contains resources combined by number of instances, e.g > > > > > > echo "<uuid>,instances=10" > create > > > > > > VM manager e.g libvirt can check mdev type with "aggregation" attribute which > > > can support this setting. If no "aggregation" attribute found for mdev type, > > > previous behavior is still kept for one instance allocation. And new sysfs > > > attribute "instances" is created for each mdev device to show allocated number. > > > > > > This trys to create new KVMGT type with minimal vGPU resources which can be > > > combined with "instances=x" setting to allocate for user wanted resources. > > > > "instances" makes me think this is arg helps to create multiple mdev > > instances rather than consuming multiple instances for a single mdev. > > You're already exposing the "aggregation" attribute, so doesn't > > "aggregate" perhaps make more sense as the create option? We're asking > > the driver to aggregate $NUM instances into a single mdev. The mdev > > attribute should then perhaps also be "aggregated_instances". I agree, that seems like a better naming scheme. (...) > > I'm curious what libvirt folks and Kirti think of this, it looks like > > it has a nice degree of backwards compatibility, both in the sysfs > > interface and the vendor driver interface. Thanks, > > Since libvirt doesn't have an API to create mdevs yet, this doesn't pose an > issue for us at the moment. I see this adds new optional sysfs attributes which > we could expose within our device capabilities XML, provided it doesn't use a > free form text, like the description attribute does. One thing I noticed is that we have seem to have an optional (?) vendor-driver created "aggregation" attribute (which always prints "true" in the Intel driver). Would it be better or worse for libvirt if it contained some kind of upper boundary or so? Additionally, would it be easier if the "create" attribute always accepted ",instances=1" (calling the .create ops in that case?) -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Thu, Jul 26, 2018 at 05:30:07PM +0200, Cornelia Huck wrote: > On Thu, 26 Jul 2018 15:50:28 +0200 > Erik Skultety <eskultet@redhat.com> wrote: > > > On Tue, Jul 24, 2018 at 11:44:40AM -0600, Alex Williamson wrote: > > > On Fri, 20 Jul 2018 10:19:24 +0800 > > > Zhenyu Wang <zhenyuw@linux.intel.com> wrote: > > > > > > > Current mdev device create interface depends on fixed mdev type, which get uuid > > > > from user to create instance of mdev device. If user wants to use customized > > > > number of resource for mdev device, then only can create new mdev type for that > > > > which may not be flexible. This requirement comes not only from to be able to > > > > allocate flexible resources for KVMGT, but also from Intel scalable IO > > > > virtualization which would use vfio/mdev to be able to allocate arbitrary > > > > resources on mdev instance. More info on [1] [2] [3]. > > > > > > > > To allow to create user defined resources for mdev, it trys to extend mdev > > > > create interface by adding new "instances=xxx" parameter following uuid, for > > > > target mdev type if aggregation is supported, it can create new mdev device > > > > which contains resources combined by number of instances, e.g > > > > > > > > echo "<uuid>,instances=10" > create > > > > > > > > VM manager e.g libvirt can check mdev type with "aggregation" attribute which > > > > can support this setting. If no "aggregation" attribute found for mdev type, > > > > previous behavior is still kept for one instance allocation. And new sysfs > > > > attribute "instances" is created for each mdev device to show allocated number. > > > > > > > > This trys to create new KVMGT type with minimal vGPU resources which can be > > > > combined with "instances=x" setting to allocate for user wanted resources. > > > > > > "instances" makes me think this is arg helps to create multiple mdev > > > instances rather than consuming multiple instances for a single mdev. > > > You're already exposing the "aggregation" attribute, so doesn't > > > "aggregate" perhaps make more sense as the create option? We're asking > > > the driver to aggregate $NUM instances into a single mdev. The mdev > > > attribute should then perhaps also be "aggregated_instances". > > I agree, that seems like a better naming scheme. > > (...) > > > > I'm curious what libvirt folks and Kirti think of this, it looks like > > > it has a nice degree of backwards compatibility, both in the sysfs > > > interface and the vendor driver interface. Thanks, > > > > Since libvirt doesn't have an API to create mdevs yet, this doesn't pose an > > issue for us at the moment. I see this adds new optional sysfs attributes which > > we could expose within our device capabilities XML, provided it doesn't use a > > free form text, like the description attribute does. > > One thing I noticed is that we have seem to have an optional (?) > vendor-driver created "aggregation" attribute (which always prints > "true" in the Intel driver). Would it be better or worse for libvirt if > it contained some kind of upper boundary or so? Additionally, would it Can you be more specific? Although, I wouldn't argue against data that conveys some information, since I would treat the mere presence of the optional attribute as a supported feature that we can expose. Therefore, additional *structured* data which sets clear limits to a certain feature is only a plus that we can expose to the users/management layer. > be easier if the "create" attribute always accepted > ",instances=1" (calling the .create ops in that case?) Is ^this libvirt related question? If so, then by accepting such syntax you'll definitely save us a few lines of code ;). However, until we have a clear vision on how to tackle the mdev migration, I'd like to avoid proposing a libvirt "create" API until then. Erik -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Thu, 26 Jul 2018 17:43:45 +0200 Erik Skultety <eskultet@redhat.com> wrote: > On Thu, Jul 26, 2018 at 05:30:07PM +0200, Cornelia Huck wrote: > > One thing I noticed is that we have seem to have an optional (?) > > vendor-driver created "aggregation" attribute (which always prints > > "true" in the Intel driver). Would it be better or worse for libvirt if > > it contained some kind of upper boundary or so? Additionally, would it > > Can you be more specific? Although, I wouldn't argue against data that conveys > some information, since I would treat the mere presence of the optional > attribute as a supported feature that we can expose. Therefore, additional > *structured* data which sets clear limits to a certain feature is only a plus > that we can expose to the users/management layer. My question is what would be easiest for libvirt: - "aggregation" attribute only present when driver supports aggregation (possibly containing max number of resources to be aggregated) - "aggregation" attribute always present; contains '1' if driver does not support aggregation and 'm' if driver can aggregate 'm' resources -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Thu, Jul 26, 2018 at 06:04:10PM +0200, Cornelia Huck wrote: > On Thu, 26 Jul 2018 17:43:45 +0200 > Erik Skultety <eskultet@redhat.com> wrote: > > > On Thu, Jul 26, 2018 at 05:30:07PM +0200, Cornelia Huck wrote: > > > One thing I noticed is that we have seem to have an optional (?) > > > vendor-driver created "aggregation" attribute (which always prints > > > "true" in the Intel driver). Would it be better or worse for libvirt if > > > it contained some kind of upper boundary or so? Additionally, would it > > > > Can you be more specific? Although, I wouldn't argue against data that conveys > > some information, since I would treat the mere presence of the optional > > attribute as a supported feature that we can expose. Therefore, additional > > *structured* data which sets clear limits to a certain feature is only a plus > > that we can expose to the users/management layer. > > My question is what would be easiest for libvirt: > > - "aggregation" attribute only present when driver supports aggregation > (possibly containing max number of resources to be aggregated) > - "aggregation" attribute always present; contains '1' if driver does > not support aggregation and 'm' if driver can aggregate 'm' resources Both are fine from libvirt's POV, but IMHO the former makes a bit more sense and I'm in favour of that one, IOW the presence of an attribute denotes a new functionality which we can report, if it's missing, the feature is clearly lacking- I don't think we (libvirt) should be reporting the value 1 explicitly in the XML if the feature is missing, we can assume 1 as the default. Erik -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On 2018.07.27 16:45:55 +0200, Erik Skultety wrote: > On Thu, Jul 26, 2018 at 06:04:10PM +0200, Cornelia Huck wrote: > > On Thu, 26 Jul 2018 17:43:45 +0200 > > Erik Skultety <eskultet@redhat.com> wrote: > > > > > On Thu, Jul 26, 2018 at 05:30:07PM +0200, Cornelia Huck wrote: > > > > One thing I noticed is that we have seem to have an optional (?) > > > > vendor-driver created "aggregation" attribute (which always prints > > > > "true" in the Intel driver). Would it be better or worse for libvirt if > > > > it contained some kind of upper boundary or so? Additionally, would it > > > > > > Can you be more specific? Although, I wouldn't argue against data that conveys > > > some information, since I would treat the mere presence of the optional > > > attribute as a supported feature that we can expose. Therefore, additional > > > *structured* data which sets clear limits to a certain feature is only a plus > > > that we can expose to the users/management layer. > > > > My question is what would be easiest for libvirt: > > > > - "aggregation" attribute only present when driver supports aggregation > > (possibly containing max number of resources to be aggregated) > > - "aggregation" attribute always present; contains '1' if driver does > > not support aggregation and 'm' if driver can aggregate 'm' resources > > Both are fine from libvirt's POV, but IMHO the former makes a bit more sense > and I'm in favour of that one, IOW the presence of an attribute denotes a new > functionality which we can report, if it's missing, the feature is clearly > lacking- I don't think we (libvirt) should be reporting the value 1 explicitly > in the XML if the feature is missing, we can assume 1 as the default. > Good I'll adhere to that, thanks! -- Open Source Technology Center, Intel ltd. $gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Thu, 26 Jul 2018 17:30:07 +0200 Cornelia Huck <cohuck@redhat.com> wrote: > On Thu, 26 Jul 2018 15:50:28 +0200 > Erik Skultety <eskultet@redhat.com> wrote: > > > On Tue, Jul 24, 2018 at 11:44:40AM -0600, Alex Williamson wrote: > > > On Fri, 20 Jul 2018 10:19:24 +0800 > > > Zhenyu Wang <zhenyuw@linux.intel.com> wrote: > > > > > > > Current mdev device create interface depends on fixed mdev type, which get uuid > > > > from user to create instance of mdev device. If user wants to use customized > > > > number of resource for mdev device, then only can create new mdev type for that > > > > which may not be flexible. This requirement comes not only from to be able to > > > > allocate flexible resources for KVMGT, but also from Intel scalable IO > > > > virtualization which would use vfio/mdev to be able to allocate arbitrary > > > > resources on mdev instance. More info on [1] [2] [3]. > > > > > > > > To allow to create user defined resources for mdev, it trys to extend mdev > > > > create interface by adding new "instances=xxx" parameter following uuid, for > > > > target mdev type if aggregation is supported, it can create new mdev device > > > > which contains resources combined by number of instances, e.g > > > > > > > > echo "<uuid>,instances=10" > create > > > > > > > > VM manager e.g libvirt can check mdev type with "aggregation" attribute which > > > > can support this setting. If no "aggregation" attribute found for mdev type, > > > > previous behavior is still kept for one instance allocation. And new sysfs > > > > attribute "instances" is created for each mdev device to show allocated number. > > > > > > > > This trys to create new KVMGT type with minimal vGPU resources which can be > > > > combined with "instances=x" setting to allocate for user wanted resources. > > > > > > "instances" makes me think this is arg helps to create multiple mdev > > > instances rather than consuming multiple instances for a single mdev. > > > You're already exposing the "aggregation" attribute, so doesn't > > > "aggregate" perhaps make more sense as the create option? We're asking > > > the driver to aggregate $NUM instances into a single mdev. The mdev > > > attribute should then perhaps also be "aggregated_instances". > > I agree, that seems like a better naming scheme. > > (...) > > > > I'm curious what libvirt folks and Kirti think of this, it looks like > > > it has a nice degree of backwards compatibility, both in the sysfs > > > interface and the vendor driver interface. Thanks, > > > > Since libvirt doesn't have an API to create mdevs yet, this doesn't pose an > > issue for us at the moment. I see this adds new optional sysfs attributes which > > we could expose within our device capabilities XML, provided it doesn't use a > > free form text, like the description attribute does. > > One thing I noticed is that we have seem to have an optional (?) > vendor-driver created "aggregation" attribute (which always prints > "true" in the Intel driver). Would it be better or worse for libvirt if > it contained some kind of upper boundary or so? Ultimately the aggregation value should be fully specified in Documentation/ABI, but doesn't the kernel generally use 'Y' or 'N' for boolean attributes in sysfs? Maybe mdev core can handle the attribute since it should exist any time the create_with_instances callback is provided. > Additionally, would it > be easier if the "create" attribute always accepted > ",instances=1" (calling the .create ops in that case?) Unless I misunderstand the code or the question, I think this is exactly what happens: + if (instances > 1) + ret = parent->ops->create_with_instances(kobj, mdev, instances); + else + ret = parent->ops->create(kobj, mdev); And elsewhere: + if (instances > 1 && !parent->ops->create_with_instances) { + ret = -EINVAL; + goto mdev_fail; + } If the mdev core supplied the aggregation attribute, then the presence of the attribute could indicate to userspace whether it can always provide an instance (aggregate) count, even if limited to '1' when 'N', for that mdev type. Thanks, Alex -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On Thu, 26 Jul 2018 09:51:26 -0600 Alex Williamson <alex.williamson@redhat.com> wrote: > On Thu, 26 Jul 2018 17:30:07 +0200 > Cornelia Huck <cohuck@redhat.com> wrote: > > > On Thu, 26 Jul 2018 15:50:28 +0200 > > Erik Skultety <eskultet@redhat.com> wrote: > > > Since libvirt doesn't have an API to create mdevs yet, this doesn't pose an > > > issue for us at the moment. I see this adds new optional sysfs attributes which > > > we could expose within our device capabilities XML, provided it doesn't use a > > > free form text, like the description attribute does. > > > > One thing I noticed is that we have seem to have an optional (?) > > vendor-driver created "aggregation" attribute (which always prints > > "true" in the Intel driver). Would it be better or worse for libvirt if > > it contained some kind of upper boundary or so? > > Ultimately the aggregation value should be fully specified in > Documentation/ABI, but doesn't the kernel generally use 'Y' or 'N' for > boolean attributes in sysfs? Maybe mdev core can handle the attribute > since it should exist any time the create_with_instances callback is > provided. It might make sense to print a number if the driver allows a number of resources to be aggregated which is not the same as available_instances (see my reply to the documentation patch). > > > Additionally, would it > > be easier if the "create" attribute always accepted > > ",instances=1" (calling the .create ops in that case?) > > Unless I misunderstand the code or the question, I think this is > exactly what happens: Indeed, it does. Let me blame the weather ;) > If the mdev core supplied the aggregation attribute, then the presence > of the attribute could indicate to userspace whether it can always > provide an instance (aggregate) count, even if limited to '1' when 'N', > for that mdev type. Thanks, > > Alex -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On 2018.07.24 11:44:40 -0600, Alex Williamson wrote: > On Fri, 20 Jul 2018 10:19:24 +0800 > Zhenyu Wang <zhenyuw@linux.intel.com> wrote: > > > Current mdev device create interface depends on fixed mdev type, which get uuid > > from user to create instance of mdev device. If user wants to use customized > > number of resource for mdev device, then only can create new mdev type for that > > which may not be flexible. This requirement comes not only from to be able to > > allocate flexible resources for KVMGT, but also from Intel scalable IO > > virtualization which would use vfio/mdev to be able to allocate arbitrary > > resources on mdev instance. More info on [1] [2] [3]. > > > > To allow to create user defined resources for mdev, it trys to extend mdev > > create interface by adding new "instances=xxx" parameter following uuid, for > > target mdev type if aggregation is supported, it can create new mdev device > > which contains resources combined by number of instances, e.g > > > > echo "<uuid>,instances=10" > create > > > > VM manager e.g libvirt can check mdev type with "aggregation" attribute which > > can support this setting. If no "aggregation" attribute found for mdev type, > > previous behavior is still kept for one instance allocation. And new sysfs > > attribute "instances" is created for each mdev device to show allocated number. > > > > This trys to create new KVMGT type with minimal vGPU resources which can be > > combined with "instances=x" setting to allocate for user wanted resources. > > "instances" makes me think this is arg helps to create multiple mdev > instances rather than consuming multiple instances for a single mdev. > You're already exposing the "aggregation" attribute, so doesn't > "aggregate" perhaps make more sense as the create option? We're asking > the driver to aggregate $NUM instances into a single mdev. The mdev > attribute should then perhaps also be "aggregated_instances". yeah that seems better. > > The next user question for the interface might be what aspect of the > device gets multiplied by this aggregation? In i915 I see you're > multiplying the memory sizes by the instance, but clearly the > resolution doesn't change. I assume this is sort of like mdev types > themselves, ie. some degree of what a type means is buried in the > implementation and some degree of what some number of those types > aggregated together means is impossible to describe generically. > yeah, the purpose was to increase memory resource only, but due to current free formatted 'description', can't have a meaningful expression for that, and not sure if libvirt likes to understand vendor specific behavior e.g for intel vGPU? > We're also going to need to add aggregation to the checklist for device > compatibility for migration, for example 1) is it the same mdev_type, > 1a) are the aggregation counts the same (new), 2) is the host driver > compatible (TBD). > Right, will check with Zhi on that. > The count handling in create_store(), particularly MDEV_CREATE_OPT_LEN > looks a little suspicious. I think we should just be validating that > the string before the ',' or the entire string if there is no comma is > UUID length. Pass only that to uuid_le_to_bin(). We can then strncmp > as you have for "instances=" (or "aggregate=") but then let's be sure > to end the string we pass to kstrtouint(), ie. assume there could be > further args. Original purpose was to limit the length of string to accept, but can take this way without issue I think. > > Documentation/ABI/testing/sysfs-bus-vfio-mdev also needs to be updated > with this series. Ok. Thanks > > I'm curious what libvirt folks and Kirti think of this, it looks like > it has a nice degree of backwards compatibility, both in the sysfs > interface and the vendor driver interface. Thanks, > > Alex -- Open Source Technology Center, Intel ltd. $gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Current mdev device create interface depends on fixed mdev type, which get uuid from user to create instance of mdev device. If user wants to use customized number of resource for mdev device, then only can create new mdev type for that which may not be flexible. This requirement comes not only from to be able to allocate flexible resources for KVMGT, but also from Intel scalable IO virtualization which would use vfio/mdev to be able to allocate arbitrary resources on mdev instance. More info on [1] [2] [3]. To allow to create user defined resources for mdev, it trys to extend mdev create interface by adding new "aggregate=xxx" parameter following UUID, for target mdev type if aggregation is supported, it can create new mdev device which contains resources combined by number of instances, e.g echo "<uuid>,aggregate=10" > create VM manager e.g libvirt can check mdev type with "aggregation" attribute which can support this setting. If no "aggregation" attribute found for mdev type, previous behavior is still kept for one instance allocation. And new sysfs attribute "aggregated_instances" is created for each mdev device to show allocated number. References: [1] https://software.intel.com/en-us/download/intel-virtualization-technology-for-directed-io-architecture-specification [2] https://software.intel.com/en-us/download/intel-scalable-io-virtualization-technical-specification [3] https://schd.ws/hosted_files/lc32018/00/LC3-SIOV-final.pdf v2: - Add new create_with_instances driver hook - Update doc for new attributes v3: - Rename parameter and attribute names from review - Make "aggregation" attribute to show maxium resource number for aggregation - Check driver hooks for attribute create validation - Update doc and ABI spec Zhenyu Wang (6): vfio/mdev: Add new "aggregate" parameter for mdev create vfio/mdev: Add "aggregation" attribute for supported mdev type vfio/mdev: Add "aggregated_instances" attribute for supported mdev device Documentation/vfio-mediated-device.txt: Update for vfio/mdev aggregation support Documentation/ABI/testing/sysfs-bus-vfio-mdev: Update for vfio/mdev aggregation support drm/i915/gvt: Add new type with aggregation support Documentation/ABI/testing/sysfs-bus-vfio-mdev | 25 +++++++ Documentation/vfio-mediated-device.txt | 44 +++++++++-- drivers/gpu/drm/i915/gvt/gvt.h | 11 ++- drivers/gpu/drm/i915/gvt/kvmgt.c | 53 ++++++++++++- drivers/gpu/drm/i915/gvt/vgpu.c | 56 +++++++++++++- drivers/vfio/mdev/mdev_core.c | 40 +++++++++- drivers/vfio/mdev/mdev_private.h | 6 +- drivers/vfio/mdev/mdev_sysfs.c | 74 ++++++++++++++++++- include/linux/mdev.h | 19 +++++ 9 files changed, 305 insertions(+), 23 deletions(-) -- 2.19.1 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Hi, Zhenyu, curious about the progress of this series. Is there still some open remaining or a new version coming soon? Thanks Kevin > From: Zhenyu Wang [mailto:zhenyuw@linux.intel.com] > Sent: Friday, July 20, 2018 10:19 AM > > Current mdev device create interface depends on fixed mdev type, which > get uuid > from user to create instance of mdev device. If user wants to use > customized > number of resource for mdev device, then only can create new mdev type > for that > which may not be flexible. This requirement comes not only from to be > able to > allocate flexible resources for KVMGT, but also from Intel scalable IO > virtualization which would use vfio/mdev to be able to allocate arbitrary > resources on mdev instance. More info on [1] [2] [3]. > > To allow to create user defined resources for mdev, it trys to extend mdev > create interface by adding new "instances=xxx" parameter following uuid, > for > target mdev type if aggregation is supported, it can create new mdev device > which contains resources combined by number of instances, e.g > > echo "<uuid>,instances=10" > create > > VM manager e.g libvirt can check mdev type with "aggregation" attribute > which > can support this setting. If no "aggregation" attribute found for mdev type, > previous behavior is still kept for one instance allocation. And new sysfs > attribute "instances" is created for each mdev device to show allocated > number. > > This trys to create new KVMGT type with minimal vGPU resources which > can be > combined with "instances=x" setting to allocate for user wanted resources. > > References: > [1] https://software.intel.com/en-us/download/intel-virtualization- > technology-for-directed-io-architecture-specification > [2] https://software.intel.com/en-us/download/intel-scalable-io- > virtualization-technical-specification > [3] https://schd.ws/hosted_files/lc32018/00/LC3-SIOV-final.pdf > > v2: > - Add new create_with_instances driver hook > - Update doc for new attributes > > Zhenyu Wang (4): > vfio/mdev: Add new instances parameter for mdev create > vfio/mdev: Add mdev device instances attribute > drm/i915/gvt: Add new aggregation type support > Documentation/vfio-mediated-device.txt: update for aggregation > attribute > > Documentation/vfio-mediated-device.txt | 39 +++++++++++++++--- > drivers/gpu/drm/i915/gvt/gvt.c | 26 +++++++++--- > drivers/gpu/drm/i915/gvt/gvt.h | 14 ++++--- > drivers/gpu/drm/i915/gvt/kvmgt.c | 30 +++++++++++--- > drivers/gpu/drm/i915/gvt/vgpu.c | 56 ++++++++++++++++++++++---- > drivers/vfio/mdev/mdev_core.c | 19 +++++++-- > drivers/vfio/mdev/mdev_private.h | 6 ++- > drivers/vfio/mdev/mdev_sysfs.c | 42 ++++++++++++++++--- > include/linux/mdev.h | 10 +++++ > 9 files changed, 203 insertions(+), 39 deletions(-) > > -- > 2.18.0 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
On 2018.10.08 03:19:25 +0000, Tian, Kevin wrote: > Hi, Zhenyu, > > curious about the progress of this series. Is there still some open remaining > or a new version coming soon? > I had new version almost ready before my vacation, planned to send after be back. So will refresh this later. Sorry for any delay as was trying to close several gvt issues. Thanks. -- Open Source Technology Center, Intel ltd. $gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
© 2016 - 2024 Red Hat, Inc.