From: Chun Feng Wu <wucf@linux.ibm.com>
* Add new elements '<throttlefilters>'
* <ThrottleFilters> can include multiple throttlegroup references to form filter chain in qemu
* Chained throttle filters feature in qemu is described at https://github.com/qemu/qemu/blob/master/docs/throttle.txt
Signed-off-by: Chun Feng Wu <wucf@linux.ibm.com>
---
docs/formatdomain.rst | 22 ++++++++++++++++++++++
src/conf/schemas/domaincommon.rng | 19 ++++++++++++++++++-
2 files changed, 40 insertions(+), 1 deletion(-)
diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst
index b7e1f9cc83..0fa8f1267c 100644
--- a/docs/formatdomain.rst
+++ b/docs/formatdomain.rst
@@ -2736,6 +2736,15 @@ paravirtualized driver is specified via the ``disk`` element.
<source dev='/dev/vhost-vdpa-0' />
<target dev='vdg' bus='virtio'/>
</disk>
+ <disk type='file' device='disk'>
+ <driver name='qemu' type='qcow2' />
+ <source file='/var/lib/libvirt/images/disk.qcow2'/>
+ <target dev='vdh' bus='virtio'/>
+ <throttlefilters>
+ <throttlefilter group='limit2'/>
+ <throttlefilter group='limit012'/>
+ </throttlefilters>
+ </disk>
</devices>
...
@@ -3217,6 +3226,19 @@ paravirtualized driver is specified via the ``disk`` element.
:since:`since after 0.4.4`; "sata" attribute value :since:`since 0.9.7`;
"removable" attribute value :since:`since 1.1.3`;
"rotation_rate" attribute value :since:`since 7.3.0`
+``throttlefilters``
+ The optional ``throttlefilters`` element provides the ability to provide additional
+ per-device throttle chain :since:`Since 10.5.0`
+ For example, if we have four different disks and we want to limit I/O for each one
+ and we also want to limit combined I/O of all four disks, we can leverage
+ ``throttlefilters`` to achieve this goal by setting two ``throttlefilter`` for
+ each disk: disk's own filter(e.g. limit2) and combined filter(e.g. limit012).
+ The nodes in qemu shape a chain like libvirt-4-filter(node name of "limit012") ->
+ libvirt-3-filter(node name of "limit2") -> libvirt-2-format -> libvirt-1-storage.
+ ``throttlefilters`` and ``iotune`` should be used exclusively.
+
+ ``throttlefilter``
+ The optional ``throttlefilter`` element is to reference defined throttle group.
``iotune``
The optional ``iotune`` element provides the ability to provide additional
per-device I/O tuning, with values that can vary for each device (contrast
diff --git a/src/conf/schemas/domaincommon.rng b/src/conf/schemas/domaincommon.rng
index 08c520e222..7ceb8c0be2 100644
--- a/src/conf/schemas/domaincommon.rng
+++ b/src/conf/schemas/domaincommon.rng
@@ -1578,7 +1578,10 @@
<ref name="encryption"/>
</optional>
<optional>
- <ref name="diskIoTune"/>
+ <choice>
+ <ref name="throttlefilters"/>
+ <ref name="diskIoTune"/>
+ </choice>
</optional>
<optional>
<ref name="alias"/>
@@ -6671,6 +6674,20 @@
</element>
</optional>
</define>
+ <!--
+ A set of throttlefilters to reference throttlegroups
+ -->
+ <define name="throttlefilters">
+ <element name="throttlefilters">
+ <zeroOrMore>
+ <element name="throttlefilter">
+ <attribute name="group">
+ <data type="string"/>
+ </attribute>
+ </element>
+ </zeroOrMore>
+ </element>
+ </define>
<!--
A set of optional features: PAE, APIC, ACPI, GIC, TCG,
HyperV Enlightenment, KVM features, paravirtual spinlocks and HAP support
--
2.34.1
On Wed, Jun 12, 2024 at 03:02:10 -0700, wucf@linux.ibm.com wrote: > From: Chun Feng Wu <wucf@linux.ibm.com> > > * Add new elements '<throttlefilters>' > * <ThrottleFilters> can include multiple throttlegroup references to form filter chain in qemu > * Chained throttle filters feature in qemu is described at https://github.com/qemu/qemu/blob/master/docs/throttle.txt > > Signed-off-by: Chun Feng Wu <wucf@linux.ibm.com> > --- > docs/formatdomain.rst | 22 ++++++++++++++++++++++ > src/conf/schemas/domaincommon.rng | 19 ++++++++++++++++++- > 2 files changed, 40 insertions(+), 1 deletion(-) > > diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst > index b7e1f9cc83..0fa8f1267c 100644 > --- a/docs/formatdomain.rst > +++ b/docs/formatdomain.rst > @@ -2736,6 +2736,15 @@ paravirtualized driver is specified via the ``disk`` element. > <source dev='/dev/vhost-vdpa-0' /> > <target dev='vdg' bus='virtio'/> > </disk> > + <disk type='file' device='disk'> > + <driver name='qemu' type='qcow2' /> > + <source file='/var/lib/libvirt/images/disk.qcow2'/> > + <target dev='vdh' bus='virtio'/> > + <throttlefilters> > + <throttlefilter group='limit2'/> > + <throttlefilter group='limit012'/> > + </throttlefilters> > + </disk> > </devices> > ... > > @@ -3217,6 +3226,19 @@ paravirtualized driver is specified via the ``disk`` element. > :since:`since after 0.4.4`; "sata" attribute value :since:`since 0.9.7`; > "removable" attribute value :since:`since 1.1.3`; > "rotation_rate" attribute value :since:`since 7.3.0` > +``throttlefilters`` > + The optional ``throttlefilters`` element provides the ability to provide additional > + per-device throttle chain :since:`Since 10.5.0` > + For example, if we have four different disks and we want to limit I/O for each one > + and we also want to limit combined I/O of all four disks, we can leverage > + ``throttlefilters`` to achieve this goal by setting two ``throttlefilter`` for > + each disk: disk's own filter(e.g. limit2) and combined filter(e.g. limit012). > + The nodes in qemu shape a chain like libvirt-4-filter(node name of "limit012") -> > + libvirt-3-filter(node name of "limit2") -> libvirt-2-format -> libvirt-1-storage. > + ``throttlefilters`` and ``iotune`` should be used exclusively. Node names are a qemu driver internal implementation detail and thus must not be noted in documentation. > + > + ``throttlefilter`` > + The optional ``throttlefilter`` element is to reference defined throttle group. > ``iotune`` > The optional ``iotune`` element provides the ability to provide additional > per-device I/O tuning, with values that can vary for each device (contrast > diff --git a/src/conf/schemas/domaincommon.rng b/src/conf/schemas/domaincommon.rng > index 08c520e222..7ceb8c0be2 100644 > --- a/src/conf/schemas/domaincommon.rng > +++ b/src/conf/schemas/domaincommon.rng > @@ -1578,7 +1578,10 @@ > <ref name="encryption"/> > </optional> > <optional> > - <ref name="diskIoTune"/> > + <choice> > + <ref name="throttlefilters"/> > + <ref name="diskIoTune"/> > + </choice> > </optional> > <optional> > <ref name="alias"/> > @@ -6671,6 +6674,20 @@ > </element> > </optional> > </define> > + <!-- > + A set of throttlefilters to reference throttlegroups > + --> > + <define name="throttlefilters"> > + <element name="throttlefilters"> > + <zeroOrMore> > + <element name="throttlefilter"> > + <attribute name="group"> > + <data type="string"/> > + </attribute> > + </element> > + </zeroOrMore> > + </element> > + </define> > <!-- > A set of optional features: PAE, APIC, ACPI, GIC, TCG, > HyperV Enlightenment, KVM features, paravirtual spinlocks and HAP support > -- > 2.34.1 >
On Tue, Jul 02, 2024 at 16:11:03 +0200, Peter Krempa wrote: > On Wed, Jun 12, 2024 at 03:02:10 -0700, wucf@linux.ibm.com wrote: > > From: Chun Feng Wu <wucf@linux.ibm.com> > > > > * Add new elements '<throttlefilters>' > > * <ThrottleFilters> can include multiple throttlegroup references to form filter chain in qemu > > * Chained throttle filters feature in qemu is described at https://github.com/qemu/qemu/blob/master/docs/throttle.txt > > > > Signed-off-by: Chun Feng Wu <wucf@linux.ibm.com> > > --- > > docs/formatdomain.rst | 22 ++++++++++++++++++++++ > > src/conf/schemas/domaincommon.rng | 19 ++++++++++++++++++- > > 2 files changed, 40 insertions(+), 1 deletion(-) > > > > diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst > > index b7e1f9cc83..0fa8f1267c 100644 > > --- a/docs/formatdomain.rst > > +++ b/docs/formatdomain.rst > > @@ -2736,6 +2736,15 @@ paravirtualized driver is specified via the ``disk`` element. > > <source dev='/dev/vhost-vdpa-0' /> > > <target dev='vdg' bus='virtio'/> > > </disk> > > + <disk type='file' device='disk'> > > + <driver name='qemu' type='qcow2' /> > > + <source file='/var/lib/libvirt/images/disk.qcow2'/> > > + <target dev='vdh' bus='virtio'/> > > + <throttlefilters> > > + <throttlefilter group='limit2'/> > > + <throttlefilter group='limit012'/> > > + </throttlefilters> > > + </disk> > > </devices> > > ... > > > > @@ -3217,6 +3226,19 @@ paravirtualized driver is specified via the ``disk`` element. > > :since:`since after 0.4.4`; "sata" attribute value :since:`since 0.9.7`; > > "removable" attribute value :since:`since 1.1.3`; > > "rotation_rate" attribute value :since:`since 7.3.0` > > +``throttlefilters`` > > + The optional ``throttlefilters`` element provides the ability to provide additional > > + per-device throttle chain :since:`Since 10.5.0` > > + For example, if we have four different disks and we want to limit I/O for each one > > + and we also want to limit combined I/O of all four disks, we can leverage > > + ``throttlefilters`` to achieve this goal by setting two ``throttlefilter`` for > > + each disk: disk's own filter(e.g. limit2) and combined filter(e.g. limit012). > > > + The nodes in qemu shape a chain like libvirt-4-filter(node name of "limit012") -> > > + libvirt-3-filter(node name of "limit2") -> libvirt-2-format -> libvirt-1-storage. > > + ``throttlefilters`` and ``iotune`` should be used exclusively. > > Node names are a qemu driver internal implementation detail and thus > must not be noted in documentation. I'm not exactly sure how the internals in qemu work here, but you also might want to document how the order of the filters impacts things (or that it does not impact things).
The order of such ``throttlefilter`` doesn't matter within ``throttlefilters``. I will put above statement into doc
On Tue, Aug 06, 2024 at 00:27:58 -0000, Chun Feng Wu wrote: Please keep the context in the reply. I had to check back what I've asked. > The order of such ``throttlefilter`` doesn't matter within ``throttlefilters``. So IIUC, re-ordering of the filters doesn't have any guest-OS visible impact? I'm trying to understand whether one disk can exhaust one layer while be blocked on the next, in which case a different disk which has only one layer (equivalent to the first disk's first layer) would be starved, but if the filters were ordered the other way around at the first disk it would not. If the above can happen you'll need to document how it's supposed to behave.
my original conclusion is based on the following test xml: <domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'> ... <throttlegroups> <throttlegroup> <total_iops_sec>200</total_iops_sec> <total_iops_sec_max>200</total_iops_sec_max> <group_name>limit0</group_name> <total_iops_sec_max_length>1</total_iops_sec_max_length> </throttlegroup> <throttlegroup> <total_iops_sec>250</total_iops_sec> <total_iops_sec_max>250</total_iops_sec_max> <group_name>limit1</group_name> <total_iops_sec_max_length>1</total_iops_sec_max_length> </throttlegroup> <throttlegroup> <total_iops_sec>300</total_iops_sec> <total_iops_sec_max>300</total_iops_sec_max> <group_name>limit2</group_name> <total_iops_sec_max_length>1</total_iops_sec_max_length> </throttlegroup> <throttlegroup> <total_iops_sec>400</total_iops_sec> <total_iops_sec_max>400</total_iops_sec_max> <group_name>limit012</group_name> <total_iops_sec_max_length>1</total_iops_sec_max_length> </throttlegroup> </throttlegroups> ... <devices> <!-- Disk for the operating system --> <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/virt/images/jammy-server-cloudimg-amd64.img'/> <target dev='vda' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </disk> <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/virt/disks/vm1_disk_1.qcow2'/> <target dev='vdb' bus='virtio'/> <throttlefilters> <throttlefilter group='limit0'/> <throttlefilter group='limit012'/> </throttlefilters> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </disk> <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/virt/disks/vm1_disk_2.qcow2'/> <target dev='vdc' bus='virtio'/> <throttlefilters> <throttlefilter group='limit1'/> <throttlefilter group='limit012'/> </throttlefilters> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </disk> <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/virt/disks/vm1_disk_3.qcow2'/> <target dev='vdd' bus='virtio'/> <throttlefilters> <throttlefilter group='limit2'/> <throttlefilter group='limit012'/> </throttlefilters> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </disk> ... </devices> </domain> if I re-order filters in vdc as below, fio tests(randwrite) show the same result for both concurrent(400 iops in total, around 133(400/3) for each disk) and individual disk test(200 for vdb, 250 for vdc, 300 for vdd). <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/virt/disks/vm1_disk_2.qcow2'/> <target dev='vdc' bus='virtio'/> <throttlefilters> <throttlefilter group='limit012'/> <throttlefilter group='limit1'/> </throttlefilters> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </disk> and back to your case(vdb, vdc in the following xml): <domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'> ... <throttlegroups> <throttlegroup> <total_iops_sec>200</total_iops_sec> <total_iops_sec_max>200</total_iops_sec_max> <group_name>limit0</group_name> <total_iops_sec_max_length>1</total_iops_sec_max_length> </throttlegroup> <throttlegroup> <total_iops_sec>250</total_iops_sec> <total_iops_sec_max>250</total_iops_sec_max> <group_name>limit1</group_name> <total_iops_sec_max_length>1</total_iops_sec_max_length> </throttlegroup> <throttlegroup> <total_iops_sec>300</total_iops_sec> <total_iops_sec_max>300</total_iops_sec_max> <group_name>limit2</group_name> <total_iops_sec_max_length>1</total_iops_sec_max_length> </throttlegroup> <throttlegroup> <total_iops_sec>400</total_iops_sec> <total_iops_sec_max>400</total_iops_sec_max> <group_name>limit012</group_name> <total_iops_sec_max_length>1</total_iops_sec_max_length> </throttlegroup> </throttlegroups> ... <devices> <!-- Disk for the operating system --> <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/virt/images/jammy-server-cloudimg-amd64.img'/> <target dev='vda' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </disk> <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/virt/disks/vm1_disk_1.qcow2'/> <target dev='vdb' bus='virtio'/> <throttlefilters> <throttlefilter group='limit012'/> <throttlefilter group='limit0'/> </throttlefilters> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </disk> <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/virt/disks/vm1_disk_2.qcow2'/> <target dev='vdc' bus='virtio'/> <throttlefilters> <throttlefilter group='limit012'/> </throttlefilters> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </disk> ... </devices> </domain> with above xml, fio tests(randwrite) show: - concurrent: 400 iops in total, around 200(400/2) for each disk - individual disk test: 200 for vdb, 400 for vdc after I re-order vdb disk as below, tests have the same result: - concurrent: 400 iops in total, around 200(400/2) for each disk - individual disk test: 200 for vdb, 400 for vdc <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/virt/disks/vm1_disk_1.qcow2'/> <target dev='vdb' bus='virtio'/> <throttlefilters> <throttlefilter group='limit0'/> <throttlefilter group='limit012'/> </throttlefilters> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </disk> let me know if I understand your case correctly, thanks! On 2024/8/6 15:36, Peter Krempa wrote: > On Tue, Aug 06, 2024 at 00:27:58 -0000, Chun Feng Wu wrote: > > Please keep the context in the reply. I had to check back what I've > asked. > >> The order of such ``throttlefilter`` doesn't matter within ``throttlefilters``. > So IIUC, re-ordering of the filters doesn't have any guest-OS visible > impact? I'm trying to understand whether one disk can exhaust one layer > while be blocked on the next, in which case a different disk which has > only one layer (equivalent to the first disk's first layer) would be > starved, but if the filters were ordered the other way around at the > first disk it would not. > > If the above can happen you'll need to document how it's supposed to > behave. > -- Thanks and Regards, Wu
© 2016 - 2024 Red Hat, Inc.