[libvirt] [PATCH 01/12] conf: introduce domain XML element <polling> for iothread

Pavel Hrdina posted 12 patches 8 years, 11 months ago
[libvirt] [PATCH 01/12] conf: introduce domain XML element <polling> for iothread
Posted by Pavel Hrdina 8 years, 11 months ago
QEMU 2.9.0 will introduce polling feature for AioContext that instead
of blocking syscalls polls for events without blocking.  This means
that polling can be in most cases faster but it also increases CPU
utilization.

To address this issue QEMU implements self-tuning algorithm that
modifies the current polling time to adapt to different workloads
and it can also fallback to blocking syscalls.

For each IOThread this all is controlled by three parameters,
poll-max-ns, poll-grow and poll-shrink.  If parameter poll-max-ns
is set to 0 it disables the polling, if it is omitted the default
behavior is used and any value more than 0 enables polling.
Parameters poll-grow and poll-shrink configure how the self-tuning
algorithm will adapt the current polling time.  If they are omitted
or set to 0 default values will be used.

Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
---
 docs/formatdomain.html.in                          |  19 ++-
 docs/schemas/domaincommon.rng                      |  24 +++
 src/conf/domain_conf.c                             | 169 ++++++++++++++++++++-
 src/conf/domain_conf.h                             |   8 +
 src/libvirt_private.syms                           |   1 +
 .../generic-iothreads-no-polling.xml               |  22 +++
 .../generic-iothreads-polling-disabled.xml         |  24 +++
 .../generic-iothreads-polling-enabled-fail.xml     |  24 +++
 .../generic-iothreads-polling-enabled.xml          |  24 +++
 tests/genericxml2xmltest.c                         |   6 +
 tests/testutils.c                                  |   3 +-
 11 files changed, 318 insertions(+), 6 deletions(-)
 create mode 100644 tests/genericxml2xmlindata/generic-iothreads-no-polling.xml
 create mode 100644 tests/genericxml2xmlindata/generic-iothreads-polling-disabled.xml
 create mode 100644 tests/genericxml2xmlindata/generic-iothreads-polling-enabled-fail.xml
 create mode 100644 tests/genericxml2xmlindata/generic-iothreads-polling-enabled.xml

diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
index b69bd4c44c..f01a11cf93 100644
--- a/docs/formatdomain.html.in
+++ b/docs/formatdomain.html.in
@@ -611,6 +611,7 @@
   ...
   &lt;iothreadids&gt;
     &lt;iothread id="2"/&gt;
+      &lt;polling enabled='yes' max_ns="4000" grow="2" shrink="4"/&gt;
     &lt;iothread id="4"/&gt;
     &lt;iothread id="6"/&gt;
     &lt;iothread id="8"/&gt;
@@ -645,7 +646,23 @@
         defined for the domain, then the <code>iothreads</code> value
         will be adjusted accordingly.
         <span class="since">Since 1.2.15</span>
-       </dd>
+      </dd>
+      <dt><code>polling</code></dt>
+      <dd>
+        The optional <code>polling</code> element provides the capability to
+        enable/disable and configure polling mechanism for iothreads. Attribute
+        <code>max_ns</code> specifies the maximum time in <code>ns</code>
+        between each poll requests and is mandatory if polling is explicitly
+        enabled. Attributes <code>grow</code> and <code>shrink</code> specifies
+        time in <code>ns</code> that is used to configure how the polling
+        algorithm will adapt current polling time to different workloads.
+        If any of <code>max_ns</code>, <code>grow</code> and <code>shrink</code>
+        is set to <code>0</code>, it is the same as not providing it at all.
+        If this element is omitted the default behavior and values are set by
+        hypervisor. Possible values for <code>enabled</code> are
+        <code>yes</code> and <code>no</code>. Available only for QEMU driver.
+        <span class="since">Since 3.1.0</span>
+      </dd>
     </dl>
 
     <h3><a name="elementsCPUTuning">CPU Tuning</a></h3>
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng
index c5f101325e..11aecbaa6e 100644
--- a/docs/schemas/domaincommon.rng
+++ b/docs/schemas/domaincommon.rng
@@ -661,6 +661,30 @@
               <attribute name="id">
                 <ref name="unsignedInt"/>
               </attribute>
+              <optional>
+                <element name="polling">
+                  <optional>
+                    <attribute name="enabled">
+                      <ref name="virYesNo"/>
+                    </attribute>
+                  </optional>
+                  <optional>
+                    <attribute name="max_ns">
+                      <ref name="unsignedInt"/>
+                    </attribute>
+                  </optional>
+                  <optional>
+                    <attribute name="grow">
+                      <ref name="unsignedInt"/>
+                    </attribute>
+                  </optional>
+                  <optional>
+                    <attribute name="shrink">
+                      <ref name="unsignedInt"/>
+                    </attribute>
+                  </optional>
+                </element>
+              </optional>
             </element>
           </zeroOrMore>
         </element>
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 79bdbdf50c..4b552a9175 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -1228,6 +1228,30 @@ virDomainDeviceDefCheckUnsupportedMemoryDevice(virDomainDeviceDefPtr dev)
 }
 
 
+/**
+ * virDomainDefCheckUnsupportedIOThreadPolling:
+ * @def: domain definition
+ *
+ * Returns -1 if the domain definition would configure IOThread polling
+ * and reports an error, otherwise returns 0.
+ */
+static int
+virDomainDefCheckUnsupportedIOThreadPolling(virDomainDefPtr def)
+{
+    size_t i;
+
+    for (i = 0; i < def->niothreadids; i++) {
+        if (def->iothreadids[i]->poll_enabled != VIR_TRISTATE_BOOL_ABSENT) {
+            virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
+                           _("IOThread polling is not supported by this driver"));
+            return -1;
+        }
+    }
+
+    return 0;
+}
+
+
 bool virDomainObjTaint(virDomainObjPtr obj,
                        virDomainTaintFlags taint)
 {
@@ -4467,6 +4491,10 @@ virDomainDefPostParseCheckFeatures(virDomainDefPtr def,
         return -1;
     }
 
+    if (UNSUPPORTED(VIR_DOMAIN_DEF_FEATURE_IOTHREAD_POLLING) &&
+        virDomainDefCheckUnsupportedIOThreadPolling(def) < 0)
+        return -1;
+
     return 0;
 }
 
@@ -4585,6 +4613,60 @@ virDomainVcpuDefPostParse(virDomainDefPtr def)
 }
 
 
+int
+virDomainIOThreadDefPostParse(virDomainIOThreadIDDefPtr iothread)
+{
+    if ((iothread->poll_grow > 0 || iothread->poll_shrink > 0) &&
+        iothread->poll_max_ns == 0) {
+        virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+                       _("polling grow or shrink is set for iothread id "
+                         "'%u' but max_ns is not set"),
+                       iothread->iothread_id);
+        return -1;
+    }
+
+    switch (iothread->poll_enabled) {
+    case VIR_TRISTATE_BOOL_ABSENT:
+        if (iothread->poll_max_ns > 0)
+            iothread->poll_enabled = VIR_TRISTATE_BOOL_YES;
+        break;
+
+    case VIR_TRISTATE_BOOL_NO:
+        iothread->poll_max_ns = 0;
+        iothread->poll_grow = 0;
+        iothread->poll_shrink = 0;
+        break;
+
+    case VIR_TRISTATE_BOOL_YES:
+        if (iothread->poll_max_ns == 0) {
+            virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+                           _("polling is enabled for iothread id '%u' but "
+                             "max_ns is not set"), iothread->iothread_id);
+            return -1;
+        }
+        break;
+
+    case VIR_TRISTATE_BOOL_LAST:
+        break;
+    }
+
+    return 0;
+}
+
+
+static int
+virDomainDefPostParseIOThreads(virDomainDefPtr def)
+{
+    size_t i;
+
+    for (i = 0; i < def->niothreadids; i++)
+        if (virDomainIOThreadDefPostParse(def->iothreadids[i]) < 0)
+            return -1;
+
+    return 0;
+}
+
+
 static int
 virDomainDefPostParseInternal(virDomainDefPtr def,
                               struct virDomainDefPostParseDeviceIteratorData *data)
@@ -4599,6 +4681,9 @@ virDomainDefPostParseInternal(virDomainDefPtr def,
     if (virDomainVcpuDefPostParse(def) < 0)
         return -1;
 
+    if (virDomainDefPostParseIOThreads(def) < 0)
+        return -1;
+
     if (virDomainDefPostParseMemory(def, data->parseFlags) < 0)
         return -1;
 
@@ -15574,7 +15659,9 @@ virDomainIdmapDefParseXML(xmlXPathContextPtr ctxt,
  *
  *     <iothreads>4</iothreads>
  *     <iothreadids>
- *       <iothread id='1'/>
+ *       <iothread id='1'>
+ *         <polling enabled='yes' max_ns='4000' grow='2' shrink='4'/>
+ *       </iothread>
  *       <iothread id='3'/>
  *       <iothread id='5'/>
  *       <iothread id='7'/>
@@ -15587,6 +15674,7 @@ virDomainIOThreadIDDefParseXML(xmlNodePtr node,
     virDomainIOThreadIDDefPtr iothrid;
     xmlNodePtr oldnode = ctxt->node;
     char *tmp = NULL;
+    int npoll = 0;
 
     if (VIR_ALLOC(iothrid) < 0)
         return NULL;
@@ -15604,6 +15692,58 @@ virDomainIOThreadIDDefParseXML(xmlNodePtr node,
                        _("invalid iothread 'id' value '%s'"), tmp);
         goto error;
     }
+    VIR_FREE(tmp);
+
+    if ((npoll = virXPathNodeSet("./polling", ctxt, NULL)) < 0)
+        goto error;
+
+    if (npoll > 1) {
+        virReportError(VIR_ERR_XML_ERROR, "%s",
+                       _("only one polling element is allowed for each "
+                         "<iothread> element"));
+        goto error;
+    }
+
+    if (npoll > 0) {
+        if ((tmp = virXPathString("string(./polling/@enabled)", ctxt))) {
+            int enabled = virTristateBoolTypeFromString(tmp);
+            if (enabled < 0) {
+                virReportError(VIR_ERR_XML_ERROR,
+                               _("invalid polling 'enabled' value '%s'"), tmp);
+                goto error;
+            }
+            iothrid->poll_enabled = enabled;
+        } else {
+            virReportError(VIR_ERR_XML_ERROR, "%s",
+                           _("missing 'enabled' attribute in <polling> element"));
+            goto error;
+        }
+        VIR_FREE(tmp);
+
+        if ((tmp = virXPathString("string(./polling/@max_ns)", ctxt)) &&
+            virStrToLong_uip(tmp, NULL, 10, &iothrid->poll_max_ns) < 0) {
+            virReportError(VIR_ERR_XML_ERROR,
+                           _("invalid polling 'max_ns' value '%s'"), tmp);
+            goto error;
+        }
+        VIR_FREE(tmp);
+
+        if ((tmp = virXPathString("string(./polling/@grow)", ctxt)) &&
+            virStrToLong_uip(tmp, NULL, 10, &iothrid->poll_grow) < 0) {
+            virReportError(VIR_ERR_XML_ERROR,
+                           _("invalid polling 'grow' value '%s'"), tmp);
+            goto error;
+        }
+        VIR_FREE(tmp);
+
+        if ((tmp = virXPathString("string(./polling/@shrink)", ctxt)) &&
+            virStrToLong_uip(tmp, NULL, 10, &iothrid->poll_shrink) < 0) {
+            virReportError(VIR_ERR_XML_ERROR,
+                           _("invalid polling 'shrink' value '%s'"), tmp);
+            goto error;
+        }
+        VIR_FREE(tmp);
+    }
 
  cleanup:
     VIR_FREE(tmp);
@@ -23713,7 +23853,8 @@ virDomainDefIothreadShouldFormat(virDomainDefPtr def)
     size_t i;
 
     for (i = 0; i < def->niothreadids; i++) {
-        if (!def->iothreadids[i]->autofill)
+        if (!def->iothreadids[i]->autofill ||
+            def->iothreadids[i]->poll_enabled != VIR_TRISTATE_BOOL_ABSENT)
             return true;
     }
 
@@ -23918,8 +24059,28 @@ virDomainDefFormatInternal(virDomainDefPtr def,
             virBufferAddLit(buf, "<iothreadids>\n");
             virBufferAdjustIndent(buf, 2);
             for (i = 0; i < def->niothreadids; i++) {
-                virBufferAsprintf(buf, "<iothread id='%u'/>\n",
-                                  def->iothreadids[i]->iothread_id);
+                virDomainIOThreadIDDefPtr iothread = def->iothreadids[i];
+                virBufferAsprintf(buf, "<iothread id='%u'", iothread->iothread_id);
+                if (iothread->poll_enabled != VIR_TRISTATE_BOOL_ABSENT) {
+                    virBufferAddLit(buf, ">\n");
+                    virBufferAdjustIndent(buf, 2);
+                    virBufferAsprintf(buf, "<polling enabled='%s'",
+                                      virTristateBoolTypeToString(iothread->poll_enabled));
+                    if (iothread->poll_max_ns)
+                        virBufferAsprintf(buf, " max_ns='%u'",
+                                          iothread->poll_max_ns);
+                    if (iothread->poll_grow)
+                        virBufferAsprintf(buf, " grow='%u'",
+                                          iothread->poll_grow);
+                    if (iothread->poll_shrink)
+                        virBufferAsprintf(buf, " shrink='%u'",
+                                          iothread->poll_shrink);
+                    virBufferAddLit(buf, "/>\n");
+                    virBufferAdjustIndent(buf, -2);
+                    virBufferAddLit(buf, "</iothread>\n");
+                } else {
+                    virBufferAddLit(buf, "/>\n");
+                }
             }
             virBufferAdjustIndent(buf, -2);
             virBufferAddLit(buf, "</iothreadids>\n");
diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h
index 1e53cc3280..8ac1d8a409 100644
--- a/src/conf/domain_conf.h
+++ b/src/conf/domain_conf.h
@@ -2074,11 +2074,18 @@ struct _virDomainIOThreadIDDef {
     int thread_id;
     virBitmapPtr cpumask;
 
+    virTristateBool poll_enabled;
+    unsigned int poll_max_ns;
+    unsigned int poll_grow;
+    unsigned int poll_shrink;
+
     virDomainThreadSchedParam sched;
 };
 
 void virDomainIOThreadIDDefFree(virDomainIOThreadIDDefPtr def);
 
+int virDomainIOThreadDefPostParse(virDomainIOThreadIDDefPtr iothread);
+
 
 typedef struct _virDomainCputune virDomainCputune;
 typedef virDomainCputune *virDomainCputunePtr;
@@ -2410,6 +2417,7 @@ typedef enum {
     VIR_DOMAIN_DEF_FEATURE_OFFLINE_VCPUPIN = (1 << 2),
     VIR_DOMAIN_DEF_FEATURE_NAME_SLASH = (1 << 3),
     VIR_DOMAIN_DEF_FEATURE_INDIVIDUAL_VCPUS = (1 << 4),
+    VIR_DOMAIN_DEF_FEATURE_IOTHREAD_POLLING = (1 << 5),
 } virDomainDefFeatures;
 
 
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
index e6ccd697d2..97aee9c0e3 100644
--- a/src/libvirt_private.syms
+++ b/src/libvirt_private.syms
@@ -371,6 +371,7 @@ virDomainHypervTypeToString;
 virDomainInputDefFree;
 virDomainIOMMUModelTypeFromString;
 virDomainIOMMUModelTypeToString;
+virDomainIOThreadDefPostParse;
 virDomainIOThreadIDAdd;
 virDomainIOThreadIDDefFree;
 virDomainIOThreadIDDel;
diff --git a/tests/genericxml2xmlindata/generic-iothreads-no-polling.xml b/tests/genericxml2xmlindata/generic-iothreads-no-polling.xml
new file mode 100644
index 0000000000..b7e5a11c06
--- /dev/null
+++ b/tests/genericxml2xmlindata/generic-iothreads-no-polling.xml
@@ -0,0 +1,22 @@
+<domain type='qemu'>
+  <name>foo</name>
+  <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid>
+  <memory unit='KiB'>219136</memory>
+  <currentMemory unit='KiB'>219136</currentMemory>
+  <vcpu placement='static'>1</vcpu>
+  <iothreads>2</iothreads>
+  <iothreadids>
+    <iothread id='1'/>
+    <iothread id='2'/>
+  </iothreadids>
+  <os>
+    <type arch='x86_64' machine='pc'>hvm</type>
+    <boot dev='hd'/>
+  </os>
+  <clock offset='utc'/>
+  <on_poweroff>destroy</on_poweroff>
+  <on_reboot>restart</on_reboot>
+  <on_crash>destroy</on_crash>
+  <devices>
+  </devices>
+</domain>
diff --git a/tests/genericxml2xmlindata/generic-iothreads-polling-disabled.xml b/tests/genericxml2xmlindata/generic-iothreads-polling-disabled.xml
new file mode 100644
index 0000000000..0352a80900
--- /dev/null
+++ b/tests/genericxml2xmlindata/generic-iothreads-polling-disabled.xml
@@ -0,0 +1,24 @@
+<domain type='qemu'>
+  <name>foo</name>
+  <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid>
+  <memory unit='KiB'>219136</memory>
+  <currentMemory unit='KiB'>219136</currentMemory>
+  <vcpu placement='static'>1</vcpu>
+  <iothreads>2</iothreads>
+  <iothreadids>
+    <iothread id='1'>
+      <polling enabled='no'/>
+    </iothread>
+    <iothread id='2'/>
+  </iothreadids>
+  <os>
+    <type arch='x86_64' machine='pc'>hvm</type>
+    <boot dev='hd'/>
+  </os>
+  <clock offset='utc'/>
+  <on_poweroff>destroy</on_poweroff>
+  <on_reboot>restart</on_reboot>
+  <on_crash>destroy</on_crash>
+  <devices>
+  </devices>
+</domain>
diff --git a/tests/genericxml2xmlindata/generic-iothreads-polling-enabled-fail.xml b/tests/genericxml2xmlindata/generic-iothreads-polling-enabled-fail.xml
new file mode 100644
index 0000000000..6f77922e7a
--- /dev/null
+++ b/tests/genericxml2xmlindata/generic-iothreads-polling-enabled-fail.xml
@@ -0,0 +1,24 @@
+<domain type='qemu'>
+  <name>foo</name>
+  <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid>
+  <memory unit='KiB'>219136</memory>
+  <currentMemory unit='KiB'>219136</currentMemory>
+  <vcpu placement='static'>1</vcpu>
+  <iothreads>2</iothreads>
+  <iothreadids>
+    <iothread id='1'>
+      <polling enabled='yes'/>
+    </iothread>
+    <iothread id='2'/>
+  </iothreadids>
+  <os>
+    <type arch='x86_64' machine='pc'>hvm</type>
+    <boot dev='hd'/>
+  </os>
+  <clock offset='utc'/>
+  <on_poweroff>destroy</on_poweroff>
+  <on_reboot>restart</on_reboot>
+  <on_crash>destroy</on_crash>
+  <devices>
+  </devices>
+</domain>
diff --git a/tests/genericxml2xmlindata/generic-iothreads-polling-enabled.xml b/tests/genericxml2xmlindata/generic-iothreads-polling-enabled.xml
new file mode 100644
index 0000000000..ef9161f367
--- /dev/null
+++ b/tests/genericxml2xmlindata/generic-iothreads-polling-enabled.xml
@@ -0,0 +1,24 @@
+<domain type='qemu'>
+  <name>foo</name>
+  <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid>
+  <memory unit='KiB'>219136</memory>
+  <currentMemory unit='KiB'>219136</currentMemory>
+  <vcpu placement='static'>1</vcpu>
+  <iothreads>2</iothreads>
+  <iothreadids>
+    <iothread id='1'>
+      <polling enabled='yes' max_ns='4000' grow='50'/>
+    </iothread>
+    <iothread id='2'/>
+  </iothreadids>
+  <os>
+    <type arch='x86_64' machine='pc'>hvm</type>
+    <boot dev='hd'/>
+  </os>
+  <clock offset='utc'/>
+  <on_poweroff>destroy</on_poweroff>
+  <on_reboot>restart</on_reboot>
+  <on_crash>destroy</on_crash>
+  <devices>
+  </devices>
+</domain>
diff --git a/tests/genericxml2xmltest.c b/tests/genericxml2xmltest.c
index 488190270f..c3a9586eab 100644
--- a/tests/genericxml2xmltest.c
+++ b/tests/genericxml2xmltest.c
@@ -100,6 +100,12 @@ mymain(void)
 
     DO_TEST("vcpus-individual");
 
+    DO_TEST("iothreads-no-polling");
+    DO_TEST("iothreads-polling-enabled");
+    DO_TEST("iothreads-polling-disabled");
+    DO_TEST_FULL("iothreads-polling-enabled-fail", 0, false,
+        TEST_COMPARE_DOM_XML2XML_RESULT_FAIL_PARSE);
+
     virObjectUnref(caps);
     virObjectUnref(xmlopt);
 
diff --git a/tests/testutils.c b/tests/testutils.c
index a596a83a96..93a12bcc76 100644
--- a/tests/testutils.c
+++ b/tests/testutils.c
@@ -1101,7 +1101,8 @@ virCapsPtr virTestGenericCapsInit(void)
 }
 
 static virDomainDefParserConfig virTestGenericDomainDefParserConfig = {
-    .features = VIR_DOMAIN_DEF_FEATURE_INDIVIDUAL_VCPUS,
+    .features = VIR_DOMAIN_DEF_FEATURE_INDIVIDUAL_VCPUS |
+                VIR_DOMAIN_DEF_FEATURE_IOTHREAD_POLLING,
 };
 static virDomainXMLPrivateDataCallbacks virTestGenericPrivateDataCallbacks;
 
-- 
2.11.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 01/12] conf: introduce domain XML element <polling> for iothread
Posted by Daniel P. Berrange 8 years, 11 months ago
On Tue, Feb 21, 2017 at 01:14:57PM +0100, Pavel Hrdina wrote:
> QEMU 2.9.0 will introduce polling feature for AioContext that instead
> of blocking syscalls polls for events without blocking.  This means
> that polling can be in most cases faster but it also increases CPU
> utilization.
> 
> To address this issue QEMU implements self-tuning algorithm that
> modifies the current polling time to adapt to different workloads
> and it can also fallback to blocking syscalls.
> 
> For each IOThread this all is controlled by three parameters,
> poll-max-ns, poll-grow and poll-shrink.  If parameter poll-max-ns
> is set to 0 it disables the polling, if it is omitted the default
> behavior is used and any value more than 0 enables polling.
> Parameters poll-grow and poll-shrink configure how the self-tuning
> algorithm will adapt the current polling time.  If they are omitted
> or set to 0 default values will be used.

With my app developer hat on I have to wonder how an app is supposed
to figure out what to set these parameters to ? It has been difficult
enough figuring out existing QEMU block tunables, but at least most
of those can be set dependant on the type of storage use on the host
side. Tunables whose use depends on the guest workload are harder to
use since it largely involves predicting the unknown.  IOW is there
a compelling reason to add these low level parameters that are tightly
coupled to the specific algorithm that QEMU happens to use today.

The QEMU commits say the tunables all default to sane parameters so
I'm inclined to say we ignore them at the libvirt level entirely.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 01/12] conf: introduce domain XML element <polling> for iothread
Posted by Pavel Hrdina 8 years, 11 months ago
On Tue, Feb 21, 2017 at 12:26:02PM +0000, Daniel P. Berrange wrote:
> On Tue, Feb 21, 2017 at 01:14:57PM +0100, Pavel Hrdina wrote:
> > QEMU 2.9.0 will introduce polling feature for AioContext that instead
> > of blocking syscalls polls for events without blocking.  This means
> > that polling can be in most cases faster but it also increases CPU
> > utilization.
> > 
> > To address this issue QEMU implements self-tuning algorithm that
> > modifies the current polling time to adapt to different workloads
> > and it can also fallback to blocking syscalls.
> > 
> > For each IOThread this all is controlled by three parameters,
> > poll-max-ns, poll-grow and poll-shrink.  If parameter poll-max-ns
> > is set to 0 it disables the polling, if it is omitted the default
> > behavior is used and any value more than 0 enables polling.
> > Parameters poll-grow and poll-shrink configure how the self-tuning
> > algorithm will adapt the current polling time.  If they are omitted
> > or set to 0 default values will be used.
> 
> With my app developer hat on I have to wonder how an app is supposed
> to figure out what to set these parameters to ? It has been difficult
> enough figuring out existing QEMU block tunables, but at least most
> of those can be set dependant on the type of storage use on the host
> side. Tunables whose use depends on the guest workload are harder to
> use since it largely involves predicting the unknown.  IOW is there
> a compelling reason to add these low level parameters that are tightly
> coupled to the specific algorithm that QEMU happens to use today.

I agree that it's probably way to complicated for management applications,
but there is a small issue with QEMU.  Currently it you don't specify
anything the polling is enabled with some reasonable default value and base
on experience with QEMU I'm not planning to count on that they will not change
the default behavior in the future.  To explicitly enable polling you need
to set poll-max-ns to some value more than 0.  We would have to check qemu
source code, and define the default value in our code in order to let
users explicitly enable polling.

> The QEMU commits say the tunables all default to sane parameters so
> I'm inclined to say we ignore them at the libvirt level entirely.

Yes, it would be way batter to have only <polling enable='yes|no'> end let
QEMU deal with the sane values for all parameters but that would mean to come
up with the sane values ourselves or modify QEMU to add another property that
would simply control whether it is enabled or not.

Pavel

> Regards,
> Daniel
> -- 
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|
> 
> --
> libvir-list mailing list
> libvir-list@redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 01/12] conf: introduce domain XML element <polling> for iothread
Posted by Daniel P. Berrange 8 years, 11 months ago
On Tue, Feb 21, 2017 at 01:48:15PM +0100, Pavel Hrdina wrote:
> On Tue, Feb 21, 2017 at 12:26:02PM +0000, Daniel P. Berrange wrote:
> > On Tue, Feb 21, 2017 at 01:14:57PM +0100, Pavel Hrdina wrote:
> > > QEMU 2.9.0 will introduce polling feature for AioContext that instead
> > > of blocking syscalls polls for events without blocking.  This means
> > > that polling can be in most cases faster but it also increases CPU
> > > utilization.
> > > 
> > > To address this issue QEMU implements self-tuning algorithm that
> > > modifies the current polling time to adapt to different workloads
> > > and it can also fallback to blocking syscalls.
> > > 
> > > For each IOThread this all is controlled by three parameters,
> > > poll-max-ns, poll-grow and poll-shrink.  If parameter poll-max-ns
> > > is set to 0 it disables the polling, if it is omitted the default
> > > behavior is used and any value more than 0 enables polling.
> > > Parameters poll-grow and poll-shrink configure how the self-tuning
> > > algorithm will adapt the current polling time.  If they are omitted
> > > or set to 0 default values will be used.
> > 
> > With my app developer hat on I have to wonder how an app is supposed
> > to figure out what to set these parameters to ? It has been difficult
> > enough figuring out existing QEMU block tunables, but at least most
> > of those can be set dependant on the type of storage use on the host
> > side. Tunables whose use depends on the guest workload are harder to
> > use since it largely involves predicting the unknown.  IOW is there
> > a compelling reason to add these low level parameters that are tightly
> > coupled to the specific algorithm that QEMU happens to use today.
> 
> I agree that it's probably way to complicated for management applications,
> but there is a small issue with QEMU.  Currently it you don't specify
> anything the polling is enabled with some reasonable default value and base
> on experience with QEMU I'm not planning to count on that they will not change
> the default behavior in the future.  To explicitly enable polling you need
> to set poll-max-ns to some value more than 0.  We would have to check qemu
> source code, and define the default value in our code in order to let
> users explicitly enable polling.

The QEMU commit says polling is now enabled by default without needing
to set poll-max-ns AFAICT

  commit cdd7abfdba9287a289c404dfdcb02316f9ffee7d
  Author: Stefan Hajnoczi <stefanha@redhat.com>
  Date:   Thu Jan 26 17:01:19 2017 +0000

    iothread: enable AioContext polling by default
    
    IOThread AioContexts are likely to consist only of event sources like
    virtqueue ioeventfds and LinuxAIO completion eventfds that are pollable
    from userspace (without system calls).
    
    We recently merged the AioContext polling feature but didn't enable it
    by default yet.  I have gone back over the performance data on the
    mailing list and picked a default polling value that gave good results.
    
    Let's enable AioContext polling by default so users don't have another
    switch they need to set manually.  If performance regressions are found
    we can still disable this for the QEMU 2.9 release.
    
> > The QEMU commits say the tunables all default to sane parameters so
> > I'm inclined to say we ignore them at the libvirt level entirely.
> 
> Yes, it would be way batter to have only <polling enable='yes|no'> end let
> QEMU deal with the sane values for all parameters but that would mean to come
> up with the sane values ourselves or modify QEMU to add another property that
> would simply control whether it is enabled or not.

I'm saying don't even add that.

Do exactly nothing and just rely on the QEMU defaults here. This is not
affecting guest ABI at all so it doesn't matter if QEMU changes its
defaults later. In fact if QEMU changes defaults based on newer performance
measurements, it is a good thing if libvirt hasn't hardcoded all its VM
configs to the old default.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 01/12] conf: introduce domain XML element <polling> for iothread
Posted by Pavel Hrdina 8 years, 11 months ago
On Tue, Feb 21, 2017 at 12:55:51PM +0000, Daniel P. Berrange wrote:
> On Tue, Feb 21, 2017 at 01:48:15PM +0100, Pavel Hrdina wrote:
> > On Tue, Feb 21, 2017 at 12:26:02PM +0000, Daniel P. Berrange wrote:
> > > On Tue, Feb 21, 2017 at 01:14:57PM +0100, Pavel Hrdina wrote:
> > > > QEMU 2.9.0 will introduce polling feature for AioContext that instead
> > > > of blocking syscalls polls for events without blocking.  This means
> > > > that polling can be in most cases faster but it also increases CPU
> > > > utilization.
> > > > 
> > > > To address this issue QEMU implements self-tuning algorithm that
> > > > modifies the current polling time to adapt to different workloads
> > > > and it can also fallback to blocking syscalls.
> > > > 
> > > > For each IOThread this all is controlled by three parameters,
> > > > poll-max-ns, poll-grow and poll-shrink.  If parameter poll-max-ns
> > > > is set to 0 it disables the polling, if it is omitted the default
> > > > behavior is used and any value more than 0 enables polling.
> > > > Parameters poll-grow and poll-shrink configure how the self-tuning
> > > > algorithm will adapt the current polling time.  If they are omitted
> > > > or set to 0 default values will be used.
> > > 
> > > With my app developer hat on I have to wonder how an app is supposed
> > > to figure out what to set these parameters to ? It has been difficult
> > > enough figuring out existing QEMU block tunables, but at least most
> > > of those can be set dependant on the type of storage use on the host
> > > side. Tunables whose use depends on the guest workload are harder to
> > > use since it largely involves predicting the unknown.  IOW is there
> > > a compelling reason to add these low level parameters that are tightly
> > > coupled to the specific algorithm that QEMU happens to use today.
> > 
> > I agree that it's probably way to complicated for management applications,
> > but there is a small issue with QEMU.  Currently it you don't specify
> > anything the polling is enabled with some reasonable default value and base
> > on experience with QEMU I'm not planning to count on that they will not change
> > the default behavior in the future.  To explicitly enable polling you need
> > to set poll-max-ns to some value more than 0.  We would have to check qemu
> > source code, and define the default value in our code in order to let
> > users explicitly enable polling.
> 
> The QEMU commit says polling is now enabled by default without needing
> to set poll-max-ns AFAICT
> 
>   commit cdd7abfdba9287a289c404dfdcb02316f9ffee7d
>   Author: Stefan Hajnoczi <stefanha@redhat.com>
>   Date:   Thu Jan 26 17:01:19 2017 +0000
> 
>     iothread: enable AioContext polling by default
>     
>     IOThread AioContexts are likely to consist only of event sources like
>     virtqueue ioeventfds and LinuxAIO completion eventfds that are pollable
>     from userspace (without system calls).
>     
>     We recently merged the AioContext polling feature but didn't enable it
>     by default yet.  I have gone back over the performance data on the
>     mailing list and picked a default polling value that gave good results.
>     
>     Let's enable AioContext polling by default so users don't have another
>     switch they need to set manually.  If performance regressions are found
>     we can still disable this for the QEMU 2.9 release.
>     
> > > The QEMU commits say the tunables all default to sane parameters so
> > > I'm inclined to say we ignore them at the libvirt level entirely.
> > 
> > Yes, it would be way batter to have only <polling enable='yes|no'> end let
> > QEMU deal with the sane values for all parameters but that would mean to come
> > up with the sane values ourselves or modify QEMU to add another property that
> > would simply control whether it is enabled or not.
> 
> I'm saying don't even add that.
> 
> Do exactly nothing and just rely on the QEMU defaults here. This is not
> affecting guest ABI at all so it doesn't matter if QEMU changes its
> defaults later. In fact if QEMU changes defaults based on newer performance
> measurements, it is a good thing if libvirt hasn't hardcoded all its VM
> configs to the old default.

What if someone would like to disable it even if QEMU thinks that the
performance is good?  This patch series doesn't hardcode anything into
VM config.  If you don't set the polling element at all Libvirt will follow
the QEMU defaults and only the live XML would contain current state of
polling with default values loaded from QEMU.

This patch series ads a possibility to explicitly configure the polling if
someone want's to do that for some reason, but it also preserve the benefit
when you just don't care about it and you want to use the default.

If you still think that we should not export this feature at all, well we
don't have to.  The use-case that you've described is still possible with
this series, it only adds extra functionality on top of that.

Pavel
 
> Regards,
> Daniel
> -- 
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|
> 
> --
> libvir-list mailing list
> libvir-list@redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 01/12] conf: introduce domain XML element <polling> for iothread
Posted by Daniel P. Berrange 8 years, 11 months ago
On Tue, Feb 21, 2017 at 02:14:44PM +0100, Pavel Hrdina wrote:
> On Tue, Feb 21, 2017 at 12:55:51PM +0000, Daniel P. Berrange wrote:
> > On Tue, Feb 21, 2017 at 01:48:15PM +0100, Pavel Hrdina wrote:
> > > On Tue, Feb 21, 2017 at 12:26:02PM +0000, Daniel P. Berrange wrote:
> > > > On Tue, Feb 21, 2017 at 01:14:57PM +0100, Pavel Hrdina wrote:
> > > > > QEMU 2.9.0 will introduce polling feature for AioContext that instead
> > > > > of blocking syscalls polls for events without blocking.  This means
> > > > > that polling can be in most cases faster but it also increases CPU
> > > > > utilization.
> > > > > 
> > > > > To address this issue QEMU implements self-tuning algorithm that
> > > > > modifies the current polling time to adapt to different workloads
> > > > > and it can also fallback to blocking syscalls.
> > > > > 
> > > > > For each IOThread this all is controlled by three parameters,
> > > > > poll-max-ns, poll-grow and poll-shrink.  If parameter poll-max-ns
> > > > > is set to 0 it disables the polling, if it is omitted the default
> > > > > behavior is used and any value more than 0 enables polling.
> > > > > Parameters poll-grow and poll-shrink configure how the self-tuning
> > > > > algorithm will adapt the current polling time.  If they are omitted
> > > > > or set to 0 default values will be used.
> > > > 
> > > > With my app developer hat on I have to wonder how an app is supposed
> > > > to figure out what to set these parameters to ? It has been difficult
> > > > enough figuring out existing QEMU block tunables, but at least most
> > > > of those can be set dependant on the type of storage use on the host
> > > > side. Tunables whose use depends on the guest workload are harder to
> > > > use since it largely involves predicting the unknown.  IOW is there
> > > > a compelling reason to add these low level parameters that are tightly
> > > > coupled to the specific algorithm that QEMU happens to use today.
> > > 
> > > I agree that it's probably way to complicated for management applications,
> > > but there is a small issue with QEMU.  Currently it you don't specify
> > > anything the polling is enabled with some reasonable default value and base
> > > on experience with QEMU I'm not planning to count on that they will not change
> > > the default behavior in the future.  To explicitly enable polling you need
> > > to set poll-max-ns to some value more than 0.  We would have to check qemu
> > > source code, and define the default value in our code in order to let
> > > users explicitly enable polling.
> > 
> > The QEMU commit says polling is now enabled by default without needing
> > to set poll-max-ns AFAICT
> > 
> >   commit cdd7abfdba9287a289c404dfdcb02316f9ffee7d
> >   Author: Stefan Hajnoczi <stefanha@redhat.com>
> >   Date:   Thu Jan 26 17:01:19 2017 +0000
> > 
> >     iothread: enable AioContext polling by default
> >     
> >     IOThread AioContexts are likely to consist only of event sources like
> >     virtqueue ioeventfds and LinuxAIO completion eventfds that are pollable
> >     from userspace (without system calls).
> >     
> >     We recently merged the AioContext polling feature but didn't enable it
> >     by default yet.  I have gone back over the performance data on the
> >     mailing list and picked a default polling value that gave good results.
> >     
> >     Let's enable AioContext polling by default so users don't have another
> >     switch they need to set manually.  If performance regressions are found
> >     we can still disable this for the QEMU 2.9 release.
> >     
> > > > The QEMU commits say the tunables all default to sane parameters so
> > > > I'm inclined to say we ignore them at the libvirt level entirely.
> > > 
> > > Yes, it would be way batter to have only <polling enable='yes|no'> end let
> > > QEMU deal with the sane values for all parameters but that would mean to come
> > > up with the sane values ourselves or modify QEMU to add another property that
> > > would simply control whether it is enabled or not.
> > 
> > I'm saying don't even add that.
> > 
> > Do exactly nothing and just rely on the QEMU defaults here. This is not
> > affecting guest ABI at all so it doesn't matter if QEMU changes its
> > defaults later. In fact if QEMU changes defaults based on newer performance
> > measurements, it is a good thing if libvirt hasn't hardcoded all its VM
> > configs to the old default.
> 
> What if someone would like to disable it even if QEMU thinks that the
> performance is good?  This patch series doesn't hardcode anything into
> VM config.  If you don't set the polling element at all Libvirt will follow
> the QEMU defaults and only the live XML would contain current state of
> polling with default values loaded from QEMU.
> 
> This patch series ads a possibility to explicitly configure the polling if
> someone want's to do that for some reason, but it also preserve the benefit
> when you just don't care about it and you want to use the default.
> 
> If you still think that we should not export this feature at all, well we
> don't have to.  The use-case that you've described is still possible with
> this series, it only adds extra functionality on top of that.

I'm very wary of adding config parameters in libvirt just because they
exist in QEMU, particularly when the parameters are totally specific to
an algorithm that just happens to be the one implemented right now. We've
no idea if QEMU will stick with this algorithm or change it entirely, and
if the latter, then any config parameters will be likely meaningless for
any other algorithm. I can understand why QEMU would expose them on its
CLI, but I don't feel they are a good fit for exposing up the stack,
particularly given a lack of any guidance as to how people might consider
changing the values, other than random uninformed guesswork.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 01/12] conf: introduce domain XML element <polling> for iothread
Posted by Pavel Hrdina 8 years, 11 months ago
On Tue, Feb 21, 2017 at 01:26:25PM +0000, Daniel P. Berrange wrote:
> On Tue, Feb 21, 2017 at 02:14:44PM +0100, Pavel Hrdina wrote:
> > On Tue, Feb 21, 2017 at 12:55:51PM +0000, Daniel P. Berrange wrote:
> > > On Tue, Feb 21, 2017 at 01:48:15PM +0100, Pavel Hrdina wrote:
> > > > On Tue, Feb 21, 2017 at 12:26:02PM +0000, Daniel P. Berrange wrote:
> > > > > On Tue, Feb 21, 2017 at 01:14:57PM +0100, Pavel Hrdina wrote:
> > > > > > QEMU 2.9.0 will introduce polling feature for AioContext that instead
> > > > > > of blocking syscalls polls for events without blocking.  This means
> > > > > > that polling can be in most cases faster but it also increases CPU
> > > > > > utilization.
> > > > > > 
> > > > > > To address this issue QEMU implements self-tuning algorithm that
> > > > > > modifies the current polling time to adapt to different workloads
> > > > > > and it can also fallback to blocking syscalls.
> > > > > > 
> > > > > > For each IOThread this all is controlled by three parameters,
> > > > > > poll-max-ns, poll-grow and poll-shrink.  If parameter poll-max-ns
> > > > > > is set to 0 it disables the polling, if it is omitted the default
> > > > > > behavior is used and any value more than 0 enables polling.
> > > > > > Parameters poll-grow and poll-shrink configure how the self-tuning
> > > > > > algorithm will adapt the current polling time.  If they are omitted
> > > > > > or set to 0 default values will be used.
> > > > > 
> > > > > With my app developer hat on I have to wonder how an app is supposed
> > > > > to figure out what to set these parameters to ? It has been difficult
> > > > > enough figuring out existing QEMU block tunables, but at least most
> > > > > of those can be set dependant on the type of storage use on the host
> > > > > side. Tunables whose use depends on the guest workload are harder to
> > > > > use since it largely involves predicting the unknown.  IOW is there
> > > > > a compelling reason to add these low level parameters that are tightly
> > > > > coupled to the specific algorithm that QEMU happens to use today.
> > > > 
> > > > I agree that it's probably way to complicated for management applications,
> > > > but there is a small issue with QEMU.  Currently it you don't specify
> > > > anything the polling is enabled with some reasonable default value and base
> > > > on experience with QEMU I'm not planning to count on that they will not change
> > > > the default behavior in the future.  To explicitly enable polling you need
> > > > to set poll-max-ns to some value more than 0.  We would have to check qemu
> > > > source code, and define the default value in our code in order to let
> > > > users explicitly enable polling.
> > > 
> > > The QEMU commit says polling is now enabled by default without needing
> > > to set poll-max-ns AFAICT
> > > 
> > >   commit cdd7abfdba9287a289c404dfdcb02316f9ffee7d
> > >   Author: Stefan Hajnoczi <stefanha@redhat.com>
> > >   Date:   Thu Jan 26 17:01:19 2017 +0000
> > > 
> > >     iothread: enable AioContext polling by default
> > >     
> > >     IOThread AioContexts are likely to consist only of event sources like
> > >     virtqueue ioeventfds and LinuxAIO completion eventfds that are pollable
> > >     from userspace (without system calls).
> > >     
> > >     We recently merged the AioContext polling feature but didn't enable it
> > >     by default yet.  I have gone back over the performance data on the
> > >     mailing list and picked a default polling value that gave good results.
> > >     
> > >     Let's enable AioContext polling by default so users don't have another
> > >     switch they need to set manually.  If performance regressions are found
> > >     we can still disable this for the QEMU 2.9 release.
> > >     
> > > > > The QEMU commits say the tunables all default to sane parameters so
> > > > > I'm inclined to say we ignore them at the libvirt level entirely.
> > > > 
> > > > Yes, it would be way batter to have only <polling enable='yes|no'> end let
> > > > QEMU deal with the sane values for all parameters but that would mean to come
> > > > up with the sane values ourselves or modify QEMU to add another property that
> > > > would simply control whether it is enabled or not.
> > > 
> > > I'm saying don't even add that.
> > > 
> > > Do exactly nothing and just rely on the QEMU defaults here. This is not
> > > affecting guest ABI at all so it doesn't matter if QEMU changes its
> > > defaults later. In fact if QEMU changes defaults based on newer performance
> > > measurements, it is a good thing if libvirt hasn't hardcoded all its VM
> > > configs to the old default.
> > 
> > What if someone would like to disable it even if QEMU thinks that the
> > performance is good?  This patch series doesn't hardcode anything into
> > VM config.  If you don't set the polling element at all Libvirt will follow
> > the QEMU defaults and only the live XML would contain current state of
> > polling with default values loaded from QEMU.
> > 
> > This patch series ads a possibility to explicitly configure the polling if
> > someone want's to do that for some reason, but it also preserve the benefit
> > when you just don't care about it and you want to use the default.
> > 
> > If you still think that we should not export this feature at all, well we
> > don't have to.  The use-case that you've described is still possible with
> > this series, it only adds extra functionality on top of that.
> 
> I'm very wary of adding config parameters in libvirt just because they
> exist in QEMU, particularly when the parameters are totally specific to
> an algorithm that just happens to be the one implemented right now. We've
> no idea if QEMU will stick with this algorithm or change it entirely, and
> if the latter, then any config parameters will be likely meaningless for
> any other algorithm. I can understand why QEMU would expose them on its
> CLI, but I don't feel they are a good fit for exposing up the stack,
> particularly given a lack of any guidance as to how people might consider
> changing the values, other than random uninformed guesswork.

Yes, that's true and I had the same worrying feeling about the parameters
specifically tied to QEMU algorithm, but I thought that it would be nice
to export them anyway.  Let's ignore this series for now and if someone asks
for a feature to disable polling we can revive it.

Pavel

> 
> Regards,
> Daniel
> -- 
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|
> 
> --
> libvir-list mailing list
> libvir-list@redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 01/12] conf: introduce domain XML element <polling> for iothread
Posted by Stefan Hajnoczi 8 years, 11 months ago
On Tue, Feb 21, 2017 at 1:39 PM, Pavel Hrdina <phrdina@redhat.com> wrote:
> On Tue, Feb 21, 2017 at 01:26:25PM +0000, Daniel P. Berrange wrote:
>> On Tue, Feb 21, 2017 at 02:14:44PM +0100, Pavel Hrdina wrote:
>> > On Tue, Feb 21, 2017 at 12:55:51PM +0000, Daniel P. Berrange wrote:
>> > > On Tue, Feb 21, 2017 at 01:48:15PM +0100, Pavel Hrdina wrote:
>> > > > On Tue, Feb 21, 2017 at 12:26:02PM +0000, Daniel P. Berrange wrote:
>> > > > > On Tue, Feb 21, 2017 at 01:14:57PM +0100, Pavel Hrdina wrote:
>> > > > > > QEMU 2.9.0 will introduce polling feature for AioContext that instead
>> > > > > > of blocking syscalls polls for events without blocking.  This means
>> > > > > > that polling can be in most cases faster but it also increases CPU
>> > > > > > utilization.
>> > > > > >
>> > > > > > To address this issue QEMU implements self-tuning algorithm that
>> > > > > > modifies the current polling time to adapt to different workloads
>> > > > > > and it can also fallback to blocking syscalls.
>> > > > > >
>> > > > > > For each IOThread this all is controlled by three parameters,
>> > > > > > poll-max-ns, poll-grow and poll-shrink.  If parameter poll-max-ns
>> > > > > > is set to 0 it disables the polling, if it is omitted the default
>> > > > > > behavior is used and any value more than 0 enables polling.
>> > > > > > Parameters poll-grow and poll-shrink configure how the self-tuning
>> > > > > > algorithm will adapt the current polling time.  If they are omitted
>> > > > > > or set to 0 default values will be used.
>> > > > >
>> > > > > With my app developer hat on I have to wonder how an app is supposed
>> > > > > to figure out what to set these parameters to ? It has been difficult
>> > > > > enough figuring out existing QEMU block tunables, but at least most
>> > > > > of those can be set dependant on the type of storage use on the host
>> > > > > side. Tunables whose use depends on the guest workload are harder to
>> > > > > use since it largely involves predicting the unknown.  IOW is there
>> > > > > a compelling reason to add these low level parameters that are tightly
>> > > > > coupled to the specific algorithm that QEMU happens to use today.
>> > > >
>> > > > I agree that it's probably way to complicated for management applications,
>> > > > but there is a small issue with QEMU.  Currently it you don't specify
>> > > > anything the polling is enabled with some reasonable default value and base
>> > > > on experience with QEMU I'm not planning to count on that they will not change
>> > > > the default behavior in the future.  To explicitly enable polling you need
>> > > > to set poll-max-ns to some value more than 0.  We would have to check qemu
>> > > > source code, and define the default value in our code in order to let
>> > > > users explicitly enable polling.
>> > >
>> > > The QEMU commit says polling is now enabled by default without needing
>> > > to set poll-max-ns AFAICT
>> > >
>> > >   commit cdd7abfdba9287a289c404dfdcb02316f9ffee7d
>> > >   Author: Stefan Hajnoczi <stefanha@redhat.com>
>> > >   Date:   Thu Jan 26 17:01:19 2017 +0000
>> > >
>> > >     iothread: enable AioContext polling by default
>> > >
>> > >     IOThread AioContexts are likely to consist only of event sources like
>> > >     virtqueue ioeventfds and LinuxAIO completion eventfds that are pollable
>> > >     from userspace (without system calls).
>> > >
>> > >     We recently merged the AioContext polling feature but didn't enable it
>> > >     by default yet.  I have gone back over the performance data on the
>> > >     mailing list and picked a default polling value that gave good results.
>> > >
>> > >     Let's enable AioContext polling by default so users don't have another
>> > >     switch they need to set manually.  If performance regressions are found
>> > >     we can still disable this for the QEMU 2.9 release.
>> > >
>> > > > > The QEMU commits say the tunables all default to sane parameters so
>> > > > > I'm inclined to say we ignore them at the libvirt level entirely.
>> > > >
>> > > > Yes, it would be way batter to have only <polling enable='yes|no'> end let
>> > > > QEMU deal with the sane values for all parameters but that would mean to come
>> > > > up with the sane values ourselves or modify QEMU to add another property that
>> > > > would simply control whether it is enabled or not.
>> > >
>> > > I'm saying don't even add that.
>> > >
>> > > Do exactly nothing and just rely on the QEMU defaults here. This is not
>> > > affecting guest ABI at all so it doesn't matter if QEMU changes its
>> > > defaults later. In fact if QEMU changes defaults based on newer performance
>> > > measurements, it is a good thing if libvirt hasn't hardcoded all its VM
>> > > configs to the old default.
>> >
>> > What if someone would like to disable it even if QEMU thinks that the
>> > performance is good?  This patch series doesn't hardcode anything into
>> > VM config.  If you don't set the polling element at all Libvirt will follow
>> > the QEMU defaults and only the live XML would contain current state of
>> > polling with default values loaded from QEMU.
>> >
>> > This patch series ads a possibility to explicitly configure the polling if
>> > someone want's to do that for some reason, but it also preserve the benefit
>> > when you just don't care about it and you want to use the default.
>> >
>> > If you still think that we should not export this feature at all, well we
>> > don't have to.  The use-case that you've described is still possible with
>> > this series, it only adds extra functionality on top of that.
>>
>> I'm very wary of adding config parameters in libvirt just because they
>> exist in QEMU, particularly when the parameters are totally specific to
>> an algorithm that just happens to be the one implemented right now. We've
>> no idea if QEMU will stick with this algorithm or change it entirely, and
>> if the latter, then any config parameters will be likely meaningless for
>> any other algorithm. I can understand why QEMU would expose them on its
>> CLI, but I don't feel they are a good fit for exposing up the stack,
>> particularly given a lack of any guidance as to how people might consider
>> changing the values, other than random uninformed guesswork.
>
> Yes, that's true and I had the same worrying feeling about the parameters
> specifically tied to QEMU algorithm, but I thought that it would be nice
> to export them anyway.  Let's ignore this series for now and if someone asks
> for a feature to disable polling we can revive it.

libvirt doesn't need a dedicated API for <iothread> polling parameters
but the QEMU command-line passthrough feature must make it possible
using <qemu:arg value='-newarg'/>.  I have tested the following QEMU
command-line:

-object iothread,id=iothread0 \ # assume libvirt defines the iothreads
-object iothread,id=iothread1 \
-set object.iothread0.poll-max-ns=0 # override poll-max-ns using -set

This disables polling in iothread0 and leaves the default value in iothread1.

I'm fine if libvirt doesn't add a dedicated API for setting <iothread>
polling parameters.  It's unlikely that users will need to change the
setting.  In an emergency (e.g. disabling it due to a performance
regression) they can use <qemu:arg value='-newarg'/>.

Stefan

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list