[libvirt] [RFC] docs: Discourage usage of cache mode=passthrough

Eduardo Habkost posted 1 patch 6 years, 6 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20170919193741.6921-1-ehabkost@redhat.com
docs/formatdomain.html.in | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
[libvirt] [RFC] docs: Discourage usage of cache mode=passthrough
Posted by Eduardo Habkost 6 years, 6 months ago
Cache mode=passthrough can result in a broken cache topology if
the domain topology is not exactly the same as the host topology.
Warn about that in the documentation.

Bug report for reference:
https://bugzilla.redhat.com/show_bug.cgi?id=1184125

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
---
 docs/formatdomain.html.in | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
index 57ec2ff34..9c21892f3 100644
--- a/docs/formatdomain.html.in
+++ b/docs/formatdomain.html.in
@@ -1478,7 +1478,9 @@
 
               <dt><code>passthrough</code></dt>
               <dd>The real CPU cache data reported by the host CPU will be
-                passed through to the virtual CPU.</dd>
+                passed through to the virtual CPU.  Using this mode is not
+                recommended unless the domain CPU and NUMA topology is exactly
+                the same as the host CPU and NUMA topology.</dd>
 
               <dt><code>disable</code></dt>
               <dd>The virtual CPU will report no CPU cache of the specified
-- 
2.13.5

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC] docs: Discourage usage of cache mode=passthrough
Posted by Laine Stump 6 years, 6 months ago
On 09/19/2017 03:37 PM, Eduardo Habkost wrote:
> Cache mode=passthrough can result in a broken cache topology if
> the domain topology is not exactly the same as the host topology.
> Warn about that in the documentation.
>
> Bug report for reference:
> https://bugzilla.redhat.com/show_bug.cgi?id=1184125
>
> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> ---
>  docs/formatdomain.html.in | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
> index 57ec2ff34..9c21892f3 100644
> --- a/docs/formatdomain.html.in
> +++ b/docs/formatdomain.html.in
> @@ -1478,7 +1478,9 @@
>  
>                <dt><code>passthrough</code></dt>
>                <dd>The real CPU cache data reported by the host CPU will be
> -                passed through to the virtual CPU.</dd>
> +                passed through to the virtual CPU.  Using this mode is not
> +                recommended unless the domain CPU and NUMA topology is exactly
> +                the same as the host CPU and NUMA topology.</dd>

To me this sounds like it should be forbidden by libvirt, rather than
just documented as "bad". (I haven't followed any previous discussion on
the topic though, so maybe I'm over-reacting).

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC] docs: Discourage usage of cache mode=passthrough
Posted by Eduardo Habkost 6 years, 6 months ago
On Thu, Sep 21, 2017 at 01:14:04PM -0400, Laine Stump wrote:
> On 09/19/2017 03:37 PM, Eduardo Habkost wrote:
> > Cache mode=passthrough can result in a broken cache topology if
> > the domain topology is not exactly the same as the host topology.
> > Warn about that in the documentation.
> >
> > Bug report for reference:
> > https://bugzilla.redhat.com/show_bug.cgi?id=1184125
> >
> > Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> > ---
> >  docs/formatdomain.html.in | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
> > index 57ec2ff34..9c21892f3 100644
> > --- a/docs/formatdomain.html.in
> > +++ b/docs/formatdomain.html.in
> > @@ -1478,7 +1478,9 @@
> >  
> >                <dt><code>passthrough</code></dt>
> >                <dd>The real CPU cache data reported by the host CPU will be
> > -                passed through to the virtual CPU.</dd>
> > +                passed through to the virtual CPU.  Using this mode is not
> > +                recommended unless the domain CPU and NUMA topology is exactly
> > +                the same as the host CPU and NUMA topology.</dd>
> 
> To me this sounds like it should be forbidden by libvirt, rather than
> just documented as "bad". (I haven't followed any previous discussion on
> the topic though, so maybe I'm over-reacting).

mode=passthrough is a bad idea most times, and most people don't
really need it.  But if libvirt already supports it, won't its
removal be a regression for people that are already relying on
it?

I will check later if we can make host-cache-info safer in QEMU,
by fixing up the socket/core/thread counts in CPUID instead of
copying it as-is from the host.

-- 
Eduardo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC] docs: Discourage usage of cache mode=passthrough
Posted by Daniel P. Berrange 6 years, 6 months ago
On Thu, Sep 21, 2017 at 01:14:04PM -0400, Laine Stump wrote:
> On 09/19/2017 03:37 PM, Eduardo Habkost wrote:
> > Cache mode=passthrough can result in a broken cache topology if
> > the domain topology is not exactly the same as the host topology.
> > Warn about that in the documentation.
> >
> > Bug report for reference:
> > https://bugzilla.redhat.com/show_bug.cgi?id=1184125
> >
> > Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> > ---
> >  docs/formatdomain.html.in | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
> > index 57ec2ff34..9c21892f3 100644
> > --- a/docs/formatdomain.html.in
> > +++ b/docs/formatdomain.html.in
> > @@ -1478,7 +1478,9 @@
> >  
> >                <dt><code>passthrough</code></dt>
> >                <dd>The real CPU cache data reported by the host CPU will be
> > -                passed through to the virtual CPU.</dd>
> > +                passed through to the virtual CPU.  Using this mode is not
> > +                recommended unless the domain CPU and NUMA topology is exactly
> > +                the same as the host CPU and NUMA topology.</dd>
> 
> To me this sounds like it should be forbidden by libvirt, rather than
> just documented as "bad". (I haven't followed any previous discussion on
> the topic though, so maybe I'm over-reacting).

In high performance setups, people pin guest vCPUs to host pCPUs and
set the vCPU topology to match the host pCPU topology they've pinned
to. So ohaving a cache mode that matches this topology is just fine.
It simply isn't something you want as a default for the more typical
floating vCPUs scenarios.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC] docs: Discourage usage of cache mode=passthrough
Posted by Eduardo Habkost 6 years, 5 months ago
On Thu, Sep 28, 2017 at 09:21:41AM +0100, Daniel P. Berrange wrote:
> On Thu, Sep 21, 2017 at 01:14:04PM -0400, Laine Stump wrote:
> > On 09/19/2017 03:37 PM, Eduardo Habkost wrote:
> > > Cache mode=passthrough can result in a broken cache topology if
> > > the domain topology is not exactly the same as the host topology.
> > > Warn about that in the documentation.
> > >
> > > Bug report for reference:
> > > https://bugzilla.redhat.com/show_bug.cgi?id=1184125
> > >
> > > Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> > > ---
> > >  docs/formatdomain.html.in | 4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
> > > index 57ec2ff34..9c21892f3 100644
> > > --- a/docs/formatdomain.html.in
> > > +++ b/docs/formatdomain.html.in
> > > @@ -1478,7 +1478,9 @@
> > >  
> > >                <dt><code>passthrough</code></dt>
> > >                <dd>The real CPU cache data reported by the host CPU will be
> > > -                passed through to the virtual CPU.</dd>
> > > +                passed through to the virtual CPU.  Using this mode is not
> > > +                recommended unless the domain CPU and NUMA topology is exactly
> > > +                the same as the host CPU and NUMA topology.</dd>
> > 
> > To me this sounds like it should be forbidden by libvirt, rather than
> > just documented as "bad". (I haven't followed any previous discussion on
> > the topic though, so maybe I'm over-reacting).
> 
> In high performance setups, people pin guest vCPUs to host pCPUs and
> set the vCPU topology to match the host pCPU topology they've pinned
> to. So ohaving a cache mode that matches this topology is just fine.
> It simply isn't something you want as a default for the more typical
> floating vCPUs scenarios.

So, should this patch be applied?

-- 
Eduardo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC] docs: Discourage usage of cache mode=passthrough
Posted by Daniel P. Berrange 6 years, 5 months ago
On Mon, Nov 06, 2017 at 11:10:00AM -0200, Eduardo Habkost wrote:
> On Thu, Sep 28, 2017 at 09:21:41AM +0100, Daniel P. Berrange wrote:
> > On Thu, Sep 21, 2017 at 01:14:04PM -0400, Laine Stump wrote:
> > > On 09/19/2017 03:37 PM, Eduardo Habkost wrote:
> > > > Cache mode=passthrough can result in a broken cache topology if
> > > > the domain topology is not exactly the same as the host topology.
> > > > Warn about that in the documentation.
> > > >
> > > > Bug report for reference:
> > > > https://bugzilla.redhat.com/show_bug.cgi?id=1184125
> > > >
> > > > Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> > > > ---
> > > >  docs/formatdomain.html.in | 4 +++-
> > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
> > > > index 57ec2ff34..9c21892f3 100644
> > > > --- a/docs/formatdomain.html.in
> > > > +++ b/docs/formatdomain.html.in
> > > > @@ -1478,7 +1478,9 @@
> > > >  
> > > >                <dt><code>passthrough</code></dt>
> > > >                <dd>The real CPU cache data reported by the host CPU will be
> > > > -                passed through to the virtual CPU.</dd>
> > > > +                passed through to the virtual CPU.  Using this mode is not
> > > > +                recommended unless the domain CPU and NUMA topology is exactly
> > > > +                the same as the host CPU and NUMA topology.</dd>
> > > 
> > > To me this sounds like it should be forbidden by libvirt, rather than
> > > just documented as "bad". (I haven't followed any previous discussion on
> > > the topic though, so maybe I'm over-reacting).
> > 
> > In high performance setups, people pin guest vCPUs to host pCPUs and
> > set the vCPU topology to match the host pCPU topology they've pinned
> > to. So ohaving a cache mode that matches this topology is just fine.
> > It simply isn't something you want as a default for the more typical
> > floating vCPUs scenarios.
> 
> So, should this patch be applied?

We could take a patch that describes more clearly when it is reasonable
to use the passthrough mode.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC] docs: Discourage usage of cache mode=passthrough
Posted by Eduardo Habkost 6 years, 5 months ago
On Mon, Nov 06, 2017 at 01:17:02PM +0000, Daniel P. Berrange wrote:
> On Mon, Nov 06, 2017 at 11:10:00AM -0200, Eduardo Habkost wrote:
> > On Thu, Sep 28, 2017 at 09:21:41AM +0100, Daniel P. Berrange wrote:
> > > On Thu, Sep 21, 2017 at 01:14:04PM -0400, Laine Stump wrote:
> > > > On 09/19/2017 03:37 PM, Eduardo Habkost wrote:
> > > > > Cache mode=passthrough can result in a broken cache topology if
> > > > > the domain topology is not exactly the same as the host topology.
> > > > > Warn about that in the documentation.
> > > > >
> > > > > Bug report for reference:
> > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1184125
> > > > >
> > > > > Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> > > > > ---
> > > > >  docs/formatdomain.html.in | 4 +++-
> > > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
> > > > > index 57ec2ff34..9c21892f3 100644
> > > > > --- a/docs/formatdomain.html.in
> > > > > +++ b/docs/formatdomain.html.in
> > > > > @@ -1478,7 +1478,9 @@
> > > > >  
> > > > >                <dt><code>passthrough</code></dt>
> > > > >                <dd>The real CPU cache data reported by the host CPU will be
> > > > > -                passed through to the virtual CPU.</dd>
> > > > > +                passed through to the virtual CPU.  Using this mode is not
> > > > > +                recommended unless the domain CPU and NUMA topology is exactly
> > > > > +                the same as the host CPU and NUMA topology.</dd>
> > > > 
> > > > To me this sounds like it should be forbidden by libvirt, rather than
> > > > just documented as "bad". (I haven't followed any previous discussion on
> > > > the topic though, so maybe I'm over-reacting).
> > > 
> > > In high performance setups, people pin guest vCPUs to host pCPUs and
> > > set the vCPU topology to match the host pCPU topology they've pinned
> > > to. So ohaving a cache mode that matches this topology is just fine.
> > > It simply isn't something you want as a default for the more typical
> > > floating vCPUs scenarios.
> > 
> > So, should this patch be applied?
> 
> We could take a patch that describes more clearly when it is reasonable
> to use the passthrough mode.

Why "unless the domain CPU and NUMA topology is exactly the same
as the host CPU and NUMA topology" isn't a clear description?

-- 
Eduardo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC] docs: Discourage usage of cache mode=passthrough
Posted by Daniel P. Berrange 6 years, 5 months ago
On Mon, Nov 06, 2017 at 11:43:49AM -0200, Eduardo Habkost wrote:
> On Mon, Nov 06, 2017 at 01:17:02PM +0000, Daniel P. Berrange wrote:
> > On Mon, Nov 06, 2017 at 11:10:00AM -0200, Eduardo Habkost wrote:
> > > On Thu, Sep 28, 2017 at 09:21:41AM +0100, Daniel P. Berrange wrote:
> > > > On Thu, Sep 21, 2017 at 01:14:04PM -0400, Laine Stump wrote:
> > > > > On 09/19/2017 03:37 PM, Eduardo Habkost wrote:
> > > > > > Cache mode=passthrough can result in a broken cache topology if
> > > > > > the domain topology is not exactly the same as the host topology.
> > > > > > Warn about that in the documentation.
> > > > > >
> > > > > > Bug report for reference:
> > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1184125
> > > > > >
> > > > > > Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> > > > > > ---
> > > > > >  docs/formatdomain.html.in | 4 +++-
> > > > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
> > > > > > index 57ec2ff34..9c21892f3 100644
> > > > > > --- a/docs/formatdomain.html.in
> > > > > > +++ b/docs/formatdomain.html.in
> > > > > > @@ -1478,7 +1478,9 @@
> > > > > >  
> > > > > >                <dt><code>passthrough</code></dt>
> > > > > >                <dd>The real CPU cache data reported by the host CPU will be
> > > > > > -                passed through to the virtual CPU.</dd>
> > > > > > +                passed through to the virtual CPU.  Using this mode is not
> > > > > > +                recommended unless the domain CPU and NUMA topology is exactly
> > > > > > +                the same as the host CPU and NUMA topology.</dd>
> > > > > 
> > > > > To me this sounds like it should be forbidden by libvirt, rather than
> > > > > just documented as "bad". (I haven't followed any previous discussion on
> > > > > the topic though, so maybe I'm over-reacting).
> > > > 
> > > > In high performance setups, people pin guest vCPUs to host pCPUs and
> > > > set the vCPU topology to match the host pCPU topology they've pinned
> > > > to. So ohaving a cache mode that matches this topology is just fine.
> > > > It simply isn't something you want as a default for the more typical
> > > > floating vCPUs scenarios.
> > > 
> > > So, should this patch be applied?
> > 
> > We could take a patch that describes more clearly when it is reasonable
> > to use the passthrough mode.
> 
> Why "unless the domain CPU and NUMA topology is exactly the same
> as the host CPU and NUMA topology" isn't a clear description?

Just matching topology is not useful unless you've also pinned the
guest CPUs to host CPUs. So I think it'd be clearer to say something
like

  "If using 'passthrough' mode, it is recommended to explicitly pin each
   virtual CPU to a dedicated host CPU, and setup the guest CPU and NUMA
   topology to match that of the host. Mis-matched topology or freely
   floating CPUs will result in unpredictable performance, so should be
   avoided."

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC] docs: Discourage usage of cache mode=passthrough
Posted by Eduardo Habkost 6 years, 5 months ago
On Mon, Nov 06, 2017 at 02:08:31PM +0000, Daniel P. Berrange wrote:
> On Mon, Nov 06, 2017 at 11:43:49AM -0200, Eduardo Habkost wrote:
> > On Mon, Nov 06, 2017 at 01:17:02PM +0000, Daniel P. Berrange wrote:
> > > On Mon, Nov 06, 2017 at 11:10:00AM -0200, Eduardo Habkost wrote:
> > > > On Thu, Sep 28, 2017 at 09:21:41AM +0100, Daniel P. Berrange wrote:
> > > > > On Thu, Sep 21, 2017 at 01:14:04PM -0400, Laine Stump wrote:
> > > > > > On 09/19/2017 03:37 PM, Eduardo Habkost wrote:
> > > > > > > Cache mode=passthrough can result in a broken cache topology if
> > > > > > > the domain topology is not exactly the same as the host topology.
> > > > > > > Warn about that in the documentation.
> > > > > > >
> > > > > > > Bug report for reference:
> > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1184125
> > > > > > >
> > > > > > > Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
> > > > > > > ---
> > > > > > >  docs/formatdomain.html.in | 4 +++-
> > > > > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
> > > > > > > index 57ec2ff34..9c21892f3 100644
> > > > > > > --- a/docs/formatdomain.html.in
> > > > > > > +++ b/docs/formatdomain.html.in
> > > > > > > @@ -1478,7 +1478,9 @@
> > > > > > >  
> > > > > > >                <dt><code>passthrough</code></dt>
> > > > > > >                <dd>The real CPU cache data reported by the host CPU will be
> > > > > > > -                passed through to the virtual CPU.</dd>
> > > > > > > +                passed through to the virtual CPU.  Using this mode is not
> > > > > > > +                recommended unless the domain CPU and NUMA topology is exactly
> > > > > > > +                the same as the host CPU and NUMA topology.</dd>
> > > > > > 
> > > > > > To me this sounds like it should be forbidden by libvirt, rather than
> > > > > > just documented as "bad". (I haven't followed any previous discussion on
> > > > > > the topic though, so maybe I'm over-reacting).
> > > > > 
> > > > > In high performance setups, people pin guest vCPUs to host pCPUs and
> > > > > set the vCPU topology to match the host pCPU topology they've pinned
> > > > > to. So ohaving a cache mode that matches this topology is just fine.
> > > > > It simply isn't something you want as a default for the more typical
> > > > > floating vCPUs scenarios.
> > > > 
> > > > So, should this patch be applied?
> > > 
> > > We could take a patch that describes more clearly when it is reasonable
> > > to use the passthrough mode.
> > 
> > Why "unless the domain CPU and NUMA topology is exactly the same
> > as the host CPU and NUMA topology" isn't a clear description?
> 
> Just matching topology is not useful unless you've also pinned the
> guest CPUs to host CPUs. So I think it'd be clearer to say something
> like
> 
>   "If using 'passthrough' mode, it is recommended to explicitly pin each
>    virtual CPU to a dedicated host CPU, and setup the guest CPU and NUMA
>    topology to match that of the host. Mis-matched topology or freely
>    floating CPUs will result in unpredictable performance, so should be
>    avoided."

Performance of VMs with more complex topologies can be unpredictable
even if not using cache passthrough mode.  I believe this explanation
belongs to the documentation of the cpu/topology or cpu/numa
elements.

-- 
Eduardo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list