Changeset
include/sysemu/sysemu.h |  1 +
target/i386/cpu.h       |  9 +++++++++
target/i386/cpu.c       | 19 ++++++++++++++-----
target/i386/kvm.c       | 30 ++++++++++++++++++++++++++++++
vl.c                    |  6 ++++++
qemu-options.hx         |  9 +++++++--
6 files changed, 67 insertions(+), 7 deletions(-)
Git apply log
Switched to a new branch '20180612184616.90838-1-mst@redhat.com'
Applying: kvm: support -realtime cpu-pm=on|off
Applying: i386/cpu: make -cpu host support monitor/mwait
To https://github.com/patchew-project/qemu
 + b928938...7d519f2 patchew/20180612184616.90838-1-mst@redhat.com -> patchew/20180612184616.90838-1-mst@redhat.com (forced update)
Test failed: checkpatch

loading

Test passed: docker-mingw@fedora

loading

Test passed: s390x

loading

Test passed: docker-quick@centos7

loading

[Qemu-devel] [PATCH v2 0/2] kvm: x86 CPU power management
Posted by Michael S. Tsirkin, 1 week ago
This adds ability to expose host CPU power management capabilities to
guests. For intel guests, this is sufficient for guest to enable
low power CPU power management. For AMD guests it isn't sufficient,
deeper C-states are entered using System-IO.

mwait based power management is tied closely to specifics of CPUID,
making migration challenging. At this point only the non-migrateable
-cpu host is supported.

With this patch applied, VM latency is within the noise of
baremetal for some benchmarks.

perf bench sched pipe results:
Before:
    6.452 sec
After:
    4.382 sec
Baremetal:
    4.136 sec

Michael S. Tsirkin (2):
  kvm: support -realtime cpu-pm=on|off
  i386/cpu: make -cpu host support monitor/mwait

 include/sysemu/sysemu.h |  1 +
 target/i386/cpu.h       |  9 +++++++++
 target/i386/cpu.c       | 19 ++++++++++++++-----
 target/i386/kvm.c       | 30 ++++++++++++++++++++++++++++++
 vl.c                    |  6 ++++++
 qemu-options.hx         |  9 +++++++--
 6 files changed, 67 insertions(+), 7 deletions(-)

-- 
MST


Re: [Qemu-devel] [PATCH v2 0/2] kvm: x86 CPU power management
Posted by no-reply@patchew.org, 1 week ago
Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20180612184616.90838-1-mst@redhat.com
Subject: [Qemu-devel] [PATCH v2 0/2] kvm: x86 CPU power management

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
    echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
    if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
        failed=1
        echo
    fi
    n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]               patchew/20180612184616.90838-1-mst@redhat.com -> patchew/20180612184616.90838-1-mst@redhat.com
Switched to a new branch 'test'
b92893898a i386/cpu: make -cpu host support monitor/mwait
26a0f862a2 kvm: support -realtime cpu-pm=on|off

=== OUTPUT BEGIN ===
Checking PATCH 1/2: kvm: support -realtime cpu-pm=on|off...
WARNING: line over 80 characters
#85: FILE: target/i386/kvm.c:1401:
+            error_report("kvm: guest stopping CPU not supported: %s", strerror(-ret));

ERROR: do not initialise globals to 0 or NULL
#100: FILE: vl.c:145:
+bool enable_cpu_pm = false;

total: 1 errors, 1 warnings, 82 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 2/2: i386/cpu: make -cpu host support monitor/mwait...
WARNING: line over 80 characters
#82: FILE: target/i386/kvm.c:371:
+            int disable_exits = kvm_check_extension(s, KVM_CAP_X86_DISABLE_EXITS);

total: 0 errors, 1 warnings, 60 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
Re: [Qemu-devel] [PATCH v2 0/2] kvm: x86 CPU power management
Posted by Michael S. Tsirkin, 1 week ago
On Tue, Jun 12, 2018 at 12:18:03PM -0700, no-reply@patchew.org wrote:
> Checking PATCH 1/2: kvm: support -realtime cpu-pm=on|off...
> WARNING: line over 80 characters
> #85: FILE: target/i386/kvm.c:1401:
> +            error_report("kvm: guest stopping CPU not supported: %s", strerror(-ret));
> 
> ERROR: do not initialise globals to 0 or NULL
> #100: FILE: vl.c:145:
> +bool enable_cpu_pm = false;
> 
> total: 1 errors, 1 warnings, 82 lines checked
> 
> Your patch has style problems, please review.  If any of these errors
> are false positives report them to the maintainer, see
> CHECKPATCH in MAINTAINERS.

We seem to violate this left and right. Let's drop this?

-- 
MST

Re: [Qemu-devel] [PATCH v2 0/2] kvm: x86 CPU power management
Posted by Marcelo Tosatti, 1 week ago
On Tue, Jun 12, 2018 at 09:47:11PM +0300, Michael S. Tsirkin wrote:
> This adds ability to expose host CPU power management capabilities to
> guests. For intel guests, this is sufficient for guest to enable
> low power CPU power management. For AMD guests it isn't sufficient,
> deeper C-states are entered using System-IO.
> 
> mwait based power management is tied closely to specifics of CPUID,
> making migration challenging. At this point only the non-migrateable
> -cpu host is supported.
> 
> With this patch applied, VM latency is within the noise of
> baremetal for some benchmarks.
> 
> perf bench sched pipe results:
> Before:
>     6.452 sec
> After:
>     4.382 sec
> Baremetal:
>     4.136 sec
> 
> Michael S. Tsirkin (2):
>   kvm: support -realtime cpu-pm=on|off
>   i386/cpu: make -cpu host support monitor/mwait
> 
>  include/sysemu/sysemu.h |  1 +
>  target/i386/cpu.h       |  9 +++++++++
>  target/i386/cpu.c       | 19 ++++++++++++++-----
>  target/i386/kvm.c       | 30 ++++++++++++++++++++++++++++++
>  vl.c                    |  6 ++++++
>  qemu-options.hx         |  9 +++++++--
>  6 files changed, 67 insertions(+), 7 deletions(-)
> 
> -- 
> MST

Hi Michael,

1) Command line option interface

Why is this not an optional cpu feature such as the other features? 


-cpu CPU,+mwait  

rather than a separate, architecture independent "-realtime cpu-pm=on|off" 
command line option?

2) Migration

Isnt it sufficient to check that both CPUID leafs are the same, 
to allow migration ?

1. Check that the processor supports MONITOR and MWAIT. If
CPUID.01H:ECX.MONITOR[bit 3] = 1, MONITOR and MWAIT are available at
ring 0.

2. Query the smallest and largest line size that MONITOR uses.
Use CPUID.05H:EAX.smallest[bits 15:0];EBX.largest[bits15:0]. 



Re: [Qemu-devel] [PATCH v2 0/2] kvm: x86 CPU power management
Posted by Michael S. Tsirkin, 1 week ago
On Wed, Jun 13, 2018 at 07:34:53PM -0300, Marcelo Tosatti wrote:
> On Tue, Jun 12, 2018 at 09:47:11PM +0300, Michael S. Tsirkin wrote:
> > This adds ability to expose host CPU power management capabilities to
> > guests. For intel guests, this is sufficient for guest to enable
> > low power CPU power management. For AMD guests it isn't sufficient,
> > deeper C-states are entered using System-IO.
> > 
> > mwait based power management is tied closely to specifics of CPUID,
> > making migration challenging. At this point only the non-migrateable
> > -cpu host is supported.
> > 
> > With this patch applied, VM latency is within the noise of
> > baremetal for some benchmarks.
> > 
> > perf bench sched pipe results:
> > Before:
> >     6.452 sec
> > After:
> >     4.382 sec
> > Baremetal:
> >     4.136 sec
> > 
> > Michael S. Tsirkin (2):
> >   kvm: support -realtime cpu-pm=on|off
> >   i386/cpu: make -cpu host support monitor/mwait
> > 
> >  include/sysemu/sysemu.h |  1 +
> >  target/i386/cpu.h       |  9 +++++++++
> >  target/i386/cpu.c       | 19 ++++++++++++++-----
> >  target/i386/kvm.c       | 30 ++++++++++++++++++++++++++++++
> >  vl.c                    |  6 ++++++
> >  qemu-options.hx         |  9 +++++++--
> >  6 files changed, 67 insertions(+), 7 deletions(-)
> > 
> > -- 
> > MST
> 
> Hi Michael,
> 
> 1) Command line option interface
> 
> Why is this not an optional cpu feature such as the other features? 
> 
> 
> -cpu CPU,+mwait  
> 
> rather than a separate, architecture independent "-realtime cpu-pm=on|off" 
> command line option?

Because it's not just a guest flag. With guest pm on, one guest
can severely affect the latency of others on the same host CPU.

> 2) Migration
> 
> Isnt it sufficient to check that both CPUID leafs are the same, 
> to allow migration ?

Not at the moment since linux guests use mwait hints and latency values
from a table in intel_idle.  If the host and guest models do not match,
mwait will get a wrong hint.

It will not do the right thing then!

You want exactly the same host CPU for it to work.

This isn't different from how -host cpu works generally.

> 1. Check that the processor supports MONITOR and MWAIT. If
> CPUID.01H:ECX.MONITOR[bit 3] = 1, MONITOR and MWAIT are available at
> ring 0.
> 
> 2. Query the smallest and largest line size that MONITOR uses.
> Use CPUID.05H:EAX.smallest[bits 15:0];EBX.largest[bits15:0]. 
> 

Re: [Qemu-devel] [PATCH v2 0/2] kvm: x86 CPU power management
Posted by Marcelo Tosatti, 1 week ago
On Thu, Jun 14, 2018 at 02:37:28AM +0300, Michael S. Tsirkin wrote:
> On Wed, Jun 13, 2018 at 07:34:53PM -0300, Marcelo Tosatti wrote:
> > On Tue, Jun 12, 2018 at 09:47:11PM +0300, Michael S. Tsirkin wrote:
> > > This adds ability to expose host CPU power management capabilities to
> > > guests. For intel guests, this is sufficient for guest to enable
> > > low power CPU power management. For AMD guests it isn't sufficient,
> > > deeper C-states are entered using System-IO.
> > > 
> > > mwait based power management is tied closely to specifics of CPUID,
> > > making migration challenging. At this point only the non-migrateable
> > > -cpu host is supported.
> > > 
> > > With this patch applied, VM latency is within the noise of
> > > baremetal for some benchmarks.
> > > 
> > > perf bench sched pipe results:
> > > Before:
> > >     6.452 sec
> > > After:
> > >     4.382 sec
> > > Baremetal:
> > >     4.136 sec
> > > 
> > > Michael S. Tsirkin (2):
> > >   kvm: support -realtime cpu-pm=on|off
> > >   i386/cpu: make -cpu host support monitor/mwait
> > > 
> > >  include/sysemu/sysemu.h |  1 +
> > >  target/i386/cpu.h       |  9 +++++++++
> > >  target/i386/cpu.c       | 19 ++++++++++++++-----
> > >  target/i386/kvm.c       | 30 ++++++++++++++++++++++++++++++
> > >  vl.c                    |  6 ++++++
> > >  qemu-options.hx         |  9 +++++++--
> > >  6 files changed, 67 insertions(+), 7 deletions(-)
> > > 
> > > -- 
> > > MST
> > 
> > Hi Michael,
> > 
> > 1) Command line option interface
> > 
> > Why is this not an optional cpu feature such as the other features? 
> > 
> > 
> > -cpu CPU,+mwait  
> > 
> > rather than a separate, architecture independent "-realtime cpu-pm=on|off" 
> > command line option?
> 
> Because it's not just a guest flag. With guest pm on, one guest
> can severely affect the latency of others on the same host CPU.

How so ?

> > 2) Migration
> > 
> > Isnt it sufficient to check that both CPUID leafs are the same, 
> > to allow migration ?
> 
> Not at the moment since linux guests use mwait hints and latency values
> from a table in intel_idle.  If the host and guest models do not match,
> mwait will get a wrong hint.
> 
> It will not do the right thing then!
> 
> You want exactly the same host CPU for it to work.
> 
> This isn't different from how -host cpu works generally.

Ok, makes sense.

> > 1. Check that the processor supports MONITOR and MWAIT. If
> > CPUID.01H:ECX.MONITOR[bit 3] = 1, MONITOR and MWAIT are available at
> > ring 0.
> > 
> > 2. Query the smallest and largest line size that MONITOR uses.
> > Use CPUID.05H:EAX.smallest[bits 15:0];EBX.largest[bits15:0]. 
> > 

Re: [Qemu-devel] [PATCH v2 0/2] kvm: x86 CPU power management
Posted by Michael S. Tsirkin, 1 week ago
On Fri, Jun 15, 2018 at 03:32:27PM -0300, Marcelo Tosatti wrote:
> On Thu, Jun 14, 2018 at 02:37:28AM +0300, Michael S. Tsirkin wrote:
> > On Wed, Jun 13, 2018 at 07:34:53PM -0300, Marcelo Tosatti wrote:
> > > On Tue, Jun 12, 2018 at 09:47:11PM +0300, Michael S. Tsirkin wrote:
> > > > This adds ability to expose host CPU power management capabilities to
> > > > guests. For intel guests, this is sufficient for guest to enable
> > > > low power CPU power management. For AMD guests it isn't sufficient,
> > > > deeper C-states are entered using System-IO.
> > > > 
> > > > mwait based power management is tied closely to specifics of CPUID,
> > > > making migration challenging. At this point only the non-migrateable
> > > > -cpu host is supported.
> > > > 
> > > > With this patch applied, VM latency is within the noise of
> > > > baremetal for some benchmarks.
> > > > 
> > > > perf bench sched pipe results:
> > > > Before:
> > > >     6.452 sec
> > > > After:
> > > >     4.382 sec
> > > > Baremetal:
> > > >     4.136 sec
> > > > 
> > > > Michael S. Tsirkin (2):
> > > >   kvm: support -realtime cpu-pm=on|off
> > > >   i386/cpu: make -cpu host support monitor/mwait
> > > > 
> > > >  include/sysemu/sysemu.h |  1 +
> > > >  target/i386/cpu.h       |  9 +++++++++
> > > >  target/i386/cpu.c       | 19 ++++++++++++++-----
> > > >  target/i386/kvm.c       | 30 ++++++++++++++++++++++++++++++
> > > >  vl.c                    |  6 ++++++
> > > >  qemu-options.hx         |  9 +++++++--
> > > >  6 files changed, 67 insertions(+), 7 deletions(-)
> > > > 
> > > > -- 
> > > > MST
> > > 
> > > Hi Michael,
> > > 
> > > 1) Command line option interface
> > > 
> > > Why is this not an optional cpu feature such as the other features? 
> > > 
> > > 
> > > -cpu CPU,+mwait  
> > > 
> > > rather than a separate, architecture independent "-realtime cpu-pm=on|off" 
> > > command line option?
> > 
> > Because it's not just a guest flag. With guest pm on, one guest
> > can severely affect the latency of others on the same host CPU.
> 
> How so ?

Look at drivers/idle/intel_idle.c
There are states with exit latencies of 
10000

> > > 2) Migration
> > > 
> > > Isnt it sufficient to check that both CPUID leafs are the same, 
> > > to allow migration ?
> > 
> > Not at the moment since linux guests use mwait hints and latency values
> > from a table in intel_idle.  If the host and guest models do not match,
> > mwait will get a wrong hint.
> > 
> > It will not do the right thing then!
> > 
> > You want exactly the same host CPU for it to work.
> > 
> > This isn't different from how -host cpu works generally.
> 
> Ok, makes sense.
> 
> > > 1. Check that the processor supports MONITOR and MWAIT. If
> > > CPUID.01H:ECX.MONITOR[bit 3] = 1, MONITOR and MWAIT are available at
> > > ring 0.
> > > 
> > > 2. Query the smallest and largest line size that MONITOR uses.
> > > Use CPUID.05H:EAX.smallest[bits 15:0];EBX.largest[bits15:0]. 
> > > 

Re: [Qemu-devel] [PATCH v2 0/2] kvm: x86 CPU power management
Posted by Daniel P. Berrangé, 1 week ago
On Tue, Jun 12, 2018 at 09:47:11PM +0300, Michael S. Tsirkin wrote:
> This adds ability to expose host CPU power management capabilities to
> guests. For intel guests, this is sufficient for guest to enable
> low power CPU power management. For AMD guests it isn't sufficient,
> deeper C-states are entered using System-IO.
> 
> mwait based power management is tied closely to specifics of CPUID,
> making migration challenging. At this point only the non-migrateable
> -cpu host is supported.
> 
> With this patch applied, VM latency is within the noise of
> baremetal for some benchmarks.
> 
> perf bench sched pipe results:
> Before:
>     6.452 sec
> After:
>     4.382 sec
> Baremetal:
>     4.136 sec
> 
> Michael S. Tsirkin (2):
>   kvm: support -realtime cpu-pm=on|off

IMHO this really shouldn't be under the -realtime flag. I don't think
the -realtime flag should ever have been introduced, and we certainly
shouldn't add more stuff under it.

"-realtime" is referring to a very specific use case, while the
properties listed under it are all general purpose features. Real
time guests just happen to be one possible use case, but it is
valid to use them for non-real time guests.

IOW, I think we should just have this as an option under -cpu or
some other *functionally* named option, not a option named after
a specific usage scenario.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH v2 0/2] kvm: x86 CPU power management
Posted by Paolo Bonzini, 1 week ago
On 14/06/2018 10:18, Daniel P. Berrangé wrote:
> I don't think
> the -realtime flag should ever have been introduced, and we certainly
> shouldn't add more stuff under it.
> 
> "-realtime" is referring to a very specific use case, while the
> properties listed under it are all general purpose features. Real
> time guests just happen to be one possible use case, but it is
> valid to use them for non-real time guests.
> 
> IOW, I think we should just have this as an option under -cpu or
> some other *functionally* named option, not a option named after
> a specific usage scenario.

"-cpu" is certainly wrong for KVM_CAP_X86_DISABLE_EXITS. "-cpu" is a
device option, while this is about host behavior.  "-realtime"'s name is
awful, but I still think it's the best place for this option.  Maybe we
could call it "-realtime power-mgmt={host|guest}".

A separate issue is whether the same flag should control both
KVM_CAP_X86_DISABLE_EXITS and the monitor/mwait CPUID leaf.  Eduardo,
what do you think?

Paolo

Re: [Qemu-devel] [PATCH v2 0/2] kvm: x86 CPU power management
Posted by Daniel P. Berrangé, 1 week ago
On Thu, Jun 14, 2018 at 05:40:41PM +0200, Paolo Bonzini wrote:
> On 14/06/2018 10:18, Daniel P. Berrangé wrote:
> > I don't think
> > the -realtime flag should ever have been introduced, and we certainly
> > shouldn't add more stuff under it.
> > 
> > "-realtime" is referring to a very specific use case, while the
> > properties listed under it are all general purpose features. Real
> > time guests just happen to be one possible use case, but it is
> > valid to use them for non-real time guests.
> > 
> > IOW, I think we should just have this as an option under -cpu or
> > some other *functionally* named option, not a option named after
> > a specific usage scenario.
> 
> "-cpu" is certainly wrong for KVM_CAP_X86_DISABLE_EXITS. "-cpu" is a
> device option, while this is about host behavior.  "-realtime"'s name is
> awful, but I still think it's the best place for this option.  Maybe we
> could call it "-realtime power-mgmt={host|guest}".

If none of the existing ones are a suitable fit, then we should just
introduce a new CLI arg instead stuffing it into somewhere odd.

IOW, why not just "--power-mgmt host|guest"

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH v2 0/2] kvm: x86 CPU power management
Posted by Paolo Bonzini, 1 week ago
On 14/06/2018 17:44, Daniel P. Berrangé wrote:
>> "-cpu" is certainly wrong for KVM_CAP_X86_DISABLE_EXITS. "-cpu" is a
>> device option, while this is about host behavior.  "-realtime"'s name is
>> awful, but I still think it's the best place for this option.  Maybe we
>> could call it "-realtime power-mgmt={host|guest}".
> If none of the existing ones are a suitable fit, then we should just
> introduce a new CLI arg instead stuffing it into somewhere odd.

It is related to mlock in my opinion.  Both options are about applying
settings that reduce latency but may not be applicable in general.
Maybe "-realtime" could be renamed to "-perftune" or something like
that, but I'd like to keep the grouping.

Paolo

> IOW, why not just "--power-mgmt host|guest"


Re: [Qemu-devel] [PATCH v2 0/2] kvm: x86 CPU power management
Posted by Michael S. Tsirkin, 1 week ago
On Thu, Jun 14, 2018 at 10:32:00PM +0200, Paolo Bonzini wrote:
> On 14/06/2018 17:44, Daniel P. Berrangé wrote:
> >> "-cpu" is certainly wrong for KVM_CAP_X86_DISABLE_EXITS. "-cpu" is a
> >> device option, while this is about host behavior.  "-realtime"'s name is
> >> awful, but I still think it's the best place for this option.  Maybe we
> >> could call it "-realtime power-mgmt={host|guest}".
> > If none of the existing ones are a suitable fit, then we should just
> > introduce a new CLI arg instead stuffing it into somewhere odd.
> 
> It is related to mlock in my opinion.  Both options are about applying
> settings that reduce latency but may not be applicable in general.

I would add, it's also about host resource handling.

With mlock guest memory is locked so there's no memory overcommit.
With cpu-pm guest does not give up host cpu which mostly disables
host cpu overcommit.


> Maybe "-realtime" could be renamed to "-perftune" or something like
> that, but I'd like to keep the grouping.
> 
> Paolo

-overcommit ?

> > IOW, why not just "--power-mgmt host|guest"

Re: [Qemu-devel] [PATCH v2 0/2] kvm: x86 CPU power management
Posted by Eduardo Habkost, 1 week ago
On Thu, Jun 14, 2018 at 05:40:41PM +0200, Paolo Bonzini wrote:
> On 14/06/2018 10:18, Daniel P. Berrangé wrote:
> > I don't think
> > the -realtime flag should ever have been introduced, and we certainly
> > shouldn't add more stuff under it.
> > 
> > "-realtime" is referring to a very specific use case, while the
> > properties listed under it are all general purpose features. Real
> > time guests just happen to be one possible use case, but it is
> > valid to use them for non-real time guests.
> > 
> > IOW, I think we should just have this as an option under -cpu or
> > some other *functionally* named option, not a option named after
> > a specific usage scenario.
> 
> "-cpu" is certainly wrong for KVM_CAP_X86_DISABLE_EXITS. "-cpu" is a
> device option, while this is about host behavior.  "-realtime"'s name is
> awful, but I still think it's the best place for this option.  Maybe we
> could call it "-realtime power-mgmt={host|guest}".
> 
> A separate issue is whether the same flag should control both
> KVM_CAP_X86_DISABLE_EXITS and the monitor/mwait CPUID leaf.  Eduardo,
> what do you think?

Making "-cpu host" be affected by a host-side option is
acceptable to me.  A "-cpu" option would be more appropriate if
we decide to allow monitor/mwait be enabled for other CPU models
too.

-- 
Eduardo

Re: [Qemu-devel] [PATCH v2 0/2] kvm: x86 CPU power management
Posted by Paolo Bonzini, 1 week ago
On 14/06/2018 18:53, Eduardo Habkost wrote:
>>
>> A separate issue is whether the same flag should control both
>> KVM_CAP_X86_DISABLE_EXITS and the monitor/mwait CPUID leaf.  Eduardo,
>> what do you think?
> Making "-cpu host" be affected by a host-side option is
> acceptable to me.  A "-cpu" option would be more appropriate if
> we decide to allow monitor/mwait be enabled for other CPU models
> too.

Sounds fair enough, we should probably add a note in the documentation
of "-cpu host" though.

Paolo

Re: [Qemu-devel] [PATCH v2 0/2] kvm: x86 CPU power management
Posted by Michael S. Tsirkin, 1 week ago
On Thu, Jun 14, 2018 at 05:40:41PM +0200, Paolo Bonzini wrote:
> On 14/06/2018 10:18, Daniel P. Berrangé wrote:
> > I don't think
> > the -realtime flag should ever have been introduced, and we certainly
> > shouldn't add more stuff under it.
> > 
> > "-realtime" is referring to a very specific use case, while the
> > properties listed under it are all general purpose features. Real
> > time guests just happen to be one possible use case, but it is
> > valid to use them for non-real time guests.
> > 
> > IOW, I think we should just have this as an option under -cpu or
> > some other *functionally* named option, not a option named after
> > a specific usage scenario.
> 
> "-cpu" is certainly wrong for KVM_CAP_X86_DISABLE_EXITS. "-cpu" is a
> device option, while this is about host behavior.  "-realtime"'s name is
> awful, but I still think it's the best place for this option.  Maybe we
> could call it "-realtime power-mgmt={host|guest}".

I don't feel it's a good name.  The new flag allows guest to put CPU
into a low power state without host knowing about it.  This will affect
other guests on the same host CPU.  It doesn't however limit power
management to guest only, things like speedstep remain under host
control.

> A separate issue is whether the same flag should control both
> KVM_CAP_X86_DISABLE_EXITS and the monitor/mwait CPUID leaf.  Eduardo,
> what do you think?
> 
> Paolo

It's just for -cpu host.

Re: [Qemu-devel] [PATCH v2 0/2] kvm: x86 CPU power management
Posted by Kashyap Chamarthy, 2 days ago
(Cc: Thomas Huth)

On Thu, Jun 14, 2018 at 09:18:00AM +0100, Daniel P. Berrangé wrote:

[...]

> IMHO this really shouldn't be under the -realtime flag. I don't think
> the -realtime flag should ever have been introduced, and we certainly
> shouldn't add more stuff under it.
> 
> "-realtime" is referring to a very specific use case, while the
> properties listed under it are all general purpose features. Real
> time guests just happen to be one possible use case, but it is
> valid to use them for non-real time guests.

Given what you say above, then it sounds like the "-realtime" QEMU
option should be deprecated, and removed in a future release.

[...]


-- 
/kashyap

Re: [Qemu-devel] [PATCH v2 0/2] kvm: x86 CPU power management
Posted by Michael S. Tsirkin, 1 day ago
On Fri, Jun 22, 2018 at 03:06:59PM +0200, Kashyap Chamarthy wrote:
> (Cc: Thomas Huth)
> 
> On Thu, Jun 14, 2018 at 09:18:00AM +0100, Daniel P. Berrangé wrote:
> 
> [...]
> 
> > IMHO this really shouldn't be under the -realtime flag. I don't think
> > the -realtime flag should ever have been introduced, and we certainly
> > shouldn't add more stuff under it.
> > 
> > "-realtime" is referring to a very specific use case, while the
> > properties listed under it are all general purpose features. Real
> > time guests just happen to be one possible use case, but it is
> > valid to use them for non-real time guests.
> 
> Given what you say above, then it sounds like the "-realtime" QEMU
> option should be deprecated, and removed in a future release.
> 
> [...]
> 

As a patch on top,  sure.

> -- 
> /kashyap

[Qemu-devel] [PATCH v2 1/2] kvm: support -realtime cpu-pm=on|off
Posted by Michael S. Tsirkin, 1 week ago
With this flag, kvm allows guest to control host CPU power state.  This
increases latency for other processes using same host CPU in an
unpredictable way, but if decreases idle entry/exit times for the
running VCPU.

Follow-up patches will expose this capability to guest
(using mwait leaf).

Based on a patch by Wanpeng Li <kernellwp@gmail.com> .

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 include/sysemu/sysemu.h |  1 +
 target/i386/kvm.c       | 22 ++++++++++++++++++++++
 vl.c                    |  6 ++++++
 qemu-options.hx         |  9 +++++++--
 4 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index e893f72f3b..b921c6f3b7 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -128,6 +128,7 @@ extern bool boot_strict;
 extern uint8_t *boot_splash_filedata;
 extern size_t boot_splash_filedata_size;
 extern bool enable_mlock;
+extern bool enable_cpu_pm;
 extern uint8_t qemu_extra_params_fw[2];
 extern QEMUClockType rtc_clock;
 extern const char *mem_path;
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 44f70733e7..f093d55209 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -1357,6 +1357,28 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
         smram_machine_done.notify = register_smram_listener;
         qemu_add_machine_init_done_notifier(&smram_machine_done);
     }
+
+    if (enable_cpu_pm) {
+        int disable_exits = kvm_check_extension(s, KVM_CAP_X86_DISABLE_EXITS);
+        int ret;
+
+/* Work around for kernel header with a typo. TODO: fix header and drop. */
+#if defined(KVM_X86_DISABLE_EXITS_HTL) && !defined(KVM_X86_DISABLE_EXITS_HLT)
+#define KVM_X86_DISABLE_EXITS_HLT KVM_X86_DISABLE_EXITS_HTL
+#endif
+        if (disable_exits) {
+            disable_exits &= (KVM_X86_DISABLE_EXITS_MWAIT |
+                              KVM_X86_DISABLE_EXITS_HLT |
+                              KVM_X86_DISABLE_EXITS_PAUSE);
+        }
+
+        ret = kvm_vm_enable_cap(s, KVM_CAP_X86_DISABLE_EXITS, 0,
+                                disable_exits);
+        if (ret < 0) {
+            error_report("kvm: guest stopping CPU not supported: %s", strerror(-ret));
+        }
+    }
+
     return 0;
 }
 
diff --git a/vl.c b/vl.c
index 06031715ac..7bea9c2177 100644
--- a/vl.c
+++ b/vl.c
@@ -142,6 +142,7 @@ ram_addr_t ram_size;
 const char *mem_path = NULL;
 int mem_prealloc = 0; /* force preallocation of physical target memory */
 bool enable_mlock = false;
+bool enable_cpu_pm = false;
 int nb_nics;
 NICInfo nd_table[MAX_NICS];
 int autostart;
@@ -386,6 +387,10 @@ static QemuOptsList qemu_realtime_opts = {
             .name = "mlock",
             .type = QEMU_OPT_BOOL,
         },
+        {
+            .name = "cpu-pm",
+            .type = QEMU_OPT_BOOL,
+        },
         { /* end of list */ }
     },
 };
@@ -3904,6 +3909,7 @@ int main(int argc, char **argv, char **envp)
                     exit(1);
                 }
                 enable_mlock = qemu_opt_get_bool(opts, "mlock", true);
+                enable_cpu_pm = qemu_opt_get_bool(opts, "cpu-pm", false);
                 break;
             case QEMU_OPTION_msg:
                 opts = qemu_opts_parse_noisily(qemu_find_opts("msg"), optarg,
diff --git a/qemu-options.hx b/qemu-options.hx
index c0d3951e9f..e6f31071ce 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3325,16 +3325,21 @@ Do not start CPU at startup (you must type 'c' in the monitor).
 ETEXI
 
 DEF("realtime", HAS_ARG, QEMU_OPTION_realtime,
-    "-realtime [mlock=on|off]\n"
+    "-realtime [mlock=on|off][cpu-halt=on|off[\n"
     "                run qemu with realtime features\n"
-    "                mlock=on|off controls mlock support (default: on)\n",
+    "                mlock=on|off controls mlock support (default: on)\n"
+    "                cpu-pm=on|off controls cpu power management (default: off)\n",
     QEMU_ARCH_ALL)
 STEXI
 @item -realtime mlock=on|off
+@item -realtime cpu-pm=on|off
 @findex -realtime
 Run qemu with realtime features.
 mlocking qemu and guest memory can be enabled via @option{mlock=on}
 (enabled by default).
+guest ability to manage power state of host cpus (increasing latency for other
+processes on the same host cpu, but decreasing latency for guest)
+can be enabled via @option{cpu-pm=on} (disabled by default).
 ETEXI
 
 DEF("gdb", HAS_ARG, QEMU_OPTION_gdb, \
-- 
MST


Re: [Qemu-devel] [PATCH v2 1/2] kvm: support -realtime cpu-pm=on|off
Posted by Eduardo Habkost, 1 week ago
On Tue, Jun 12, 2018 at 09:47:11PM +0300, Michael S. Tsirkin wrote:
> With this flag, kvm allows guest to control host CPU power state.  This
> increases latency for other processes using same host CPU in an
> unpredictable way, but if decreases idle entry/exit times for the
> running VCPU.
> 
> Follow-up patches will expose this capability to guest
> (using mwait leaf).
> 
> Based on a patch by Wanpeng Li <kernellwp@gmail.com> .
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

The interface makes sense to me, but:

> +extern bool enable_cpu_pm;

Why do we need a global variable if initialization code can call
qemu_opt_get_bool() directly?

-- 
Eduardo

Re: [Qemu-devel] [PATCH v2 1/2] kvm: support -realtime cpu-pm=on|off
Posted by Eduardo Habkost, 1 week ago
On Tue, Jun 12, 2018 at 09:47:11PM +0300, Michael S. Tsirkin wrote:
> With this flag, kvm allows guest to control host CPU power state.  This
> increases latency for other processes using same host CPU in an
> unpredictable way, but if decreases idle entry/exit times for the
> running VCPU.
> 
> Follow-up patches will expose this capability to guest
> (using mwait leaf).
> 
> Based on a patch by Wanpeng Li <kernellwp@gmail.com> .
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  include/sysemu/sysemu.h |  1 +
>  target/i386/kvm.c       | 22 ++++++++++++++++++++++
>  vl.c                    |  6 ++++++
>  qemu-options.hx         |  9 +++++++--
>  4 files changed, 36 insertions(+), 2 deletions(-)
> 
> diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
> index e893f72f3b..b921c6f3b7 100644
> --- a/include/sysemu/sysemu.h
> +++ b/include/sysemu/sysemu.h
> @@ -128,6 +128,7 @@ extern bool boot_strict;
>  extern uint8_t *boot_splash_filedata;
>  extern size_t boot_splash_filedata_size;
>  extern bool enable_mlock;
> +extern bool enable_cpu_pm;

After looking at patch 2/2, I see that the global variable is
useful, and it's consistent with the existing enable_mlock
variable.

>  extern uint8_t qemu_extra_params_fw[2];
>  extern QEMUClockType rtc_clock;
>  extern const char *mem_path;
> diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> index 44f70733e7..f093d55209 100644
> --- a/target/i386/kvm.c
> +++ b/target/i386/kvm.c
> @@ -1357,6 +1357,28 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>          smram_machine_done.notify = register_smram_listener;
>          qemu_add_machine_init_done_notifier(&smram_machine_done);
>      }
> +
> +    if (enable_cpu_pm) {
> +        int disable_exits = kvm_check_extension(s, KVM_CAP_X86_DISABLE_EXITS);
> +        int ret;
> +
> +/* Work around for kernel header with a typo. TODO: fix header and drop. */
> +#if defined(KVM_X86_DISABLE_EXITS_HTL) && !defined(KVM_X86_DISABLE_EXITS_HLT)
> +#define KVM_X86_DISABLE_EXITS_HLT KVM_X86_DISABLE_EXITS_HTL
> +#endif
> +        if (disable_exits) {
> +            disable_exits &= (KVM_X86_DISABLE_EXITS_MWAIT |
> +                              KVM_X86_DISABLE_EXITS_HLT |
> +                              KVM_X86_DISABLE_EXITS_PAUSE);
> +        }
> +
> +        ret = kvm_vm_enable_cap(s, KVM_CAP_X86_DISABLE_EXITS, 0,
> +                                disable_exits);

Isn't the kvm_vm_enable_cap() call supposed to be inside the "if
(disable_exits)" block?

> +        if (ret < 0) {
> +            error_report("kvm: guest stopping CPU not supported: %s", strerror(-ret));
> +        }
> +    }
> +
>      return 0;
>  }
[...]

-- 
Eduardo

Re: [Qemu-devel] [PATCH v2 1/2] kvm: support -realtime cpu-pm=on|off
Posted by Michael S. Tsirkin, 1 week ago
On Wed, Jun 13, 2018 at 06:00:22PM -0300, Eduardo Habkost wrote:
> On Tue, Jun 12, 2018 at 09:47:11PM +0300, Michael S. Tsirkin wrote:
> > With this flag, kvm allows guest to control host CPU power state.  This
> > increases latency for other processes using same host CPU in an
> > unpredictable way, but if decreases idle entry/exit times for the
> > running VCPU.
> > 
> > Follow-up patches will expose this capability to guest
> > (using mwait leaf).
> > 
> > Based on a patch by Wanpeng Li <kernellwp@gmail.com> .
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> >  include/sysemu/sysemu.h |  1 +
> >  target/i386/kvm.c       | 22 ++++++++++++++++++++++
> >  vl.c                    |  6 ++++++
> >  qemu-options.hx         |  9 +++++++--
> >  4 files changed, 36 insertions(+), 2 deletions(-)
> > 
> > diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
> > index e893f72f3b..b921c6f3b7 100644
> > --- a/include/sysemu/sysemu.h
> > +++ b/include/sysemu/sysemu.h
> > @@ -128,6 +128,7 @@ extern bool boot_strict;
> >  extern uint8_t *boot_splash_filedata;
> >  extern size_t boot_splash_filedata_size;
> >  extern bool enable_mlock;
> > +extern bool enable_cpu_pm;
> 
> After looking at patch 2/2, I see that the global variable is
> useful, and it's consistent with the existing enable_mlock
> variable.
> 
> >  extern uint8_t qemu_extra_params_fw[2];
> >  extern QEMUClockType rtc_clock;
> >  extern const char *mem_path;
> > diff --git a/target/i386/kvm.c b/target/i386/kvm.c
> > index 44f70733e7..f093d55209 100644
> > --- a/target/i386/kvm.c
> > +++ b/target/i386/kvm.c
> > @@ -1357,6 +1357,28 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
> >          smram_machine_done.notify = register_smram_listener;
> >          qemu_add_machine_init_done_notifier(&smram_machine_done);
> >      }
> > +
> > +    if (enable_cpu_pm) {
> > +        int disable_exits = kvm_check_extension(s, KVM_CAP_X86_DISABLE_EXITS);
> > +        int ret;
> > +
> > +/* Work around for kernel header with a typo. TODO: fix header and drop. */
> > +#if defined(KVM_X86_DISABLE_EXITS_HTL) && !defined(KVM_X86_DISABLE_EXITS_HLT)
> > +#define KVM_X86_DISABLE_EXITS_HLT KVM_X86_DISABLE_EXITS_HTL
> > +#endif
> > +        if (disable_exits) {
> > +            disable_exits &= (KVM_X86_DISABLE_EXITS_MWAIT |
> > +                              KVM_X86_DISABLE_EXITS_HLT |
> > +                              KVM_X86_DISABLE_EXITS_PAUSE);
> > +        }
> > +
> > +        ret = kvm_vm_enable_cap(s, KVM_CAP_X86_DISABLE_EXITS, 0,
> > +                                disable_exits);
> 
> Isn't the kvm_vm_enable_cap() call supposed to be inside the "if
> (disable_exits)" block?

Doing it like this causes a warning if pm is requested but
disable halt is not supported.

But I can move it, sure - let me know.

> > +        if (ret < 0) {
> > +            error_report("kvm: guest stopping CPU not supported: %s", strerror(-ret));
> > +        }
> > +    }
> > +
> >      return 0;
> >  }
> [...]
> 
> -- 
> Eduardo

[Qemu-devel] [PATCH v2 2/2] i386/cpu: make -cpu host support monitor/mwait
Posted by Michael S. Tsirkin, 1 week ago
When guest CPU PM is enabled, and with -cpu host, expose the host CPU
MWAIT leaf in the CPUID so guest can make good PM decisions.

Note: the result is 100% CPU utilization reported by host as host
no longer knows that the CPU is halted.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 target/i386/cpu.h |  9 +++++++++
 target/i386/cpu.c | 19 ++++++++++++++-----
 target/i386/kvm.c |  8 ++++++++
 3 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 664504610e..309f804573 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1378,6 +1378,15 @@ struct X86CPU {
     /* if true the CPUID code directly forward host cache leaves to the guest */
     bool cache_info_passthrough;
 
+    /* if true the CPUID code directly forwards
+     * host monitor/mwait leaves to the guest */
+    struct {
+        uint32_t eax;
+        uint32_t ebx;
+        uint32_t ecx;
+        uint32_t edx;
+    } mwait;
+
     /* Features that were filtered out because of missing host capabilities */
     uint32_t filtered_features[FEATURE_WORDS];
 
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 94260412e2..a4fb856d58 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -3760,11 +3760,11 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
         }
         break;
     case 5:
-        /* mwait info: needed for Core compatibility */
-        *eax = 0; /* Smallest monitor-line size in bytes */
-        *ebx = 0; /* Largest monitor-line size in bytes */
-        *ecx = CPUID_MWAIT_EMX | CPUID_MWAIT_IBE;
-        *edx = 0;
+        /* MONITOR/MWAIT Leaf */
+        *eax = cpu->mwait.eax; /* Smallest monitor-line size in bytes */
+        *ebx = cpu->mwait.ebx; /* Largest monitor-line size in bytes */
+        *ecx = cpu->mwait.ecx; /* flags */
+        *edx = cpu->mwait.edx; /* mwait substates */
         break;
     case 6:
         /* Thermal and Power Leaf */
@@ -4595,6 +4595,15 @@ static void x86_cpu_realizefn(DeviceState *dev, Error **errp)
         goto out;
     }
 
+    if (xcc->host_cpuid_required && enable_cpu_pm) {
+        host_cpuid(5, 0, &cpu->mwait.eax, &cpu->mwait.ebx,
+                   &cpu->mwait.ecx, &cpu->mwait.edx);
+        env->features[FEAT_1_ECX] |= CPUID_EXT_MONITOR;
+    }
+    /* mwait extended info: needed for Core compatibility */
+    /* We always wake on interrupt even if host does not have the capability */
+    cpu->mwait.ecx |= CPUID_MWAIT_EMX | CPUID_MWAIT_IBE;
+
     if (cpu->apic_id == UNASSIGNED_APIC_ID) {
         error_setg(errp, "apic-id property was not initialized properly");
         return;
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index f093d55209..7cf9661159 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -366,6 +366,14 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function,
         if (!kvm_irqchip_in_kernel()) {
             ret &= ~CPUID_EXT_X2APIC;
         }
+
+        if (enable_cpu_pm) {
+            int disable_exits = kvm_check_extension(s, KVM_CAP_X86_DISABLE_EXITS);
+
+            if (disable_exits & KVM_X86_DISABLE_EXITS_MWAIT) {
+                ret |= CPUID_EXT_MONITOR;
+            }
+        }
     } else if (function == 6 && reg == R_EAX) {
         ret |= CPUID_6_EAX_ARAT; /* safe to allow because of emulated APIC */
     } else if (function == 7 && index == 0 && reg == R_EBX) {
-- 
MST


Re: [Qemu-devel] [PATCH v2 2/2] i386/cpu: make -cpu host support monitor/mwait
Posted by Eduardo Habkost, 1 week ago
On Tue, Jun 12, 2018 at 09:47:13PM +0300, Michael S. Tsirkin wrote:
> When guest CPU PM is enabled, and with -cpu host, expose the host CPU
> MWAIT leaf in the CPUID so guest can make good PM decisions.
> 
> Note: the result is 100% CPU utilization reported by host as host
> no longer knows that the CPU is halted.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>

-- 
Eduardo