[PATCH 0/2] KVM: enable halt poll shrink parameter

Parshuram Sangle posted 2 patches 2 years, 1 month ago
Documentation/virt/kvm/halt-polling.rst | 26 +++++++++++++------------
virt/kvm/kvm_main.c                     |  4 ++--
2 files changed, 16 insertions(+), 14 deletions(-)
[PATCH 0/2] KVM: enable halt poll shrink parameter
Posted by Parshuram Sangle 2 years, 1 month ago
KVM halt polling interval growth and shrink behavior has evolved since its
inception. The current mechanism adjusts the polling interval based on whether
vcpu wakeup was received or not during polling interval using grow and shrink
parameter values. Though grow parameter is logically set to 2 by default,
shrink parameter is kept disabled (set to 0).

Disabled shrink has two issues:
1) Resets polling interval to 0 on every un-successful poll assuming it is
less likely to receive a vcpu wakeup in further shrunk intervals.
2) Even on successful poll, if total block time is greater or equal to current
poll_ns value, polling interval is reset to 0 instead shrinking gradually.

These aspects reduce the chances receiving valid wakeup during polling and
lose potential performance benefits for VM workloads.

Below is the summary of experiments conducted to assess performance and power
impact by enabling the halt_poll_ns_shrink parameter(value set to 2).

Performance Test Summary: (Higher is better)
--------------------------------------------
Platform Details: Chrome Brya platform
CPU - Alder Lake (12th Gen Intel CPU i7-1255U)
Host kernel version - 5.15.127-20371-g710a1611ad33

Android VM workload (Score)   Base      Shrink Enabled (value 2)    Delta
---------------------------------------------------------------------------
GeekBench Multi-core(CPU)     5754      5856                        2%
3D Mark Slingshot(CPU+GPU)    15486     15885                       3%
Stream (handopt)(Memory)      20566     21594                       5%
fio seq-read (Storage)        727       747                         3%
fio seq-write (Storage)       331       343                         3%
fio rand-read (Storage)       690       732                         6%
fio rand-write (Storage)      299       300                         1%

Steam Gaming VM (Avg FPS)     Base      Shrink Enabled (value 2)    Delta
---------------------------------------------------------------------------
Metro Redux (OpenGL)          54.80     59.60                       9%
Dota 2 (Open GL)              48.74     51.40                       5%
Dota 2 (Vulkan)               20.80     21.10                       1%
SpaceShip (Vulkan)            20.40     21.52                       6%

With Shrink enabled, majority of workloads show higher % of successful polling.
Reduced latency of returning control back to VM and avoided overhead of vm_exit
contribute to these performance gains.

Power Impact Assessment Summary: (Lower is better)
--------------------------------------------------
Method : DAQ measurements of CPU and Memory rails

CPU+Memory (Watt)             Base      Shrink Enabled (value 2)    Delta
---------------------------------------------------------------------------
Idle* (Host)                  0.636     0.631                       -0.8%
Video Playback (Host)         2.225     2.210                       -0.7%
Tomb Raider (VM)              17.261    17.175                      -0.5%
SpaceShip Benchmark(VM)       17.079    17.123                       0.3%

*Idle power - Idle system with no application running, Android and Borealis
VMs enabled running no workload. Duration 180 sec.

Power measurements done for Chrome idle scenario and active Gaming VM 
workload show negligible power overhead since additional polling creates
very short duration bursts which are less likely to have gone to a
complete idle CPU state.

NOTE: No tests are conducted on non-x86 platform with this changed config

The default values of grow and shrink parameters get commonly used by
various VM deployments unless specifically tuned for performance. Hence
referring to performance and power measurements results shown above, it is
recommended to have shrink enabled (with value 2) by default so that there
is no need to explicitly set this parameter through kernel cmdline or by
other means.

Parshuram Sangle (2):
  KVM: enable halt polling shrink parameter by default
  KVM: documentation update to halt polling

 Documentation/virt/kvm/halt-polling.rst | 26 +++++++++++++------------
 virt/kvm/kvm_main.c                     |  4 ++--
 2 files changed, 16 insertions(+), 14 deletions(-)


base-commit: 2b3f2325e71f09098723727d665e2e8003d455dc
-- 
2.17.1
Re: [PATCH 0/2] KVM: enable halt poll shrink parameter
Posted by Sean Christopherson 1 year, 6 months ago
On Thu, 02 Nov 2023 21:16:26 +0530, Parshuram Sangle wrote:
> KVM halt polling interval growth and shrink behavior has evolved since its
> inception. The current mechanism adjusts the polling interval based on whether
> vcpu wakeup was received or not during polling interval using grow and shrink
> parameter values. Though grow parameter is logically set to 2 by default,
> shrink parameter is kept disabled (set to 0).
> 
> Disabled shrink has two issues:
> 1) Resets polling interval to 0 on every un-successful poll assuming it is
> less likely to receive a vcpu wakeup in further shrunk intervals.
> 2) Even on successful poll, if total block time is greater or equal to current
> poll_ns value, polling interval is reset to 0 instead shrinking gradually.
> 
> [...]

Applied to kvm-x86 generic, with a reduced version of the doc update as
described in response to patch 2.  Thanks!

[1/2] KVM: enable halt polling shrink parameter by default
      https://github.com/kvm-x86/linux/commit/aeb1b22a3ac8
[2/2] KVM: documentation update to halt polling
      https://github.com/kvm-x86/linux/commit/f8aadead1971

--
https://github.com/kvm-x86/linux/tree/next
Re: [PATCH 0/2] KVM: enable halt poll shrink parameter
Posted by Sean Christopherson 1 year, 8 months ago
On Thu, Nov 02, 2023, Parshuram Sangle wrote:
> KVM halt polling interval growth and shrink behavior has evolved since its
> inception. The current mechanism adjusts the polling interval based on whether
> vcpu wakeup was received or not during polling interval using grow and shrink
> parameter values. Though grow parameter is logically set to 2 by default,
> shrink parameter is kept disabled (set to 0).
> 
> Disabled shrink has two issues:
> 1) Resets polling interval to 0 on every un-successful poll assuming it is
> less likely to receive a vcpu wakeup in further shrunk intervals.
> 2) Even on successful poll, if total block time is greater or equal to current
> poll_ns value, polling interval is reset to 0 instead shrinking gradually.
> 
> These aspects reduce the chances receiving valid wakeup during polling and
> lose potential performance benefits for VM workloads.
> 
> Below is the summary of experiments conducted to assess performance and power
> impact by enabling the halt_poll_ns_shrink parameter(value set to 2).
> 
> Performance Test Summary: (Higher is better)
> --------------------------------------------
> Platform Details: Chrome Brya platform
> CPU - Alder Lake (12th Gen Intel CPU i7-1255U)
> Host kernel version - 5.15.127-20371-g710a1611ad33
> 
> Android VM workload (Score)   Base      Shrink Enabled (value 2)    Delta
> ---------------------------------------------------------------------------
> GeekBench Multi-core(CPU)     5754      5856                        2%
> 3D Mark Slingshot(CPU+GPU)    15486     15885                       3%
> Stream (handopt)(Memory)      20566     21594                       5%
> fio seq-read (Storage)        727       747                         3%
> fio seq-write (Storage)       331       343                         3%
> fio rand-read (Storage)       690       732                         6%
> fio rand-write (Storage)      299       300                         1%
> 
> Steam Gaming VM (Avg FPS)     Base      Shrink Enabled (value 2)    Delta
> ---------------------------------------------------------------------------
> Metro Redux (OpenGL)          54.80     59.60                       9%
> Dota 2 (Open GL)              48.74     51.40                       5%
> Dota 2 (Vulkan)               20.80     21.10                       1%
> SpaceShip (Vulkan)            20.40     21.52                       6%
> 
> With Shrink enabled, majority of workloads show higher % of successful polling.
> Reduced latency of returning control back to VM and avoided overhead of vm_exit
> contribute to these performance gains.
> 
> Power Impact Assessment Summary: (Lower is better)
> --------------------------------------------------
> Method : DAQ measurements of CPU and Memory rails
> 
> CPU+Memory (Watt)             Base      Shrink Enabled (value 2)    Delta
> ---------------------------------------------------------------------------
> Idle* (Host)                  0.636     0.631                       -0.8%
> Video Playback (Host)         2.225     2.210                       -0.7%
> Tomb Raider (VM)              17.261    17.175                      -0.5%
> SpaceShip Benchmark(VM)       17.079    17.123                       0.3%
> 
> *Idle power - Idle system with no application running, Android and Borealis
> VMs enabled running no workload. Duration 180 sec.
> 
> Power measurements done for Chrome idle scenario and active Gaming VM 
> workload show negligible power overhead since additional polling creates
> very short duration bursts which are less likely to have gone to a
> complete idle CPU state.
> 
> NOTE: No tests are conducted on non-x86 platform with this changed config
> 
> The default values of grow and shrink parameters get commonly used by
> various VM deployments unless specifically tuned for performance. Hence
> referring to performance and power measurements results shown above, it is
> recommended to have shrink enabled (with value 2) by default so that there
> is no need to explicitly set this parameter through kernel cmdline or by
> other means.

I am by no means an expert on halt polling or power management, but all of this
seems like a reasonable tradeoff.  And even without the numbers you provided,
starting from scratch after a single failure is rather odd.

So unless someone objects, I'll plan on applying this for 6.11 in a few weeks
(after the 6.10 merge window closes).
Re: [PATCH 0/2] KVM: enable halt poll shrink parameter
Posted by Parshuram Sangle 2 years ago
Soft reminder for patch review