arch/x86/kvm/xen.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-)
From: David Woodhouse <dwmw@amazon.co.uk>
Most of the time there's no need to kick the vCPU and deliver the timer
event through kvm_xen_inject_timer_irqs(). Use kvm_xen_set_evtchn_fast()
directly from the timer callback, and only fall back to the slow path
when it's necessary to do so.
This gives a significant improvement in timer latency testing (using
nanosleep() for various periods and then measuring the actual time
elapsed).
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
arch/x86/kvm/xen.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index 40edf4d1974c..66c4cf93a55c 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -134,12 +134,23 @@ static enum hrtimer_restart xen_timer_callback(struct hrtimer *timer)
{
struct kvm_vcpu *vcpu = container_of(timer, struct kvm_vcpu,
arch.xen.timer);
+ struct kvm_xen_evtchn e;
+ int rc;
+
if (atomic_read(&vcpu->arch.xen.timer_pending))
return HRTIMER_NORESTART;
- atomic_inc(&vcpu->arch.xen.timer_pending);
- kvm_make_request(KVM_REQ_UNBLOCK, vcpu);
- kvm_vcpu_kick(vcpu);
+ e.vcpu_id = vcpu->vcpu_id;
+ e.vcpu_idx = vcpu->vcpu_idx;
+ e.port = vcpu->arch.xen.timer_virq;
+ e.priority = KVM_IRQ_ROUTING_XEN_EVTCHN_PRIO_2LEVEL;
+
+ rc = kvm_xen_set_evtchn_fast(&e, vcpu->kvm);
+ if (rc == -EWOULDBLOCK) {
+ atomic_inc(&vcpu->arch.xen.timer_pending);
+ kvm_make_request(KVM_REQ_UNBLOCK, vcpu);
+ kvm_vcpu_kick(vcpu);
+ }
return HRTIMER_NORESTART;
}
--
2.34.1
On Fri, 2023-09-22 at 10:20 +0100, David Woodhouse wrote: > From: David Woodhouse <dwmw@amazon.co.uk> > > Most of the time there's no need to kick the vCPU and deliver the timer > event through kvm_xen_inject_timer_irqs(). Use kvm_xen_set_evtchn_fast() > directly from the timer callback, and only fall back to the slow path > when it's necessary to do so. > > This gives a significant improvement in timer latency testing (using > nanosleep() for various periods and then measuring the actual time > elapsed). > > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Hm, scratch that. It brings back the race condition Paolo talks about in https://lore.kernel.org/kvm/0709ac62f664c0f3123fcdeabed3b79038cef3b6.camel@infradead.org/T/ Well, not *quite*... there's no race with clearing timer_expires because I forgot to clear it at all :) I think I'll be able to fix it up with an hrtimer_cancel() in the rare ioctl attr_get path to avoid the race.
On 22/09/2023 10:20, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
>
> Most of the time there's no need to kick the vCPU and deliver the timer
> event through kvm_xen_inject_timer_irqs(). Use kvm_xen_set_evtchn_fast()
> directly from the timer callback, and only fall back to the slow path
> when it's necessary to do so.
>
> This gives a significant improvement in timer latency testing (using
> nanosleep() for various periods and then measuring the actual time
> elapsed).
>
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
> arch/x86/kvm/xen.c | 17 ++++++++++++++---
> 1 file changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
> index 40edf4d1974c..66c4cf93a55c 100644
> --- a/arch/x86/kvm/xen.c
> +++ b/arch/x86/kvm/xen.c
> @@ -134,12 +134,23 @@ static enum hrtimer_restart xen_timer_callback(struct hrtimer *timer)
> {
> struct kvm_vcpu *vcpu = container_of(timer, struct kvm_vcpu,
> arch.xen.timer);
> + struct kvm_xen_evtchn e;
> + int rc;
> +
> if (atomic_read(&vcpu->arch.xen.timer_pending))
> return HRTIMER_NORESTART;
>
> - atomic_inc(&vcpu->arch.xen.timer_pending);
> - kvm_make_request(KVM_REQ_UNBLOCK, vcpu);
> - kvm_vcpu_kick(vcpu);
> + e.vcpu_id = vcpu->vcpu_id;
> + e.vcpu_idx = vcpu->vcpu_idx;
> + e.port = vcpu->arch.xen.timer_virq;
Don't you need to check the port for validity? The VMM may not have set
it (and hence be delivering timer VIRQ events itself).
Paul
> + e.priority = KVM_IRQ_ROUTING_XEN_EVTCHN_PRIO_2LEVEL;
> +
> + rc = kvm_xen_set_evtchn_fast(&e, vcpu->kvm);
> + if (rc == -EWOULDBLOCK) {
> + atomic_inc(&vcpu->arch.xen.timer_pending);
> + kvm_make_request(KVM_REQ_UNBLOCK, vcpu);
> + kvm_vcpu_kick(vcpu);
> + }
>
> return HRTIMER_NORESTART;
> }
On Fri, 2023-09-22 at 10:22 +0100, Paul Durrant wrote:
> On 22/09/2023 10:20, David Woodhouse wrote:
> > From: David Woodhouse <dwmw@amazon.co.uk>
> >
> > Most of the time there's no need to kick the vCPU and deliver the timer
> > event through kvm_xen_inject_timer_irqs(). Use kvm_xen_set_evtchn_fast()
> > directly from the timer callback, and only fall back to the slow path
> > when it's necessary to do so.
> >
> > This gives a significant improvement in timer latency testing (using
> > nanosleep() for various periods and then measuring the actual time
> > elapsed).
> >
> > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> > ---
> > arch/x86/kvm/xen.c | 17 ++++++++++++++---
> > 1 file changed, 14 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
> > index 40edf4d1974c..66c4cf93a55c 100644
> > --- a/arch/x86/kvm/xen.c
> > +++ b/arch/x86/kvm/xen.c
> > @@ -134,12 +134,23 @@ static enum hrtimer_restart xen_timer_callback(struct hrtimer *timer)
> > {
> > struct kvm_vcpu *vcpu = container_of(timer, struct kvm_vcpu,
> > arch.xen.timer);
> > + struct kvm_xen_evtchn e;
> > + int rc;
> > +
> > if (atomic_read(&vcpu->arch.xen.timer_pending))
> > return HRTIMER_NORESTART;
> >
> > - atomic_inc(&vcpu->arch.xen.timer_pending);
> > - kvm_make_request(KVM_REQ_UNBLOCK, vcpu);
> > - kvm_vcpu_kick(vcpu);
> > + e.vcpu_id = vcpu->vcpu_id;
> > + e.vcpu_idx = vcpu->vcpu_idx;
> > + e.port = vcpu->arch.xen.timer_virq;
>
> Don't you need to check the port for validity? The VMM may not have set
> it (and hence be delivering timer VIRQ events itself).
Nah, the kvm_xen_timer_enabled() check which gates all the kernel
interception of timer hypercalls is already testing precisely that.
If the TIMER_VIRQ isn't set, none of this code runs. And when userspace
tears down TIMER_VIRQ, the hrtimer is cancelled. So none of this code
(then) runs.
On 22/09/2023 10:29, David Woodhouse wrote:
> On Fri, 2023-09-22 at 10:22 +0100, Paul Durrant wrote:
>> On 22/09/2023 10:20, David Woodhouse wrote:
>>> From: David Woodhouse <dwmw@amazon.co.uk>
>>>
>>> Most of the time there's no need to kick the vCPU and deliver the timer
>>> event through kvm_xen_inject_timer_irqs(). Use kvm_xen_set_evtchn_fast()
>>> directly from the timer callback, and only fall back to the slow path
>>> when it's necessary to do so.
>>>
>>> This gives a significant improvement in timer latency testing (using
>>> nanosleep() for various periods and then measuring the actual time
>>> elapsed).
>>>
>>> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
>>> ---
>>> arch/x86/kvm/xen.c | 17 ++++++++++++++---
>>> 1 file changed, 14 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
>>> index 40edf4d1974c..66c4cf93a55c 100644
>>> --- a/arch/x86/kvm/xen.c
>>> +++ b/arch/x86/kvm/xen.c
>>> @@ -134,12 +134,23 @@ static enum hrtimer_restart xen_timer_callback(struct hrtimer *timer)
>>> {
>>> struct kvm_vcpu *vcpu = container_of(timer, struct kvm_vcpu,
>>> arch.xen.timer);
>>> + struct kvm_xen_evtchn e;
>>> + int rc;
>>> +
>>> if (atomic_read(&vcpu->arch.xen.timer_pending))
>>> return HRTIMER_NORESTART;
>>>
>>> - atomic_inc(&vcpu->arch.xen.timer_pending);
>>> - kvm_make_request(KVM_REQ_UNBLOCK, vcpu);
>>> - kvm_vcpu_kick(vcpu);
>>> + e.vcpu_id = vcpu->vcpu_id;
>>> + e.vcpu_idx = vcpu->vcpu_idx;
>>> + e.port = vcpu->arch.xen.timer_virq;
>>
>> Don't you need to check the port for validity? The VMM may not have set
>> it (and hence be delivering timer VIRQ events itself).
>
> Nah, the kvm_xen_timer_enabled() check which gates all the kernel
> interception of timer hypercalls is already testing precisely that.
>
> If the TIMER_VIRQ isn't set, none of this code runs. And when userspace
> tears down TIMER_VIRQ, the hrtimer is cancelled. So none of this code
> (then) runs.
Ok. Good :-)
Reviewed-by: Paul Durrant <paul@xen.org>
© 2016 - 2026 Red Hat, Inc.