[v5] Add SBI v3.0 PMU enhancements

[PATCH v5 6/9] KVM: Add a helper function to check if a gpa is in writable memselot

Posted by Atish Patra 5 months, 2 weeks ago

The arch specific code may need to know if a particular gpa is valid and
writable for the shared memory between the host and the guest. Currently,
there are few places where it is used in RISC-V implementation. Given the
nature of the function it may be used for other architectures.
Hence, a common helper function is added.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
 include/linux/kvm_host.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 15656b7fba6c..eec5cbbcb4b3 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1892,6 +1892,14 @@ static inline bool kvm_is_gpa_in_memslot(struct kvm *kvm, gpa_t gpa)
 	return !kvm_is_error_hva(hva);
 }
 
+static inline bool kvm_is_gpa_in_writable_memslot(struct kvm *kvm, gpa_t gpa)
+{
+	bool writable;
+	unsigned long hva = gfn_to_hva_prot(kvm, gpa_to_gfn(gpa), &writable);
+
+	return !kvm_is_error_hva(hva) && writable;
+}
+
 static inline void kvm_gpc_mark_dirty_in_slot(struct gfn_to_pfn_cache *gpc)
 {
 	lockdep_assert_held(&gpc->lock);

-- 
2.43.0

Re: [PATCH v5 6/9] KVM: Add a helper function to check if a gpa is in writable memselot

Posted by Sean Christopherson 5 months, 2 weeks ago

On Fri, Aug 29, 2025, Atish Patra wrote:
> The arch specific code may need to know if a particular gpa is valid and
> writable for the shared memory between the host and the guest. Currently,
> there are few places where it is used in RISC-V implementation. Given the
> nature of the function it may be used for other architectures.
> Hence, a common helper function is added.
> 
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Atish Patra <atishp@rivosinc.com>
> ---
>  include/linux/kvm_host.h | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 15656b7fba6c..eec5cbbcb4b3 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -1892,6 +1892,14 @@ static inline bool kvm_is_gpa_in_memslot(struct kvm *kvm, gpa_t gpa)
>  	return !kvm_is_error_hva(hva);
>  }
>  
> +static inline bool kvm_is_gpa_in_writable_memslot(struct kvm *kvm, gpa_t gpa)
> +{
> +	bool writable;
> +	unsigned long hva = gfn_to_hva_prot(kvm, gpa_to_gfn(gpa), &writable);
> +
> +	return !kvm_is_error_hva(hva) && writable;

I don't hate this API, but I don't love it either.  Because knowing that the
_memslot_ is writable doesn't mean all that much.  E.g. in this usage:

	hva = kvm_vcpu_gfn_to_hva_prot(vcpu, shmem >> PAGE_SHIFT, &writable);
	if (kvm_is_error_hva(hva) || !writable)
		return SBI_ERR_INVALID_ADDRESS;

	ret = kvm_vcpu_write_guest(vcpu, shmem, &zero_sta, sizeof(zero_sta));
	if (ret)
		return SBI_ERR_FAILURE;

the error code returned to the guest will be different if the memslot is read-only
versus if the VMA is read-only (or not even mapped!).  Unless every read-only
memslot is explicitly communicated as such to the guest, I don't see how the guest
can *know* that a memslot is read-only, so returning INVALID_ADDRESS in that case
but not when the underlying VMA isn't writable seems odd.

It's also entirely possible the memslot could be replaced with a read-only memslot
after the check, or vice versa, i.e. become writable after being rejected.  Is it
*really* a problem to return FAILURE if the guest attempts to setup steal-time in
a read-only memslot?  I.e. why not do this and call it good?

	if (!kvm_is_gpa_in_memslot(vcpu->kvm, shmem >> PAGE_SHIFT))
		return SBI_ERR_INVALID_ADDRESS;

	ret = kvm_vcpu_write_guest(vcpu, shmem, &zero_sta, sizeof(zero_sta));
	if (ret)
		return SBI_ERR_FAILURE;

Re: [PATCH v5 6/9] KVM: Add a helper function to check if a gpa is in writable memselot

Posted by Atish Kumar Patra 5 months, 1 week ago

On Fri, Aug 29, 2025 at 1:47 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Fri, Aug 29, 2025, Atish Patra wrote:
> > The arch specific code may need to know if a particular gpa is valid and
> > writable for the shared memory between the host and the guest. Currently,
> > there are few places where it is used in RISC-V implementation. Given the
> > nature of the function it may be used for other architectures.
> > Hence, a common helper function is added.
> >
> > Suggested-by: Sean Christopherson <seanjc@google.com>
> > Signed-off-by: Atish Patra <atishp@rivosinc.com>
> > ---
> >  include/linux/kvm_host.h | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > index 15656b7fba6c..eec5cbbcb4b3 100644
> > --- a/include/linux/kvm_host.h
> > +++ b/include/linux/kvm_host.h
> > @@ -1892,6 +1892,14 @@ static inline bool kvm_is_gpa_in_memslot(struct kvm *kvm, gpa_t gpa)
> >       return !kvm_is_error_hva(hva);
> >  }
> >
> > +static inline bool kvm_is_gpa_in_writable_memslot(struct kvm *kvm, gpa_t gpa)
> > +{
> > +     bool writable;
> > +     unsigned long hva = gfn_to_hva_prot(kvm, gpa_to_gfn(gpa), &writable);
> > +
> > +     return !kvm_is_error_hva(hva) && writable;
>
> I don't hate this API, but I don't love it either.  Because knowing that the
> _memslot_ is writable doesn't mean all that much.  E.g. in this usage:
>
>         hva = kvm_vcpu_gfn_to_hva_prot(vcpu, shmem >> PAGE_SHIFT, &writable);
>         if (kvm_is_error_hva(hva) || !writable)
>                 return SBI_ERR_INVALID_ADDRESS;
>
>         ret = kvm_vcpu_write_guest(vcpu, shmem, &zero_sta, sizeof(zero_sta));
>         if (ret)
>                 return SBI_ERR_FAILURE;
>
> the error code returned to the guest will be different if the memslot is read-only
> versus if the VMA is read-only (or not even mapped!).  Unless every read-only
> memslot is explicitly communicated as such to the guest, I don't see how the guest
> can *know* that a memslot is read-only, so returning INVALID_ADDRESS in that case
> but not when the underlying VMA isn't writable seems odd.
>
> It's also entirely possible the memslot could be replaced with a read-only memslot
> after the check, or vice versa, i.e. become writable after being rejected.  Is it
> *really* a problem to return FAILURE if the guest attempts to setup steal-time in
> a read-only memslot?  I.e. why not do this and call it good?
>

Reposting the response as gmail converted my previous response as
html. Sorry for the spam.

From a functionality pov, that should be fine. However, we have
explicit error conditions for read only memory defined in the SBI STA
specification[1].
Technically, we will violate the spec if we return FAILURE instead of
INVALID_ADDRESS for read only memslot.

TBH, I don't save much duplicate code with the new generic API now.
If you don't see if the generic API will be useful in other cases, I
can drop that patch and changes in the steal time code.

[1] https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/src/ext-steal-time.adoc#table_sta_steal_time_set_shmem_errors
>         if (!kvm_is_gpa_in_memslot(vcpu->kvm, shmem >> PAGE_SHIFT))
>                 return SBI_ERR_INVALID_ADDRESS;
>
>         ret = kvm_vcpu_write_guest(vcpu, shmem, &zero_sta, sizeof(zero_sta));
>         if (ret)
>                 return SBI_ERR_FAILURE;

Re: [PATCH v5 6/9] KVM: Add a helper function to check if a gpa is in writable memselot

Posted by Sean Christopherson 5 months, 1 week ago

On Wed, Sep 03, 2025, Atish Kumar Patra wrote:
> On Fri, Aug 29, 2025 at 1:47 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Fri, Aug 29, 2025, Atish Patra wrote:
> > > +static inline bool kvm_is_gpa_in_writable_memslot(struct kvm *kvm, gpa_t gpa)
> > > +{
> > > +     bool writable;
> > > +     unsigned long hva = gfn_to_hva_prot(kvm, gpa_to_gfn(gpa), &writable);
> > > +
> > > +     return !kvm_is_error_hva(hva) && writable;
> >
> > I don't hate this API, but I don't love it either.  Because knowing that the
> > _memslot_ is writable doesn't mean all that much.  E.g. in this usage:
> >
> >         hva = kvm_vcpu_gfn_to_hva_prot(vcpu, shmem >> PAGE_SHIFT, &writable);
> >         if (kvm_is_error_hva(hva) || !writable)
> >                 return SBI_ERR_INVALID_ADDRESS;
> >
> >         ret = kvm_vcpu_write_guest(vcpu, shmem, &zero_sta, sizeof(zero_sta));
> >         if (ret)
> >                 return SBI_ERR_FAILURE;
> >
> > the error code returned to the guest will be different if the memslot is read-only
> > versus if the VMA is read-only (or not even mapped!).  Unless every read-only
> > memslot is explicitly communicated as such to the guest, I don't see how the guest
> > can *know* that a memslot is read-only, so returning INVALID_ADDRESS in that case
> > but not when the underlying VMA isn't writable seems odd.
> >
> > It's also entirely possible the memslot could be replaced with a read-only memslot
> > after the check, or vice versa, i.e. become writable after being rejected.  Is it
> > *really* a problem to return FAILURE if the guest attempts to setup steal-time in
> > a read-only memslot?  I.e. why not do this and call it good?
> >
> 
> Reposting the response as gmail converted my previous response as
> html. Sorry for the spam.
> 
> From a functionality pov, that should be fine. However, we have
> explicit error conditions for read only memory defined in the SBI STA
> specification[1].
> Technically, we will violate the spec if we return FAILURE instead of
> INVALID_ADDRESS for read only memslot.

But KVM is already violating the spec, as kvm_vcpu_write_guest() redoes the
memslot lookup and so could encounter a read-only memslot (if it races with
a memslot update), and because the underlying memory could be read-only even if
the memslot is writable.

Why not simply return SBI_ERR_INVALID_ADDRESS on kvm_vcpu_write_guest() failure?
The only downside of that is KVM will also return SBI_ERR_INVALID_ADDRESS if the
userspace mapping is completely missing, but AFAICT that doesn't seem to be an
outright spec violation.

Re: [PATCH v5 6/9] KVM: Add a helper function to check if a gpa is in writable memselot

Posted by Atish Kumar Patra 5 months, 1 week ago

On Fri, Sep 5, 2025 at 1:23 AM Sean Christopherson <seanjc@google.com> wrote:
>
> On Wed, Sep 03, 2025, Atish Kumar Patra wrote:
> > On Fri, Aug 29, 2025 at 1:47 PM Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > On Fri, Aug 29, 2025, Atish Patra wrote:
> > > > +static inline bool kvm_is_gpa_in_writable_memslot(struct kvm *kvm, gpa_t gpa)
> > > > +{
> > > > +     bool writable;
> > > > +     unsigned long hva = gfn_to_hva_prot(kvm, gpa_to_gfn(gpa), &writable);
> > > > +
> > > > +     return !kvm_is_error_hva(hva) && writable;
> > >
> > > I don't hate this API, but I don't love it either.  Because knowing that the
> > > _memslot_ is writable doesn't mean all that much.  E.g. in this usage:
> > >
> > >         hva = kvm_vcpu_gfn_to_hva_prot(vcpu, shmem >> PAGE_SHIFT, &writable);
> > >         if (kvm_is_error_hva(hva) || !writable)
> > >                 return SBI_ERR_INVALID_ADDRESS;
> > >
> > >         ret = kvm_vcpu_write_guest(vcpu, shmem, &zero_sta, sizeof(zero_sta));
> > >         if (ret)
> > >                 return SBI_ERR_FAILURE;
> > >
> > > the error code returned to the guest will be different if the memslot is read-only
> > > versus if the VMA is read-only (or not even mapped!).  Unless every read-only
> > > memslot is explicitly communicated as such to the guest, I don't see how the guest
> > > can *know* that a memslot is read-only, so returning INVALID_ADDRESS in that case
> > > but not when the underlying VMA isn't writable seems odd.
> > >
> > > It's also entirely possible the memslot could be replaced with a read-only memslot
> > > after the check, or vice versa, i.e. become writable after being rejected.  Is it
> > > *really* a problem to return FAILURE if the guest attempts to setup steal-time in
> > > a read-only memslot?  I.e. why not do this and call it good?
> > >
> >
> > Reposting the response as gmail converted my previous response as
> > html. Sorry for the spam.
> >
> > From a functionality pov, that should be fine. However, we have
> > explicit error conditions for read only memory defined in the SBI STA
> > specification[1].
> > Technically, we will violate the spec if we return FAILURE instead of
> > INVALID_ADDRESS for read only memslot.
>
> But KVM is already violating the spec, as kvm_vcpu_write_guest() redoes the
> memslot lookup and so could encounter a read-only memslot (if it races with
> a memslot update), and because the underlying memory could be read-only even if
> the memslot is writable.
>

Ahh. Thanks for clarifying that.

> Why not simply return SBI_ERR_INVALID_ADDRESS on kvm_vcpu_write_guest() failure?
> The only downside of that is KVM will also return SBI_ERR_INVALID_ADDRESS if the
> userspace mapping is completely missing, but AFAICT that doesn't seem to be an
> outright spec violation.

Yes. That's correct. That can still be considered as invalid address.
I will revise the patch according to this.
Thanks for the suggestions.

[PATCH v5 1/9] drivers/perf: riscv: Add SBI v3.0 flag
[PATCH v5 2/9] drivers/perf: riscv: Add raw event v2 support
[PATCH v5 3/9] RISC-V: KVM: Add support for Raw event v2
[PATCH v5 4/9] drivers/perf: riscv: Implement PMU event info function
[PATCH v5 5/9] drivers/perf: riscv: Export PMU event info function
[PATCH v5 6/9] KVM: Add a helper function to check if a gpa is in writable memselot
[PATCH v5 7/9] RISC-V: KVM: Use the new gpa validate helper function
[PATCH v5 8/9] RISC-V: KVM: Implement get event info function
[PATCH v5 9/9] RISC-V: KVM: Upgrade the supported SBI version to 3.0