The arch specific code may need to know if a particular gpa is valid and
writable for the shared memory between the host and the guest. Currently,
there are few places where it is used in RISC-V implementation. Given the
nature of the function it may be used for other architectures.
Hence, a common helper function is added.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
---
include/linux/kvm_host.h | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 15656b7fba6c..eec5cbbcb4b3 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1892,6 +1892,14 @@ static inline bool kvm_is_gpa_in_memslot(struct kvm *kvm, gpa_t gpa)
return !kvm_is_error_hva(hva);
}
+static inline bool kvm_is_gpa_in_writable_memslot(struct kvm *kvm, gpa_t gpa)
+{
+ bool writable;
+ unsigned long hva = gfn_to_hva_prot(kvm, gpa_to_gfn(gpa), &writable);
+
+ return !kvm_is_error_hva(hva) && writable;
+}
+
static inline void kvm_gpc_mark_dirty_in_slot(struct gfn_to_pfn_cache *gpc)
{
lockdep_assert_held(&gpc->lock);
--
2.43.0
On Fri, Aug 29, 2025, Atish Patra wrote: > The arch specific code may need to know if a particular gpa is valid and > writable for the shared memory between the host and the guest. Currently, > there are few places where it is used in RISC-V implementation. Given the > nature of the function it may be used for other architectures. > Hence, a common helper function is added. > > Suggested-by: Sean Christopherson <seanjc@google.com> > Signed-off-by: Atish Patra <atishp@rivosinc.com> > --- > include/linux/kvm_host.h | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index 15656b7fba6c..eec5cbbcb4b3 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -1892,6 +1892,14 @@ static inline bool kvm_is_gpa_in_memslot(struct kvm *kvm, gpa_t gpa) > return !kvm_is_error_hva(hva); > } > > +static inline bool kvm_is_gpa_in_writable_memslot(struct kvm *kvm, gpa_t gpa) > +{ > + bool writable; > + unsigned long hva = gfn_to_hva_prot(kvm, gpa_to_gfn(gpa), &writable); > + > + return !kvm_is_error_hva(hva) && writable; I don't hate this API, but I don't love it either. Because knowing that the _memslot_ is writable doesn't mean all that much. E.g. in this usage: hva = kvm_vcpu_gfn_to_hva_prot(vcpu, shmem >> PAGE_SHIFT, &writable); if (kvm_is_error_hva(hva) || !writable) return SBI_ERR_INVALID_ADDRESS; ret = kvm_vcpu_write_guest(vcpu, shmem, &zero_sta, sizeof(zero_sta)); if (ret) return SBI_ERR_FAILURE; the error code returned to the guest will be different if the memslot is read-only versus if the VMA is read-only (or not even mapped!). Unless every read-only memslot is explicitly communicated as such to the guest, I don't see how the guest can *know* that a memslot is read-only, so returning INVALID_ADDRESS in that case but not when the underlying VMA isn't writable seems odd. It's also entirely possible the memslot could be replaced with a read-only memslot after the check, or vice versa, i.e. become writable after being rejected. Is it *really* a problem to return FAILURE if the guest attempts to setup steal-time in a read-only memslot? I.e. why not do this and call it good? if (!kvm_is_gpa_in_memslot(vcpu->kvm, shmem >> PAGE_SHIFT)) return SBI_ERR_INVALID_ADDRESS; ret = kvm_vcpu_write_guest(vcpu, shmem, &zero_sta, sizeof(zero_sta)); if (ret) return SBI_ERR_FAILURE;
On Fri, Aug 29, 2025 at 1:47 PM Sean Christopherson <seanjc@google.com> wrote: > > On Fri, Aug 29, 2025, Atish Patra wrote: > > The arch specific code may need to know if a particular gpa is valid and > > writable for the shared memory between the host and the guest. Currently, > > there are few places where it is used in RISC-V implementation. Given the > > nature of the function it may be used for other architectures. > > Hence, a common helper function is added. > > > > Suggested-by: Sean Christopherson <seanjc@google.com> > > Signed-off-by: Atish Patra <atishp@rivosinc.com> > > --- > > include/linux/kvm_host.h | 8 ++++++++ > > 1 file changed, 8 insertions(+) > > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > > index 15656b7fba6c..eec5cbbcb4b3 100644 > > --- a/include/linux/kvm_host.h > > +++ b/include/linux/kvm_host.h > > @@ -1892,6 +1892,14 @@ static inline bool kvm_is_gpa_in_memslot(struct kvm *kvm, gpa_t gpa) > > return !kvm_is_error_hva(hva); > > } > > > > +static inline bool kvm_is_gpa_in_writable_memslot(struct kvm *kvm, gpa_t gpa) > > +{ > > + bool writable; > > + unsigned long hva = gfn_to_hva_prot(kvm, gpa_to_gfn(gpa), &writable); > > + > > + return !kvm_is_error_hva(hva) && writable; > > I don't hate this API, but I don't love it either. Because knowing that the > _memslot_ is writable doesn't mean all that much. E.g. in this usage: > > hva = kvm_vcpu_gfn_to_hva_prot(vcpu, shmem >> PAGE_SHIFT, &writable); > if (kvm_is_error_hva(hva) || !writable) > return SBI_ERR_INVALID_ADDRESS; > > ret = kvm_vcpu_write_guest(vcpu, shmem, &zero_sta, sizeof(zero_sta)); > if (ret) > return SBI_ERR_FAILURE; > > the error code returned to the guest will be different if the memslot is read-only > versus if the VMA is read-only (or not even mapped!). Unless every read-only > memslot is explicitly communicated as such to the guest, I don't see how the guest > can *know* that a memslot is read-only, so returning INVALID_ADDRESS in that case > but not when the underlying VMA isn't writable seems odd. > > It's also entirely possible the memslot could be replaced with a read-only memslot > after the check, or vice versa, i.e. become writable after being rejected. Is it > *really* a problem to return FAILURE if the guest attempts to setup steal-time in > a read-only memslot? I.e. why not do this and call it good? > Reposting the response as gmail converted my previous response as html. Sorry for the spam. From a functionality pov, that should be fine. However, we have explicit error conditions for read only memory defined in the SBI STA specification[1]. Technically, we will violate the spec if we return FAILURE instead of INVALID_ADDRESS for read only memslot. TBH, I don't save much duplicate code with the new generic API now. If you don't see if the generic API will be useful in other cases, I can drop that patch and changes in the steal time code. [1] https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/src/ext-steal-time.adoc#table_sta_steal_time_set_shmem_errors > if (!kvm_is_gpa_in_memslot(vcpu->kvm, shmem >> PAGE_SHIFT)) > return SBI_ERR_INVALID_ADDRESS; > > ret = kvm_vcpu_write_guest(vcpu, shmem, &zero_sta, sizeof(zero_sta)); > if (ret) > return SBI_ERR_FAILURE;
On Wed, Sep 03, 2025, Atish Kumar Patra wrote: > On Fri, Aug 29, 2025 at 1:47 PM Sean Christopherson <seanjc@google.com> wrote: > > > > On Fri, Aug 29, 2025, Atish Patra wrote: > > > +static inline bool kvm_is_gpa_in_writable_memslot(struct kvm *kvm, gpa_t gpa) > > > +{ > > > + bool writable; > > > + unsigned long hva = gfn_to_hva_prot(kvm, gpa_to_gfn(gpa), &writable); > > > + > > > + return !kvm_is_error_hva(hva) && writable; > > > > I don't hate this API, but I don't love it either. Because knowing that the > > _memslot_ is writable doesn't mean all that much. E.g. in this usage: > > > > hva = kvm_vcpu_gfn_to_hva_prot(vcpu, shmem >> PAGE_SHIFT, &writable); > > if (kvm_is_error_hva(hva) || !writable) > > return SBI_ERR_INVALID_ADDRESS; > > > > ret = kvm_vcpu_write_guest(vcpu, shmem, &zero_sta, sizeof(zero_sta)); > > if (ret) > > return SBI_ERR_FAILURE; > > > > the error code returned to the guest will be different if the memslot is read-only > > versus if the VMA is read-only (or not even mapped!). Unless every read-only > > memslot is explicitly communicated as such to the guest, I don't see how the guest > > can *know* that a memslot is read-only, so returning INVALID_ADDRESS in that case > > but not when the underlying VMA isn't writable seems odd. > > > > It's also entirely possible the memslot could be replaced with a read-only memslot > > after the check, or vice versa, i.e. become writable after being rejected. Is it > > *really* a problem to return FAILURE if the guest attempts to setup steal-time in > > a read-only memslot? I.e. why not do this and call it good? > > > > Reposting the response as gmail converted my previous response as > html. Sorry for the spam. > > From a functionality pov, that should be fine. However, we have > explicit error conditions for read only memory defined in the SBI STA > specification[1]. > Technically, we will violate the spec if we return FAILURE instead of > INVALID_ADDRESS for read only memslot. But KVM is already violating the spec, as kvm_vcpu_write_guest() redoes the memslot lookup and so could encounter a read-only memslot (if it races with a memslot update), and because the underlying memory could be read-only even if the memslot is writable. Why not simply return SBI_ERR_INVALID_ADDRESS on kvm_vcpu_write_guest() failure? The only downside of that is KVM will also return SBI_ERR_INVALID_ADDRESS if the userspace mapping is completely missing, but AFAICT that doesn't seem to be an outright spec violation.
On Fri, Sep 5, 2025 at 1:23 AM Sean Christopherson <seanjc@google.com> wrote: > > On Wed, Sep 03, 2025, Atish Kumar Patra wrote: > > On Fri, Aug 29, 2025 at 1:47 PM Sean Christopherson <seanjc@google.com> wrote: > > > > > > On Fri, Aug 29, 2025, Atish Patra wrote: > > > > +static inline bool kvm_is_gpa_in_writable_memslot(struct kvm *kvm, gpa_t gpa) > > > > +{ > > > > + bool writable; > > > > + unsigned long hva = gfn_to_hva_prot(kvm, gpa_to_gfn(gpa), &writable); > > > > + > > > > + return !kvm_is_error_hva(hva) && writable; > > > > > > I don't hate this API, but I don't love it either. Because knowing that the > > > _memslot_ is writable doesn't mean all that much. E.g. in this usage: > > > > > > hva = kvm_vcpu_gfn_to_hva_prot(vcpu, shmem >> PAGE_SHIFT, &writable); > > > if (kvm_is_error_hva(hva) || !writable) > > > return SBI_ERR_INVALID_ADDRESS; > > > > > > ret = kvm_vcpu_write_guest(vcpu, shmem, &zero_sta, sizeof(zero_sta)); > > > if (ret) > > > return SBI_ERR_FAILURE; > > > > > > the error code returned to the guest will be different if the memslot is read-only > > > versus if the VMA is read-only (or not even mapped!). Unless every read-only > > > memslot is explicitly communicated as such to the guest, I don't see how the guest > > > can *know* that a memslot is read-only, so returning INVALID_ADDRESS in that case > > > but not when the underlying VMA isn't writable seems odd. > > > > > > It's also entirely possible the memslot could be replaced with a read-only memslot > > > after the check, or vice versa, i.e. become writable after being rejected. Is it > > > *really* a problem to return FAILURE if the guest attempts to setup steal-time in > > > a read-only memslot? I.e. why not do this and call it good? > > > > > > > Reposting the response as gmail converted my previous response as > > html. Sorry for the spam. > > > > From a functionality pov, that should be fine. However, we have > > explicit error conditions for read only memory defined in the SBI STA > > specification[1]. > > Technically, we will violate the spec if we return FAILURE instead of > > INVALID_ADDRESS for read only memslot. > > But KVM is already violating the spec, as kvm_vcpu_write_guest() redoes the > memslot lookup and so could encounter a read-only memslot (if it races with > a memslot update), and because the underlying memory could be read-only even if > the memslot is writable. > Ahh. Thanks for clarifying that. > Why not simply return SBI_ERR_INVALID_ADDRESS on kvm_vcpu_write_guest() failure? > The only downside of that is KVM will also return SBI_ERR_INVALID_ADDRESS if the > userspace mapping is completely missing, but AFAICT that doesn't seem to be an > outright spec violation. Yes. That's correct. That can still be considered as invalid address. I will revise the patch according to this. Thanks for the suggestions.
© 2016 - 2025 Red Hat, Inc.