From nobody Thu Dec 18 16:59:07 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5603FC71145 for ; Thu, 24 Aug 2023 08:06:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240427AbjHXIGO (ORCPT ); Thu, 24 Aug 2023 04:06:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59978 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240390AbjHXIFj (ORCPT ); Thu, 24 Aug 2023 04:05:39 -0400 Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D25DE1BC6 for ; Thu, 24 Aug 2023 01:05:01 -0700 (PDT) Received: by mail-pl1-x632.google.com with SMTP id d9443c01a7336-1bbc87ded50so44195875ad.1 for ; Thu, 24 Aug 2023 01:05:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1692864298; x=1693469098; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=AvY3BR7n7ICYAWnZQ/eSZPYmzhaqgUHnyIdR0CKyw8o=; b=icN/X5nU7CuPmVf3EDP69dbIgW3C5ar+dYaOd+8Um+/O02PqFVSvONoyv1PfzWzeHx XGX0oIDz1VPn6MkrlqnH3sf/EON0iHZeAG5rKH7d4jWCsSSL/lHE41K1imdxEycpg7uf yQkZvT0Xz84srBh1H+MUK3DVVjjmy0ScESnew= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692864298; x=1693469098; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=AvY3BR7n7ICYAWnZQ/eSZPYmzhaqgUHnyIdR0CKyw8o=; b=Ssm5Nd1WKqfDX6JrzWIBc07dRS6AY4C8G6fr033Zbdq26gDaZ+mylTUISYCpW0G/Yr IzOJ+9fSIzBJecOUPAwqk/06VqPv6N/96SLljONsTAG8D/6dfL36Ob3vknBKUUMUTKEA 5z4SD0oOEpMDSMzFRpF3SmWyfs+mWeHvrdABuEfPih4kQtkCOrqaknT0bzfH+Tb/F0p9 ciPT25YquqMtx9LeCIWUeAPHZDxjdqKwXNv2oE5rEkTx3EGAQmMYC9QcgtBiS6Ov9ebO ZB64dxCEdpdQ3UvqJPvLT72BSZGu1R0xHAA0arpqZa7d4kXBlYoLbioIPFVJ5X/4TK+7 fw1A== X-Gm-Message-State: AOJu0YzTt94nzTTu0kmS/uah8AXeIwIK29L8f0WAGHJuPKeKsCyOlMW/ uWrOTcslb5R3/FZ+++mKKIf3Jg== X-Google-Smtp-Source: AGHT+IHcwTm8GB+wpjCGaAV8ehVfoF/tH7wPenceS5dXQN6OUyjBTD0aXuUmDcuvmshnSF2y3JwxCQ== X-Received: by 2002:a17:903:32ce:b0:1bc:56c3:ebb7 with SMTP id i14-20020a17090332ce00b001bc56c3ebb7mr15916806plr.20.1692864298341; Thu, 24 Aug 2023 01:04:58 -0700 (PDT) Received: from localhost ([2401:fa00:8f:203:515:8b2a:90c3:b79e]) by smtp.gmail.com with UTF8SMTPSA id e21-20020a170902d39500b001c0c86a5415sm610623pld.154.2023.08.24.01.04.55 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 24 Aug 2023 01:04:57 -0700 (PDT) From: David Stevens X-Google-Original-From: David Stevens To: Sean Christopherson Cc: Yu Zhang , Isaku Yamahata , Marc Zyngier , Michael Ellerman , Peter Xu , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm@vger.kernel.org, David Stevens Subject: [PATCH v8 2/8] KVM: mmu: Introduce __kvm_follow_pfn function Date: Thu, 24 Aug 2023 17:04:02 +0900 Message-ID: <20230824080408.2933205-3-stevensd@google.com> X-Mailer: git-send-email 2.42.0.rc1.204.g551eb34607-goog In-Reply-To: <20230824080408.2933205-1-stevensd@google.com> References: <20230824080408.2933205-1-stevensd@google.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: David Stevens Introduce __kvm_follow_pfn, which will replace __gfn_to_pfn_memslot. __kvm_follow_pfn refactors the old API's arguments into a struct and, where possible, combines the boolean arguments into a single flags argument. Signed-off-by: David Stevens --- include/linux/kvm_host.h | 16 ++++ virt/kvm/kvm_main.c | 171 ++++++++++++++++++++++----------------- virt/kvm/kvm_mm.h | 3 +- virt/kvm/pfncache.c | 10 ++- 4 files changed, 123 insertions(+), 77 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 9d3ac7720da9..59d9b5e5db33 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -97,6 +97,7 @@ #define KVM_PFN_ERR_HWPOISON (KVM_PFN_ERR_MASK + 1) #define KVM_PFN_ERR_RO_FAULT (KVM_PFN_ERR_MASK + 2) #define KVM_PFN_ERR_SIGPENDING (KVM_PFN_ERR_MASK + 3) +#define KVM_PFN_ERR_NEEDS_IO (KVM_PFN_ERR_MASK + 4) =20 /* * error pfns indicate that the gfn is in slot but faild to @@ -1156,6 +1157,21 @@ unsigned long gfn_to_hva_memslot_prot(struct kvm_mem= ory_slot *slot, gfn_t gfn, void kvm_release_page_clean(struct page *page); void kvm_release_page_dirty(struct page *page); =20 +struct kvm_follow_pfn { + const struct kvm_memory_slot *slot; + gfn_t gfn; + unsigned int flags; + bool atomic; + /* Try to create a writable mapping even for a read fault */ + bool try_map_writable; + + /* Outputs of __kvm_follow_pfn */ + hva_t hva; + bool writable; +}; + +kvm_pfn_t __kvm_follow_pfn(struct kvm_follow_pfn *foll); + kvm_pfn_t gfn_to_pfn(struct kvm *kvm, gfn_t gfn); kvm_pfn_t gfn_to_pfn_prot(struct kvm *kvm, gfn_t gfn, bool write_fault, bool *writable); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 1e4586aaa6cb..5fde46f05117 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2486,8 +2486,7 @@ static inline int check_user_page_hwpoison(unsigned l= ong addr) * true indicates success, otherwise false is returned. It's also the * only part that runs if we can in atomic context. */ -static bool hva_to_pfn_fast(unsigned long addr, bool write_fault, - bool *writable, kvm_pfn_t *pfn) +static bool hva_to_pfn_fast(struct kvm_follow_pfn *foll, kvm_pfn_t *pfn) { struct page *page[1]; =20 @@ -2496,14 +2495,12 @@ static bool hva_to_pfn_fast(unsigned long addr, boo= l write_fault, * or the caller allows to map a writable pfn for a read fault * request. */ - if (!(write_fault || writable)) + if (!((foll->flags & FOLL_WRITE) || foll->try_map_writable)) return false; =20 - if (get_user_page_fast_only(addr, FOLL_WRITE, page)) { + if (get_user_page_fast_only(foll->hva, FOLL_WRITE, page)) { *pfn =3D page_to_pfn(page[0]); - - if (writable) - *writable =3D true; + foll->writable =3D true; return true; } =20 @@ -2514,35 +2511,26 @@ static bool hva_to_pfn_fast(unsigned long addr, boo= l write_fault, * The slow path to get the pfn of the specified host virtual address, * 1 indicates success, -errno is returned if error is detected. */ -static int hva_to_pfn_slow(unsigned long addr, bool *async, bool write_fau= lt, - bool interruptible, bool *writable, kvm_pfn_t *pfn) +static int hva_to_pfn_slow(struct kvm_follow_pfn *foll, kvm_pfn_t *pfn) { - unsigned int flags =3D FOLL_HWPOISON; + unsigned int flags =3D FOLL_HWPOISON | foll->flags; struct page *page; int npages; =20 might_sleep(); =20 - if (writable) - *writable =3D write_fault; - - if (write_fault) - flags |=3D FOLL_WRITE; - if (async) - flags |=3D FOLL_NOWAIT; - if (interruptible) - flags |=3D FOLL_INTERRUPTIBLE; - - npages =3D get_user_pages_unlocked(addr, 1, &page, flags); + npages =3D get_user_pages_unlocked(foll->hva, 1, &page, flags); if (npages !=3D 1) return npages; =20 - /* map read fault as writable if possible */ - if (unlikely(!write_fault) && writable) { + if (foll->flags & FOLL_WRITE) { + foll->writable =3D true; + } else if (foll->try_map_writable) { struct page *wpage; =20 - if (get_user_page_fast_only(addr, FOLL_WRITE, &wpage)) { - *writable =3D true; + /* map read fault as writable if possible */ + if (get_user_page_fast_only(foll->hva, FOLL_WRITE, &wpage)) { + foll->writable =3D true; put_page(page); page =3D wpage; } @@ -2573,23 +2561,23 @@ static int kvm_try_get_pfn(kvm_pfn_t pfn) } =20 static int hva_to_pfn_remapped(struct vm_area_struct *vma, - unsigned long addr, bool write_fault, - bool *writable, kvm_pfn_t *p_pfn) + struct kvm_follow_pfn *foll, kvm_pfn_t *p_pfn) { kvm_pfn_t pfn; pte_t *ptep; pte_t pte; spinlock_t *ptl; + bool write_fault =3D foll->flags & FOLL_WRITE; int r; =20 - r =3D follow_pte(vma->vm_mm, addr, &ptep, &ptl); + r =3D follow_pte(vma->vm_mm, foll->hva, &ptep, &ptl); if (r) { /* * get_user_pages fails for VM_IO and VM_PFNMAP vmas and does * not call the fault handler, so do it here. */ bool unlocked =3D false; - r =3D fixup_user_fault(current->mm, addr, + r =3D fixup_user_fault(current->mm, foll->hva, (write_fault ? FAULT_FLAG_WRITE : 0), &unlocked); if (unlocked) @@ -2597,7 +2585,7 @@ static int hva_to_pfn_remapped(struct vm_area_struct = *vma, if (r) return r; =20 - r =3D follow_pte(vma->vm_mm, addr, &ptep, &ptl); + r =3D follow_pte(vma->vm_mm, foll->hva, &ptep, &ptl); if (r) return r; } @@ -2609,8 +2597,7 @@ static int hva_to_pfn_remapped(struct vm_area_struct = *vma, goto out; } =20 - if (writable) - *writable =3D pte_write(pte); + foll->writable =3D pte_write(*ptep); pfn =3D pte_pfn(pte); =20 /* @@ -2655,24 +2642,22 @@ static int hva_to_pfn_remapped(struct vm_area_struc= t *vma, * 2): @write_fault =3D false && @writable, @writable will tell the caller * whether the mapping is writable. */ -kvm_pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool interruptible, - bool *async, bool write_fault, bool *writable) +kvm_pfn_t hva_to_pfn(struct kvm_follow_pfn *foll) { struct vm_area_struct *vma; kvm_pfn_t pfn; int npages, r; =20 /* we can do it either atomically or asynchronously, not both */ - BUG_ON(atomic && async); + BUG_ON(foll->atomic && (foll->flags & FOLL_NOWAIT)); =20 - if (hva_to_pfn_fast(addr, write_fault, writable, &pfn)) + if (hva_to_pfn_fast(foll, &pfn)) return pfn; =20 - if (atomic) + if (foll->atomic) return KVM_PFN_ERR_FAULT; =20 - npages =3D hva_to_pfn_slow(addr, async, write_fault, interruptible, - writable, &pfn); + npages =3D hva_to_pfn_slow(foll, &pfn); if (npages =3D=3D 1) return pfn; if (npages =3D=3D -EINTR) @@ -2680,83 +2665,123 @@ kvm_pfn_t hva_to_pfn(unsigned long addr, bool atom= ic, bool interruptible, =20 mmap_read_lock(current->mm); if (npages =3D=3D -EHWPOISON || - (!async && check_user_page_hwpoison(addr))) { + (!(foll->flags & FOLL_NOWAIT) && check_user_page_hwpoison(foll->hva))= ) { pfn =3D KVM_PFN_ERR_HWPOISON; goto exit; } =20 retry: - vma =3D vma_lookup(current->mm, addr); + vma =3D vma_lookup(current->mm, foll->hva); =20 if (vma =3D=3D NULL) pfn =3D KVM_PFN_ERR_FAULT; else if (vma->vm_flags & (VM_IO | VM_PFNMAP)) { - r =3D hva_to_pfn_remapped(vma, addr, write_fault, writable, &pfn); + r =3D hva_to_pfn_remapped(vma, foll, &pfn); if (r =3D=3D -EAGAIN) goto retry; if (r < 0) pfn =3D KVM_PFN_ERR_FAULT; } else { - if (async && vma_is_valid(vma, write_fault)) - *async =3D true; - pfn =3D KVM_PFN_ERR_FAULT; + if ((foll->flags & FOLL_NOWAIT) && + vma_is_valid(vma, foll->flags & FOLL_WRITE)) + pfn =3D KVM_PFN_ERR_NEEDS_IO; + else + pfn =3D KVM_PFN_ERR_FAULT; } exit: mmap_read_unlock(current->mm); return pfn; } =20 -kvm_pfn_t __gfn_to_pfn_memslot(const struct kvm_memory_slot *slot, gfn_t g= fn, - bool atomic, bool interruptible, bool *async, - bool write_fault, bool *writable, hva_t *hva) +kvm_pfn_t __kvm_follow_pfn(struct kvm_follow_pfn *foll) { - unsigned long addr =3D __gfn_to_hva_many(slot, gfn, NULL, write_fault); + foll->writable =3D false; + foll->hva =3D __gfn_to_hva_many(foll->slot, foll->gfn, NULL, + foll->flags & FOLL_WRITE); =20 - if (hva) - *hva =3D addr; - - if (addr =3D=3D KVM_HVA_ERR_RO_BAD) { - if (writable) - *writable =3D false; + if (foll->hva =3D=3D KVM_HVA_ERR_RO_BAD) return KVM_PFN_ERR_RO_FAULT; - } =20 - if (kvm_is_error_hva(addr)) { - if (writable) - *writable =3D false; + if (kvm_is_error_hva(foll->hva)) return KVM_PFN_NOSLOT; - } =20 - /* Do not map writable pfn in the readonly memslot. */ - if (writable && memslot_is_readonly(slot)) { - *writable =3D false; - writable =3D NULL; - } + if (memslot_is_readonly(foll->slot)) + foll->try_map_writable =3D false; =20 - return hva_to_pfn(addr, atomic, interruptible, async, write_fault, - writable); + return hva_to_pfn(foll); +} +EXPORT_SYMBOL_GPL(__kvm_follow_pfn); + +kvm_pfn_t __gfn_to_pfn_memslot(const struct kvm_memory_slot *slot, gfn_t g= fn, + bool atomic, bool interruptible, bool *async, + bool write_fault, bool *writable, hva_t *hva) +{ + kvm_pfn_t pfn; + struct kvm_follow_pfn foll =3D { + .slot =3D slot, + .gfn =3D gfn, + .flags =3D 0, + .atomic =3D atomic, + .try_map_writable =3D !!writable, + }; + + if (write_fault) + foll.flags |=3D FOLL_WRITE; + if (async) + foll.flags |=3D FOLL_NOWAIT; + if (interruptible) + foll.flags |=3D FOLL_INTERRUPTIBLE; + + pfn =3D __kvm_follow_pfn(&foll); + if (pfn =3D=3D KVM_PFN_ERR_NEEDS_IO) { + *async =3D true; + pfn =3D KVM_PFN_ERR_FAULT; + } + if (hva) + *hva =3D foll.hva; + if (writable) + *writable =3D foll.writable; + return pfn; } EXPORT_SYMBOL_GPL(__gfn_to_pfn_memslot); =20 kvm_pfn_t gfn_to_pfn_prot(struct kvm *kvm, gfn_t gfn, bool write_fault, bool *writable) { - return __gfn_to_pfn_memslot(gfn_to_memslot(kvm, gfn), gfn, false, false, - NULL, write_fault, writable, NULL); + kvm_pfn_t pfn; + struct kvm_follow_pfn foll =3D { + .slot =3D gfn_to_memslot(kvm, gfn), + .gfn =3D gfn, + .flags =3D write_fault ? FOLL_WRITE : 0, + .try_map_writable =3D !!writable, + }; + pfn =3D __kvm_follow_pfn(&foll); + if (writable) + *writable =3D foll.writable; + return pfn; } EXPORT_SYMBOL_GPL(gfn_to_pfn_prot); =20 kvm_pfn_t gfn_to_pfn_memslot(const struct kvm_memory_slot *slot, gfn_t gfn) { - return __gfn_to_pfn_memslot(slot, gfn, false, false, NULL, true, - NULL, NULL); + struct kvm_follow_pfn foll =3D { + .slot =3D slot, + .gfn =3D gfn, + .flags =3D FOLL_WRITE, + }; + return __kvm_follow_pfn(&foll); } EXPORT_SYMBOL_GPL(gfn_to_pfn_memslot); =20 kvm_pfn_t gfn_to_pfn_memslot_atomic(const struct kvm_memory_slot *slot, gf= n_t gfn) { - return __gfn_to_pfn_memslot(slot, gfn, true, false, NULL, true, - NULL, NULL); + struct kvm_follow_pfn foll =3D { + .slot =3D slot, + .gfn =3D gfn, + .flags =3D FOLL_WRITE, + .atomic =3D true, + }; + return __kvm_follow_pfn(&foll); } EXPORT_SYMBOL_GPL(gfn_to_pfn_memslot_atomic); =20 diff --git a/virt/kvm/kvm_mm.h b/virt/kvm/kvm_mm.h index 180f1a09e6ba..ed896aee5396 100644 --- a/virt/kvm/kvm_mm.h +++ b/virt/kvm/kvm_mm.h @@ -20,8 +20,7 @@ #define KVM_MMU_UNLOCK(kvm) spin_unlock(&(kvm)->mmu_lock) #endif /* KVM_HAVE_MMU_RWLOCK */ =20 -kvm_pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool interruptible, - bool *async, bool write_fault, bool *writable); +kvm_pfn_t hva_to_pfn(struct kvm_follow_pfn *foll); =20 #ifdef CONFIG_HAVE_KVM_PFNCACHE void gfn_to_pfn_cache_invalidate_start(struct kvm *kvm, diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c index 2d6aba677830..86cd40acad11 100644 --- a/virt/kvm/pfncache.c +++ b/virt/kvm/pfncache.c @@ -144,6 +144,12 @@ static kvm_pfn_t hva_to_pfn_retry(struct gfn_to_pfn_ca= che *gpc) kvm_pfn_t new_pfn =3D KVM_PFN_ERR_FAULT; void *new_khva =3D NULL; unsigned long mmu_seq; + struct kvm_follow_pfn foll =3D { + .slot =3D gpc->memslot, + .gfn =3D gpa_to_gfn(gpc->gpa), + .flags =3D FOLL_WRITE, + .hva =3D gpc->uhva, + }; =20 lockdep_assert_held(&gpc->refresh_lock); =20 @@ -182,8 +188,8 @@ static kvm_pfn_t hva_to_pfn_retry(struct gfn_to_pfn_cac= he *gpc) cond_resched(); } =20 - /* We always request a writeable mapping */ - new_pfn =3D hva_to_pfn(gpc->uhva, false, false, NULL, true, NULL); + /* We always request a writable mapping */ + new_pfn =3D hva_to_pfn(&foll); if (is_error_noslot_pfn(new_pfn)) goto out_error; =20 --=20 2.42.0.rc1.204.g551eb34607-goog