From nobody Tue Apr 7 04:59:04 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EA9C3C433F5 for ; Tue, 11 Oct 2022 19:58:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229864AbiJKT63 (ORCPT ); Tue, 11 Oct 2022 15:58:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55198 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229799AbiJKT6U (ORCPT ); Tue, 11 Oct 2022 15:58:20 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3DD9B7268C for ; Tue, 11 Oct 2022 12:58:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1665518297; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=V/+ripU4Cn0e2jw5c/awHk5ywDVBelMr8KNke92yFbw=; b=BJlBumdMC90tSTQcPnvFdxQrqPIKjEEinPGnUtDHJqjEgYh/wjpCJTWm7nr8uGm78iIa6l IhejkiCqmtA9lavq4T6Hz/V3ywABuNdxNSy6CUw1zWzbG91D/cJOhQ29hjfQ/UBmIZ2Re6 890aS71bz6I5+lgNfuOPqzQkBuOMv/M= Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-175-WbixH5IFMjKwo6so3DxWtQ-1; Tue, 11 Oct 2022 15:58:13 -0400 X-MC-Unique: WbixH5IFMjKwo6so3DxWtQ-1 Received: by mail-qk1-f198.google.com with SMTP id h7-20020a05620a400700b006cebec84734so12689459qko.23 for ; Tue, 11 Oct 2022 12:58:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=V/+ripU4Cn0e2jw5c/awHk5ywDVBelMr8KNke92yFbw=; b=TFVc3ZqkrJANKgJWsrFH/eovMJYihgd/+QfPw2zlHom4Y+zvgsBQsNKYMlEpEbJmPX RBWMEZSvwcRcWhejarzwJlskQzrmwzjQF3ksuT+b58yY+dokX8xCEXkGWu/X6CaY/C0a dxDUOOkKsgglFzZoaUgFwLtL3vzRuo/Gzy49kStTyc5jTcaHjTZIjRGWIPwDvhocfM58 7798NpU9ru0Vsb/G3MVMV7CXdaVn34A5nSsZKzfDSHUAFq8g91zVkJ5j3r8ULT33F+kX R+vIMIdJ9RQ9FH8Uo0tDCDZoq/utlHVUilvtuisuLUlpwRMrAau2XIYW3mgdV66SwYuf Vs7Q== X-Gm-Message-State: ACrzQf37N3LOIM31cPGPTF0T/UTZR8sV57Rh3QeDBqugRig0i47+rfUD SDUKwD6DAWK9oXoJq6XIzzzcUhoeCfG6+iCxLxjVtd/aNL0yNhYjBmS60CRcauJrqwfKe43nbLj /UGBm1epcAhRezGJnjMd5v39/ X-Received: by 2002:a0c:9a0d:0:b0:4b1:982e:96d4 with SMTP id p13-20020a0c9a0d000000b004b1982e96d4mr19788277qvd.114.1665518293325; Tue, 11 Oct 2022 12:58:13 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6PixulCqGkG8rkh18fizY4hjzT3AYfYhVR7T8bvGGt8538uBuPPVxwiplaUYFmIz1hwWZjbA== X-Received: by 2002:a0c:9a0d:0:b0:4b1:982e:96d4 with SMTP id p13-20020a0c9a0d000000b004b1982e96d4mr19788255qvd.114.1665518293119; Tue, 11 Oct 2022 12:58:13 -0700 (PDT) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id az31-20020a05620a171f00b006ce9e880c6fsm13648837qkb.111.2022.10.11.12.58.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Oct 2022 12:58:12 -0700 (PDT) From: Peter Xu To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Sean Christopherson , peterx@redhat.com, John Hubbard , Paolo Bonzini , David Matlack , Andrew Morton , Andrea Arcangeli , "Dr . David Alan Gilbert" , David Hildenbrand , Linux MM Mailing List , Mike Kravetz Subject: [PATCH v4 1/4] mm/gup: Add FOLL_INTERRUPTIBLE Date: Tue, 11 Oct 2022 15:58:06 -0400 Message-Id: <20221011195809.557016-2-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221011195809.557016-1-peterx@redhat.com> References: <20221011195809.557016-1-peterx@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" We have had FAULT_FLAG_INTERRUPTIBLE but it was never applied to GUPs. One issue with it is that not all GUP paths are able to handle signal delivers besides SIGKILL. That's not ideal for the GUP users who are actually able to handle these cases, like KVM. KVM uses GUP extensively on faulting guest pages, during which we've got existing infrastructures to retry a page fault at a later time. Allowing the GUP to be interrupted by generic signals can make KVM related threads to be more responsive. For examples: (1) SIGUSR1: which QEMU/KVM uses to deliver an inter-process IPI, e.g. when the admin issues a vm_stop QMP command, SIGUSR1 can be generated to kick the vcpus out of kernel context immediately, (2) SIGINT: which can be used with interactive hypervisor users to stop a virtual machine with Ctrl-C without any delays/hangs, (3) SIGTRAP: which grants GDB capability even during page faults that are stuck for a long time. Normally hypervisor will be able to receive these signals properly, but not if we're stuck in a GUP for a long time for whatever reason. It happens easily with a stucked postcopy migration when e.g. a network temp failure happens, then some vcpu threads can hang death waiting for the pages. With the new FOLL_INTERRUPTIBLE, we can allow GUP users like KVM to selectively enable the ability to trap these signals. Reviewed-by: John Hubbard Reviewed-by: David Hildenbrand Signed-off-by: Peter Xu --- include/linux/mm.h | 1 + mm/gup.c | 33 +++++++++++++++++++++++++++++---- mm/hugetlb.c | 5 ++++- 3 files changed, 34 insertions(+), 5 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 21f8b27bd9fd..488a9f4cce07 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2897,6 +2897,7 @@ struct page *follow_page(struct vm_area_struct *vma, = unsigned long address, #define FOLL_SPLIT_PMD 0x20000 /* split huge pmd before returning */ #define FOLL_PIN 0x40000 /* pages must be released via unpin_user_page */ #define FOLL_FAST_ONLY 0x80000 /* gup_fast: prevent fall-back to slow gup = */ +#define FOLL_INTERRUPTIBLE 0x100000 /* allow interrupts from generic sign= als */ =20 /* * FOLL_PIN and FOLL_LONGTERM may be used in various combinations with each diff --git a/mm/gup.c b/mm/gup.c index 5abdaf487460..d51e7ccaef32 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -970,8 +970,17 @@ static int faultin_page(struct vm_area_struct *vma, fault_flags |=3D FAULT_FLAG_WRITE; if (*flags & FOLL_REMOTE) fault_flags |=3D FAULT_FLAG_REMOTE; - if (locked) + if (locked) { fault_flags |=3D FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE; + /* + * FAULT_FLAG_INTERRUPTIBLE is opt-in. GUP callers must set + * FOLL_INTERRUPTIBLE to enable FAULT_FLAG_INTERRUPTIBLE. + * That's because some callers may not be prepared to + * handle early exits caused by non-fatal signals. + */ + if (*flags & FOLL_INTERRUPTIBLE) + fault_flags |=3D FAULT_FLAG_INTERRUPTIBLE; + } if (*flags & FOLL_NOWAIT) fault_flags |=3D FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT; if (*flags & FOLL_TRIED) { @@ -1380,6 +1389,22 @@ int fixup_user_fault(struct mm_struct *mm, } EXPORT_SYMBOL_GPL(fixup_user_fault); =20 +/* + * GUP always responds to fatal signals. When FOLL_INTERRUPTIBLE is + * specified, it'll also respond to generic signals. The caller of GUP + * that has FOLL_INTERRUPTIBLE should take care of the GUP interruption. + */ +static bool gup_signal_pending(unsigned int flags) +{ + if (fatal_signal_pending(current)) + return true; + + if (!(flags & FOLL_INTERRUPTIBLE)) + return false; + + return signal_pending(current); +} + /* * Please note that this function, unlike __get_user_pages will not * return 0 for nr_pages > 0 without FOLL_NOWAIT @@ -1461,11 +1486,11 @@ static __always_inline long __get_user_pages_locked= (struct mm_struct *mm, * Repeat on the address that fired VM_FAULT_RETRY * with both FAULT_FLAG_ALLOW_RETRY and * FAULT_FLAG_TRIED. Note that GUP can be interrupted - * by fatal signals, so we need to check it before we + * by fatal signals of even common signals, depending on + * the caller's request. So we need to check it before we * start trying again otherwise it can loop forever. */ - - if (fatal_signal_pending(current)) { + if (gup_signal_pending(flags)) { if (!pages_done) pages_done =3D -EINTR; break; diff --git a/mm/hugetlb.c b/mm/hugetlb.c index e070b8593b37..202f3ad7f35c 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6206,9 +6206,12 @@ long follow_hugetlb_page(struct mm_struct *mm, struc= t vm_area_struct *vma, fault_flags |=3D FAULT_FLAG_WRITE; else if (unshare) fault_flags |=3D FAULT_FLAG_UNSHARE; - if (locked) + if (locked) { fault_flags |=3D FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE; + if (flags & FOLL_INTERRUPTIBLE) + fault_flags |=3D FAULT_FLAG_INTERRUPTIBLE; + } if (flags & FOLL_NOWAIT) fault_flags |=3D FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_RETRY_NOWAIT; --=20 2.37.3 From nobody Tue Apr 7 04:59:04 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 05C14C433FE for ; Tue, 11 Oct 2022 19:58:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229871AbiJKT6e (ORCPT ); Tue, 11 Oct 2022 15:58:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55304 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229827AbiJKT6Z (ORCPT ); Tue, 11 Oct 2022 15:58:25 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CA7A098CA5 for ; Tue, 11 Oct 2022 12:58:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1665518301; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SAlECgJKwNknwoioVUeF/uXHuNVkO1eljRPrSr7m9O4=; b=fo08+YB45cnpbux1+hEFkSz9J6CLtFH1MUK9znQJX3YTOdL3Jh5uH9LJ93w/5UQRlGW36N lZOfkI17AhbSBm8BawgTJz14nqaaUHF4hfY4A2d4yJi4fG9pDWrWjeUkk1sziyXvtXKzwH eocd8xRRP6+/ga/S0QqaW3Ya1ByAz98= Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-115-YCRiPx4NOwy7AgDO5Mp_2g-1; Tue, 11 Oct 2022 15:58:15 -0400 X-MC-Unique: YCRiPx4NOwy7AgDO5Mp_2g-1 Received: by mail-qk1-f197.google.com with SMTP id k2-20020a05620a414200b006ceec443c8bso12577513qko.14 for ; Tue, 11 Oct 2022 12:58:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SAlECgJKwNknwoioVUeF/uXHuNVkO1eljRPrSr7m9O4=; b=GgkZIDD1L9l/y72N93qDa6+0GLeur7mmOlWEFBaUCWC5lydCWTpW2C+wwmGYN1rSCd MRgTtkHdhI0NvpdSxRwm3AzuZlzJz7Jh6U+DJn6rLgQWRkWa0MJBa/n7G2x/nPVfIigd gae+eDCfXFvaiNvHITYl0iQkOLezwE5QF6annboS7oCO5VuVA4d56vpwJhTwHeIbzeTD 6oLs8inplEJvTX+pvw+2ME6xNsLMP0sbLOfRbZxA1OQi7Aq9AHo+FEvko5a1pnhl1TlI fWf+tJtvVkKIMMbMIpuJaOCN8YIyrGrRSysVRgg5DozBJ3DowF/qh4jmi2Tkmcv7j3KQ jvqQ== X-Gm-Message-State: ACrzQf3BIY1pSR1H7O6Ie0p4IRIgzcYi6IsQdzlpOrDfJJYynXm0j129 eqMcrl6o0r/56jA3wCmrETsgQO5zxZdJlGAi/uY7eR+xal506orJ0XbT9B326Ak2kKLDE7udbVr C+phK02yjZEEn5MZxpAjtNP9U X-Received: by 2002:ac8:574a:0:b0:394:3388:9fc3 with SMTP id 10-20020ac8574a000000b0039433889fc3mr20682007qtx.292.1665518294789; Tue, 11 Oct 2022 12:58:14 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5lzV/hWT/COel7yjUmDe/h42rtZvKalYrmBS4qOaAdmcwzGvcmWe1PC252299lW5tk7hr3pg== X-Received: by 2002:ac8:574a:0:b0:394:3388:9fc3 with SMTP id 10-20020ac8574a000000b0039433889fc3mr20681987qtx.292.1665518294584; Tue, 11 Oct 2022 12:58:14 -0700 (PDT) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id az31-20020a05620a171f00b006ce9e880c6fsm13648837qkb.111.2022.10.11.12.58.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Oct 2022 12:58:14 -0700 (PDT) From: Peter Xu To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Sean Christopherson , peterx@redhat.com, John Hubbard , Paolo Bonzini , David Matlack , Andrew Morton , Andrea Arcangeli , "Dr . David Alan Gilbert" , David Hildenbrand , Linux MM Mailing List , Mike Kravetz Subject: [PATCH v4 2/4] kvm: Add KVM_PFN_ERR_SIGPENDING Date: Tue, 11 Oct 2022 15:58:07 -0400 Message-Id: <20221011195809.557016-3-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221011195809.557016-1-peterx@redhat.com> References: <20221011195809.557016-1-peterx@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Add a new pfn error to show that we've got a pending signal to handle during hva_to_pfn_slow() procedure (of -EINTR retval). Signed-off-by: Peter Xu Reviewed-by: Sean Christopherson --- include/linux/kvm_host.h | 10 ++++++++++ virt/kvm/kvm_main.c | 2 ++ 2 files changed, 12 insertions(+) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 32f259fa5801..92baa930b891 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -96,6 +96,7 @@ #define KVM_PFN_ERR_FAULT (KVM_PFN_ERR_MASK) #define KVM_PFN_ERR_HWPOISON (KVM_PFN_ERR_MASK + 1) #define KVM_PFN_ERR_RO_FAULT (KVM_PFN_ERR_MASK + 2) +#define KVM_PFN_ERR_SIGPENDING (KVM_PFN_ERR_MASK + 3) =20 /* * error pfns indicate that the gfn is in slot but faild to @@ -106,6 +107,15 @@ static inline bool is_error_pfn(kvm_pfn_t pfn) return !!(pfn & KVM_PFN_ERR_MASK); } =20 +/* + * KVM_PFN_ERR_SIGPENDING indicates that fetching the PFN was interrupted + * by a pending signal. Note, the signal may or may not be fatal. + */ +static inline bool is_sigpending_pfn(kvm_pfn_t pfn) +{ + return pfn =3D=3D KVM_PFN_ERR_SIGPENDING; +} + /* * error_noslot pfns indicate that the gfn can not be * translated to pfn - it is not in slot or failed to diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index e30f1b4ecfa5..e20a59dcda32 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2667,6 +2667,8 @@ kvm_pfn_t hva_to_pfn(unsigned long addr, bool atomic,= bool *async, npages =3D hva_to_pfn_slow(addr, async, write_fault, writable, &pfn); if (npages =3D=3D 1) return pfn; + if (npages =3D=3D -EINTR) + return KVM_PFN_ERR_SIGPENDING; =20 mmap_read_lock(current->mm); if (npages =3D=3D -EHWPOISON || --=20 2.37.3 From nobody Tue Apr 7 04:59:04 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B24A0C4332F for ; Tue, 11 Oct 2022 19:58:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229483AbiJKT61 (ORCPT ); Tue, 11 Oct 2022 15:58:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55184 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229731AbiJKT6U (ORCPT ); Tue, 11 Oct 2022 15:58:20 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A943B98C9D for ; Tue, 11 Oct 2022 12:58:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1665518297; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PgkB5amVUPRR3INxfCSynSvTKOcc2dw225AQHYbN8r0=; b=SHlUCObeyAykUCgNCT1NLdBpm6PDUS74F6JiPqVqopMzGoBVl7Nxf1/YZel8p7lC8Fy55X GaxeIZilkdoWgPiSxuDaCDXY9S4pMMA3anRGb/ie75RZ7sxbuxNmm/u+l46L4K5L6iKyOr JLyuK1ot+Ok4OxpYbHmIsP6jCTHrTMA= Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-300-XhASqH0AOcqQMG3VoSz_CA-1; Tue, 11 Oct 2022 15:58:16 -0400 X-MC-Unique: XhASqH0AOcqQMG3VoSz_CA-1 Received: by mail-qk1-f200.google.com with SMTP id k2-20020a05620a414200b006ceec443c8bso12577568qko.14 for ; Tue, 11 Oct 2022 12:58:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PgkB5amVUPRR3INxfCSynSvTKOcc2dw225AQHYbN8r0=; b=f9v0ABxYAu/PIqBq6NBXJo0aD+lfyOJdCYcoBRlkiy26dvVvFQzsHKNf5uwsWoinRY 7YUW5u4k3XcHfHn67c8tFCbd8uEH6GvHyIHy4YK2SN3bvir7f+g+39kGhCgsTTFYDZKG y/I8cv7WL573BuJvTwoYfxQHu5UJzMu9FEutdhthgL//IuAgGlzswQZcdjxlsqeNshnV fEP0rLBsSKzHClG+W2RORhbaUnD7srAaKzKbo8sIit/9bKJ/enGs+O5FAnPeXvsCnTKC 5J+v2vLZC3iqH0Mk02J15yILmqdoaAXMOs2WfS6lzyQXNWnU6R19dxwdiA30lAe4f53f BxCQ== X-Gm-Message-State: ACrzQf3qhp0Ids+U+n5qT9aciF3iZJSwsFVYa9iD+twMaTHLJBcjdUkh Vikso7uqihx8wev12Ss3upnkcn02wtJWdCTpynzEC/P+A3Uoh6TtwGOfpE2pN3LmfPAFzv8LV2w 5gyQqr3N8Mf3Z2XB54V/z6KTT X-Received: by 2002:ac8:7c54:0:b0:35c:ebfd:32b4 with SMTP id o20-20020ac87c54000000b0035cebfd32b4mr20412162qtv.30.1665518296186; Tue, 11 Oct 2022 12:58:16 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6eeX3q/UT6wqtgDWH/uPjS+MLsxyPYFVtszRACLZWTMTJNfYFrayyjfrXgbM3yPOUwQH7YEQ== X-Received: by 2002:ac8:7c54:0:b0:35c:ebfd:32b4 with SMTP id o20-20020ac87c54000000b0035cebfd32b4mr20412140qtv.30.1665518295920; Tue, 11 Oct 2022 12:58:15 -0700 (PDT) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id az31-20020a05620a171f00b006ce9e880c6fsm13648837qkb.111.2022.10.11.12.58.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Oct 2022 12:58:15 -0700 (PDT) From: Peter Xu To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Sean Christopherson , peterx@redhat.com, John Hubbard , Paolo Bonzini , David Matlack , Andrew Morton , Andrea Arcangeli , "Dr . David Alan Gilbert" , David Hildenbrand , Linux MM Mailing List , Mike Kravetz Subject: [PATCH v4 3/4] kvm: Add interruptible flag to __gfn_to_pfn_memslot() Date: Tue, 11 Oct 2022 15:58:08 -0400 Message-Id: <20221011195809.557016-4-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221011195809.557016-1-peterx@redhat.com> References: <20221011195809.557016-1-peterx@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Add a new "interruptible" flag showing that the caller is willing to be interrupted by signals during the __gfn_to_pfn_memslot() request. Wire it up with a FOLL_INTERRUPTIBLE flag that we've just introduced. This prepares KVM to be able to respond to SIGUSR1 (for QEMU that's the SIGIPI) even during e.g. handling an userfaultfd page fault. No functional change intended. Signed-off-by: Peter Xu Reviewed-by: Sean Christopherson --- arch/arm64/kvm/mmu.c | 2 +- arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +- arch/powerpc/kvm/book3s_64_mmu_radix.c | 2 +- arch/x86/kvm/mmu/mmu.c | 4 ++-- include/linux/kvm_host.h | 4 ++-- virt/kvm/kvm_main.c | 28 ++++++++++++++++---------- virt/kvm/kvm_mm.h | 4 ++-- virt/kvm/pfncache.c | 2 +- 8 files changed, 27 insertions(+), 21 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 34c5feed9dc1..7b990b33b337 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1232,7 +1232,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys= _addr_t fault_ipa, */ smp_rmb(); =20 - pfn =3D __gfn_to_pfn_memslot(memslot, gfn, false, NULL, + pfn =3D __gfn_to_pfn_memslot(memslot, gfn, false, false, NULL, write_fault, &writable, NULL); if (pfn =3D=3D KVM_PFN_ERR_HWPOISON) { kvm_send_hwpoison_signal(hva, vma_shift); diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_= 64_mmu_hv.c index e9744b41a226..4939f57b6f6a 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c @@ -598,7 +598,7 @@ int kvmppc_book3s_hv_page_fault(struct kvm_vcpu *vcpu, write_ok =3D true; } else { /* Call KVM generic code to do the slow-path check */ - pfn =3D __gfn_to_pfn_memslot(memslot, gfn, false, NULL, + pfn =3D __gfn_to_pfn_memslot(memslot, gfn, false, false, NULL, writing, &write_ok, NULL); if (is_error_noslot_pfn(pfn)) return -EFAULT; diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book= 3s_64_mmu_radix.c index 5d5e12f3bf86..9d3743ca16d5 100644 --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c @@ -846,7 +846,7 @@ int kvmppc_book3s_instantiate_page(struct kvm_vcpu *vcp= u, unsigned long pfn; =20 /* Call KVM generic code to do the slow-path check */ - pfn =3D __gfn_to_pfn_memslot(memslot, gfn, false, NULL, + pfn =3D __gfn_to_pfn_memslot(memslot, gfn, false, false, NULL, writing, upgrade_p, NULL); if (is_error_noslot_pfn(pfn)) return -EFAULT; diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 6f81539061d6..cc26f425f41c 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4169,7 +4169,7 @@ static int kvm_faultin_pfn(struct kvm_vcpu *vcpu, str= uct kvm_page_fault *fault) } =20 async =3D false; - fault->pfn =3D __gfn_to_pfn_memslot(slot, fault->gfn, false, &async, + fault->pfn =3D __gfn_to_pfn_memslot(slot, fault->gfn, false, false, &asyn= c, fault->write, &fault->map_writable, &fault->hva); if (!async) @@ -4186,7 +4186,7 @@ static int kvm_faultin_pfn(struct kvm_vcpu *vcpu, str= uct kvm_page_fault *fault) } } =20 - fault->pfn =3D __gfn_to_pfn_memslot(slot, fault->gfn, false, NULL, + fault->pfn =3D __gfn_to_pfn_memslot(slot, fault->gfn, false, false, NULL, fault->write, &fault->map_writable, &fault->hva); return RET_PF_CONTINUE; diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 92baa930b891..1904162a041d 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1150,8 +1150,8 @@ kvm_pfn_t gfn_to_pfn_prot(struct kvm *kvm, gfn_t gfn,= bool write_fault, kvm_pfn_t gfn_to_pfn_memslot(const struct kvm_memory_slot *slot, gfn_t gfn= ); kvm_pfn_t gfn_to_pfn_memslot_atomic(const struct kvm_memory_slot *slot, gf= n_t gfn); kvm_pfn_t __gfn_to_pfn_memslot(const struct kvm_memory_slot *slot, gfn_t g= fn, - bool atomic, bool *async, bool write_fault, - bool *writable, hva_t *hva); + bool atomic, bool interruptible, bool *async, + bool write_fault, bool *writable, hva_t *hva); =20 void kvm_release_pfn_clean(kvm_pfn_t pfn); void kvm_release_pfn_dirty(kvm_pfn_t pfn); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index e20a59dcda32..903ec86c4d54 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2514,7 +2514,7 @@ static bool hva_to_pfn_fast(unsigned long addr, bool = write_fault, * 1 indicates success, -errno is returned if error is detected. */ static int hva_to_pfn_slow(unsigned long addr, bool *async, bool write_fau= lt, - bool *writable, kvm_pfn_t *pfn) + bool interruptible, bool *writable, kvm_pfn_t *pfn) { unsigned int flags =3D FOLL_HWPOISON; struct page *page; @@ -2529,6 +2529,8 @@ static int hva_to_pfn_slow(unsigned long addr, bool *= async, bool write_fault, flags |=3D FOLL_WRITE; if (async) flags |=3D FOLL_NOWAIT; + if (interruptible) + flags |=3D FOLL_INTERRUPTIBLE; =20 npages =3D get_user_pages_unlocked(addr, 1, &page, flags); if (npages !=3D 1) @@ -2638,6 +2640,7 @@ static int hva_to_pfn_remapped(struct vm_area_struct = *vma, * Pin guest page in memory and return its pfn. * @addr: host virtual address which maps memory to the guest * @atomic: whether this function can sleep + * @interruptible: whether the process can be interrupted by non-fatal sig= nals * @async: whether this function need to wait IO complete if the * host page is not in the memory * @write_fault: whether we should get a writable host page @@ -2648,8 +2651,8 @@ static int hva_to_pfn_remapped(struct vm_area_struct = *vma, * 2): @write_fault =3D false && @writable, @writable will tell the caller * whether the mapping is writable. */ -kvm_pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool *async, - bool write_fault, bool *writable) +kvm_pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool interruptible, + bool *async, bool write_fault, bool *writable) { struct vm_area_struct *vma; kvm_pfn_t pfn; @@ -2664,7 +2667,8 @@ kvm_pfn_t hva_to_pfn(unsigned long addr, bool atomic,= bool *async, if (atomic) return KVM_PFN_ERR_FAULT; =20 - npages =3D hva_to_pfn_slow(addr, async, write_fault, writable, &pfn); + npages =3D hva_to_pfn_slow(addr, async, write_fault, interruptible, + writable, &pfn); if (npages =3D=3D 1) return pfn; if (npages =3D=3D -EINTR) @@ -2699,8 +2703,8 @@ kvm_pfn_t hva_to_pfn(unsigned long addr, bool atomic,= bool *async, } =20 kvm_pfn_t __gfn_to_pfn_memslot(const struct kvm_memory_slot *slot, gfn_t g= fn, - bool atomic, bool *async, bool write_fault, - bool *writable, hva_t *hva) + bool atomic, bool interruptible, bool *async, + bool write_fault, bool *writable, hva_t *hva) { unsigned long addr =3D __gfn_to_hva_many(slot, gfn, NULL, write_fault); =20 @@ -2725,7 +2729,7 @@ kvm_pfn_t __gfn_to_pfn_memslot(const struct kvm_memor= y_slot *slot, gfn_t gfn, writable =3D NULL; } =20 - return hva_to_pfn(addr, atomic, async, write_fault, + return hva_to_pfn(addr, atomic, interruptible, async, write_fault, writable); } EXPORT_SYMBOL_GPL(__gfn_to_pfn_memslot); @@ -2733,20 +2737,22 @@ EXPORT_SYMBOL_GPL(__gfn_to_pfn_memslot); kvm_pfn_t gfn_to_pfn_prot(struct kvm *kvm, gfn_t gfn, bool write_fault, bool *writable) { - return __gfn_to_pfn_memslot(gfn_to_memslot(kvm, gfn), gfn, false, NULL, - write_fault, writable, NULL); + return __gfn_to_pfn_memslot(gfn_to_memslot(kvm, gfn), gfn, false, false, + NULL, write_fault, writable, NULL); } EXPORT_SYMBOL_GPL(gfn_to_pfn_prot); =20 kvm_pfn_t gfn_to_pfn_memslot(const struct kvm_memory_slot *slot, gfn_t gfn) { - return __gfn_to_pfn_memslot(slot, gfn, false, NULL, true, NULL, NULL); + return __gfn_to_pfn_memslot(slot, gfn, false, false, NULL, true, + NULL, NULL); } EXPORT_SYMBOL_GPL(gfn_to_pfn_memslot); =20 kvm_pfn_t gfn_to_pfn_memslot_atomic(const struct kvm_memory_slot *slot, gf= n_t gfn) { - return __gfn_to_pfn_memslot(slot, gfn, true, NULL, true, NULL, NULL); + return __gfn_to_pfn_memslot(slot, gfn, true, false, NULL, true, + NULL, NULL); } EXPORT_SYMBOL_GPL(gfn_to_pfn_memslot_atomic); =20 diff --git a/virt/kvm/kvm_mm.h b/virt/kvm/kvm_mm.h index 41da467d99c9..a1ab15006af3 100644 --- a/virt/kvm/kvm_mm.h +++ b/virt/kvm/kvm_mm.h @@ -24,8 +24,8 @@ #define KVM_MMU_READ_UNLOCK(kvm) spin_unlock(&(kvm)->mmu_lock) #endif /* KVM_HAVE_MMU_RWLOCK */ =20 -kvm_pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool *async, - bool write_fault, bool *writable); +kvm_pfn_t hva_to_pfn(unsigned long addr, bool atomic, bool interruptible, + bool *async, bool write_fault, bool *writable); =20 #ifdef CONFIG_HAVE_KVM_PFNCACHE void gfn_to_pfn_cache_invalidate_start(struct kvm *kvm, diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c index 68ff41d39545..6f66808d7793 100644 --- a/virt/kvm/pfncache.c +++ b/virt/kvm/pfncache.c @@ -182,7 +182,7 @@ static kvm_pfn_t hva_to_pfn_retry(struct kvm *kvm, stru= ct gfn_to_pfn_cache *gpc) } =20 /* We always request a writeable mapping */ - new_pfn =3D hva_to_pfn(gpc->uhva, false, NULL, true, NULL); + new_pfn =3D hva_to_pfn(gpc->uhva, false, false, NULL, true, NULL); if (is_error_noslot_pfn(new_pfn)) goto out_error; =20 --=20 2.37.3 From nobody Tue Apr 7 04:59:04 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1CD01C433FE for ; Tue, 11 Oct 2022 20:00:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229838AbiJKUAG (ORCPT ); Tue, 11 Oct 2022 16:00:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59458 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229884AbiJKUAA (ORCPT ); Tue, 11 Oct 2022 16:00:00 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BE56513DD4 for ; Tue, 11 Oct 2022 12:59:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1665518393; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8Yevp9gp3Kfu1CI9DtUubsDde9/ctCFvT8SlIVvPvCA=; b=MIHqIE8z3IlJ7y7xX5LM5pGoLURjjlFEWazDqRnu39Z/jUWga2R7p0/rGQr+RXfp9FSLos 1N0j4Lp9KX7EH54NI9KhQGnG9lpBZl632QngraLJrzlqwC3OVzH4MbhGLOHsEJ7QjpcUdW 2Uf/Z6tXxq/rAXi8kThQz/yU3mc7z+0= Received: from mail-qv1-f72.google.com (mail-qv1-f72.google.com [209.85.219.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-480-V4snW-lWMD26fUzWvcbFHw-1; Tue, 11 Oct 2022 15:59:50 -0400 X-MC-Unique: V4snW-lWMD26fUzWvcbFHw-1 Received: by mail-qv1-f72.google.com with SMTP id y14-20020a0cf14e000000b004afb3c6984bso8698536qvl.21 for ; Tue, 11 Oct 2022 12:59:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8Yevp9gp3Kfu1CI9DtUubsDde9/ctCFvT8SlIVvPvCA=; b=yBZF8uuAVocB9WQqvV5zf0B9knuQ9sraPtnk5jKoqo8SvsEkUJWUShvAdGLYQkQL6B BJZ7pCQYd7HbIEsOYl0Xnco+66BShoIKti000jgl6EuQ4cB24F2DIiNMGcoERB31+ANt QxP7LVutcY6I+ejVNRuxV+gqujLY+Nl6OjAxXdEEEHNmiTrMWeJPHLToz3FAbNdMs2eD On3MS/YrgDKUyndOwBSYR/qugfzwx33Xz8izN0nN5FU9MJ1hD7xVBGuCPPLCFx4pZsPi ycGiysyuIqt6uVXi4Tbqg3evwiZtZY3rdtQg+qeFJ7SEOFCtgFuJp1q1+KeK+lW0oRcC Fn9w== X-Gm-Message-State: ACrzQf0nG+7WYbTvTphV6Eq2tF1G/OVechKqTNrmwdjm5T6G18N54hYH kregqrWabLij3s1+/ZIGvY+x4rtUHNO3uBsvy3julhnveCZl7mHRxl40aMfaA64oDph++6VzIiJ kCnF02k6/hRbVCto/crEDrZTcjRsM7xSbWF1w8E3M9r8QDgC43beF2xI3Ab5SGmLV6hgPBKBcfA == X-Received: by 2002:a37:64cd:0:b0:6ec:545f:9095 with SMTP id y196-20020a3764cd000000b006ec545f9095mr10080155qkb.133.1665518389292; Tue, 11 Oct 2022 12:59:49 -0700 (PDT) X-Google-Smtp-Source: AMsMyM49LBnh/R6pXZ8ZDj9dNiopP/LCrCnUzgsu2WgP4jJEuB7vGoS/dPhB9OGhtxh5BKvDICMIgQ== X-Received: by 2002:a37:64cd:0:b0:6ec:545f:9095 with SMTP id y196-20020a3764cd000000b006ec545f9095mr10080132qkb.133.1665518389031; Tue, 11 Oct 2022 12:59:49 -0700 (PDT) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id v4-20020a05622a014400b00343057845f7sm4477368qtw.20.2022.10.11.12.59.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Oct 2022 12:59:48 -0700 (PDT) From: Peter Xu To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: peterx@redhat.com, David Matlack , "Dr . David Alan Gilbert" , David Hildenbrand , John Hubbard , Andrew Morton , Linux MM Mailing List , Mike Kravetz , Paolo Bonzini , Andrea Arcangeli , Sean Christopherson Subject: [PATCH v4 4/4] kvm: x86: Allow to respond to generic signals during slow PF Date: Tue, 11 Oct 2022 15:59:47 -0400 Message-Id: <20221011195947.557281-1-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221011195809.557016-1-peterx@redhat.com> References: <20221011195809.557016-1-peterx@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Enable x86 slow page faults to be able to respond to non-fatal signals, returning -EINTR properly when it happens. Signed-off-by: Peter Xu Reviewed-by: Sean Christopherson --- arch/x86/kvm/mmu/mmu.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index cc26f425f41c..83b9c034313d 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3148,8 +3148,13 @@ static void kvm_send_hwpoison_signal(unsigned long a= ddress, struct task_struct * send_sig_mceerr(BUS_MCEERR_AR, (void __user *)address, PAGE_SHIFT, tsk); } =20 -static int kvm_handle_bad_page(struct kvm_vcpu *vcpu, gfn_t gfn, kvm_pfn_t= pfn) +static int kvm_handle_error_pfn(struct kvm_vcpu *vcpu, gfn_t gfn, kvm_pfn_= t pfn) { + if (is_sigpending_pfn(pfn)) { + kvm_handle_signal_exit(vcpu); + return -EINTR; + } + /* * Do not cache the mmio info caused by writing the readonly gfn * into the spte otherwise read access on readonly gfn also can @@ -3171,7 +3176,7 @@ static int handle_abnormal_pfn(struct kvm_vcpu *vcpu,= struct kvm_page_fault *fau { /* The pfn is invalid, report the error! */ if (unlikely(is_error_pfn(fault->pfn))) - return kvm_handle_bad_page(vcpu, fault->gfn, fault->pfn); + return kvm_handle_error_pfn(vcpu, fault->gfn, fault->pfn); =20 if (unlikely(!fault->slot)) { gva_t gva =3D fault->is_tdp ? 0 : fault->addr; @@ -4186,7 +4191,12 @@ static int kvm_faultin_pfn(struct kvm_vcpu *vcpu, st= ruct kvm_page_fault *fault) } } =20 - fault->pfn =3D __gfn_to_pfn_memslot(slot, fault->gfn, false, false, NULL, + /* + * Allow gup to bail on pending non-fatal signals when it's also allowed + * to wait for IO. Note, gup always bails if it is unable to quickly + * get a page and a fatal signal, i.e. SIGKILL, is pending. + */ + fault->pfn =3D __gfn_to_pfn_memslot(slot, fault->gfn, false, true, NULL, fault->write, &fault->map_writable, &fault->hva); return RET_PF_CONTINUE; --=20 2.37.3