From nobody Mon Apr 29 20:08:20 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1503908977781836.0793204452668; Mon, 28 Aug 2017 01:29:37 -0700 (PDT) Received: from localhost ([::1]:37478 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dmFQO-0008K4-Cx for importer@patchew.org; Mon, 28 Aug 2017 04:29:36 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33593) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dmFPC-0007kP-83 for qemu-devel@nongnu.org; Mon, 28 Aug 2017 04:28:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dmFP7-0002I8-A8 for qemu-devel@nongnu.org; Mon, 28 Aug 2017 04:28:22 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:54958) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dmFP7-0002Hm-12 for qemu-devel@nongnu.org; Mon, 28 Aug 2017 04:28:17 -0400 Received: from pps.filterd (m0098399.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id v7S8RoXl133932 for ; Mon, 28 Aug 2017 04:28:15 -0400 Received: from e06smtp14.uk.ibm.com (e06smtp14.uk.ibm.com [195.75.94.110]) by mx0a-001b2d01.pphosted.com with ESMTP id 2cm5wnn36a-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Mon, 28 Aug 2017 04:28:15 -0400 Received: from localhost by e06smtp14.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 28 Aug 2017 09:28:12 +0100 Received: from b06cxnps4076.portsmouth.uk.ibm.com (9.149.109.198) by e06smtp14.uk.ibm.com (192.168.101.144) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Mon, 28 Aug 2017 09:28:09 +0100 Received: from d06av22.portsmouth.uk.ibm.com (d06av22.portsmouth.uk.ibm.com [9.149.105.58]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id v7S8S9jO5636158; Mon, 28 Aug 2017 08:28:09 GMT Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B377C4C040; Mon, 28 Aug 2017 09:25:03 +0100 (BST) Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9D88F4C046; Mon, 28 Aug 2017 09:25:03 +0100 (BST) Received: from tuxmaker.boeblingen.de.ibm.com (unknown [9.152.85.9]) by d06av22.portsmouth.uk.ibm.com (Postfix) with ESMTPS; Mon, 28 Aug 2017 09:25:03 +0100 (BST) Received: by tuxmaker.boeblingen.de.ibm.com (Postfix, from userid 25651) id 0349120F69C; Mon, 28 Aug 2017 10:28:08 +0200 (CEST) From: Christian Borntraeger To: "Dr . David Alan Gilbert" , Andrea Arcangeli Date: Mon, 28 Aug 2017 10:28:07 +0200 X-Mailer: git-send-email 2.7.4 In-Reply-To: <1503908887-67060-1-git-send-email-borntraeger@de.ibm.com> References: <1503908887-67060-1-git-send-email-borntraeger@de.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 17082808-0016-0000-0000-000004E6C688 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17082808-0017-0000-0000-0000282031AD Message-Id: <1503908887-67060-2-git-send-email-borntraeger@de.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-08-28_04:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000 definitions=main-1708280134 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] [fuzzy] X-Received-From: 148.163.156.1 Subject: [Qemu-devel] [PATCH 1/1] s390/mm: avoid empty zero pages for KVM guests to avoid postcopy hangs X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-s390 , Thomas Huth , KVM , Juan Quintela , Cornelia Huck , qemu-devel , Christian Borntraeger Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Right now there is a potential hang situation for postcopy migrations, if the guest is enabling storage keys on the target system during the postcopy process. For storage key virtualization, we have to forbid the empty zero page as the storage key is a property of the physical page frame. As we enable storage key handling lazily we then drop all mappings for empty zero pages for lazy refaulting later on. This does not work with the postcopy migration, which relies on the empty zero page never triggering a fault again in the future. The reason is that postcopy migration will simply read a page on the target system if that page is a known zero page to fault in an empty zero page. At the same time postcopy remembers that this page was already transferred - so any future userfault on that page will be NOT retransmitted again to avoid races. If now the guest enters the storage key mode while in postcopy, we will break this assumption of postcopy. The solution is to disable the empty zero page for KVM guests early on and not during storage key enablement. With this change, the postcopy migration process is guaranteed to start after no zero pages are left. As guest pages are very likely not empty zero pages anyway the memory overhead is also pretty small. While at it this also adds proper page table locking to the zero page removal. Signed-off-by: Christian Borntraeger Cc: stable@vger.kernel.org --- arch/s390/include/asm/pgtable.h | 2 +- arch/s390/mm/gmap.c | 41 ++++++++++++++++++++++++++++++++++---= ---- 2 files changed, 35 insertions(+), 8 deletions(-) diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtabl= e.h index bb59a0a..9d7f62a 100644 --- a/arch/s390/include/asm/pgtable.h +++ b/arch/s390/include/asm/pgtable.h @@ -506,7 +506,7 @@ static inline int mm_alloc_pgste(struct mm_struct *mm) * In the case that a guest uses storage keys * faults should no longer be backed by zero pages */ -#define mm_forbids_zeropage mm_use_skey +#define mm_forbids_zeropage mm_has_pgste static inline int mm_use_skey(struct mm_struct *mm) { #ifdef CONFIG_PGSTE diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c index 4fb3d3c..ae8bd57 100644 --- a/arch/s390/mm/gmap.c +++ b/arch/s390/mm/gmap.c @@ -2121,6 +2121,39 @@ static inline void thp_split_mm(struct mm_struct *mm) } =20 /* + * Remove all empty zero pages from the mapping for lazy refaulting + * - This must be called after mm->context.has_pgste is set, to avoid + * future creation of zero pages + * - This must be called after THP was enabled + */ +static int __zap_zero_pages(pmd_t *pmd, unsigned long start, + unsigned long end, struct mm_walk *walk) +{ + unsigned long addr; + + for (addr =3D start; addr !=3D end; addr +=3D PAGE_SIZE) { + pte_t *ptep; + spinlock_t *ptl; + + ptep =3D pte_offset_map_lock(walk->mm, pmd, addr, &ptl); + if (is_zero_pfn(pte_pfn(*ptep))) + ptep_xchg_direct(walk->mm, addr, ptep, __pte(_PAGE_INVALID)); + pte_unmap_unlock(ptep, ptl); + } + + return 0; + +} + +static inline void zap_zero_pages(struct mm_struct *mm) +{ + struct mm_walk walk =3D { .pmd_entry =3D __zap_zero_pages }; + + walk.mm =3D mm; + walk_page_range(0, TASK_SIZE, &walk); +} + +/* * switch on pgstes for its userspace process (for kvm) */ int s390_enable_sie(void) @@ -2137,6 +2170,7 @@ int s390_enable_sie(void) mm->context.has_pgste =3D 1; /* split thp mappings and disable thp for future mappings */ thp_split_mm(mm); + zap_zero_pages(mm); up_write(&mm->mmap_sem); return 0; } @@ -2149,13 +2183,6 @@ EXPORT_SYMBOL_GPL(s390_enable_sie); static int __s390_enable_skey(pte_t *pte, unsigned long addr, unsigned long next, struct mm_walk *walk) { - /* - * Remove all zero page mappings, - * after establishing a policy to forbid zero page mappings - * following faults for that page will get fresh anonymous pages - */ - if (is_zero_pfn(pte_pfn(*pte))) - ptep_xchg_direct(walk->mm, addr, pte, __pte(_PAGE_INVALID)); /* Clear storage key */ ptep_zap_key(walk->mm, addr, pte); return 0; --=20 2.7.4