From nobody Tue Apr 30 12:37:04 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=none (zoho.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org ARC-Seal: i=1; a=rsa-sha256; t=1560801088; cv=none; d=zoho.com; s=zohoarc; b=ojZC0wCgavk+WlwsEQFBkfOXZu1sYFH3nffRI67QI1wbsWRmsbTw7EhM8s3ajUQLtdLxzNtC9PBrzkvhX8KXFB3C0vRTiCdE8ekNptRt9QCYMeClcu6oRMb6mkon+wZvNVSKsV9+aCBrF6Bs33L+v8SBKCSqPASEpmZWllWREoA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1560801088; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:Sender:Subject:To:ARC-Authentication-Results; bh=rw5LOHGtJqjVOLxdSM4r+6BYCC+B/P2CxhHMCcdcJ/A=; b=KOzRWHz5T9LqsrYUw5N0AUNrscU4Zq8jyeWDOg4n13oYZUroI80qQclR1Zo/n3p4YhtqfLo6eFyD+q5pJrnYjrOCe6TAIT6HPQ0RhZnhKClm6y9OcfoPMaZJ3NUSzt6VWQI/Nj9Hw3VoNyT5YrnpX0BBuS995qNJcl/z6wISveM= ARC-Authentication-Results: i=1; mx.zoho.com; spf=none (zoho.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1560801088159825.2485876908523; Mon, 17 Jun 2019 12:51:28 -0700 (PDT) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1hcxdq-0006hZ-0K; Mon, 17 Jun 2019 19:50:10 +0000 Received: from us1-rack-dfw2.inumbo.com ([104.130.134.6]) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1hcxdp-0006hU-E0 for xen-devel@lists.xenproject.org; Mon, 17 Jun 2019 19:50:09 +0000 Received: from esa6.hc3370-68.iphmx.com (unknown [216.71.155.175]) by us1-rack-dfw2.inumbo.com (Halon) with ESMTPS id 1bddde7e-9139-11e9-8980-bc764e045a96; Mon, 17 Jun 2019 19:50:08 +0000 (UTC) X-Inumbo-ID: 1bddde7e-9139-11e9-8980-bc764e045a96 Authentication-Results: esa6.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none; spf=None smtp.pra=andrew.cooper3@citrix.com; spf=Pass smtp.mailfrom=Andrew.Cooper3@citrix.com; spf=None smtp.helo=postmaster@mail.citrix.com Received-SPF: none (zoho.com: 192.237.175.120 is neither permitted nor denied by domain of lists.xenproject.org) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Received-SPF: None (esa6.hc3370-68.iphmx.com: no sender authenticity information available from domain of andrew.cooper3@citrix.com) identity=pra; client-ip=162.221.158.21; receiver=esa6.hc3370-68.iphmx.com; envelope-from="Andrew.Cooper3@citrix.com"; x-sender="andrew.cooper3@citrix.com"; x-conformance=sidf_compatible Received-SPF: Pass (esa6.hc3370-68.iphmx.com: domain of Andrew.Cooper3@citrix.com designates 162.221.158.21 as permitted sender) identity=mailfrom; client-ip=162.221.158.21; receiver=esa6.hc3370-68.iphmx.com; envelope-from="Andrew.Cooper3@citrix.com"; x-sender="Andrew.Cooper3@citrix.com"; x-conformance=sidf_compatible; x-record-type="v=spf1"; x-record-text="v=spf1 ip4:209.167.231.154 ip4:178.63.86.133 ip4:195.66.111.40/30 ip4:85.115.9.32/28 ip4:199.102.83.4 ip4:192.28.146.160 ip4:192.28.146.107 ip4:216.52.6.88 ip4:216.52.6.188 ip4:162.221.158.21 ip4:162.221.156.83 ~all" Received-SPF: None (esa6.hc3370-68.iphmx.com: no sender authenticity information available from domain of postmaster@mail.citrix.com) identity=helo; client-ip=162.221.158.21; receiver=esa6.hc3370-68.iphmx.com; envelope-from="Andrew.Cooper3@citrix.com"; x-sender="postmaster@mail.citrix.com"; x-conformance=sidf_compatible IronPort-SDR: Cbeve3Z9Qkc/lTSLFyhM/411DHzsSSv2ZGtwS8uZCH6kjKcErycj+nWnOfsRY409pLPwo0OTmj 5rev5iYJ6DFansbKO1hsBMDxD+UDUNepCiTpV6e2XV0nehSoawKnog1onhYzcEv7YTEbq2docU 9lX/IiPVQe7ZC/2yrkP+Uh361u08onAEim3H2x/snh/iYln54tvzdohUG9M8RQ43ebohXbBIJ9 DJ9BZU5XjJeGCsrEues/Zqq8ATweXcYcWfXhE1oWqNRuAh2v57XkrVrJ6aXT6KZa0QPsD/X725 rLY= X-SBRS: 2.7 X-MesageID: 1836386 X-Ironport-Server: esa6.hc3370-68.iphmx.com X-Remote-IP: 162.221.158.21 X-Policy: $RELAYED X-IronPort-AV: E=Sophos;i="5.63,386,1557201600"; d="scan'208";a="1836386" From: Andrew Cooper To: Xen-devel Date: Mon, 17 Jun 2019 20:49:59 +0100 Message-ID: <1560800999-11592-1-git-send-email-andrew.cooper3@citrix.com> X-Mailer: git-send-email 2.1.4 MIME-Version: 1.0 Subject: [Xen-devel] [PATCH] x86/clear_page: Update clear_page_sse2() after dropping 32bit Xen X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Andrew Cooper , =?UTF-8?q?Edwin=20T=C3=B6r=C3=B6k?= , Wei Liu , Jan Beulich , =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" This code was never updated when the 32bit build of Xen was dropped. * Expand the now-redundant ptr_reg macro. * The number of iterations in the loop can be halfed by using 64bit writes, without consuming any extra execution resource in the pipeline. Adjust = all numbers/offsets appropriately. * Replace dec with sub to avoid a eflags stall, and position it to be macro-fused with the related jmp. * With no need to preserve eflags across the body of the loop, replace lea with add which has 1/3'rd the latency on basically all 64bit hardware. A quick userspace perf test on my Haswell dev box indicates that the old version takes ~1385 cycles on average (ignoring outliers), and the new vers= ion takes ~1060 cyles, or about 77% of the time. Reported-by: Edwin T=C3=B6r=C3=B6k Signed-off-by: Andrew Cooper Reviewed-by: Jan Beulich --- CC: Jan Beulich CC: Wei Liu CC: Roger Pau Monn=C3=A9 CC: Edwin T=C3=B6r=C3=B6k There is almost certainly better room for improvement, especially now that = we have alternatives, but this is substantial improvement which is very safe f= or backport. --- xen/arch/x86/clear_page.S | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/xen/arch/x86/clear_page.S b/xen/arch/x86/clear_page.S index 243a767..0817610 100644 --- a/xen/arch/x86/clear_page.S +++ b/xen/arch/x86/clear_page.S @@ -2,18 +2,16 @@ =20 #include =20 -#define ptr_reg %rdi - ENTRY(clear_page_sse2) - mov $PAGE_SIZE/16, %ecx + mov $PAGE_SIZE/32, %ecx xor %eax,%eax =20 -0: dec %ecx - movnti %eax, (ptr_reg) - movnti %eax, 4(ptr_reg) - movnti %eax, 8(ptr_reg) - movnti %eax, 12(ptr_reg) - lea 16(ptr_reg), ptr_reg +0: movnti %rax, 0(%rdi) + movnti %rax, 8(%rdi) + movnti %rax, 16(%rdi) + movnti %rax, 24(%rdi) + add $32, %rdi + sub $1, %ecx jnz 0b =20 sfence --=20 2.1.4 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel