From nobody Mon Apr 29 10:07:08 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1491960027405638.6933200524737; Tue, 11 Apr 2017 18:20:27 -0700 (PDT) Received: from localhost ([::1]:41768 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cy6xL-0001EG-JK for importer@patchew.org; Tue, 11 Apr 2017 21:20:25 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41194) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cy6ux-00082L-Lk for qemu-devel@nongnu.org; Tue, 11 Apr 2017 21:17:56 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cy6uw-0006Qj-Qb for qemu-devel@nongnu.org; Tue, 11 Apr 2017 21:17:55 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:41216) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cy6ur-0006LW-LM; Tue, 11 Apr 2017 21:17:49 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id D58EE20B4E; Tue, 11 Apr 2017 21:17:46 -0400 (EDT) Received: from frontend1 ([10.202.2.160]) by compute4.internal (MEProxy); Tue, 11 Apr 2017 21:17:46 -0400 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 914837E43A; Tue, 11 Apr 2017 21:17:46 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc:x-sasl-enc; s=mesmtp; bh=a64 BGNYCNc6kRlHQOx3HwajbGOXw8XrdNAZlj33XqTo=; b=B6YxeyOP/yBAnu4vpkk lJv7hTzXtWf6VQj1V/FPqNBwpt6muPlL/e2P7l/DCjHk7JSLzzDc0Bu1oRVTjkHk YEty0g5DQuWutr0edaFICRg1dRg2T72lNe2PFRO5QIQdhcVtY8EWdb+2y8pbiM9C fD9hJMZkAKDK9fR1JcEiaOg8= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc :x-sasl-enc; s=fm1; bh=a64BGNYCNc6kRlHQOx3HwajbGOXw8XrdNAZlj33Xq To=; b=EGG352hfjtiYJEhPeAEXWbLk9AjQOCVSz+cnApqZ/7wmj/dAZFpovepjV lJThpF5P8dt13mfIZvriiz5X7pUWCJFeIAin3wTMd7rIsJDkbbmHLKT/orAR1lrY ppqeIjd3sJNNbm9eXSKI4aZJjpHlL4+BG4RGOUsVY3FZ7DtTxp6tEtDDB6FDUW7F Q7wpRW+QObLwdDiyoHjpnuBrltVjeg6PjBJZ6lYLYv+D2BAhKZV7doEnL11mSNdq yVJD6F4wzFdfwsU3sUyQH3Ace0IragoXghlteViYq63gH6FJ8IfBlCo8vxe9u/q9 l1kkdc4ibNaVTL2CWfc6VPModmcaA== X-ME-Sender: X-Sasl-enc: lc4L516ucAULhbQJVnOtgQkKErDIxaFwR3n+DYN/qAw9 1491959866 From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Tue, 11 Apr 2017 21:17:21 -0400 Message-Id: <1491959850-30756-2-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1491959850-30756-1-git-send-email-cota@braap.org> References: <1491959850-30756-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.25 Subject: [Qemu-devel] [PATCH 01/10] exec-all: add tb_from_jmp_cache X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Eduardo Habkost , Peter Crosthwaite , Stefan Weil , Claudio Fontana , Alexander Graf , alex.bennee@linaro.org, qemu-arm@nongnu.org, Pranith Kumar , Paolo Bonzini , Aurelien Jarno , Richard Henderson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" This paves the way for upcoming changes. Signed-off-by: Emilio G. Cota --- cpu-exec.c | 19 +++++++++++++++++++ include/exec/exec-all.h | 2 +- 2 files changed, 20 insertions(+), 1 deletion(-) diff --git a/cpu-exec.c b/cpu-exec.c index 748cb66..ce9750a 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -309,6 +309,25 @@ static bool tb_cmp(const void *p, const void *d) return false; } =20 +TranslationBlock *tb_from_jmp_cache(CPUArchState *env, target_ulong vaddr) +{ + CPUState *cpu =3D ENV_GET_CPU(env); + TranslationBlock *tb; + target_ulong cs_base, pc; + uint32_t flags; + + if (unlikely(atomic_read(&cpu->exit_request))) { + return NULL; + } + cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags); + tb =3D atomic_rcu_read(&cpu->tb_jmp_cache[tb_jmp_cache_hash_func(vaddr= )]); + if (likely(tb && tb->pc =3D=3D vaddr && tb->cs_base =3D=3D cs_base && + tb->flags =3D=3D flags)) { + return tb; + } + return NULL; +} + static TranslationBlock *tb_htable_lookup(CPUState *cpu, target_ulong pc, target_ulong cs_base, diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h index bcde1e6..18b80bc 100644 --- a/include/exec/exec-all.h +++ b/include/exec/exec-all.h @@ -56,7 +56,6 @@ TranslationBlock *tb_gen_code(CPUState *cpu, target_ulong pc, target_ulong cs_base, uint32_t flags, int cflags); - void QEMU_NORETURN cpu_loop_exit(CPUState *cpu); void QEMU_NORETURN cpu_loop_exit_restore(CPUState *cpu, uintptr_t pc); void QEMU_NORETURN cpu_loop_exit_atomic(CPUState *cpu, uintptr_t pc); @@ -368,6 +367,7 @@ struct TranslationBlock { void tb_free(TranslationBlock *tb); void tb_flush(CPUState *cpu); void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr); +TranslationBlock *tb_from_jmp_cache(CPUArchState *env, target_ulong vaddr); =20 #if defined(USE_DIRECT_JUMP) =20 --=20 2.7.4 From nobody Mon Apr 29 10:07:08 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1491960165792506.24388686737325; Tue, 11 Apr 2017 18:22:45 -0700 (PDT) Received: from localhost ([::1]:41783 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cy6zc-00031I-6M for importer@patchew.org; Tue, 11 Apr 2017 21:22:44 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41191) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cy6ux-00082H-Fa for qemu-devel@nongnu.org; Tue, 11 Apr 2017 21:17:56 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cy6uw-0006QK-Hp for qemu-devel@nongnu.org; Tue, 11 Apr 2017 21:17:55 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:53483) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cy6ur-0006LY-Lu; Tue, 11 Apr 2017 21:17:49 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 0CBA120B5F; Tue, 11 Apr 2017 21:17:47 -0400 (EDT) Received: from frontend2 ([10.202.2.161]) by compute4.internal (MEProxy); Tue, 11 Apr 2017 21:17:47 -0400 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id BFDB1241ED; Tue, 11 Apr 2017 21:17:46 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc:x-sasl-enc; s=mesmtp; bh=FHV 879t6kYzIF3kjiS1A3wMW1cEkA+YddMExwl10tQM=; b=ooir3iLI7mjSLpcu1bI MwQrRwrxJvY+kmh9937yxBit8y+0zpE9S5Ir6uBC4tUItkaqJMWtuDiXx3bX8Ozw XFKQIzf+DzR+NnJyYqb+JP3TxWRvZYc0rxvC4en1yDW2fGBWNh+hBcAka1k2dy34 DlwvJEmeAmcDFEsp5M+km/Nw= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc :x-sasl-enc; s=fm1; bh=FHV879t6kYzIF3kjiS1A3wMW1cEkA+YddMExwl10t QM=; b=lUaT10XWcZyBQyHEs02OCZuV6XDBTwTUB2KE/bHE9W4zAxAiv/xnhLHUD cOE0ROGkoVND3+LdEf5njd4toyFUZl4Ul6fbQ62OE3sVG40vyAYxo16QXjfsRcsU gRg/gHGetuib0nJC3jJkPkGE2xRD7R8w/3wzA7JFQu4rFmuJdA1OXGZvZ9bya2qS T84lZVgmFXrwsI0G5ShHr/xErCyublxxFWYwNfaIt37QNiezxQ23UPxogMqY6cm4 0scdYwTQXiLkLyPr62Ft2QlKGQkGO1SQx91s5QlssdnsAO27Yd85MmpCfrvQz4uM hM/JQpl1u+WdJ2xBCHZLO0P+ngZrw== X-ME-Sender: X-Sasl-enc: lc4V+0S6agwHgrQWVWKtgQkKErDIxaFwR3n+DYN/qAw9 1491959866 From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Tue, 11 Apr 2017 21:17:22 -0400 Message-Id: <1491959850-30756-3-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1491959850-30756-1-git-send-email-cota@braap.org> References: <1491959850-30756-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.25 Subject: [Qemu-devel] [PATCH 02/10] exec-all: inline tb_from_jmp_cache X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Eduardo Habkost , Peter Crosthwaite , Stefan Weil , Claudio Fontana , Alexander Graf , alex.bennee@linaro.org, qemu-arm@nongnu.org, Pranith Kumar , Paolo Bonzini , Aurelien Jarno , Richard Henderson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" The inline improves performance, as shown in subsequent commits' logs. This commit is kept separate to ease review, since the inclusion of tb-hash.h might be controversial. The problem here, which was introduced before this commit, is that tb_hash_func() depends on page_addr_t: this defeats the original purpose of tb-hash.h, which was to be self-contained and CPU-agnostic. Signed-off-by: Emilio G. Cota --- cpu-exec.c | 19 ------------------- include/exec/exec-all.h | 24 +++++++++++++++++++++++- 2 files changed, 23 insertions(+), 20 deletions(-) diff --git a/cpu-exec.c b/cpu-exec.c index ce9750a..748cb66 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -309,25 +309,6 @@ static bool tb_cmp(const void *p, const void *d) return false; } =20 -TranslationBlock *tb_from_jmp_cache(CPUArchState *env, target_ulong vaddr) -{ - CPUState *cpu =3D ENV_GET_CPU(env); - TranslationBlock *tb; - target_ulong cs_base, pc; - uint32_t flags; - - if (unlikely(atomic_read(&cpu->exit_request))) { - return NULL; - } - cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags); - tb =3D atomic_rcu_read(&cpu->tb_jmp_cache[tb_jmp_cache_hash_func(vaddr= )]); - if (likely(tb && tb->pc =3D=3D vaddr && tb->cs_base =3D=3D cs_base && - tb->flags =3D=3D flags)) { - return tb; - } - return NULL; -} - static TranslationBlock *tb_htable_lookup(CPUState *cpu, target_ulong pc, target_ulong cs_base, diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h index 18b80bc..bd76987 100644 --- a/include/exec/exec-all.h +++ b/include/exec/exec-all.h @@ -367,7 +367,29 @@ struct TranslationBlock { void tb_free(TranslationBlock *tb); void tb_flush(CPUState *cpu); void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr); -TranslationBlock *tb_from_jmp_cache(CPUArchState *env, target_ulong vaddr); + +/* tb_hash_func() in tb-hash.h needs tb_page_addr_t, defined above */ +#include "tb-hash.h" + +static inline +TranslationBlock *tb_from_jmp_cache(CPUArchState *env, target_ulong vaddr) +{ + CPUState *cpu =3D ENV_GET_CPU(env); + TranslationBlock *tb; + target_ulong cs_base, pc; + uint32_t flags; + + if (unlikely(atomic_read(&cpu->exit_request))) { + return NULL; + } + cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags); + tb =3D atomic_rcu_read(&cpu->tb_jmp_cache[tb_jmp_cache_hash_func(vaddr= )]); + if (likely(tb && tb->pc =3D=3D vaddr && tb->cs_base =3D=3D cs_base && + tb->flags =3D=3D flags)) { + return tb; + } + return NULL; +} =20 #if defined(USE_DIRECT_JUMP) =20 --=20 2.7.4 From nobody Mon Apr 29 10:07:08 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1491960032274646.7344731169721; Tue, 11 Apr 2017 18:20:32 -0700 (PDT) Received: from localhost ([::1]:41774 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cy6xS-0001Ja-Qo for importer@patchew.org; Tue, 11 Apr 2017 21:20:30 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41236) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cy6uy-00083b-DH for qemu-devel@nongnu.org; Tue, 11 Apr 2017 21:17:57 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cy6ux-0006Qu-1X for qemu-devel@nongnu.org; Tue, 11 Apr 2017 21:17:56 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:52445) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cy6ur-0006LX-Lg; Tue, 11 Apr 2017 21:17:49 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 43B8A20B4B; Tue, 11 Apr 2017 21:17:47 -0400 (EDT) Received: from frontend1 ([10.202.2.160]) by compute4.internal (MEProxy); Tue, 11 Apr 2017 21:17:47 -0400 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id F29667E442; Tue, 11 Apr 2017 21:17:46 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc:x-sasl-enc; s=mesmtp; bh=5TS T+1dZBNmJyO7KRqDMvNdeu2S380NORRHnloEkdok=; b=tjv1QxJWYbnkUrWx1mF qufJki1RlMDS3SWqEf/9EK8U/WwS+/A6hVDSbLA/WKCPR5n6zIgfpgrsotcaUKDm /wXxkCziFW/8dRnt1dWl3KvoYdnVp6z/isP70c0bQVpnQj1co4zBFZcO3UP4xQnV 8sKrZ8RqfQxNjLjEYCP0ZCdQ= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc :x-sasl-enc; s=fm1; bh=5TST+1dZBNmJyO7KRqDMvNdeu2S380NORRHnloEkd ok=; b=dQ8+fJvR6WM1NHDIQTFempvzlLDBaEMYseA+2f4Ws/guLJ+YgnjVT/fZx +58G3OmURnk02CY8QXg1JhnxwsYlQ7OHCWtqD/m0e6WWs5OoNj5TF5gAy9zPl5Bw AhJN1cBB+YNWwOWVZA4ukxwClM9IQDf8BdyXtnND/aboqQPs8noPZ7pMx/FDHhMz zU5XKm2E4eSMXJvu1k+TrQeNOwELMxy1EV5lAe8XHNHqfQoeJ/CeXlkw5KesZVCt F+WgJghZ6lF1AoB96ZNYHz8UHHV15DcU6hEciFh7vvIxIKr7gYKoi9pfamUQda5y iL46q3XzPrC64+wDwRAWdmRcYq+pA== X-ME-Sender: X-Sasl-enc: ERq/ZaMEoUDg17PIOgadLnGpPiI9+MZL93SFqLcWFQJf 1491959867 From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Tue, 11 Apr 2017 21:17:23 -0400 Message-Id: <1491959850-30756-4-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1491959850-30756-1-git-send-email-cota@braap.org> References: <1491959850-30756-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.25 Subject: [Qemu-devel] [PATCH 03/10] target/arm: optimize cross-page block chaining in softmmu X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Eduardo Habkost , Peter Crosthwaite , Stefan Weil , Claudio Fontana , Alexander Graf , alex.bennee@linaro.org, qemu-arm@nongnu.org, Pranith Kumar , Paolo Bonzini , Aurelien Jarno , Richard Henderson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Instead of unconditionally exiting to the exec loop, add a helper to check whether the target TB is valid. As long as the hit rate in tb_jmp_cache remains high, this improves performance. Measurements: - Boot time of ARM debian jessie on Intel host: | setup | ARM debian boot+shutdown time | stddev | |--------------------+-------------------------------+--------| | master | 10.050247057 | 0.0361 | | +cross | 10.311265443 | 0.0721 | That is a 2.58% slowdown when booting. This is reasonable given that tb_jmp_cache's hit rate when booting is expected to be low. - NBench, arm-softmmu. Host: Intel i7-4790K @ 4.00GHz (y axis: Speedup over 95b31d70) 1.3x+-+--------------------------------------------------------------+-+ | cross+noinline $$$ | | cross+inline %%% | | $$$%% | 1.2x+-+.................$.$.%.......$$$..............................+-+ | $ $ % $ $% | | $ $ % $ $% | 1.1x+-+.................$.$.%.......$.$%.............................+-+ | $$$%% $ $ % $ $% | | $ $ % $ $ % $ $% $$$%% $$$%% $$$%% | | $$$%% $$$%% $ $ % $ $ % $$$%% $ $% $ $ % %%% $ $ % $ $ % | 1x+-$.$B%R$R$A%G$A$H%T$M$_%P$L$i%l$n$%.$.$.%...%.%.$$$%%.$.$.%.$.$.%-+ | $ $ % $ $ % $ $ % $ $ % $ $ % $ $% $ $ % % % $ $ % $ $ % $ $ % | | $ $ % $ $ % $ $ % $ $ % $ $ % $ $% $ $ % % % $ $ % $ $ % $ $ % | 0.9x+-$.$.%.$.$.%.$.$.%.$.$.%.$.$.%.$.$%.$.$.%...%.%.$.$.%.$.$.%.$.$.%-+ | $ $ % $ $ % $ $ % $ $ % $ $ % $ $% $ $ % % % $ $ % $ $ % $ $ % | | $ $ % $ $ % $ $ % $ $ % $ $ % $ $% $ $ % $$$ % $ $ % $ $ % $ $ % | | $ $ % $ $ % $ $ % $ $ % $ $ % $ $% $ $ % $ $ % $ $ % $ $ % $ $ % | 0.8x+-$$$%%-$$$%%-$$$%%-$$$%%-$$$%%-$$$%-$$$%%-$$$%%-$$$%%-$$$%%-$$$%%-+ ASSIGNMBITFIELFOUFP_EMULATHUFFMALU_DECOMPNEURANUMERICSTRING_SOhmean png: http://imgur.com/1rmYSaF That is, a 4.04% hmean perf improvement over master with tb_from_jmp_cache not inlined, and a 5.82% hmean perf improvement over master with tb_from_jm= p_cache inlined (i.e. this commit). The largest improvement is 21% for the FP_EMULA= TION benchmark. Signed-off-by: Emilio G. Cota --- target/arm/helper.c | 5 +++++ target/arm/helper.h | 2 ++ target/arm/translate.c | 12 ++++++++++++ 3 files changed, 19 insertions(+) diff --git a/target/arm/helper.c b/target/arm/helper.c index 8cb7a94..10b8807 100644 --- a/target/arm/helper.c +++ b/target/arm/helper.c @@ -9922,3 +9922,8 @@ uint32_t HELPER(crc32c)(uint32_t acc, uint32_t val, u= int32_t bytes) /* Linux crc32c converts the output to one's complement. */ return crc32c(acc, buf, bytes) ^ 0xffffffff; } + +uint32_t HELPER(cross_page_check)(CPUARMState *env, target_ulong vaddr) +{ + return !!tb_from_jmp_cache(env, vaddr); +} diff --git a/target/arm/helper.h b/target/arm/helper.h index df86bf7..d4b779b 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -1,6 +1,8 @@ DEF_HELPER_FLAGS_1(sxtb16, TCG_CALL_NO_RWG_SE, i32, i32) DEF_HELPER_FLAGS_1(uxtb16, TCG_CALL_NO_RWG_SE, i32, i32) =20 +DEF_HELPER_2(cross_page_check, i32, env, tl) + DEF_HELPER_3(add_setq, i32, env, i32, i32) DEF_HELPER_3(add_saturate, i32, env, i32, i32) DEF_HELPER_3(sub_saturate, i32, env, i32, i32) diff --git a/target/arm/translate.c b/target/arm/translate.c index e32e38c..ce97d0c 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4085,6 +4085,18 @@ static inline void gen_goto_tb(DisasContext *s, int = n, target_ulong dest) gen_set_pc_im(s, dest); tcg_gen_exit_tb((uintptr_t)s->tb + n); } else { + TCGv vaddr =3D tcg_const_tl(dest); + TCGv_i32 valid =3D tcg_temp_new_i32(); + TCGLabel *label =3D gen_new_label(); + + gen_helper_cross_page_check(valid, cpu_env, vaddr); + tcg_temp_free(vaddr); + tcg_gen_brcondi_i32(TCG_COND_EQ, valid, 0, label); + tcg_temp_free_i32(valid); + tcg_gen_goto_tb(n); + gen_set_pc_im(s, dest); + tcg_gen_exit_tb((uintptr_t)s->tb + n); + gen_set_label(label); gen_set_pc_im(s, dest); tcg_gen_exit_tb(0); } --=20 2.7.4 From nobody Mon Apr 29 10:07:08 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1491960169522416.7064283061412; Tue, 11 Apr 2017 18:22:49 -0700 (PDT) Received: from localhost ([::1]:41784 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cy6zg-00034W-4l for importer@patchew.org; Tue, 11 Apr 2017 21:22:48 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41244) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cy6uy-00083y-Jr for qemu-devel@nongnu.org; Tue, 11 Apr 2017 21:17:57 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cy6ux-0006R0-2u for qemu-devel@nongnu.org; Tue, 11 Apr 2017 21:17:56 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:59731) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cy6ur-0006La-Lm; Tue, 11 Apr 2017 21:17:49 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 6C7FC20B79; Tue, 11 Apr 2017 21:17:47 -0400 (EDT) Received: from frontend2 ([10.202.2.161]) by compute4.internal (MEProxy); Tue, 11 Apr 2017 21:17:47 -0400 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 2B99624526; Tue, 11 Apr 2017 21:17:47 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc:x-sasl-enc; s=mesmtp; bh=Tj7 7T4KERFxEoiUruN1r0FWqM16gUREttWZpODDV7JI=; b=lo6m3DYBSTk829P5PCV bzzKeuauPxy/ZSsLfo057ZKe7jKP/5mMtbAKOJpY+ZRkem1hBXDUzvR3Cu/KcfkF 39Hl6WUQAe+m21s6Q3dqOtmycboPF8TqtqJhIbb9LaeSXnUYXCW84X/79xN9rd97 UV60hEkVYH/iJAm6fLRzB4h0= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc :x-sasl-enc; s=fm1; bh=Tj77T4KERFxEoiUruN1r0FWqM16gUREttWZpODDV7 JI=; b=QpaV68gUr/jVgr0a59TXiH5ao2zeiB/20xf6DMZaBpjHAsYmINdx93KBp hV/zTrWQLiVLZQXodOeNqwtE/B9SKqYQOc1Vk/urkg1q8CynM2GAuHdZAneK0fgj ONrt64rHH+P0HmpYo2N600GNDRQxPQzyO+fEE4EOKprEGVBGcyGMT7sYXqHIdFL5 ke2fiKwF8/ip27d/XHHgjaeWJHfpbAb/IKI9aYXXgLHPd9ZgFFGFJO9ndgFGtB+o 4gT7zuHQ0Gbi3VSxTIX+ddiwldDgievc3m+3zy9v3L/SUNR09JGR5F+q4WH3XQ/W FHqcVmKwnOFn+QGrREbMS3XsTwOUA== X-ME-Sender: X-Sasl-enc: ERqzZr8eoF/o3qTAIQudLnGpPiI9+MZL93SFqLcWFQJf 1491959867 From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Tue, 11 Apr 2017 21:17:24 -0400 Message-Id: <1491959850-30756-5-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1491959850-30756-1-git-send-email-cota@braap.org> References: <1491959850-30756-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.25 Subject: [Qemu-devel] [PATCH 04/10] target/i386: optimize cross-page block chaining in softmmu X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Eduardo Habkost , Peter Crosthwaite , Stefan Weil , Claudio Fontana , Alexander Graf , alex.bennee@linaro.org, qemu-arm@nongnu.org, Pranith Kumar , Paolo Bonzini , Aurelien Jarno , Richard Henderson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Instead of unconditionally exiting to the exec loop, add a helper to check whether the target TB is valid. As long as the hit rate in tb_jmp_cache remains high, this improves performance. Measurements: - specINT 2006 (test set), x86_64-softmmu. Host: Intel i7-4790K @ 4.0= 0GHz Y axis: Speedup over 95b31d70 1.3x+-+-------------------------------------------------------------+-+ | cross $$ | 1.25x+-+.............................................................+-+ | | 1.2x+-+.............................................................+-+ | : | | : | 1.15x+-+.............................................................+-+ | $$$$ $$$$ +++ : | 1.1x+-+.........$..$.$..$...........................................+-+ | $ $ $ $ $$$ $$$$ | 1.05x+-+.........$..$.$..$.....................$.$.$$$$......$..$....+-+ | $ $ $ $ +++ +++ +++ $+$ $++$ +++ $: $ $$$$ | | +++ $ $ $ $ +++ $$$ : : $ $ $ $ $$$$ $: $ $++$ | 1x+-$$$$G$$$$_$EM$_$ro$s$$$..$.$.......$$$..$.$.$..$.$..$.$..$.$..$-+ | $++$ $ :$ $ $ $ $ $ $ $ $ : $+$ $ $ $ $ $++$ $: $ $ $ | 0.95x+-$..$.$..$.$..$.$..$.$.$..$.$..$$$..$.$..$.$.$..$.$..$.$..$.$..$-+ | $ $ $ $ $ $ $ $ $ $ $ $ $:$ $ $ $ $ $ $ $ $ $ $ $ $ | 0.9x+-$$$$-$$$$-$$$$-$$$$-$$$--$$$--$$$--$$$--$$$-$$$$-$$$$-$$$$-$$$$-+ astarbzip2gcc gobmh264rehmlibquantumcfomneperlbensjxalancbhmean png: http://imgur.com/cwRnmCi That is, a hmean gain of 2.6%. - specINT 2006 (train set), x86_64-softmmu. Host: Intel i7-4790K @ 4.0= 0GHz Y axis: Speedup over 95b31d70 1.25x+-+-------------------------------------------------------------+-+ | cross $$ | | | 1.2x+-+.............................................................+-+ | : +++ | 1.15x+-+.............................................................+-+ | : $$$ $$$$ $$$$ | | $$$$ +++ $:$ $++$ +++ $: $ | 1.1x+-+.........$..$.$$$$.....................$.$.$..$......$..$....+-+ | +++ $++$ $++$ +++ : $ $ $ $ : $++$ +++ | 1.05x+-+....$$$$.$..$.$..$......$$$............$.$.$..$.$$$$.$..$.$$$$-+ | $++$ $ $ $ $ $$$ $:$ $ $ $ $ $ :$ $ $ $ $ | | $ $ $ $ $ $ $:$ $+$ +++ +++ $ $ $ $ $ :$ $ $ $ $ | 1x+-$$$$G$AP$_$EM$_$ro$s$i$li$e$..$$$.......$.$.$..$.$..$.$..$.$..$-+ | $++$ $ $ $ $ $ $ $+$ $ $ $:$ $$$ $ $ $ $ $ $ $ $ $ $ | 0.95x+-$..$.$..$.$..$.$..$.$.$..$.$..$.$..$.$..$.$.$..$.$..$.$..$.$..$-+ | $ $ $ $ $ $ $ $ $ $ $ $ $ $ $+$ $ $ $ $ $ $ $ $ $ $ | | $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ | 0.9x+-$$$$-$$$$-$$$$-$$$$-$$$--$$$--$$$--$$$--$$$-$$$$-$$$$-$$$$-$$$$-+ astarbzip2gcc gobmh264rehmlibquantumcfomneperlbensjxalancbhmean png: http://imgur.com/0CbG7dD This is the larger "train" set. We get a hmean improvement of 6.1%. Signed-off-by: Emilio G. Cota --- target/i386/helper.h | 2 ++ target/i386/misc_helper.c | 5 +++++ target/i386/translate.c | 14 +++++++++++++- 3 files changed, 20 insertions(+), 1 deletion(-) diff --git a/target/i386/helper.h b/target/i386/helper.h index 6fb8fb9..dceb343 100644 --- a/target/i386/helper.h +++ b/target/i386/helper.h @@ -1,6 +1,8 @@ DEF_HELPER_FLAGS_4(cc_compute_all, TCG_CALL_NO_RWG_SE, tl, tl, tl, tl, int) DEF_HELPER_FLAGS_4(cc_compute_c, TCG_CALL_NO_RWG_SE, tl, tl, tl, tl, int) =20 +DEF_HELPER_2(cross_page_check, i32, env, tl) + DEF_HELPER_3(write_eflags, void, env, tl, i32) DEF_HELPER_1(read_eflags, tl, env) DEF_HELPER_2(divb_AL, void, env, tl) diff --git a/target/i386/misc_helper.c b/target/i386/misc_helper.c index ca2ea09..a41daed 100644 --- a/target/i386/misc_helper.c +++ b/target/i386/misc_helper.c @@ -637,3 +637,8 @@ void helper_wrpkru(CPUX86State *env, uint32_t ecx, uint= 64_t val) env->pkru =3D val; tlb_flush(cs); } + +uint32_t helper_cross_page_check(CPUX86State *env, target_ulong vaddr) +{ + return !!tb_from_jmp_cache(env, vaddr); +} diff --git a/target/i386/translate.c b/target/i386/translate.c index 1d1372f..ffc8ccc 100644 --- a/target/i386/translate.c +++ b/target/i386/translate.c @@ -2153,7 +2153,19 @@ static inline void gen_goto_tb(DisasContext *s, int = tb_num, target_ulong eip) gen_jmp_im(eip); tcg_gen_exit_tb((uintptr_t)s->tb + tb_num); } else { - /* jump to another page: currently not optimized */ + /* jump to another page */ + TCGv vaddr =3D tcg_const_tl(eip); + TCGv_i32 valid =3D tcg_temp_new_i32(); + TCGLabel *label =3D gen_new_label(); + + gen_helper_cross_page_check(valid, cpu_env, vaddr); + tcg_temp_free(vaddr); + tcg_gen_brcondi_i32(TCG_COND_EQ, valid, 0, label); + tcg_temp_free_i32(valid); + tcg_gen_goto_tb(tb_num); + gen_jmp_im(eip); + tcg_gen_exit_tb((uintptr_t)s->tb + tb_num); + gen_set_label(label); gen_jmp_im(eip); gen_eob(s); } --=20 2.7.4 From nobody Mon Apr 29 10:07:08 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1491960165684828.9593154988859; Tue, 11 Apr 2017 18:22:45 -0700 (PDT) Received: from localhost ([::1]:41782 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cy6zc-000318-1c for importer@patchew.org; Tue, 11 Apr 2017 21:22:44 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41247) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cy6uy-000844-Ml for qemu-devel@nongnu.org; Tue, 11 Apr 2017 21:17:58 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cy6ux-0006R6-4G for qemu-devel@nongnu.org; Tue, 11 Apr 2017 21:17:56 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:41224) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cy6us-0006MI-1q; Tue, 11 Apr 2017 21:17:50 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id AF2FD20B84; Tue, 11 Apr 2017 21:17:47 -0400 (EDT) Received: from frontend1 ([10.202.2.160]) by compute4.internal (MEProxy); Tue, 11 Apr 2017 21:17:47 -0400 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 5F8B37E43A; Tue, 11 Apr 2017 21:17:47 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc:x-sasl-enc; s=mesmtp; bh=hGK Z/fNGgyPDDms+VwnQXrBIBDAzWZv+3buonmWvGyo=; b=uJHzOkJM05UgfV6447v YbwW5fRj8kJKltvm4sBY2KWeoMRQriBl0Tc/YOSQY5JunXKmgXeL0mplQMmN38Op yjdYupc+Lr0j1xz99knk2lNQpgk4pCigkUCHWCGSehqKrVVMMi5c0JFOZ8aKUq4q O5Sleol3J1GgjeK6cRYsbHMc= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc :x-sasl-enc; s=fm1; bh=hGKZ/fNGgyPDDms+VwnQXrBIBDAzWZv+3buonmWvG yo=; b=WlhURXGl5E936CEXueYgEp5AKYbusJ4aV+EIqlOGgr7/UFLF7DFSBlRT7 bYrhzLWQxJ6ejtAYkay3A4wX4O0SjfTCahJxe/9Ul3kjvz+TeuwckRRrD0jemik1 9P6IqnuMDRMWDILwfiZDGMtzCZ9VnwtZQDPmNVXgW3PvWWm48KAF77ygeKlGMHQh iYmyZplrGQrVaYqSv8Tnc1CyQPiNtOQ1VEsxoXUKJME71wBPaRwf/Z7APSBoOlem 24PM0FYnSuX81u5NvwTj2aGynhapAfiuVRX/5Khae3kLSbJzNpDLYynqz2aVjoGc ozhwoWZkCr0u8nGzJcp/rDESrlDYg== X-ME-Sender: X-Sasl-enc: ERq3abgHpETx3qzaKB6dLnGpPiI9+MZL93SFqLcWFQJf 1491959867 From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Tue, 11 Apr 2017 21:17:25 -0400 Message-Id: <1491959850-30756-6-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1491959850-30756-1-git-send-email-cota@braap.org> References: <1491959850-30756-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.25 Subject: [Qemu-devel] [PATCH 05/10] tcg: add jr opcode X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Eduardo Habkost , Peter Crosthwaite , Stefan Weil , Claudio Fontana , Alexander Graf , alex.bennee@linaro.org, qemu-arm@nongnu.org, Pranith Kumar , Paolo Bonzini , Aurelien Jarno , Richard Henderson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" This will be used by TCG targets to implement a fast path for indirect branches. I only have implemented and tested this on an i386 host, so make this opcode optional and mark it as not implemented by other TCG backends. Signed-off-by: Emilio G. Cota --- tcg/aarch64/tcg-target.h | 1 + tcg/arm/tcg-target.h | 1 + tcg/i386/tcg-target.h | 1 + tcg/i386/tcg-target.inc.c | 7 +++++++ tcg/ia64/tcg-target.h | 1 + tcg/mips/tcg-target.h | 1 + tcg/ppc/tcg-target.h | 1 + tcg/s390/tcg-target.h | 1 + tcg/sparc/tcg-target.h | 1 + tcg/tcg-op.h | 6 ++++++ tcg/tcg-opc.h | 1 + tcg/tcg.c | 1 + tcg/tci/tcg-target.h | 1 + 13 files changed, 24 insertions(+) diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h index 1a5ea23..ed2fb84 100644 --- a/tcg/aarch64/tcg-target.h +++ b/tcg/aarch64/tcg-target.h @@ -77,6 +77,7 @@ typedef enum { #define TCG_TARGET_HAS_mulsh_i32 0 #define TCG_TARGET_HAS_extrl_i64_i32 0 #define TCG_TARGET_HAS_extrh_i64_i32 0 +#define TCG_TARGET_HAS_jr 0 =20 #define TCG_TARGET_HAS_div_i64 1 #define TCG_TARGET_HAS_rem_i64 1 diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h index 09a19c6..1c9f0a2 100644 --- a/tcg/arm/tcg-target.h +++ b/tcg/arm/tcg-target.h @@ -123,6 +123,7 @@ extern bool use_idiv_instructions; #define TCG_TARGET_HAS_mulsh_i32 0 #define TCG_TARGET_HAS_div_i32 use_idiv_instructions #define TCG_TARGET_HAS_rem_i32 0 +#define TCG_TARGET_HAS_jr 0 =20 enum { TCG_AREG0 =3D TCG_REG_R6, diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index 4275787..ebbddb3 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -107,6 +107,7 @@ extern bool have_popcnt; #define TCG_TARGET_HAS_muls2_i32 1 #define TCG_TARGET_HAS_muluh_i32 0 #define TCG_TARGET_HAS_mulsh_i32 0 +#define TCG_TARGET_HAS_jr 1 =20 #if TCG_TARGET_REG_BITS =3D=3D 64 #define TCG_TARGET_HAS_extrl_i64_i32 0 diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c index 5918008..53baf71 100644 --- a/tcg/i386/tcg-target.inc.c +++ b/tcg/i386/tcg-target.inc.c @@ -1909,6 +1909,9 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcod= e opc, case INDEX_op_br: tcg_out_jxx(s, JCC_JMP, arg_label(a0), 0); break; + case INDEX_op_jr: + tcg_out_modrm(s, OPC_GRP5, EXT5_JMPN_Ev, a0); + break; OP_32_64(ld8u): /* Note that we can ignore REXW for the zero-extend to 64-bit. */ tcg_out_modrm_offset(s, OPC_MOVZBL, a0, a1, a2); @@ -2277,6 +2280,7 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcod= e opc, =20 static const TCGTargetOpDef *tcg_target_op_def(TCGOpcode op) { + static const TCGTargetOpDef ri =3D { .args_ct_str =3D { "ri" } }; static const TCGTargetOpDef ri_r =3D { .args_ct_str =3D { "ri", "r" } = }; static const TCGTargetOpDef re_r =3D { .args_ct_str =3D { "re", "r" } = }; static const TCGTargetOpDef qi_r =3D { .args_ct_str =3D { "qi", "r" } = }; @@ -2324,6 +2328,9 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOpc= ode op) case INDEX_op_st_i64: return &re_r; =20 + case INDEX_op_jr: + return &ri; + case INDEX_op_add_i32: case INDEX_op_add_i64: return &r_r_re; diff --git a/tcg/ia64/tcg-target.h b/tcg/ia64/tcg-target.h index 42aea03..a2760ba 100644 --- a/tcg/ia64/tcg-target.h +++ b/tcg/ia64/tcg-target.h @@ -173,6 +173,7 @@ typedef enum { #define TCG_TARGET_HAS_mulsh_i64 0 #define TCG_TARGET_HAS_extrl_i64_i32 0 #define TCG_TARGET_HAS_extrh_i64_i32 0 +#define TCG_TARGET_HAS_jr 0 =20 #define TCG_TARGET_deposit_i32_valid(ofs, len) ((len) <=3D 16) #define TCG_TARGET_deposit_i64_valid(ofs, len) ((len) <=3D 16) diff --git a/tcg/mips/tcg-target.h b/tcg/mips/tcg-target.h index f46d64a..d06e495 100644 --- a/tcg/mips/tcg-target.h +++ b/tcg/mips/tcg-target.h @@ -130,6 +130,7 @@ extern bool use_mips32r2_instructions; #define TCG_TARGET_HAS_muluh_i32 1 #define TCG_TARGET_HAS_mulsh_i32 1 #define TCG_TARGET_HAS_bswap32_i32 1 +#define TCG_TARGET_HAS_jr 0 =20 #if TCG_TARGET_REG_BITS =3D=3D 64 #define TCG_TARGET_HAS_add2_i32 0 diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h index abd8b3d..461bb0c 100644 --- a/tcg/ppc/tcg-target.h +++ b/tcg/ppc/tcg-target.h @@ -82,6 +82,7 @@ extern bool have_isa_3_00; #define TCG_TARGET_HAS_muls2_i32 0 #define TCG_TARGET_HAS_muluh_i32 1 #define TCG_TARGET_HAS_mulsh_i32 1 +#define TCG_TARGET_HAS_jr 0 =20 #if TCG_TARGET_REG_BITS =3D=3D 64 #define TCG_TARGET_HAS_add2_i32 0 diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h index cbdd2a6..b35c7b1 100644 --- a/tcg/s390/tcg-target.h +++ b/tcg/s390/tcg-target.h @@ -92,6 +92,7 @@ extern uint64_t s390_facilities; #define TCG_TARGET_HAS_mulsh_i32 0 #define TCG_TARGET_HAS_extrl_i64_i32 0 #define TCG_TARGET_HAS_extrh_i64_i32 0 +#define TCG_TARGET_HAS_jr 0 =20 #define TCG_TARGET_HAS_div2_i64 1 #define TCG_TARGET_HAS_rot_i64 1 diff --git a/tcg/sparc/tcg-target.h b/tcg/sparc/tcg-target.h index b8b74f9..3d6f872 100644 --- a/tcg/sparc/tcg-target.h +++ b/tcg/sparc/tcg-target.h @@ -123,6 +123,7 @@ extern bool use_vis3_instructions; #define TCG_TARGET_HAS_muls2_i32 1 #define TCG_TARGET_HAS_muluh_i32 0 #define TCG_TARGET_HAS_mulsh_i32 0 +#define TCG_TARGET_HAS_jr 0 =20 #define TCG_TARGET_HAS_extrl_i64_i32 1 #define TCG_TARGET_HAS_extrh_i64_i32 1 diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h index c68e300..1924633 100644 --- a/tcg/tcg-op.h +++ b/tcg/tcg-op.h @@ -261,6 +261,12 @@ static inline void tcg_gen_br(TCGLabel *l) tcg_gen_op1(&tcg_ctx, INDEX_op_br, label_arg(l)); } =20 +/* jump to a host address contained in a register */ +static inline void tcg_gen_jr(TCGv_ptr arg) +{ + tcg_gen_op1i(INDEX_op_jr, GET_TCGV_PTR(arg)); +} + void tcg_gen_mb(TCGBar); =20 /* Helper calls. */ diff --git a/tcg/tcg-opc.h b/tcg/tcg-opc.h index f06f894..1e869af 100644 --- a/tcg/tcg-opc.h +++ b/tcg/tcg-opc.h @@ -34,6 +34,7 @@ DEF(set_label, 0, 0, 1, TCG_OPF_BB_END | TCG_OPF_NOT_PRES= ENT) DEF(call, 0, 0, 3, TCG_OPF_CALL_CLOBBER | TCG_OPF_NOT_PRESENT) =20 DEF(br, 0, 0, 1, TCG_OPF_BB_END) +DEF(jr, 0, 1, 0, TCG_OPF_BB_END) =20 #define IMPL(X) (__builtin_constant_p(X) && !(X) ? TCG_OPF_NOT_PRESENT : 0) #if TCG_TARGET_REG_BITS =3D=3D 32 diff --git a/tcg/tcg.c b/tcg/tcg.c index cb898f1..a7e7842 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -1139,6 +1139,7 @@ void tcg_dump_ops(TCGContext *s) switch (c) { case INDEX_op_set_label: case INDEX_op_br: + case INDEX_op_jr: case INDEX_op_brcond_i32: case INDEX_op_brcond_i64: case INDEX_op_brcond2_i32: diff --git a/tcg/tci/tcg-target.h b/tcg/tci/tcg-target.h index 838bf3a..63d1a57 100644 --- a/tcg/tci/tcg-target.h +++ b/tcg/tci/tcg-target.h @@ -85,6 +85,7 @@ #define TCG_TARGET_HAS_muls2_i32 0 #define TCG_TARGET_HAS_muluh_i32 0 #define TCG_TARGET_HAS_mulsh_i32 0 +#define TCG_TARGET_HAS_jr 0 =20 #if TCG_TARGET_REG_BITS =3D=3D 64 #define TCG_TARGET_HAS_extrl_i64_i32 0 --=20 2.7.4 From nobody Mon Apr 29 10:07:08 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1491960273111972.9056132210396; Tue, 11 Apr 2017 18:24:33 -0700 (PDT) Received: from localhost ([::1]:41790 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cy71L-0004VH-S4 for importer@patchew.org; Tue, 11 Apr 2017 21:24:31 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41196) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cy6ux-00082N-Mc for qemu-devel@nongnu.org; Tue, 11 Apr 2017 21:17:56 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cy6uw-0006Q0-Cr for qemu-devel@nongnu.org; Tue, 11 Apr 2017 21:17:55 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:45414) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cy6us-0006MH-35; Tue, 11 Apr 2017 21:17:50 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id CFD0620B95; Tue, 11 Apr 2017 21:17:47 -0400 (EDT) Received: from frontend2 ([10.202.2.161]) by compute4.internal (MEProxy); Tue, 11 Apr 2017 21:17:47 -0400 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 8D903241ED; Tue, 11 Apr 2017 21:17:47 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc:x-sasl-enc; s=mesmtp; bh=Zcj yjsaGqmYZPFITU1ASGJzz7Ubs2TwYNDmlng9VS1E=; b=pl0dq8N5eOO28rexDuu nIeawe7EX4DXPCnsb/Ohlz9TAvLpTJTYQGtnbrG3n+8vF2xiX21RhvyULiKT6Pfu SF7EuEXJJRoen/HeRKLbsk7QKCI4KN+rztwjEOqUzx8+PTXNnZZ8++vB5zmAqIOZ eQ0xsiF7u2ph8q80V2hU8gIY= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc :x-sasl-enc; s=fm1; bh=ZcjyjsaGqmYZPFITU1ASGJzz7Ubs2TwYNDmlng9VS 1E=; b=RSW1UiuQPbqPyDGfK5Gl3q1uDOWyjw7CHSRvSCQpcrm3HvpS4cgfJdbGG nnB8mNvMvlv29SZcfKS1fLhrGfolLcXMXopkDXmaelux/BtgxF7TjnfXepEQKI2R QYnIVNYvjPRE5I4rilgEIYoq2pkdIn84wh1N0TqW0PXd6IoIpeynyEUc7fH1NlSg KoakZwv1D2jLPb+QoR46oKHLL2IL9JkUZYt45KRrFJ5GIU2l9e4vZRZCjDYG4dEx 3ohYeKwtWGIrHIV4PYhRsFLirM2XJ4bdTwwds5rhCySO/3tEVWFOoLKK/LHT2sfI 06sDCZ5OC3QbMPhfkgm/OIUTcWL3Q== X-ME-Sender: X-Sasl-enc: ERqma7IXp1LszafXKhydLnGpPiI9+MZL93SFqLcWFQJf 1491959867 From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Tue, 11 Apr 2017 21:17:26 -0400 Message-Id: <1491959850-30756-7-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1491959850-30756-1-git-send-email-cota@braap.org> References: <1491959850-30756-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.25 Subject: [Qemu-devel] [PATCH 06/10] tcg: add brcondi_ptr X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Eduardo Habkost , Peter Crosthwaite , Stefan Weil , Claudio Fontana , Alexander Graf , alex.bennee@linaro.org, qemu-arm@nongnu.org, Pranith Kumar , Paolo Bonzini , Aurelien Jarno , Richard Henderson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" This will be used by TCG targets to implement a fast path for indirect branches. Signed-off-by: Emilio G. Cota --- tcg/tcg-op.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h index 1924633..abf784b 100644 --- a/tcg/tcg-op.h +++ b/tcg/tcg-op.h @@ -1118,6 +1118,8 @@ void tcg_gen_atomic_xor_fetch_i64(TCGv_i64, TCGv, TCG= v_i64, TCGArg, TCGMemOp); tcg_gen_addi_i32(TCGV_PTR_TO_NAT(R), TCGV_PTR_TO_NAT(A), (B)) # define tcg_gen_ext_i32_ptr(R, A) \ tcg_gen_mov_i32(TCGV_PTR_TO_NAT(R), (A)) +# define tcg_gen_brcondi_ptr(C, A, I, L) \ + tcg_gen_brcondi_i32(C, TCGV_PTR_TO_NAT(A), (uintptr_t)I, L) #else # define tcg_gen_ld_ptr(R, A, O) \ tcg_gen_ld_i64(TCGV_PTR_TO_NAT(R), (A), (O)) @@ -1129,4 +1131,6 @@ void tcg_gen_atomic_xor_fetch_i64(TCGv_i64, TCGv, TCG= v_i64, TCGArg, TCGMemOp); tcg_gen_addi_i64(TCGV_PTR_TO_NAT(R), TCGV_PTR_TO_NAT(A), (B)) # define tcg_gen_ext_i32_ptr(R, A) \ tcg_gen_ext_i32_i64(TCGV_PTR_TO_NAT(R), (A)) +# define tcg_gen_brcondi_ptr(C, A, I, L) \ + tcg_gen_brcondi_i64(C, TCGV_PTR_TO_NAT(A), (uintptr_t)I, L) #endif /* UINTPTR_MAX =3D=3D UINT32_MAX */ --=20 2.7.4 From nobody Mon Apr 29 10:07:08 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1491960027681207.0601931123715; Tue, 11 Apr 2017 18:20:27 -0700 (PDT) Received: from localhost ([::1]:41769 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cy6xO-0001EK-6d for importer@patchew.org; Tue, 11 Apr 2017 21:20:26 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41189) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cy6ux-00082F-BV for qemu-devel@nongnu.org; Tue, 11 Apr 2017 21:17:56 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cy6uw-0006Q7-Fl for qemu-devel@nongnu.org; Tue, 11 Apr 2017 21:17:55 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:51440) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cy6us-0006MJ-48; Tue, 11 Apr 2017 21:17:50 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 09F2320BB5; Tue, 11 Apr 2017 21:17:48 -0400 (EDT) Received: from frontend1 ([10.202.2.160]) by compute4.internal (MEProxy); Tue, 11 Apr 2017 21:17:48 -0400 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id C1AF37E442; Tue, 11 Apr 2017 21:17:47 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc:x-sasl-enc; s=mesmtp; bh=po/ WoS885TO1nQ8TTU01/OZ/65vsDKgGby8PgM3fTBo=; b=NpfsuWyGCyIjfut8GDp KUPRwqDLnSJtcNqrdms/a8Ez4jTCk4H76VmRX9nrtAsViOd62HdZ7ef6tkY0g2q+ B/H0vTu+PkCy0+D/1aD4vu4hMC297e9QZ2cgTb+/MO5RspECFdze7fkg9uKvFmF1 NhN+YM6WPxACtHfE6Fhwblgc= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc :x-sasl-enc; s=fm1; bh=po/WoS885TO1nQ8TTU01/OZ/65vsDKgGby8PgM3fT Bo=; b=Gxs3+cbY71Ptt0uT2/DbG58j+YO8ByvojGw5rh7LqoKDklgUEPG0Xv1n9 l7cIAoJQpP0cQvNC4IF3RfAu8D+SUr5OHBcJsXG4Hi4dDyg4hHjtnElmvSMYt42b 3isDjFw9+Vb0s5V7w20rP9bB9PjR1zt3MqoqAPUVFyCwLt1BeNjumP0nSk4rNZ0x tGbrxcKb9/J6+PA3cuUlhDz/w6a1BYKgU9UA0Rbx3/sbRaw+Jr2JKCS+3qQq4xVC T3uygzfelILMUNUXS6HnR4hRkElbuCVWoAOKaelqR9mrsqG2qTsIehAS14yRo0NS Z8iYKwKTnf1YFtdSQ2fJn1uuUpmxQ== X-ME-Sender: X-Sasl-enc: ERq6cLwZokTpwLjSIxSdLnGpPiI9+MZL93SFqLcWFQJf 1491959867 From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Tue, 11 Apr 2017 21:17:27 -0400 Message-Id: <1491959850-30756-8-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1491959850-30756-1-git-send-email-cota@braap.org> References: <1491959850-30756-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.25 Subject: [Qemu-devel] [PATCH 07/10] tcg: add tcg_temp_local_new_ptr X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Eduardo Habkost , Peter Crosthwaite , Stefan Weil , Claudio Fontana , Alexander Graf , alex.bennee@linaro.org, qemu-arm@nongnu.org, Pranith Kumar , Paolo Bonzini , Aurelien Jarno , Richard Henderson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" This will be used by TCG targets to implement a fast path for indirect branches. Signed-off-by: Emilio G. Cota --- tcg/tcg.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tcg/tcg.h b/tcg/tcg.h index 6c216bb..37a7c8e 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -912,6 +912,7 @@ do {\ #define tcg_global_mem_new_ptr(R, O, N) \ TCGV_NAT_TO_PTR(tcg_global_mem_new_i32((R), (O), (N))) #define tcg_temp_new_ptr() TCGV_NAT_TO_PTR(tcg_temp_new_i32()) +#define tcg_temp_local_new_ptr() TCGV_NAT_TO_PTR(tcg_temp_local_new_i32()) #define tcg_temp_free_ptr(T) tcg_temp_free_i32(TCGV_PTR_TO_NAT(T)) #else #define TCGV_NAT_TO_PTR(n) MAKE_TCGV_PTR(GET_TCGV_I64(n)) @@ -923,6 +924,7 @@ do {\ #define tcg_global_mem_new_ptr(R, O, N) \ TCGV_NAT_TO_PTR(tcg_global_mem_new_i64((R), (O), (N))) #define tcg_temp_new_ptr() TCGV_NAT_TO_PTR(tcg_temp_new_i64()) +#define tcg_temp_local_new_ptr() TCGV_NAT_TO_PTR(tcg_temp_local_new_i64()) #define tcg_temp_free_ptr(T) tcg_temp_free_i64(TCGV_PTR_TO_NAT(T)) #endif =20 --=20 2.7.4 From nobody Mon Apr 29 10:07:08 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1491960019419377.6246414412541; Tue, 11 Apr 2017 18:20:19 -0700 (PDT) Received: from localhost ([::1]:41765 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cy6xF-00018L-A8 for importer@patchew.org; Tue, 11 Apr 2017 21:20:17 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41249) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cy6uy-00084D-OK for qemu-devel@nongnu.org; Tue, 11 Apr 2017 21:17:58 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cy6ux-0006Qp-0z for qemu-devel@nongnu.org; Tue, 11 Apr 2017 21:17:56 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:42494) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cy6us-0006MK-2P; Tue, 11 Apr 2017 21:17:50 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 3ED9D20BA5; Tue, 11 Apr 2017 21:17:48 -0400 (EDT) Received: from frontend2 ([10.202.2.161]) by compute4.internal (MEProxy); Tue, 11 Apr 2017 21:17:48 -0400 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id ED3E12400E; Tue, 11 Apr 2017 21:17:47 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc:x-sasl-enc; s=mesmtp; bh=sj/ GcmZd5gFAWFWN7ErjjJ8uhFNieQtxw9vA6fVH+WY=; b=azy0MhK1ZpedannHiQN UY0RfOEjHudI1ZbnJ2nmjew8CVOGvhSUsC5b1wvxkYtKT8GY0mVqXgatbinwQtjH IV+LCb9Qy9/qVj+bawA4AvukpwTF7qM5m8bkFY+AFHx4hLOFqW7nAQ0LHByx7gBq oqLapo2qiIN1bfo0LVPAnnKE= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc :x-sasl-enc; s=fm1; bh=sj/GcmZd5gFAWFWN7ErjjJ8uhFNieQtxw9vA6fVH+ WY=; b=JaGSfNzAe1Kw/RpoL7Fwdn6N3hlW2AIrtKl9gju99O8B0U0+03iMqSR3X 37LBOcUdfEHUWlIrcz8RK/HRE7+bFwl1zbP4IsTqjxf2D3U2J/EVQCk8kZWhHg06 tOfMWZQBmQCnMGHiAXhT61vGvOMtgmFOcAghAI7tjInUIJkTtNbVF1jmk3j7pEHA qKhH7/jjP9s2ovUpnXHrWu7mBoDFA/LXubJC1OnYBO1A6lt+Ni9pkG/jHRFKdshZ 2eFocnHRAbAo/BtWIHV8mzNucplmvvrCVwBXtJXlCWG2zllSqTgbD+U+ud1wFgjO 5yAWE8z3lL4Cw0Mfouughf58csxZw== X-ME-Sender: X-Sasl-enc: ERq3Z6EBu0Tg17PfIgedLnGpPiI9+MZL93SFqLcWFQJf 1491959867 From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Tue, 11 Apr 2017 21:17:28 -0400 Message-Id: <1491959850-30756-9-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1491959850-30756-1-git-send-email-cota@braap.org> References: <1491959850-30756-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.25 Subject: [Qemu-devel] [PATCH 08/10] target/arm: optimize indirect branches with TCG's jr op X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Eduardo Habkost , Peter Crosthwaite , Stefan Weil , Claudio Fontana , Alexander Graf , alex.bennee@linaro.org, qemu-arm@nongnu.org, Pranith Kumar , Paolo Bonzini , Aurelien Jarno , Richard Henderson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Speed up indirect branches by adding a helper to look for the TB in tb_jmp_cache. The helper returns either the corresponding host address or NULL. Measurements: - Impact on Boot time | setup | ARM debian boot+shutdown time | stddev | |---------+-------------------------------+--------| | master | 10.050247057 | 0.0361 | | +cross | 10.311265443 | 0.0721 | | +jr | 10.216832579 | 0.0878 | | +inline | 10.405597879 | 0.0332 | That is, a 3.5% slowdown. This is reasonable since booting has low hit rates in tb_jmp_cache. - NBench, arm-linux-user. Host: Intel i7-4790K @ 4.00GHz Y axis: speedup over 95b31d70 1.25x+-+-------------------------------------------------------------+-+ | jr $$$ | | jr+inline %%% | 1.2x+-+..................................$$$%%......................+-+ | $ $ % | | $ $ % | | %%% $ $ % %% | 1.15x+-+........................%.%.......$.$.%.$$$%.................+-+ | % % $ $ % $ $% | | $$$ % $ $ % $ $% | 1.1x+-+......................$.$.%.......$.$.%.$.$%.................+-+ | $ $ % $ $ % $ $% | | $ $ % $$$ $ $ % $ $% %%% | 1.05x+-+......................$.$.%.$.$%%.$.$.%.$.$%.............$$$.%-+ | $$$%% $$%% $ $ % $ $ % $ $ % $ $% $ $ % | | $$$%% $ $ % $$ % $ $ % $ $ % $ $ % $ $% $ $ % | | $ $ % $ $ % $$ % $ $ % $ $ % $ $ % $ $% %%% $$$%% $ $ % | 1x+-$.$B%R$$$%%G$A$H%T$$P%j$+$n%i$e$.%.$.$.%.$.$%.$$$.%.$.$.%.$.$.%-+ +-$$$%%-$$$%%-$$$%%-$$%%-$$$%%-$$$%%-$$$%%-$$$%-$$$%%-$$$%%-$$$%%-+ ASSIGNMBITFIELFOFP_EMULATHUFFMANLU_DECOMPNEURNUMERICSTRING_SOhmean png: http://imgur.com/ihqQj6l That is, a 6.65% hmean improvement with jr+inline (5.92% w/o inlining). Peak improvement is 21% for HUFFMAN. - NBench, arm-softmmu. Host: Intel i7-4790K @ 4.00GHz Y axis: speedup over 95b31d70 +------------------------------------------------------------------+ | | 1.3x+-+........................................ cross+noinline $$ +-+ | cross+inline %% | | && @@&& cross+jr+noinline @@ | | $$%@& @@ & cross+jr+inline && | 1.2x+-+.................$$%@&......$$..&&..@@.&......................+-+ | $$%@& $$%%@& @@ & @@& | | $$%@& $$ %@& @@ & @@& | 1.1x+-+.................$$%@&...@@.$$.%@&..@@.&..@@&................&&-+ | $$%@& $$%@& @@&$$ %@& @@ & @@& @@& | | $$%@& $$%@& @@&$$ %@&$$%@ & @@& $$%@& $$%@& | | $$%&& $$%&& $$%@& $$%@&$$$%@&$$ %@&$$%@ & %%@& $$%@& $$%@& | 1x+-$$%@&A$$%@&A$$%@&A$$%@&$R$%@&$$T%@&$$%@s&+%%@&n$$%@&.$$%@&.$$%@&-+ | $$%@& $$%@& $$%@& $$%@&$ $%@&$$ %@&$$%@ & %%@& $$%@& $$%@& $$%@& | | $$%@& $$%@& $$%@& $$%@&$ $%@&$$ %@&$$%@ & %%@& $$%@& $$%@& $$%@& | 0.9x+-$$%@&.$$%@&.$$%@&.$$%@&$.$%@&$$.%@&$$%@.&.%%@&.$$%@&.$$%@&.$$%@&-+ | $$%@& $$%@& $$%@& $$%@&$ $%@&$$ %@&$$%@ & %%@& $$%@& $$%@& $$%@& | | $$%@& $$%@& $$%@& $$%@&$ $%@&$$ %@&$$%@ &$$%@& $$%@& $$%@& $$%@& | | $$%@& $$%@& $$%@& $$%@&$ $%@&$$ %@&$$%@ &$$%@& $$%@& $$%@& $$%@& | 0.8x+-$$%@&-$$%@&-$$%@&-$$%@&$$$%@&$$%%@&$$%@&&$$%@&-$$%@&-$$%@&-$$%@&-+ ASSIGNMBITFIELFOUFP_EMULATHUFFMALU_DECOMPNEURANUMERICSTRING_SOhmean png: http://imgur.com/yWJivBl That is, a 9.86% hmean improvement when combining cross+jr+inline (this com= mit) over current master. Peak improvement is 25% for FP_EMULATION. Signed-off-by: Emilio G. Cota --- target/arm/helper.c | 11 +++++++++++ target/arm/helper.h | 1 + target/arm/translate.c | 23 +++++++++++++++++++++++ 3 files changed, 35 insertions(+) diff --git a/target/arm/helper.c b/target/arm/helper.c index 10b8807..dfbc488 100644 --- a/target/arm/helper.c +++ b/target/arm/helper.c @@ -9927,3 +9927,14 @@ uint32_t HELPER(cross_page_check)(CPUARMState *env, = target_ulong vaddr) { return !!tb_from_jmp_cache(env, vaddr); } + +void *HELPER(get_hostptr)(CPUARMState *env, target_ulong vaddr) +{ + TranslationBlock *tb; + + tb =3D tb_from_jmp_cache(env, vaddr); + if (unlikely(tb =3D=3D NULL)) { + return NULL; + } + return tb->tc_ptr; +} diff --git a/target/arm/helper.h b/target/arm/helper.h index d4b779b..0faacc1 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -2,6 +2,7 @@ DEF_HELPER_FLAGS_1(sxtb16, TCG_CALL_NO_RWG_SE, i32, i32) DEF_HELPER_FLAGS_1(uxtb16, TCG_CALL_NO_RWG_SE, i32, i32) =20 DEF_HELPER_2(cross_page_check, i32, env, tl) +DEF_HELPER_2(get_hostptr, ptr, env, tl) =20 DEF_HELPER_3(add_setq, i32, env, i32, i32) DEF_HELPER_3(add_saturate, i32, env, i32, i32) diff --git a/target/arm/translate.c b/target/arm/translate.c index ce97d0c..2510bb2 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -65,6 +65,14 @@ static TCGv_i32 cpu_R[16]; TCGv_i32 cpu_CF, cpu_NF, cpu_VF, cpu_ZF; TCGv_i64 cpu_exclusive_addr; TCGv_i64 cpu_exclusive_val; +static bool gen_jr; + +static inline void set_jr(void) +{ + if (TCG_TARGET_HAS_jr) { + gen_jr =3D true; + } +} =20 /* FIXME: These should be removed. */ static TCGv_i32 cpu_F0s, cpu_F1s; @@ -221,6 +229,7 @@ static void store_reg(DisasContext *s, int reg, TCGv_i3= 2 var) */ tcg_gen_andi_i32(var, var, s->thumb ? ~1 : ~3); s->is_jmp =3D DISAS_JUMP; + set_jr(); } tcg_gen_mov_i32(cpu_R[reg], var); tcg_temp_free_i32(var); @@ -893,6 +902,7 @@ static inline void gen_bx_im(DisasContext *s, uint32_t = addr) tcg_temp_free_i32(tmp); } tcg_gen_movi_i32(cpu_R[15], addr & ~1); + set_jr(); } =20 /* Set PC and Thumb state from var. var is marked as dead. */ @@ -902,6 +912,7 @@ static inline void gen_bx(DisasContext *s, TCGv_i32 var) tcg_gen_andi_i32(cpu_R[15], var, ~1); tcg_gen_andi_i32(var, var, 1); store_cpu_field(var, thumb); + set_jr(); } =20 /* Variant of store_reg which uses branch&exchange logic when storing @@ -12042,6 +12053,18 @@ void gen_intermediate_code(CPUARMState *env, Trans= lationBlock *tb) gen_set_pc_im(dc, dc->pc); /* fall through */ case DISAS_JUMP: + if (TCG_TARGET_HAS_jr && gen_jr) { + TCGv_ptr ptr =3D tcg_temp_local_new_ptr(); + TCGLabel *label =3D gen_new_label(); + + gen_jr =3D false; + gen_helper_get_hostptr(ptr, cpu_env, cpu_R[15]); + tcg_gen_brcondi_ptr(TCG_COND_EQ, ptr, NULL, label); + tcg_gen_jr(ptr); + tcg_temp_free_ptr(ptr); + gen_set_label(label); + /* fall through */ + } default: /* indicate that the hash table must be used to find the next = TB */ tcg_gen_exit_tb(0); --=20 2.7.4 From nobody Mon Apr 29 10:07:08 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1491960279812654.3637885630691; Tue, 11 Apr 2017 18:24:39 -0700 (PDT) Received: from localhost ([::1]:41791 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cy71S-0004ZG-Et for importer@patchew.org; Tue, 11 Apr 2017 21:24:38 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41290) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cy6v1-00088K-HV for qemu-devel@nongnu.org; Tue, 11 Apr 2017 21:18:04 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cy6uz-0006T9-JN for qemu-devel@nongnu.org; Tue, 11 Apr 2017 21:17:59 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:46195) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cy6us-0006MM-31; Tue, 11 Apr 2017 21:17:50 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 7AF1120BF7; Tue, 11 Apr 2017 21:17:48 -0400 (EDT) Received: from frontend1 ([10.202.2.160]) by compute4.internal (MEProxy); Tue, 11 Apr 2017 21:17:48 -0400 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 2D22D7E31E; Tue, 11 Apr 2017 21:17:48 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc:x-sasl-enc; s=mesmtp; bh=VPR Yhwu83IUsd/Kj+jp04soBYNWiePOGn1IRbHBB87o=; b=K9vsObPHu1kGVXBgccT tQuq+Yh5cScyaX+rIbcZdksZ2BsutV7KMIjL0Xeh9fOGlbinWSJCWoeVz+84DBX8 VtWk4jwt9c4cAsgeUytI6mlaQcfSXuD057b8vmj2GN+nZi3+4Of0G2z5APrPrJ5E GGC3DYmm6FOKt8cMiJ7FjIAo= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc :x-sasl-enc; s=fm1; bh=VPRYhwu83IUsd/Kj+jp04soBYNWiePOGn1IRbHBB8 7o=; b=CXhTdt8NE95Q3DdT58kZ/T6kbGl0Sc3r4OYNdR+NADekre/FsXZtYVZsJ EEhX3u7QOE7ArZmAKFqP9DNPw7zBWJ58l02D531yPn8oG9qbKb6PhmH4vuwTcG+c FDMOtBCBQoE0oO494SCdLz+356CxzVZgkU1Y72A4Tz0/Thv5j3AwNPDWSQJ+aDyB DJ2tK7cFKl7eDH3+RoUDa3zmisAXI0vSyo48KlLOR22LsrABD3FtrjAMHH0fEWwT 0bvBHN3NPLqtE/+whSDrMJGirxl13/1n80cm65G+eotBzHkzmJ2cbULwtWcqubcd 9N8pkJZJAqEQeVDl/w+rXQU7qC/gw== X-ME-Sender: X-Sasl-enc: JnWji4Xm8guEmldgpjQA825Jx5d8cGmGMLLvPC+YIEX1 1491959868 From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Tue, 11 Apr 2017 21:17:29 -0400 Message-Id: <1491959850-30756-10-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1491959850-30756-1-git-send-email-cota@braap.org> References: <1491959850-30756-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.25 Subject: [Qemu-devel] [PATCH 09/10] target/i386: optimize indirect branches with TCG's jr op X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Eduardo Habkost , Peter Crosthwaite , Stefan Weil , Claudio Fontana , Alexander Graf , alex.bennee@linaro.org, qemu-arm@nongnu.org, Pranith Kumar , Paolo Bonzini , Aurelien Jarno , Richard Henderson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Speed up indirect branches by adding a helper to look for the TB in tb_jmp_cache. The helper returns either the corresponding host address or NULL. Measurements: - NBench, x86_64-linux-user. Host: Intel i7-4790K @ 4.00GHz Y axis: Speedup over 95b31d70 1.1x+-+-------------------------------------------------------------+-+ | jr $$ | 1.08x+-+...... jr+inline %% ..................................+-+ | | | $$$ | 1.06x+-$.$............................%%%............................+-+ | $ $%% % % | 1.04x+-$.$.%..........................%.%............................+-+ | $ $ % $$$ % $$$ | | $ $ % %%% $ $ % $ $%% | 1.02x+-$.$.%.........%%%.$$.%.......$.$.%...%%%...%%.......$.$.%.$$$%%-+ | $ $ % % % $$ % $$$ $ $ % $$$ % %% $$$%% $ $ % $ $ % | 1x+-$.$B%R$$$ARGRA%H%T$$P%j$+$%%i$e$.%.$.$.%.$$$%.$.$.%.$.$.%.$.$.%-+ | $ $ % $ $%% $$$ % $$ % $ $ % $ $ % $ $ % $ $% $ $ % $ $ % $ $ % | 0.98x+-$.$.%.$.$.%.$.$.%.$$.%.$.$.%.$.$.%.$.$.%.$.$%.$.$.%.$.$.%.$.$.%-+ | $ $ % $ $ % $ $ % $$ % $ $ % $ $ % $ $ % $ $% $ $ % $ $ % $ $ % | | $ $ % $ $ % $ $ % $$ % $ $ % $ $ % $ $ % $ $% $ $ % $ $ % $ $ % | 0.96x+-$.$.%.$.$.%.$.$.%.$$.%.$.$.%.$.$.%.$.$.%.$.$%.$.$.%.$.$.%.$.$.%-+ +-$$$%%-$$$%%-$$$%%-$$%%-$$$%%-$$$%%-$$$%%-$$$%-$$$%%-$$$%%-$$$%%-+ ASSIGNMBITFIELFOFP_EMULATHUFFMANLU_DECOMPNEURNUMERICSTRING_SOhmean png: http://imgur.com/Jxj4hBd The fact that NBench is not very sensitive to changes here is a little surprising, especially given the significant improvements for ARM shown in the previous commit. I wonder whether the compiler is doing a better job compiling the x86_64 version (I'm using gcc 5.4.0), or I'm sim= ply missing some i386 instructions to which the jr optimization should be applied. specINT 2006 (test set), x86_64-linux-user. Host: Intel i7-4790K @ 4.0= 0GHz Y axis: Speedup over 95b31d70 1.3x+-+-------------------------------------------------------------+-+ | jr+inline $$ | 1.25x+-+.............................................................+-+ | | 1.2x+-+.............................................................+-+ | | | +++ +++ | 1.15x+-+...................$$$.................$$$...................+-+ | $ $ $:$ | 1.1x+-+...................$.$.................$.$...........$$$$....+-+ | +++ $ $ $ $ +++ $++$ | 1.05x+-+.........$$$$......$.$.................$.$...........$..$....+-+ | $ $ $ $ $$$ $ $ $$$$ $$$$ $ $ $$$$ | | $$$$ +++ $ $ +++ $ $ $ $ +++ $$$ $ $ $ $ $++$ $ $ $ $ | 1x+-$BA$G$$$$_$EM$_$$$$.$.$..$.$..$$$..$.$..$.$.$..$.$..$.$..$.$..$-+ | $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ | 0.95x+-$..$.$..$.$..$.$..$.$.$..$.$..$.$..$.$..$.$.$..$.$..$.$..$.$..$-+ | $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ | 0.9x+-$$$$-$$$$-$$$$-$$$$-$$$--$$$--$$$--$$$--$$$-$$$$-$$$$-$$$$-$$$$-+ astarbzip2gcc gobmh264rehmlibquantumcfomneperlbensjxalancbhmean png: http://imgur.com/63Ncmx8 That is a 4.4% hmean perf improvement. - specINT 2006 (train set), x86_64-linux-user. Host: Intel i7-4790K @ 4.00= GHz Y axis: Speedup over 95b31d70 1.4x+-+--------------------------------------------------------------+-+ | jr $$ | | | 1.3x+-+..............................................................+-+ | | | | 1.2x+-+......................................................$$$$....+-+ | +++ $$$$ : $++$ | | $$$$ $$$$ $ $ : $ $ | 1.1x+-+...................$..$................$..$.$..$.$$$$.$..$....+-+ | $ $ $ $ $ $ $: $ $ $ +++ | | +++ +++ +++ $ $ $$$$ +++ $ $ $ $ $: $ $ $ $$$$ | 1x+-$$$$GRAPH_$$$$_$$$$.$..$.$..$.$$$$......$..$.$..$.$..$.$..$.$..$-+ | $++$ $$$$ $ $ $++$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ | | $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ | 0.9x+-$..$.$..$.$..$.$..$.$..$.$..$.$..$......$..$.$..$.$..$.$..$.$..$-+ | $ $ $ $ $ $ $ $ $ $ $ $ $ $ $$$$ $ $ $ $ $ $ $ $ $ $ | | $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ | 0.8x+-$$$$-$$$$-$$$$-$$$$-$$$$-$$$$-$$$$-$$$$-$$$$-$$$$-$$$$-$$$$-$$$$-+ astarbzip2 gcc gobmh264rehmlibquantmcfomneperlbensjexalancbhmean png: http://imgur.com/hd0BhU6 That is, a 4.39 % hmean improvement for jr+inline, i.e. this commit. (4.5% for noinline). Peak improvement is 20% for xalancbmk. - specINT 2006 (test set), x86_64-softmmu. Host: Intel i7-4790K @ 4.00GHz Y axis: Speedup over 95b31d70 1.3x+-+-------------------------------------------------------------+-+ | cross $$ | 1.25x+-+..... jr %% .........................................+-+ | cross+jr @@ : | 1.2x+-+.............................................................+-+ | : : | | +++ : : | 1.15x+-+...........@@................................................+-+ | $$@@ $$++ +++ : @@ | 1.1x+-+.........$$@@.$$@@.....................................@@....+-+ | $$@@ $$@@ $$ : @@@ +++$$@@ | 1.05x+-+.........$$@@.$$@@...@@...............$$...$$@.@.....$$@@....+-+ | +++$$%@ $$@@ %%@+++++++++++++++$$+: $$@ @++@@ $$%@+$$@@+| | +@@+++@@+$$%@ $$@@++%%@$$$% ::@@ ::@@$$@@@$$% @$$@@ $$%@+$$@@ | 1x+-$$%@A$$%@R$$%@R$$%@$$$%@$_$%@s%%%@$$%%@$$@.@$$%.@$$@@.$$%@.$$%@-+ |+$$%@ $$%@ $$%@ $$%@$ $%@$+$%@ %+%@$$+%@$$@+@$$% @$$@@ $$%@+$$%@ | 0.95x+-$$%@.$$%@.$$%@.$$%@$.$%@$.$%@$$.%@$$.%@$$@.@$$%.@$$%@.$$%@.$$%@-+ | $$%@ $$%@ $$%@ $$%@$ $%@$ $%@$$ %@$$ %@$$%+@$$% @$$%@ $$%@ $$%@ | 0.9x+-$$%@-$$%@-$$%@-$$%@$$$%@$$$%@$$%%@$$%%@$$%@@$$%@@$$%@-$$%@-$$%@-+ astabzip2 gcc gobmh264rehmlibquantumcfomneperlbensjexalanchmean png: http://imgur.com/IV9UtSa Here we see how jr works best when combined with cross -- jr by itself is disappointingly around baseline performance. I attribute this to the freque= nt page invalidations and/or TLB flushes (I'm running Ubuntu 16.04 as the gues= t, so there are many processes), which lowers the maximum attainable hit rate = in tb_jmp_cache. Overall the greatest hmean improvement comes from cross+jr though. - specINT 2006 (train set), x86_64-softmmu. Host: Intel i7-4790K @ 4.0= 0GHz Y axis: Speedup over 95b31d70 1.25x+-+-------------------------------------------------------------+-+ | cross+inline $$ | | cross+jr+inline %% +++ +++ | 1.2x+-+.............................................................+-+ | : : +++ | 1.15x+-+.......................................................%%....+-+ | :: +++ $$$ $$$% $$$% | | $$%%++%%% $:$ $+$% +++ $:$% | 1.1x+-+.........$$.%.$$.%....................$.$..$.$%......$.$%....+-+ | +++ $$+%+$$ %+++++ :+++ $ $: $ $% :%% $+$% +++ | 1.05x+-+....$$...$$.%.$$.%......$$............$.$%.$.$%.$$$%.$.$%.$$%%-+ | $$%% $$ % $$ % $$%% $$: +++ $ $% $ $% $:$% $ $% $$+% | | $$+% $$ % $$ % $$:%+$$%%+++: +++ $ $%+$ $% $:$% $ $% $$ % | 1x+-$$$AR$$A%G$$P%_$$M%_$$o%s$$r%$$$%%e....$.$%.$.$%.$.$%.$.$%.$$.%-+ | $+$% $$ % $$ % $$ %+$$+% $$:%$:$+%$$$++$ $% $ $% $ $% $ $% $$ % | 0.95x+-$.$%.$$.%.$$.%.$$.%.$$.%.$$.%$.$.%$.$..$.$%.$.$%.$.$%.$.$%.$$.%-+ | $ $% $$ % $$ % $$ % $$ % $$ %$ $ %$+$% $ $% $ $% $ $% $ $% $$ % | | $ $% $$ % $$ % $$ % $$ % $$ %$ $ %$ $% $ $% $ $% $ $% $ $% $$ % | 0.9x+-$$$%-$$%%-$$%%-$$%%-$$%%-$$%%$$$%%$$$%-$$$%-$$$%-$$$%-$$$%-$$%%-+ astabzip2 gcc gobmh264rehmlibquantumcfomneperlbensjexalanchmean png: http://imgur.com/CBMxrBH This is the larger "train" set of SPECint06. Here cross+jr comes slightly below cross, but it's within the noise margins (I didn't run this many times, since it takes several hours). Signed-off-by: Emilio G. Cota --- target/i386/helper.h | 1 + target/i386/misc_helper.c | 11 +++++++++++ target/i386/translate.c | 42 +++++++++++++++++++++++++++++++++--------- 3 files changed, 45 insertions(+), 9 deletions(-) diff --git a/target/i386/helper.h b/target/i386/helper.h index dceb343..f7e9f9c 100644 --- a/target/i386/helper.h +++ b/target/i386/helper.h @@ -2,6 +2,7 @@ DEF_HELPER_FLAGS_4(cc_compute_all, TCG_CALL_NO_RWG_SE, tl, = tl, tl, tl, int) DEF_HELPER_FLAGS_4(cc_compute_c, TCG_CALL_NO_RWG_SE, tl, tl, tl, tl, int) =20 DEF_HELPER_2(cross_page_check, i32, env, tl) +DEF_HELPER_2(get_hostptr, ptr, env, tl) =20 DEF_HELPER_3(write_eflags, void, env, tl, i32) DEF_HELPER_1(read_eflags, tl, env) diff --git a/target/i386/misc_helper.c b/target/i386/misc_helper.c index a41daed..5d50ab0 100644 --- a/target/i386/misc_helper.c +++ b/target/i386/misc_helper.c @@ -642,3 +642,14 @@ uint32_t helper_cross_page_check(CPUX86State *env, tar= get_ulong vaddr) { return !!tb_from_jmp_cache(env, vaddr); } + +void *helper_get_hostptr(CPUX86State *env, target_ulong vaddr) +{ + TranslationBlock *tb; + + tb =3D tb_from_jmp_cache(env, vaddr); + if (unlikely(tb =3D=3D NULL)) { + return NULL; + } + return tb->tc_ptr; +} diff --git a/target/i386/translate.c b/target/i386/translate.c index ffc8ccc..aab5c13 100644 --- a/target/i386/translate.c +++ b/target/i386/translate.c @@ -2521,7 +2521,8 @@ static void gen_bnd_jmp(DisasContext *s) If INHIBIT, set HF_INHIBIT_IRQ_MASK if it isn't already set. If RECHECK_TF, emit a rechecking helper for #DB, ignoring the state of S->TF. This is used by the syscall/sysret insns. */ -static void gen_eob_worker(DisasContext *s, bool inhibit, bool recheck_tf) +static void +gen_eob_worker(DisasContext *s, bool inhibit, bool recheck_tf, TCGv jr) { gen_update_cc_op(s); =20 @@ -2542,6 +2543,22 @@ static void gen_eob_worker(DisasContext *s, bool inh= ibit, bool recheck_tf) tcg_gen_exit_tb(0); } else if (s->tf) { gen_helper_single_step(cpu_env); + } else if (jr) { +#ifdef TCG_TARGET_HAS_JR + TCGLabel *label =3D gen_new_label(); + TCGv_ptr ptr =3D tcg_temp_local_new_ptr(); + TCGv vaddr =3D tcg_temp_new(); + + tcg_gen_ld_tl(vaddr, cpu_env, offsetof(CPUX86State, segs[R_CS].bas= e)); + tcg_gen_add_tl(vaddr, vaddr, jr); + gen_helper_get_hostptr(ptr, cpu_env, vaddr); + tcg_temp_free(vaddr); + tcg_gen_brcondi_ptr(TCG_COND_EQ, ptr, NULL, label); + tcg_gen_jr(ptr); + tcg_temp_free_ptr(ptr); + gen_set_label(label); +#endif + tcg_gen_exit_tb(0); } else { tcg_gen_exit_tb(0); } @@ -2552,13 +2569,18 @@ static void gen_eob_worker(DisasContext *s, bool in= hibit, bool recheck_tf) If INHIBIT, set HF_INHIBIT_IRQ_MASK if it isn't already set. */ static void gen_eob_inhibit_irq(DisasContext *s, bool inhibit) { - gen_eob_worker(s, inhibit, false); + gen_eob_worker(s, inhibit, false, NULL); } =20 /* End of block, resetting the inhibit irq flag. */ static void gen_eob(DisasContext *s) { - gen_eob_worker(s, false, false); + gen_eob_worker(s, false, false, NULL); +} + +static void gen_jr(DisasContext *s, TCGv dest) +{ + gen_eob_worker(s, false, false, dest); } =20 /* generate a jump to eip. No segment change must happen before as a @@ -4985,7 +5007,7 @@ static target_ulong disas_insn(CPUX86State *env, Disa= sContext *s, gen_push_v(s, cpu_T1); gen_op_jmp_v(cpu_T0); gen_bnd_jmp(s); - gen_eob(s); + gen_jr(s, cpu_T0); break; case 3: /* lcall Ev */ gen_op_ld_v(s, ot, cpu_T1, cpu_A0); @@ -5003,7 +5025,8 @@ static target_ulong disas_insn(CPUX86State *env, Disa= sContext *s, tcg_const_i32(dflag - 1), tcg_const_i32(s->pc - s->cs_base)); } - gen_eob(s); + tcg_gen_ld_tl(cpu_tmp4, cpu_env, offsetof(CPUX86State, eip)); + gen_jr(s, cpu_tmp4); break; case 4: /* jmp Ev */ if (dflag =3D=3D MO_16) { @@ -5011,7 +5034,7 @@ static target_ulong disas_insn(CPUX86State *env, Disa= sContext *s, } gen_op_jmp_v(cpu_T0); gen_bnd_jmp(s); - gen_eob(s); + gen_jr(s, cpu_T0); break; case 5: /* ljmp Ev */ gen_op_ld_v(s, ot, cpu_T1, cpu_A0); @@ -5026,7 +5049,8 @@ static target_ulong disas_insn(CPUX86State *env, Disa= sContext *s, gen_op_movl_seg_T0_vm(R_CS); gen_op_jmp_v(cpu_T1); } - gen_eob(s); + tcg_gen_ld_tl(cpu_tmp4, cpu_env, offsetof(CPUX86State, eip)); + gen_jr(s, cpu_tmp4); break; case 6: /* push Ev */ gen_push_v(s, cpu_T0); @@ -7143,7 +7167,7 @@ static target_ulong disas_insn(CPUX86State *env, Disa= sContext *s, /* TF handling for the syscall insn is different. The TF bit is c= hecked after the syscall insn completes. This allows #DB to not be generated after one has entered CPL0 if TF is set in FMASK. */ - gen_eob_worker(s, false, true); + gen_eob_worker(s, false, true, NULL); break; case 0x107: /* sysret */ if (!s->pe) { @@ -7158,7 +7182,7 @@ static target_ulong disas_insn(CPUX86State *env, Disa= sContext *s, checked after the sysret insn completes. This allows #DB to= be generated "as if" the syscall insn in userspace has just completed. */ - gen_eob_worker(s, false, true); + gen_eob_worker(s, false, true, NULL); } break; #endif --=20 2.7.4 From nobody Mon Apr 29 10:07:08 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1491960423330259.84318412222717; Tue, 11 Apr 2017 18:27:03 -0700 (PDT) Received: from localhost ([::1]:41804 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cy73l-0005xx-S0 for importer@patchew.org; Tue, 11 Apr 2017 21:27:01 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41275) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cy6uz-00085G-LY for qemu-devel@nongnu.org; Tue, 11 Apr 2017 21:17:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cy6ux-0006RZ-Jc for qemu-devel@nongnu.org; Tue, 11 Apr 2017 21:17:57 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:48652) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cy6us-0006ML-2p; Tue, 11 Apr 2017 21:17:50 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id A1E3520B77; Tue, 11 Apr 2017 21:17:48 -0400 (EDT) Received: from frontend2 ([10.202.2.161]) by compute4.internal (MEProxy); Tue, 11 Apr 2017 21:17:48 -0400 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 5FAD6241ED; Tue, 11 Apr 2017 21:17:48 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc:x-sasl-enc; s=mesmtp; bh=Mh0 ya9PSTnUiWSVOAuPQqMNYf1lHJtq8eqF1Akv/HHI=; b=nomPzFv25frsswkv6Sy 1mvanbBfJ4w8t6Ab/BQigQLy5Yay0SmWxrmt2GaXiLlRCBqrfYHPIjpoGb55eFxB muvQbhM3oiRSLreZu9SrsikOAjj2k0mvw16qk4ZY/5gqDtr8NS/RPeadXE6uMx/g rjpwttw6Rws0siX4hRFP2wEg= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc :x-sasl-enc; s=fm1; bh=Mh0ya9PSTnUiWSVOAuPQqMNYf1lHJtq8eqF1Akv/H HI=; b=d+YFxccPSfZF0v6G8Ek13e64lS5s22jhCSFKmFHP6LfmEEX7dV9u0hc4H H0U5ahYB8hx3ZDrPAZAaI/rn3+5OHtHiz/UMxAMvxvOokfAUfzvmKhwUGeKgy+Sp L74mVSCrDQRIlUr0SbAXL8JDvyMmfsEln+RQzZOm108LvHPOR9HlzWTUph/JcNaB OiU2FkDLe23LdIAPlAoCPjE6l2VMI43zuPtFkiwDhRWJVlZ4fggNHcXHE0plido/ BWSWaH5ECn4szpJmxu9oJ5pagxRCvEpZyckFMhDinDbuMVKjEPVjvkjRXPmo4B6R rWXNfg8d4Kx8EvCzDLZxHUv4tYBlg== X-ME-Sender: X-Sasl-enc: JnWikoH95gmDlF90pyAA825Jx5d8cGmGMLLvPC+YIEX1 1491959868 From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Tue, 11 Apr 2017 21:17:30 -0400 Message-Id: <1491959850-30756-11-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1491959850-30756-1-git-send-email-cota@braap.org> References: <1491959850-30756-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.25 Subject: [Qemu-devel] [PATCH 10/10] tb-hash: improve tb_jmp_cache hash function in user mode X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Eduardo Habkost , Peter Crosthwaite , Stefan Weil , Claudio Fontana , Alexander Graf , alex.bennee@linaro.org, qemu-arm@nongnu.org, Pranith Kumar , Paolo Bonzini , Aurelien Jarno , Richard Henderson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Optimizations to cross-page chaining and indirect jumps make performance more sensitive to the hit rate of tb_jmp_cache. The constraint of reserving some bits for the page number lowers the achievable quality of the hashing function. However, user-mode does not have this requirement. Thus, with this change we use for user-mode a hashing function that is both faster and of better quality than the previous one. Measurements: - specINT 2006 (test set), x86_64-linux-user. Host: Intel i7-4790K @ 4.0= 0GHz Y axis: Speedup over 95b31d70 1.3x+-+-------------------------------------------------------------+-+ | jr $$ | 1.25x+-+.... jr+xxhash %% ....................................+-+ | jr+hash+inline @@ +++ | 1.2x+-+.............................................................+-+ | @@@ | | +++@@ ++@:@ +++ @@+ | 1.15x+-+..................$$$@@...............$$@.@.......@@...@@....+-+ | $ $@@ $$@ @ %%@ @@ | 1.1x+-+..................$.$@@...............$$@.@......%%@.$$@@....+-+ | +++@@+ $ $@@ $$@ @ ++%%@+$$@@ +++| 1.05x+-+.........$$@@.....$.$@@...@@..........$$@.@..@@@.%%@.$$@@...@@-+ | $$@@ $ $@@$$$@@ $$% @$$@+@$$%@ $$@@+$$@@ | |+$$++++++++$$@@+++@@$ $@@$+$@@+++@@$$+@@$$% @$$@+@$$%@ $$%@ $$@@ | 1x+-$$@@A$$%@R$$@@R$$@@$_$%@$_$%@$$s@@$$%%@$$%.@$$%.@$$%@.$$%@.$$%@-+ | $$@@+$$%@ $$%@ $$@@$+$%@$ $%@$$%%@$$+%@$$% @$$% @$$%@ $$%@ $$%@ | 0.95x+-$$%@.$$%@.$$%@.$$%@$.$%@$.$%@$$.%@$$.%@$$%.@$$%.@$$%@.$$%@.$$%@-+ | $$%@ $$%@ $$%@ $$%@$ $%@$ $%@$$ %@$$ %@$$% @$$% @$$%@ $$%@ $$%@ | 0.9x+-$$%@-$$%@-$$%@-$$%@$$$%@$$$%@$$%%@$$%%@$$%@@$$%@@$$%@-$$%@-$$%@-+ astabzip2 gcc gobmh264rehmlibquantumcfomneperlbensjexalanchmean png: http://imgur.com/RiaBuIi That is, a 6.45% hmean improvement for this commit. Note that this is the test set, so some benchmarks take almost no time (and therefore aren't that sensitive to changes here). See "train" results below. Note also that hashing quality is not the only requirement: xxhash gives on average the highest hit rates. However, the time spent computing the hash negates the performance gains coming from the increase in hit rate. Given these results, I dropped xxhash from subsequent experiments. - specINT 2006 (train set), x86_64-linux-user. Host: Intel i7-4790K @ 4.0= 0GHz Y axis: Speedup over 95b31d70 1.4x+-+--------------------------------------------------------------+-+ | jr $$ +++ | | jr+hash %% : | 1.3x+-+.......................................................%%%....+-+ | +++ +++ %:% | | +++ %%% : %+% | 1.2x+-+.....................%%......................%.%..%%%.$$.%....+-+ | ++%% %%% $$+% %:% $$+% | | +++ $$$% $$+% $$ % %:% $$ % | 1.1x+-+...........%%......$.$%................$$.%.$$.%.$$.%.$$.%..%%%-+ | +++ %% $ $% +++ $$ % $$ % $$ % $$ % +%+% | | ++%% +++ ++%% ++%% $ $% $$$+ +++ %%% $$ % $$ % $$ % $$ % $$+% | 1x+-$$$%RGR%%R$$$%H$$$%P$j$%h$s$%.$$%%..%.%.$$.%.$$.%.$$.%.$$.%.$$.%-+ | $+$% $$$% $ $% $+$% $ $% $ $% $$+% % % $$ % $$ % $$ % $$ % $$ % | | $ $% $ $% $ $% $ $% $ $% $ $% $$ % % % $$ % $$ % $$ % $$ % $$ % | 0.9x+-$.$%.$.$%.$.$%.$.$%.$.$%.$.$%.$$.%..%.%.$$.%.$$.%.$$.%.$$.%.$$.%-+ | $ $% $ $% $ $% $ $% $ $% $ $% $$ % $$+% $$ % $$ % $$ % $$ % $$ % | | $ $% $ $% $ $% $ $% $ $% $ $% $$ % $$ % $$ % $$ % $$ % $$ % $$ % | 0.8x+-$$$%-$$$%-$$$%-$$$%-$$$%-$$$%-$$%%-$$%%-$$%%-$$%%-$$%%-$$%%-$$%%-+ astarbzip2 gcc gobmh264rehlibquantumcfomneperlbensjexalancbhmean png: http://imgur.com/55iJJgD That is, a 10.19% hmean improvement for jr+hash (this commit). - NBench, arm-linux-user. Host: Intel i7-4790K @ 4.00GHz Y axis: Speedup over 95b31d70 1.35x+-+-------------------------------------------------------------+-+ | @@@ jr $$ | 1.3x+-+.............@.@. jr+inline %% ...@@@................+-+ | @ @ jr+inline+hash @@ @ @ | | @ @ @ @ | 1.25x+-+.............@.@..........................@.@................+-+ | @ @ @@@ @ @ | 1.2x+-+.............@.@..................$$%.@...@.@................+-+ | @ @ $$% @ @ @ | | @ @ %%@ $$% @ %% @ | 1.15x+-+.............@.@........%%@.......$$%.@$$$%.@................+-+ | @ @ %%@ $$% @$ $% @ | 1.1x+-+.............@.@......$$$%@.......$$%.@$.$%.@...............@@-+ | @ @ $ $%@ $$% @$ $% @ @@ | | @ @ $ $%@ $$%%@ $$% @$ $% @ $$%%@ | 1.05x+-+...........$$%.@$$$%@@$.$%@.$$.%@.$$%.@$.$%.@.........@@.$$.%@-+ | $$%%@ $$% @$ $% @$ $%@ $$ %@ $$% @$ $% @ %%%@ $$ %@ | 1x+-$$.%@AR%%%@R$$%B@$G$%P@$T$%@_$$+%@l$$%+@$s$%.@$$$%@.$$.%@.$$.%@-+ +-$$%%@-$$%%@-$$%@@$$$%@@$$$%@-$$%%@-$$%@@$$$%@@$$$%@-$$%%@-$$%%@-+ ASSIGNMBITFIELFOFP_EMULATHUFFMANLU_DECOMPNEURNUMERICSTRING_SOhmean png: http://imgur.com/i5e1gdY That is, a 11% hmean perf gain--it almost doubles the perf gain from implementing the jr optimization. - NBench, x86_64-linux-user. Host: Intel i7-4790K @ 4.00GHz 1.1x+-+-------------------------------------------------------------+-+ | jr $$ | 1.08x+-+..... jr+inline %% ...................................+-+ | jr+inline+hash @@ | | $$ @@ | 1.06x+-$$.@@.........................%%%.............................+-+ | $$%%@ % % | 1.04x+-$$.%@.........................%.%.............................+-+ | $$ %@ @@@ $$ % $$ | | $$ %@ @ @ %% $$ % $$%%@ | 1.02x+-$$.%@........%%.@$$$%@@......$$.%@..%%@@..%%........$$.%@.$$%%@-+ | $$ %@ @@ %% @$ $% @$$$ $$ %@ $$% @ %%@@$$$% $$ %@ $$ %@ | 1x+-$$.%@A$$R@@RG%%B@$G$%P@$T$%P_$$T%@h$$%+@$$$%e@$.$%@.$$.%@.$$.%@-+ | $$ %@ $$%%@ $$% @$ $% @$ $% $$ %@ $$% @$ $% @$ $%@ $$ %@ $$ %@ | 0.98x+-$$.%@.$$.%@.$$%.@$.$%.@$.$%@.$$.%@.$$%.@$.$%.@$.$%@.$$.%@.$$.%@-+ | $$ %@ $$ %@ $$% @$ $% @$ $%@ $$ %@ $$% @$ $% @$ $%@ $$ %@ $$ %@ | | $$ %@ $$ %@ $$% @$ $% @$ $%@ $$ %@ $$% @$ $% @$ $%@ $$ %@ $$ %@ | 0.96x+-$$.%@.$$.%@.$$%.@$.$%.@$.$%@.$$.%@.$$%.@$.$%.@$.$%@.$$.%@.$$.%@-+ +-$$%%@-$$%%@-$$%@@$$$%@@$$$%@-$$%%@-$$%@@$$$%@@$$$%@-$$%%@-$$%%@-+ ASSIGNMBITFIELFOFP_EMULATHUFFMANLU_DECOMPNEURNUMERICSTRING_SOhmean png: http://imgur.com/Xu0Owgu The fact that NBench is not very sensitive to changes here was mentioned in the previous commit's log. We get a very slight overall decrease in hmean performance, although some workloads improve as well. Note that there are no error bars: NBench re-runs itself until confidence on the stability of the average is >=3D 95%, and it doesn't report the resulting stddev. Signed-off-by: Emilio G. Cota --- include/exec/tb-hash.h | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/include/exec/tb-hash.h b/include/exec/tb-hash.h index 2c27490..b1fe2d0 100644 --- a/include/exec/tb-hash.h +++ b/include/exec/tb-hash.h @@ -22,6 +22,8 @@ =20 #include "exec/tb-hash-xx.h" =20 +#ifdef CONFIG_SOFTMMU + /* Only the bottom TB_JMP_PAGE_BITS of the jump cache hash bits vary for addresses on the same page. The top bits are the same. This allows TLB invalidation to quickly clear a subset of the hash table. */ @@ -45,6 +47,16 @@ static inline unsigned int tb_jmp_cache_hash_func(target= _ulong pc) | (tmp & TB_JMP_ADDR_MASK)); } =20 +#else + +/* In user-mode we can get better hashing because we do not have a TLB */ +static inline unsigned int tb_jmp_cache_hash_func(target_ulong pc) +{ + return (pc ^ (pc >> TB_JMP_CACHE_BITS)) & (TB_JMP_CACHE_SIZE - 1); +} + +#endif /* CONFIG_SOFTMMU */ + static inline uint32_t tb_hash_func(tb_page_addr_t phys_pc, target_ulong pc, uint32_t fl= ags) { --=20 2.7.4