From nobody Wed Nov 5 18:27:51 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; dkim=fail spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 14966824961147.802206846655736; Mon, 5 Jun 2017 10:08:16 -0700 (PDT) Received: from localhost ([::1]:34426 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dHvUC-0002ty-C8 for importer@patchew.org; Mon, 05 Jun 2017 13:08:12 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56755) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dHvFR-0004VD-41 for qemu-devel@nongnu.org; Mon, 05 Jun 2017 12:52:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dHvFN-0003o1-RV for qemu-devel@nongnu.org; Mon, 05 Jun 2017 12:52:57 -0400 Received: from mail-qt0-x241.google.com ([2607:f8b0:400d:c0d::241]:36144) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1dHvFN-0003nd-JM for qemu-devel@nongnu.org; Mon, 05 Jun 2017 12:52:53 -0400 Received: by mail-qt0-x241.google.com with SMTP id s33so9907045qtg.3 for ; Mon, 05 Jun 2017 09:52:53 -0700 (PDT) Received: from bigtime.twiddle.net.com ([2602:47:d954:1500:5e51:4fff:fe40:9c64]) by smtp.gmail.com with ESMTPSA id c6sm1637044qtb.56.2017.06.05.09.52.51 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 05 Jun 2017 09:52:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=kwzLLNiTp1S0cI7SNAFu5LlATQLWK5nsL2E85V/0Oqo=; b=SluyajU4FnH35YrGFH+FVbo2l12PUeMTdxMMke9WUddXmtRq35iy4TCYh0FsV9MfQB OlnZZOzVKnAongdn2r/Opq088bvxPC8HSBvsBKAfK7KXcryXAwPzpP6Wr//aCirpk90z IMTMzxTHA6s8RVKwNPjv3lobFAEyL0bIeI5eOKaIJauUCmeorYN4ykZEOKxtdGj/KG0X A9GXb/ml1o+bXxeR4bwpLjfRLzH9OtjjRTJx7coPEvN/b2U/hv0t1OEYc9MckuYaQhOC 1O1kuHm1E0x8GSElstpRMxwFtc8r3yxlOWs2xNF2aDcawyL28Mm8JSGpHMQW37HYwh9D NLlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references:mime-version:content-transfer-encoding; bh=kwzLLNiTp1S0cI7SNAFu5LlATQLWK5nsL2E85V/0Oqo=; b=a9tVTFYqFCLUZPSD3A0EvjIpWK9Ia/KNkNiEhLEtVLyNzA0NwCz9d1UwwlbMAa+hDh wIr/tgHmQluhuZChJrX0L7xWZ4Yvloixow2iJnbDzo2q6Ahi9TPPown9XcrMQolthTJL +KgfEBQoI/JGhYibpqDzE1IeY2zGTKvkfYyDiuctA6J94/TGtH49Spx99B4U6/nlHt5E VB5sfPBDDhTzmFAOoIZrNktel7C+hjgi4rJLw1HYMvC0tAjC2VZhB5GQJLgri5B8oLEj aghGSBV1wCDBqrpAqJSNWy+lH92sq8FZIar8qJwDYnGU5QmCenWmsGazYjfcD/B00qoe ZJkw== X-Gm-Message-State: AKS2vOwZll3+ke9CtQbi4bSBHbw9WoRr1AsroY87B4bWgkmdlAgjpq7n 8UsXo4uKv19J4ty4HgA= X-Received: by 10.55.45.198 with SMTP id t189mr24942899qkh.108.1496681572643; Mon, 05 Jun 2017 09:52:52 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 5 Jun 2017 09:52:18 -0700 Message-Id: <20170605165233.4135-12-rth@twiddle.net> X-Mailer: git-send-email 2.9.4 In-Reply-To: <20170605165233.4135-1-rth@twiddle.net> References: <20170605165233.4135-1-rth@twiddle.net> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400d:c0d::241 Subject: [Qemu-devel] [PULL 11/26] tb-hash: improve tb_jmp_cache hash function in user mode X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, "Emilio G. Cota" Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 From: "Emilio G. Cota" Optimizations to cross-page chaining and indirect branches make performance more sensitive to the hit rate of tb_jmp_cache. The constraint of reserving some bits for the page number lowers the achievable quality of the hashing function. However, user-mode does not have this requirement. Thus, with this change we use for user-mode a hashing function that is both faster and of better quality than the previous one. Measurements: Note: baseline (i.e. speedup =3D=3D 1x) is QEMU v2.9.0. - SPECint06 (test set), x86_64-linux-user. Host: = Intel i7-6700K @ 4.00GHz 2.2x +-+------------------------------------------------------------------= --------------------------------------------+-+ | = | | jr = | 2x +jr+multhash +................................................= ....+++++...................................+-+ | jr+hash = |$$$ | | = |$+$ | | = ### $ | 1.8x +-+..................................................................= ....#|#.$...................................+-+ | = ++#+# $ | | = |# # $ | 1.6x +-+..................................................................= ..***.#.$....................++$$$..........+-+ | $$$ = *+* # $ |$+$ | | ++$$$ ### $ = * * # $ +++|$ $ | | ++###+$ # # $ = * * # $ ### ****## $ | 1.4x +-+...................***+#.$.........***.#.$........................= ..*.*.#.$...........#+#$$.*++*|#.$..........+-+ | *+* # $ * * # $ = * * # $ # # $ * *+# $ | | * * # $ +++++ * * # $ = * * # $ *** # $ * * # $ ###$$ | 1.2x +-+...................*.*.#.$.***##$$.*.*.#.$........................= ..*.*.#.$.........*.*.#.$.*..*.#.$.***+#+$..+-+ | * * # $ *+* # $ * * # $ +++ = * * # $ ++###$$ * * # $ * * # $ * * # $ | | ***##$$ * * # $ * * # $ * * # $ ***##$$ ++### = * * # $ *** #+$ * * # $ * * # $ * * # $ | | *+*+#+$ ***##$$$ * * # $ * * # $ * * # $ *+* # $ ++####$$ ***+# = * * # $ * * # $ * * # $ * * # $ * * # $ | 1x +-++-*+*+#+$+*+*+#-+$+*+*-#+$+*+*+#+$+*+*+#+$+*-*+#+$+***++#+$+*+*+#$= $+*+*+#+$+*+*+#+$+*+*-#+$+*+-*+#+$+*+*+#+$-++-+ | * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # = $ * * # $ * * # $ * * # $ * * # $ * * # $ | | * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # = $ * * # $ * * # $ * * # $ * * # $ * * # $ | 0.8x +-+--***##$$-***##$$$-***##$$-***##$$-***##$$-***##$$-***###$$-***##$= $-***##$$-***##$$-***##$$-****##$$-***##$$--+-+ astar bzip2 gcc gobmk h264ref hmmlibquantum mcf om= netpperlbench sjengxalancbmk hmean png: http://imgur.com/4UXTrEc Here I also tried the hash function suggested by Paolo ("multhash"): return ((uint64_t) (pc * 2654435761) >> 32) & (TB_JMP_CACHE_SIZE - 1); As you can see it is just as good as the other new function ("hash"), which is what I ended up going with. - SPECint06 (train set), x86_64-linux-user. Host: = Intel i7-6700K @ 4.00GHz 2.6x +-+------------------------------------------------------------------= --------------------------------------------+-+ | = | | jr = ### | 2.4x +jr+hash.............................................................= ..............................#.#...........+-+ | = # # | | = # # | 2.2x +-+..................................................................= ..............................#.#...........+-+ | = # # | | = # # | 2x +-+..................................................................= ..............................#.#...........+-+ | = **** # | | = * * # | 1.8x +-+..................................................................= ...........................*..*.#...........+-+ | = +++ * * # | | = #### #### * * # | 1.6x +-+......................................####........................= .....#..#.****..#..........*..*.#...........+-+ | +++ #++# = **** # * * # #### * * # | | ### # # = * * # * * # # # * * # | 1.4x +-+...................****+#..........****..#........................= ..*..*..#.*..*..#....#..#..*..*.#...........+-+ | *++* # * * # = * * # * * # *** # * * # #### | | * * # #### * * # = * * # * * # * * # * * # **** # | 1.2x +-+...................*..*.#..****++#.*..*..#........................= ..*..*..#.*..*..#..*.*..#..*..*.#..*..*..#..+-+ | ****### * * # * * # * * # = * * # * * # * * # * * # * * # | | * * # ***### * * # * * # * * # ****##= * * # * * # * * # * * # * * # | 1x +-+--****###--***###--****##--****###-****###--***###--***###--****##= --****###-****###--***###--****##--****###--+-+ astar bzip2 gcc gobmk h264ref hmmlibquantum mcf om= netpperlbench sjengxalancbmk hmean png: http://imgur.com/ArCbHqo - NBench, x86_64-linux-user. Host: Intel= i7-6700K @ 4.00GHz 1.12x +-+-----------------------------------------------------------------= --------------------------------------------+-+ | = | | jr += ++ | 1.1x +jr+hash...........................................................#= ###.........................................+-+ | +++#= | # | | | #= ++# | 1.08x +-+................................+++................+++.+++..*****= ..#.........................................+-+ | | +++ | | * | *= # | | | | | | *+++*= # | 1.06x +-+................................****###.............|...|...*...*= ..#.........................+++.............+-+ | *| * |# ****### * *= # | | | *| *++# *| * |# * *= # #### | 1.04x +-+................................*++*..#............*|.*.|#..*...*= ..#........................#.|#.............+-+ | * * # *++*++# * *= # +++#++# | | * * # * * # * *= # | # # +++#### | 1.02x +-+................................*..*..#......+++...*..*..#..*...*= ..#.....................****..#..*****++#...+-+ | +++ * * # +++ | * * # * *= # +++ *| * # *+++* # | | +++ | +++ +++ ++++++ * * # *****### * * # * *= # | +++ ++++++ *++* # * * # | 1x +-++-+++++####++****###++++-+####+-*++*++#-+*+++*-+#++*++*++#++*+-+*= ++#+-+++####-+*****###++*++*++#++*+-+*++#+-++-+ | *****| # *++* |# *****| # * * # * *++# * * # * *= # **** |# * * # * * # * * # | | * | *| # * *++# * | *++# * * # * * # * * # * *= # *| *++# * * # * * # * * # | 0.98x +-+...*.|.*++#..*..*..#..*+++*..#..*..*..#..*...*..#..*..*..#..*...*= ..#..*++*..#..*...*..#..*..*..#..*...*..#...+-+ | *+++* # * * # * * # * * # * * # * * # * *= # * * # * * # * * # * * # | | * * # * * # * * # * * # * * # * * # * *= # * * # * * # * * # * * # | 0.96x +-+---*****###--****###--*****###--****###--*****###--****###--*****= ###--****###--*****###--****###--*****###---+-+ ASSIGNMENT BITFIELD FOURFP EMULATION HUFFMAN LU DECOMPOSITIONE= URAL NNUMERIC SOSTRING SORT hmean png: http://imgur.com/ZXFX0hJ - NBench, arm-linux-user. Host: Intel i7-= 4790K @ 4.00GHz 1.3x +-+-----------------------------------------------------------------= --------------------------------------------+-+ | #### = | | jr # # = +++ | 1.25x +jr+hash.....................#..#...................................= ........####................................+-+ | # # = # # | | # # = # # | 1.2x +-+..........................#..#...................................= ........#..#................................+-+ | # # = # # | | # # = # # | 1.15x +-+..........................#..#...................................= ........#..#................................+-+ | # # #= ### # # | | # # #= # # # | 1.1x +-+..........................#..#..................................#= ..#.....#..#................................+-+ | # # #= # # # +++ | | # # #### #= # # # #### | 1.05x +-+..........................#..#...............#..#.....####......#= ..#.....#..#.........................#..#...+-+ | # # # # # # #= # # # +++ # # | | +++ ***** # #### ***** # # # +++#= # **** # ****### # # | 1x +-++-+*****###++****+++++*+-+*++#+-****++#-+*+++*-+#+++++#++#++*****= ++#+-*++*++#-+*****-++++*++*++#++*****++#+-++-+ | * * # * * | * * # * * # * * # **** # * *= # * * # * *### * *++# * * # | | * * # * *### * * # * * # * * # * * # * *= # * * # * * # * * # * * # | 0.95x +-+...*...*..#..*..*.|#..*...*..#..*..*..#..*...*..#..*..*..#..*...*= ..#..*..*..#..*...*..#..*..*..#..*...*..#...+-+ | * * # * * |# * * # * * # * * # * * # * *= # * * # * * # * * # * * # | | * * # * * |# * * # * * # * * # * * # * *= # * * # * * # * * # * * # | 0.9x +-+---*****###--****###--*****###--****###--*****###--****###--*****= ###--****###--*****###--****###--*****###---+-+ ASSIGNMENT BITFIELD FOURFP EMULATION HUFFMAN LU DECOMPOSITIONE= URAL NNUMERIC SOSTRING SORT hmean png: http://imgur.com/FfD27ey Reviewed-by: Alex Benn=C3=A9e Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota Message-Id: <1493263764-18657-12-git-send-email-cota@braap.org> Signed-off-by: Richard Henderson --- include/exec/tb-hash.h | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/include/exec/tb-hash.h b/include/exec/tb-hash.h index 2c27490..b1fe2d0 100644 --- a/include/exec/tb-hash.h +++ b/include/exec/tb-hash.h @@ -22,6 +22,8 @@ =20 #include "exec/tb-hash-xx.h" =20 +#ifdef CONFIG_SOFTMMU + /* Only the bottom TB_JMP_PAGE_BITS of the jump cache hash bits vary for addresses on the same page. The top bits are the same. This allows TLB invalidation to quickly clear a subset of the hash table. */ @@ -45,6 +47,16 @@ static inline unsigned int tb_jmp_cache_hash_func(target= _ulong pc) | (tmp & TB_JMP_ADDR_MASK)); } =20 +#else + +/* In user-mode we can get better hashing because we do not have a TLB */ +static inline unsigned int tb_jmp_cache_hash_func(target_ulong pc) +{ + return (pc ^ (pc >> TB_JMP_CACHE_BITS)) & (TB_JMP_CACHE_SIZE - 1); +} + +#endif /* CONFIG_SOFTMMU */ + static inline uint32_t tb_hash_func(tb_page_addr_t phys_pc, target_ulong pc, uint32_t fl= ags) { --=20 2.9.4