From nobody Fri Dec 19 20:12:21 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; dkim=fail spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1493753850075151.55826541160252; Tue, 2 May 2017 12:37:30 -0700 (PDT) Received: from localhost ([::1]:33362 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1d5dbz-0005Sb-Mt for importer@patchew.org; Tue, 02 May 2017 15:37:27 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59487) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1d5dOM-0001q3-Rp for qemu-devel@nongnu.org; Tue, 02 May 2017 15:23:26 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1d5dOL-0001dw-2B for qemu-devel@nongnu.org; Tue, 02 May 2017 15:23:22 -0400 Received: from mail-qt0-x243.google.com ([2607:f8b0:400d:c0d::243]:32877) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1d5dOK-0001dc-TD for qemu-devel@nongnu.org; Tue, 02 May 2017 15:23:20 -0400 Received: by mail-qt0-x243.google.com with SMTP id c45so21742430qtb.0 for ; Tue, 02 May 2017 12:23:20 -0700 (PDT) Received: from bigtime.twiddle.net.com ([2602:47:d954:1500:5e51:4fff:fe40:9c64]) by smtp.gmail.com with ESMTPSA id d65sm10664657qkc.53.2017.05.02.12.23.18 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 02 May 2017 12:23:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Bn8ht2JwEYe+vTKPxUEZjoNNVAsQsgOakso8Qn7Q/so=; b=tXWwAGbzmK9Ay50GAQfgYfIqDY5dCwNKyDj2ZSP3yP0OvUSm2WgKkHM+VaW7WUimmT lf9+cqY6N3wYgnaurm26eeXmOcQU/nF4waipL+9fXiKP635eigC8vqiMKJfSEy/J+foZ cu6BhN8U5+6s1g789WFo9s5Q+8q48RZgv9I3Qa2mlmjSyg8ieWOcYV+uoag/iCDFhuzW jFSXjDHJugSUORLseZ4OyDjkHkOJHg2TZScO6uBnopb6omEs/WkpVbHDRMkboKrzcuHT YO11IracSjBdnfJx3HhnMNcZIlAQt4WOfYa2/RpZmuVDQJsF5vOpVGRWWW91/zoBEPnD fagw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references:mime-version:content-transfer-encoding; bh=Bn8ht2JwEYe+vTKPxUEZjoNNVAsQsgOakso8Qn7Q/so=; b=SxDlfxh+1RE/DiJ2m31JXj01Siqm3F5HdYfJ3F9QYngECKW1Fw/6vtYgXAKr/iu73G jfz3mglnQhsw1gQ3r0tpuVtGML1l2/xzcX4zhLdPMqeK+XtG93HNfX91BSQLf8bT/Nb/ Kpk+4B4J9mdpW8SAcbfGnYFDmq8ItMWWoZ3A/oXdw6lNKXxYFMlOs4sg8uZ69g5cvZts iENBmks2NAmG/C9eRJcQcWDdIONvPrrsQjuJ0bIN0rr/aOjFtDfV1Xf93b+m07UglSLw +CjsNhv2qk5z5gECr8nYEKD3emX9+caHOIumovZ83micmVE5y310P4wHvy5XTru8jZk9 bHiw== X-Gm-Message-State: AN3rC/6Eox61AP01JmDyVpgprC4Z63ZmIpTEH1YTyOvifKj9DDt4bsQa FrN2Qpw7s9QdlWueXo0= X-Received: by 10.200.37.70 with SMTP id 6mr30042451qtn.130.1493753000010; Tue, 02 May 2017 12:23:20 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Tue, 2 May 2017 12:22:46 -0700 Message-Id: <20170502192300.2124-12-rth@twiddle.net> X-Mailer: git-send-email 2.9.3 In-Reply-To: <20170502192300.2124-1-rth@twiddle.net> References: <20170502192300.2124-1-rth@twiddle.net> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400d:c0d::243 Subject: [Qemu-devel] [PATCH v6 11/25] tb-hash: improve tb_jmp_cache hash function in user mode X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: cota@braap.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 From: "Emilio G. Cota" Optimizations to cross-page chaining and indirect branches make performance more sensitive to the hit rate of tb_jmp_cache. The constraint of reserving some bits for the page number lowers the achievable quality of the hashing function. However, user-mode does not have this requirement. Thus, with this change we use for user-mode a hashing function that is both faster and of better quality than the previous one. Measurements: Note: baseline (i.e. speedup =3D=3D 1x) is QEMU v2.9.0. - SPECint06 (test set), x86_64-linux-user. Host: = Intel i7-6700K @ 4.00GHz 2.2x +-+------------------------------------------------------------------= --------------------------------------------+-+ | = | | jr = | 2x +jr+multhash +................................................= ....+++++...................................+-+ | jr+hash = |$$$ | | = |$+$ | | = ### $ | 1.8x +-+..................................................................= ....#|#.$...................................+-+ | = ++#+# $ | | = |# # $ | 1.6x +-+..................................................................= ..***.#.$....................++$$$..........+-+ | $$$ = *+* # $ |$+$ | | ++$$$ ### $ = * * # $ +++|$ $ | | ++###+$ # # $ = * * # $ ### ****## $ | 1.4x +-+...................***+#.$.........***.#.$........................= ..*.*.#.$...........#+#$$.*++*|#.$..........+-+ | *+* # $ * * # $ = * * # $ # # $ * *+# $ | | * * # $ +++++ * * # $ = * * # $ *** # $ * * # $ ###$$ | 1.2x +-+...................*.*.#.$.***##$$.*.*.#.$........................= ..*.*.#.$.........*.*.#.$.*..*.#.$.***+#+$..+-+ | * * # $ *+* # $ * * # $ +++ = * * # $ ++###$$ * * # $ * * # $ * * # $ | | ***##$$ * * # $ * * # $ * * # $ ***##$$ ++### = * * # $ *** #+$ * * # $ * * # $ * * # $ | | *+*+#+$ ***##$$$ * * # $ * * # $ * * # $ *+* # $ ++####$$ ***+# = * * # $ * * # $ * * # $ * * # $ * * # $ | 1x +-++-*+*+#+$+*+*+#-+$+*+*-#+$+*+*+#+$+*+*+#+$+*-*+#+$+***++#+$+*+*+#$= $+*+*+#+$+*+*+#+$+*+*-#+$+*+-*+#+$+*+*+#+$-++-+ | * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # = $ * * # $ * * # $ * * # $ * * # $ * * # $ | | * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # $ * * # = $ * * # $ * * # $ * * # $ * * # $ * * # $ | 0.8x +-+--***##$$-***##$$$-***##$$-***##$$-***##$$-***##$$-***###$$-***##$= $-***##$$-***##$$-***##$$-****##$$-***##$$--+-+ astar bzip2 gcc gobmk h264ref hmmlibquantum mcf om= netpperlbench sjengxalancbmk hmean png: http://imgur.com/4UXTrEc Here I also tried the hash function suggested by Paolo ("multhash"): return ((uint64_t) (pc * 2654435761) >> 32) & (TB_JMP_CACHE_SIZE - 1); As you can see it is just as good as the other new function ("hash"), which is what I ended up going with. - SPECint06 (train set), x86_64-linux-user. Host: = Intel i7-6700K @ 4.00GHz 2.6x +-+------------------------------------------------------------------= --------------------------------------------+-+ | = | | jr = ### | 2.4x +jr+hash.............................................................= ..............................#.#...........+-+ | = # # | | = # # | 2.2x +-+..................................................................= ..............................#.#...........+-+ | = # # | | = # # | 2x +-+..................................................................= ..............................#.#...........+-+ | = **** # | | = * * # | 1.8x +-+..................................................................= ...........................*..*.#...........+-+ | = +++ * * # | | = #### #### * * # | 1.6x +-+......................................####........................= .....#..#.****..#..........*..*.#...........+-+ | +++ #++# = **** # * * # #### * * # | | ### # # = * * # * * # # # * * # | 1.4x +-+...................****+#..........****..#........................= ..*..*..#.*..*..#....#..#..*..*.#...........+-+ | *++* # * * # = * * # * * # *** # * * # #### | | * * # #### * * # = * * # * * # * * # * * # **** # | 1.2x +-+...................*..*.#..****++#.*..*..#........................= ..*..*..#.*..*..#..*.*..#..*..*.#..*..*..#..+-+ | ****### * * # * * # * * # = * * # * * # * * # * * # * * # | | * * # ***### * * # * * # * * # ****##= * * # * * # * * # * * # * * # | 1x +-+--****###--***###--****##--****###-****###--***###--***###--****##= --****###-****###--***###--****##--****###--+-+ astar bzip2 gcc gobmk h264ref hmmlibquantum mcf om= netpperlbench sjengxalancbmk hmean png: http://imgur.com/ArCbHqo - NBench, x86_64-linux-user. Host: Intel= i7-6700K @ 4.00GHz 1.12x +-+-----------------------------------------------------------------= --------------------------------------------+-+ | = | | jr += ++ | 1.1x +jr+hash...........................................................#= ###.........................................+-+ | +++#= | # | | | #= ++# | 1.08x +-+................................+++................+++.+++..*****= ..#.........................................+-+ | | +++ | | * | *= # | | | | | | *+++*= # | 1.06x +-+................................****###.............|...|...*...*= ..#.........................+++.............+-+ | *| * |# ****### * *= # | | | *| *++# *| * |# * *= # #### | 1.04x +-+................................*++*..#............*|.*.|#..*...*= ..#........................#.|#.............+-+ | * * # *++*++# * *= # +++#++# | | * * # * * # * *= # | # # +++#### | 1.02x +-+................................*..*..#......+++...*..*..#..*...*= ..#.....................****..#..*****++#...+-+ | +++ * * # +++ | * * # * *= # +++ *| * # *+++* # | | +++ | +++ +++ ++++++ * * # *****### * * # * *= # | +++ ++++++ *++* # * * # | 1x +-++-+++++####++****###++++-+####+-*++*++#-+*+++*-+#++*++*++#++*+-+*= ++#+-+++####-+*****###++*++*++#++*+-+*++#+-++-+ | *****| # *++* |# *****| # * * # * *++# * * # * *= # **** |# * * # * * # * * # | | * | *| # * *++# * | *++# * * # * * # * * # * *= # *| *++# * * # * * # * * # | 0.98x +-+...*.|.*++#..*..*..#..*+++*..#..*..*..#..*...*..#..*..*..#..*...*= ..#..*++*..#..*...*..#..*..*..#..*...*..#...+-+ | *+++* # * * # * * # * * # * * # * * # * *= # * * # * * # * * # * * # | | * * # * * # * * # * * # * * # * * # * *= # * * # * * # * * # * * # | 0.96x +-+---*****###--****###--*****###--****###--*****###--****###--*****= ###--****###--*****###--****###--*****###---+-+ ASSIGNMENT BITFIELD FOURFP EMULATION HUFFMAN LU DECOMPOSITIONE= URAL NNUMERIC SOSTRING SORT hmean png: http://imgur.com/ZXFX0hJ - NBench, arm-linux-user. Host: Intel i7-= 4790K @ 4.00GHz 1.3x +-+-----------------------------------------------------------------= --------------------------------------------+-+ | #### = | | jr # # = +++ | 1.25x +jr+hash.....................#..#...................................= ........####................................+-+ | # # = # # | | # # = # # | 1.2x +-+..........................#..#...................................= ........#..#................................+-+ | # # = # # | | # # = # # | 1.15x +-+..........................#..#...................................= ........#..#................................+-+ | # # #= ### # # | | # # #= # # # | 1.1x +-+..........................#..#..................................#= ..#.....#..#................................+-+ | # # #= # # # +++ | | # # #### #= # # # #### | 1.05x +-+..........................#..#...............#..#.....####......#= ..#.....#..#.........................#..#...+-+ | # # # # # # #= # # # +++ # # | | +++ ***** # #### ***** # # # +++#= # **** # ****### # # | 1x +-++-+*****###++****+++++*+-+*++#+-****++#-+*+++*-+#+++++#++#++*****= ++#+-*++*++#-+*****-++++*++*++#++*****++#+-++-+ | * * # * * | * * # * * # * * # **** # * *= # * * # * *### * *++# * * # | | * * # * *### * * # * * # * * # * * # * *= # * * # * * # * * # * * # | 0.95x +-+...*...*..#..*..*.|#..*...*..#..*..*..#..*...*..#..*..*..#..*...*= ..#..*..*..#..*...*..#..*..*..#..*...*..#...+-+ | * * # * * |# * * # * * # * * # * * # * *= # * * # * * # * * # * * # | | * * # * * |# * * # * * # * * # * * # * *= # * * # * * # * * # * * # | 0.9x +-+---*****###--****###--*****###--****###--*****###--****###--*****= ###--****###--*****###--****###--*****###---+-+ ASSIGNMENT BITFIELD FOURFP EMULATION HUFFMAN LU DECOMPOSITIONE= URAL NNUMERIC SOSTRING SORT hmean png: http://imgur.com/FfD27ey Reviewed-by: Alex Benn=C3=A9e Reviewed-by: Richard Henderson Signed-off-by: Emilio G. Cota Message-Id: <1493263764-18657-12-git-send-email-cota@braap.org> Signed-off-by: Richard Henderson --- include/exec/tb-hash.h | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/include/exec/tb-hash.h b/include/exec/tb-hash.h index 2c27490..b1fe2d0 100644 --- a/include/exec/tb-hash.h +++ b/include/exec/tb-hash.h @@ -22,6 +22,8 @@ =20 #include "exec/tb-hash-xx.h" =20 +#ifdef CONFIG_SOFTMMU + /* Only the bottom TB_JMP_PAGE_BITS of the jump cache hash bits vary for addresses on the same page. The top bits are the same. This allows TLB invalidation to quickly clear a subset of the hash table. */ @@ -45,6 +47,16 @@ static inline unsigned int tb_jmp_cache_hash_func(target= _ulong pc) | (tmp & TB_JMP_ADDR_MASK)); } =20 +#else + +/* In user-mode we can get better hashing because we do not have a TLB */ +static inline unsigned int tb_jmp_cache_hash_func(target_ulong pc) +{ + return (pc ^ (pc >> TB_JMP_CACHE_BITS)) & (TB_JMP_CACHE_SIZE - 1); +} + +#endif /* CONFIG_SOFTMMU */ + static inline uint32_t tb_hash_func(tb_page_addr_t phys_pc, target_ulong pc, uint32_t fl= ags) { --=20 2.9.3