From nobody Wed Nov 5 16:38:15 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1496975230232788.5411695854658; Thu, 8 Jun 2017 19:27:10 -0700 (PDT) Received: from localhost ([::1]:52303 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dJ9dk-0001RT-NU for importer@patchew.org; Thu, 08 Jun 2017 22:27:08 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51676) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dJ9cC-0000Y1-4B for qemu-devel@nongnu.org; Thu, 08 Jun 2017 22:25:33 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dJ9c8-00069G-U7 for qemu-devel@nongnu.org; Thu, 08 Jun 2017 22:25:32 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:50775) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dJ9c8-00068a-ET for qemu-devel@nongnu.org; Thu, 08 Jun 2017 22:25:28 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id CBE7B20B3C; Thu, 8 Jun 2017 22:25:26 -0400 (EDT) Received: from frontend2 ([10.202.2.161]) by compute4.internal (MEProxy); Thu, 08 Jun 2017 22:25:26 -0400 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 8B196241E0; Thu, 8 Jun 2017 22:25:26 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-sender :x-me-sender:x-sasl-enc:x-sasl-enc; s=mesmtp; bh=1aJinaubyeE9DjV mooONSiSN1OAohgLLzOxCsbmjpkk=; b=gXM2VLShCYUsf7DPKwK/t1w1TfHhVVT RZbmeBoFg2YvGZ+HkEaOhrATCfOUIQKGJa/GBFbBVSBs+C0ilBCy3pPnzJDqs9DU kXb0kl1YXM+oTKEZgq1NE1hXz+i2F0ut9iPuHOogyprg88A8Cl98L8SDncf2x8+T 4J9wpfrJ+OtA= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-sender:x-me-sender:x-sasl-enc:x-sasl-enc; s= fm1; bh=1aJinaubyeE9DjVmooONSiSN1OAohgLLzOxCsbmjpkk=; b=BI7N2I1l S3v+FWWF3SrtiAuNDQGmjy4NJBKnCwFotzO7gTFpHxJf/pqQp1jXO098ApiPPNxH L3RxyeB1awwVtnsPTcr/owGwndIEiX8TVP14C6DylIxw04cu+PvnS0nTl45ckxRi jsdZIZ/L/2zGWkPkDAiB1R8DZPdjOEPejoO6wwDMm02t8Kdeuyd5P3FkFCc2I1yJ Zib2Q11Ih4svmjsx9v4uxU9hFwuWdqG+/j7dQ+xIUvfbuzSbS/B43ycT72fEnkcr gk7LzjZVrqaxllBJmKqHXmlC8KcYF44LIoEZ4LQPHEfeZ15YMPFinUsZXJ5CVvGp VoVAvAe5zAP3hg== X-ME-Sender: X-Sasl-enc: l8YXhO71bWJQwDbbjQ3uNXYzNnMwbrCB/N6QSk7/fYRU 1496975126 From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Thu, 8 Jun 2017 22:25:19 -0400 Message-Id: <1496975122-16999-5-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1496975122-16999-1-git-send-email-cota@braap.org> References: <1496975122-16999-1-git-send-email-cota@braap.org> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.25 Subject: [Qemu-devel] [PATCH v8 4/7] exec: [tcg] Use different TBs according to the vCPU's dynamic tracing state X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?UTF-8?q?Llu=C3=ADs=20Vilanova?= , Stefan Hajnoczi , Richard Henderson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 From: Llu=C3=ADs Vilanova Every vCPU now uses a separate set of TBs for each set of dynamic tracing event state values. Each set of TBs can be used by any number of vCPUs to maximize TB reuse when vCPUs have the same tracing state. This feature is later used by tracetool to optimize tracing of guest code events. The maximum number of TB sets is defined as 2^E, where E is the number of events that have the 'vcpu' property (their state is stored in CPUState->trace_dstate). For this to work, a change on the dynamic tracing state of a vCPU will force it to flush its virtual TB cache (which is only indexed by address), and fall back to the physical TB cache (which now contains the vCPU's dynamic tracing state as part of the hashing function). Signed-off-by: Llu=C3=ADs Vilanova Reviewed-by: Richard Henderson [cota: - rename tb->trace_vcpu_dstate to the shorter tb->trace_ds - use uint32_t for tb->trace_ds instead of a typedef - add BUILD_BUG_ON check to make sure tb->trace_ds is big enough - fix xxhash - directly dereference cpu->trace_dstate instead of using bitmap_copy etc. - drop trace_dstate parameter from tb_htable_lookup; grab it directly from= cpu. ] Signed-off-by: Emilio G. Cota --- cpu-exec.c | 8 ++++++-- include/exec/exec-all.h | 3 +++ include/exec/tb-hash-xx.h | 7 +++++-- include/exec/tb-hash.h | 5 +++-- tcg-runtime.c | 3 ++- tests/qht-bench.c | 2 +- trace/control-target.c | 1 + trace/control.h | 3 +++ translate-all.c | 10 ++++++++-- 9 files changed, 32 insertions(+), 10 deletions(-) diff --git a/cpu-exec.c b/cpu-exec.c index 5b181c1..b6679d9 100644 --- a/cpu-exec.c +++ b/cpu-exec.c @@ -280,6 +280,7 @@ struct tb_desc { CPUArchState *env; tb_page_addr_t phys_page1; uint32_t flags; + uint32_t trace_ds; }; =20 static bool tb_cmp(const void *p, const void *d) @@ -291,6 +292,7 @@ static bool tb_cmp(const void *p, const void *d) tb->page_addr[0] =3D=3D desc->phys_page1 && tb->cs_base =3D=3D desc->cs_base && tb->flags =3D=3D desc->flags && + tb->trace_ds =3D=3D desc->trace_ds && !atomic_read(&tb->invalid)) { /* check next page if needed */ if (tb->page_addr[1] =3D=3D -1) { @@ -319,10 +321,11 @@ TranslationBlock *tb_htable_lookup(CPUState *cpu, tar= get_ulong pc, desc.env =3D (CPUArchState *)cpu->env_ptr; desc.cs_base =3D cs_base; desc.flags =3D flags; + desc.trace_ds =3D *cpu->trace_dstate; desc.pc =3D pc; phys_pc =3D get_page_addr_code(desc.env, pc); desc.phys_page1 =3D phys_pc & TARGET_PAGE_MASK; - h =3D tb_hash_func(phys_pc, pc, flags); + h =3D tb_hash_func(phys_pc, pc, flags, *cpu->trace_dstate); return qht_lookup(&tcg_ctx.tb_ctx.htable, tb_cmp, &desc, h); } =20 @@ -342,7 +345,8 @@ static inline TranslationBlock *tb_find(CPUState *cpu, cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags); tb =3D atomic_rcu_read(&cpu->tb_jmp_cache[tb_jmp_cache_hash_func(pc)]); if (unlikely(!tb || tb->pc !=3D pc || tb->cs_base !=3D cs_base || - tb->flags !=3D flags)) { + tb->flags !=3D flags || + tb->trace_ds !=3D *cpu->trace_dstate)) { tb =3D tb_htable_lookup(cpu, pc, cs_base, flags); if (!tb) { =20 diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h index b0281b0..6bdc6e5 100644 --- a/include/exec/exec-all.h +++ b/include/exec/exec-all.h @@ -324,6 +324,9 @@ struct TranslationBlock { #define CF_USE_ICOUNT 0x20000 #define CF_IGNORE_ICOUNT 0x40000 /* Do not generate icount code */ =20 + /* Tracing Dynamic State (hence '_ds') used to generate this TB */ + uint32_t trace_ds; + uint16_t invalid; =20 void *tc_ptr; /* pointer to the translated code */ diff --git a/include/exec/tb-hash-xx.h b/include/exec/tb-hash-xx.h index 2c40b5c..6cd3022 100644 --- a/include/exec/tb-hash-xx.h +++ b/include/exec/tb-hash-xx.h @@ -49,7 +49,7 @@ * contiguous in memory. */ static inline -uint32_t tb_hash_func5(uint64_t a0, uint64_t b0, uint32_t e) +uint32_t tb_hash_func6(uint64_t a0, uint64_t b0, uint32_t e, uint32_t f) { uint32_t v1 =3D TB_HASH_XX_SEED + PRIME32_1 + PRIME32_2; uint32_t v2 =3D TB_HASH_XX_SEED + PRIME32_2; @@ -78,11 +78,14 @@ uint32_t tb_hash_func5(uint64_t a0, uint64_t b0, uint32= _t e) v4 *=3D PRIME32_1; =20 h32 =3D rol32(v1, 1) + rol32(v2, 7) + rol32(v3, 12) + rol32(v4, 18); - h32 +=3D 20; + h32 +=3D 24; =20 h32 +=3D e * PRIME32_3; h32 =3D rol32(h32, 17) * PRIME32_4; =20 + h32 +=3D f * PRIME32_3; + h32 =3D rol32(h32, 17) * PRIME32_4; + h32 ^=3D h32 >> 15; h32 *=3D PRIME32_2; h32 ^=3D h32 >> 13; diff --git a/include/exec/tb-hash.h b/include/exec/tb-hash.h index b1fe2d0..d64c2d9 100644 --- a/include/exec/tb-hash.h +++ b/include/exec/tb-hash.h @@ -58,9 +58,10 @@ static inline unsigned int tb_jmp_cache_hash_func(target= _ulong pc) #endif /* CONFIG_SOFTMMU */ =20 static inline -uint32_t tb_hash_func(tb_page_addr_t phys_pc, target_ulong pc, uint32_t fl= ags) +uint32_t tb_hash_func(tb_page_addr_t phys_pc, target_ulong pc, uint32_t fl= ags, + uint32_t trace_ds) { - return tb_hash_func5(phys_pc, pc, flags); + return tb_hash_func6(phys_pc, pc, flags, trace_ds); } =20 #endif diff --git a/tcg-runtime.c b/tcg-runtime.c index 7fa90ce..71d8956 100644 --- a/tcg-runtime.c +++ b/tcg-runtime.c @@ -155,9 +155,10 @@ void *HELPER(lookup_tb_ptr)(CPUArchState *env, target_= ulong addr) if (likely(tb)) { cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags); if (likely(tb->pc =3D=3D addr && tb->cs_base =3D=3D cs_base && - tb->flags =3D=3D flags)) { + tb->flags =3D=3D flags && tb->trace_ds =3D=3D *cpu->tra= ce_dstate)) { goto found; } + tb =3D tb_htable_lookup(cpu, addr, cs_base, flags); if (likely(tb)) { atomic_set(&cpu->tb_jmp_cache[tb_jmp_cache_hash_func(addr)], t= b); diff --git a/tests/qht-bench.c b/tests/qht-bench.c index 2afa09d..11c1cec 100644 --- a/tests/qht-bench.c +++ b/tests/qht-bench.c @@ -103,7 +103,7 @@ static bool is_equal(const void *obj, const void *userp) =20 static inline uint32_t h(unsigned long v) { - return tb_hash_func5(v, 0, 0); + return tb_hash_func6(v, 0, 0, 0); } =20 /* diff --git a/trace/control-target.c b/trace/control-target.c index 416d14e..ce7347e 100644 --- a/trace/control-target.c +++ b/trace/control-target.c @@ -40,6 +40,7 @@ static void trace_event_synchronize_vcpu_state_dynamic( { bitmap_copy(vcpu->trace_dstate, vcpu->trace_dstate_delayed, CPU_TRACE_DSTATE_MAX_EVENTS); + tb_flush_jmp_cache_all(vcpu); } =20 void trace_event_set_state_dynamic(TraceEvent *ev, bool state) diff --git a/trace/control.h b/trace/control.h index 4ea53e2..b931824 100644 --- a/trace/control.h +++ b/trace/control.h @@ -165,6 +165,9 @@ void trace_event_set_state_dynamic(TraceEvent *ev, bool= state); * Set the dynamic tracing state of an event for the given vCPU. * * Pre-condition: trace_event_get_vcpu_state_static(ev) =3D=3D true + * + * Note: Changes for execution-time events with the 'tcg' property will no= t be + * propagated until the next TB is executed (iff executing in TCG mo= de). */ void trace_event_set_vcpu_state_dynamic(CPUState *vcpu, TraceEvent *ev, bool state); diff --git a/translate-all.c b/translate-all.c index 8a5dc19..42beea2 100644 --- a/translate-all.c +++ b/translate-all.c @@ -53,6 +53,7 @@ #include "exec/cputlb.h" #include "exec/tb-hash.h" #include "translate-all.h" +#include "qemu/error-report.h" #include "qemu/bitmap.h" #include "qemu/timer.h" #include "qemu/main-loop.h" @@ -112,6 +113,10 @@ typedef struct PageDesc { #define V_L2_BITS 10 #define V_L2_SIZE (1 << V_L2_BITS) =20 +/* Make sure all possible CPU event bits fit in tb->trace_ds */ +QEMU_BUILD_BUG_ON(CPU_TRACE_DSTATE_MAX_EVENTS > + sizeof(((TranslationBlock *)0)->trace_ds) * BITS_PER_BYT= E); + uintptr_t qemu_host_page_size; intptr_t qemu_host_page_mask; =20 @@ -1096,7 +1101,7 @@ void tb_phys_invalidate(TranslationBlock *tb, tb_page= _addr_t page_addr) =20 /* remove the TB from the hash list */ phys_pc =3D tb->page_addr[0] + (tb->pc & ~TARGET_PAGE_MASK); - h =3D tb_hash_func(phys_pc, tb->pc, tb->flags); + h =3D tb_hash_func(phys_pc, tb->pc, tb->flags, tb->trace_ds); qht_remove(&tcg_ctx.tb_ctx.htable, tb, h); =20 /* remove the TB from the page list */ @@ -1241,7 +1246,7 @@ static void tb_link_page(TranslationBlock *tb, tb_pag= e_addr_t phys_pc, } =20 /* add in the hash table */ - h =3D tb_hash_func(phys_pc, tb->pc, tb->flags); + h =3D tb_hash_func(phys_pc, tb->pc, tb->flags, tb->trace_ds); qht_insert(&tcg_ctx.tb_ctx.htable, tb, h); =20 #ifdef DEBUG_TB_CHECK @@ -1286,6 +1291,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu, tb->cs_base =3D cs_base; tb->flags =3D flags; tb->cflags =3D cflags; + tb->trace_ds =3D *cpu->trace_dstate; =20 #ifdef CONFIG_PROFILER tcg_ctx.tb_count1++; /* includes aborted translations because of --=20 2.7.4