From nobody Wed Nov 5 13:50:18 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zoho.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1497487033698469.4330502406949; Wed, 14 Jun 2017 17:37:13 -0700 (PDT) Received: from localhost ([::1]:51431 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dLIme-0004YD-GN for importer@patchew.org; Wed, 14 Jun 2017 20:37:12 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48191) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dLIlp-0004GE-AA for qemu-devel@nongnu.org; Wed, 14 Jun 2017 20:36:22 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dLIlk-0005dE-9R for qemu-devel@nongnu.org; Wed, 14 Jun 2017 20:36:21 -0400 Received: from out4-smtp.messagingengine.com ([66.111.4.28]:33037) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1dLIlk-0005bf-2h for qemu-devel@nongnu.org; Wed, 14 Jun 2017 20:36:16 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 3BF96209F3; Wed, 14 Jun 2017 20:36:14 -0400 (EDT) Received: from frontend2 ([10.202.2.161]) by compute4.internal (MEProxy); Wed, 14 Jun 2017 20:36:14 -0400 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 00BAF248E7; Wed, 14 Jun 2017 20:36:13 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:message-id:subject:to:x-me-sender:x-me-sender :x-sasl-enc:x-sasl-enc; s=mesmtp; bh=Oujer3blpl3muCNShPhDVai+M4a JnmU/I7U5HgxAKns=; b=1ttp2UABOiGKXvuItjtuvsKIPb+5OFqxano+dADeaCZ oB0gxU5wRBtJtBSvTyo/Zesm1mQ4P8vzGf8H21HJSBEebl+ECDShZBfVBpNlirsT 0sUTH4Jwr/1ar3Qr8EMkM9jU4Gh548nTxs80xLOPIiF2ZOS6Br0ljULcGYPnFlNY = DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:message-id:subject:to :x-me-sender:x-me-sender:x-sasl-enc:x-sasl-enc; s=fm1; bh=Oujer3 blpl3muCNShPhDVai+M4aJnmU/I7U5HgxAKns=; b=HVpX/io68mFWr202GTF7cG OtZy8phmNsZK+RWjEma6LV8K4e8/KDugO/Jr86mBvs7ibBRLR0ZCRhgTPnU0lAMp SLP74U19YA4f7vE/6JdU/CXapqyNBuy9vgE8vdOyhn3aLEIves0xvpSu3LavdKoz HLOe3QROck06KcPXrXQoCMLMqyFbO1aazCwg6bg83HMKMPniqiVw2KE6MrgxqQfv GcNdYOjjFry8DhtKhe1fOv/DgCfz//hQDgnMvqm3F5POuD65crx1GaYyHK2QT/v1 3q4O8SkmUYgKkChZtCNbUa6ae+extNlq569GeNX70Ibjg+LO/UGUp8WN9PIX4jGw == X-ME-Sender: X-Sasl-enc: OEzoq/iakrFFsAV5FMddhjHOozeIPJWs0aHCb5PlIJu5 1497486974 From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Wed, 14 Jun 2017 20:36:13 -0400 Message-Id: <1497486973-25845-1-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.28 Subject: [Qemu-devel] [PATCH] tcg: consistently access cpu->tb_jmp_cache atomically X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Paolo Bonzini , =?UTF-8?q?Alex=20Benn=C3=A9e?= , Richard Henderson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Some code paths can lead to atomic accesses racing with memset() on cpu->tb_jmp_cache, which can result in torn reads/writes and is undefined behaviour in C11. These torn accesses are unlikely to show up as bugs, but from code inspection they seem possible. For example, tb_phys_invalidate does: /* remove the TB from the hash list */ h =3D tb_jmp_cache_hash_func(tb->pc); CPU_FOREACH(cpu) { if (atomic_read(&cpu->tb_jmp_cache[h]) =3D=3D tb) { atomic_set(&cpu->tb_jmp_cache[h], NULL); } } Here atomic_set might race with a concurrent memset (such as the ones scheduled via "unsafe" async work, e.g. tlb_flush_page) and therefore we might end up with a torn pointer (or who knows what, because we are under undefined behaviour). This patch converts parallel accesses to cpu->tb_jmp_cache to use atomic primitives, thereby bringing these accesses back to defined behaviour. The price to pay is to potentially execute more instructions when clearing cpu->tb_jmp_cache, but given how infrequently they happen and the small size of the cache, the performance impact I have measured is within noise range when booting debian-arm. Note that under "safe async" work (e.g. do_tb_flush) we could use memset because no other vcpus are running. However I'm keeping these accesses atomic as well to keep things simple and to avoid confusing analysis tools such as ThreadSanitizer. Signed-off-by: Emilio G. Cota --- cputlb.c | 4 ++-- include/qom/cpu.h | 11 ++++++++++- qom/cpu.c | 5 +---- translate-all.c | 25 ++++++++++++------------- 4 files changed, 25 insertions(+), 20 deletions(-) diff --git a/cputlb.c b/cputlb.c index 743776a..e85b6d3 100644 --- a/cputlb.c +++ b/cputlb.c @@ -118,7 +118,7 @@ static void tlb_flush_nocheck(CPUState *cpu) =20 memset(env->tlb_table, -1, sizeof(env->tlb_table)); memset(env->tlb_v_table, -1, sizeof(env->tlb_v_table)); - memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache)); + cpu_tb_jmp_cache_clear(cpu); =20 env->vtlb_index =3D 0; env->tlb_flush_addr =3D -1; @@ -183,7 +183,7 @@ static void tlb_flush_by_mmuidx_async_work(CPUState *cp= u, run_on_cpu_data data) } } =20 - memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache)); + cpu_tb_jmp_cache_clear(cpu); =20 tlb_debug("done\n"); =20 diff --git a/include/qom/cpu.h b/include/qom/cpu.h index 89ddb68..2fe7cff 100644 --- a/include/qom/cpu.h +++ b/include/qom/cpu.h @@ -346,7 +346,7 @@ struct CPUState { =20 void *env_ptr; /* CPUArchState */ =20 - /* Writes protected by tb_lock, reads not thread-safe */ + /* Accessed in parallel; all accesses must be atomic */ struct TranslationBlock *tb_jmp_cache[TB_JMP_CACHE_SIZE]; =20 struct GDBRegisterState *gdb_regs; @@ -422,6 +422,15 @@ extern struct CPUTailQ cpus; =20 extern __thread CPUState *current_cpu; =20 +static inline void cpu_tb_jmp_cache_clear(CPUState *cpu) +{ + unsigned int i; + + for (i =3D 0; i < TB_JMP_CACHE_SIZE; i++) { + atomic_set(&cpu->tb_jmp_cache[i], NULL); + } +} + /** * qemu_tcg_mttcg_enabled: * Check whether we are running MultiThread TCG or not. diff --git a/qom/cpu.c b/qom/cpu.c index 5069876..585419b 100644 --- a/qom/cpu.c +++ b/qom/cpu.c @@ -274,7 +274,6 @@ void cpu_reset(CPUState *cpu) static void cpu_common_reset(CPUState *cpu) { CPUClass *cc =3D CPU_GET_CLASS(cpu); - int i; =20 if (qemu_loglevel_mask(CPU_LOG_RESET)) { qemu_log("CPU Reset (CPU %d)\n", cpu->cpu_index); @@ -292,9 +291,7 @@ static void cpu_common_reset(CPUState *cpu) cpu->crash_occurred =3D false; =20 if (tcg_enabled()) { - for (i =3D 0; i < TB_JMP_CACHE_SIZE; ++i) { - atomic_set(&cpu->tb_jmp_cache[i], NULL); - } + cpu_tb_jmp_cache_clear(cpu); =20 #ifdef CONFIG_SOFTMMU tlb_flush(cpu, 0); diff --git a/translate-all.c b/translate-all.c index b3ee876..af3c4df 100644 --- a/translate-all.c +++ b/translate-all.c @@ -923,11 +923,7 @@ static void do_tb_flush(CPUState *cpu, run_on_cpu_data= tb_flush_count) } =20 CPU_FOREACH(cpu) { - int i; - - for (i =3D 0; i < TB_JMP_CACHE_SIZE; ++i) { - atomic_set(&cpu->tb_jmp_cache[i], NULL); - } + cpu_tb_jmp_cache_clear(cpu); } =20 tcg_ctx.tb_ctx.nb_tbs =3D 0; @@ -1806,19 +1802,22 @@ void cpu_io_recompile(CPUState *cpu, uintptr_t reta= ddr) cpu_loop_exit_noexc(cpu); } =20 -void tb_flush_jmp_cache(CPUState *cpu, target_ulong addr) +static void tb_jmp_cache_clear_page(CPUState *cpu, target_ulong page_addr) { + unsigned int i0 =3D tb_jmp_cache_hash_page(page_addr); unsigned int i; =20 + for (i =3D 0; i < TB_JMP_PAGE_SIZE; i++) { + atomic_set(&cpu->tb_jmp_cache[i0 + i], NULL); + } +} + +void tb_flush_jmp_cache(CPUState *cpu, target_ulong addr) +{ /* Discard jump cache entries for any tb which might potentially overlap the flushed page. */ - i =3D tb_jmp_cache_hash_page(addr - TARGET_PAGE_SIZE); - memset(&cpu->tb_jmp_cache[i], 0, - TB_JMP_PAGE_SIZE * sizeof(TranslationBlock *)); - - i =3D tb_jmp_cache_hash_page(addr); - memset(&cpu->tb_jmp_cache[i], 0, - TB_JMP_PAGE_SIZE * sizeof(TranslationBlock *)); + tb_jmp_cache_clear_page(cpu, addr - TARGET_PAGE_SIZE); + tb_jmp_cache_clear_page(cpu, addr); } =20 static void print_qht_statistics(FILE *f, fprintf_function cpu_fprintf, --=20 2.7.4