From nobody Mon Feb 9 07:55:17 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=intel.com ARC-Seal: i=1; a=rsa-sha256; t=1684418240; cv=none; d=zohomail.com; s=zohoarc; b=EaY4261OWtLDIHvGrvUlR1r50VZfZCmN4lvRTzsXgycrL6Nl00iYT+/v4DQdj7SyJHQj6FdcpZPnzJFaRnHN/WvuzhCLZ4hutoUf52U/7TbRAD/5GDY8ekUOrd9T/2Y1LU077FGa4b8VhgqI4y6LVnIJxHewwgrmSgaXJHR3+Ag= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1684418240; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=4vL2rVCQp/CDt2RJunKxsz+zS344afgq0AFymMlw9k4=; b=NwcepeyqBTu7xPN77NLg+t+hYclbTJsqCcahM421MV+JT4WmeL6Hb6oVRAdXVJkcd6+DNEQsOO+HxI40b1szye+EwHzd4R+Gu24bPg1TYpaz5yH32AVeM1sYn5JyS3SIxzPK7uMsJA7BOj+guK3q+D5VZE8Yud1OqjfDtvYQNfk= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass header.i=@intel.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1684418240690367.9644394158387; Thu, 18 May 2023 06:57:20 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pze7u-0003IO-3B; Thu, 18 May 2023 09:57:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pze7n-0003Dm-W7 for qemu-devel@nongnu.org; Thu, 18 May 2023 09:57:00 -0400 Received: from mga12.intel.com ([192.55.52.136]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pze7l-0002PX-Jn for qemu-devel@nongnu.org; Thu, 18 May 2023 09:56:59 -0400 Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 May 2023 06:56:55 -0700 Received: from wufei-optiplex-7090.sh.intel.com ([10.238.200.247]) by orsmga005.jf.intel.com with ESMTP; 18 May 2023 06:56:53 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1684418217; x=1715954217; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4Fx8zmTWGMZeSGq3I1RnLLjklsWjEPTEXW0ciKDnB7k=; b=KRWRK0q7F1lmqZoEFaMTpHW06dBGICnvmiTvU/8dkiGvtOmBaTiT5jwi fXF/+D+nvYpF29LBFW1V0GODQiqcIwq8Z3C2+QHjTZ3VGZgHbm8zqvYxt cxfiyXiTtupTpU+BvHkBqCY/RUnBzNZw1/lVuhNaNn8AVSOFbgHmLZhZ/ Ib9gohAbFqMGWLxe6UqZjejMKDoLsP0h8j1CTnOID81JZvCgZXzmHZKwD ztzHTCL2Hw18B4KI3Rk9ls2A1GcF99iFgOboKBHeI10Mlkm3bSu5fH7Uo M/Cvp40w+e4irzOpcGVHAsSwodgJUI+ReTlUJu51F//GZml31XWvxmdXj g==; X-IronPort-AV: E=McAfee;i="6600,9927,10714"; a="331685771" X-IronPort-AV: E=Sophos;i="5.99,285,1677571200"; d="scan'208";a="331685771" X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10714"; a="876428835" X-IronPort-AV: E=Sophos;i="5.99,285,1677571200"; d="scan'208";a="876428835" From: Fei Wu To: qemu-devel@nongnu.org, richard.henderson@linaro.org, alex.bennee@linaro.org, fei2.wu@intel.com Cc: "Vanderson M. do Rosario" , Paolo Bonzini , Markus Armbruster , Thomas Huth , Laurent Vivier Subject: [PATCH v12 08/15] accel: adding TB_JIT_TIME and full replacing CONFIG_PROFILER Date: Thu, 18 May 2023 21:57:50 +0800 Message-Id: <20230518135757.1442654-9-fei2.wu@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230518135757.1442654-1-fei2.wu@intel.com> References: <20230518135757.1442654-1-fei2.wu@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=192.55.52.136; envelope-from=fei2.wu@intel.com; helo=mga12.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @intel.com) X-ZM-MESSAGEID: 1684418242656100001 From: "Vanderson M. do Rosario" Replace all others CONFIG_PROFILER statistics and migrate it to TBStatistics system. However, TCGProfiler still exists and can be use to store global statistics and times. All TB related statistics goes to TBStatistics. Signed-off-by: Vanderson M. do Rosario Message-Id: <20190829173437.5926-6-vandersonmr2@gmail.com> [AJB: fix authorship] Signed-off-by: Alex Benn=C3=A9e Signed-off-by: Fei Wu --- accel/tcg/monitor.c | 25 ++++------ accel/tcg/tb-stats.c | 78 +++++++++++++++++++++++++++++ accel/tcg/tcg-accel-ops.c | 15 +++--- include/exec/tb-stats.h | 2 +- include/tcg/tcg.h | 5 +- softmmu/runstate.c | 8 +-- tcg/tcg.c | 100 ++++++++++--------------------------- tests/qtest/qmp-cmd-test.c | 2 +- 8 files changed, 127 insertions(+), 108 deletions(-) diff --git a/accel/tcg/monitor.c b/accel/tcg/monitor.c index d4e044f7f5..2bc87f2642 100644 --- a/accel/tcg/monitor.c +++ b/accel/tcg/monitor.c @@ -15,6 +15,7 @@ #include "sysemu/cpus.h" #include "sysemu/cpu-timers.h" #include "sysemu/tcg.h" +#include "exec/tb-stats.h" #include "internal.h" =20 =20 @@ -69,6 +70,11 @@ HumanReadableText *qmp_x_query_opcount(Error **errp) { g_autoptr(GString) buf =3D g_string_new(""); =20 + if (!tb_stats_collection_enabled()) { + error_setg(errp, "TB information not being recorded."); + return NULL; + } + if (!tcg_enabled()) { error_setg(errp, "Opcode count information is only available with accel= =3Dtcg"); @@ -80,23 +86,12 @@ HumanReadableText *qmp_x_query_opcount(Error **errp) return human_readable_text_from_str(buf); } =20 -#ifdef CONFIG_PROFILER - +#ifdef CONFIG_TCG HumanReadableText *qmp_x_query_profile(Error **errp) { g_autoptr(GString) buf =3D g_string_new(""); - static int64_t last_cpu_exec_time; - int64_t cpu_exec_time; - int64_t delta; - - cpu_exec_time =3D tcg_cpu_exec_time(); - delta =3D cpu_exec_time - last_cpu_exec_time; - - g_string_append_printf(buf, "async time %" PRId64 " (%0.3f)\n", - dev_time, dev_time / (double)NANOSECONDS_PER_SE= COND); - g_string_append_printf(buf, "qemu time %" PRId64 " (%0.3f)\n", - delta, delta / (double)NANOSECONDS_PER_SECOND); - last_cpu_exec_time =3D cpu_exec_time; + + dump_jit_exec_time_info(dev_time, buf); dev_time =3D 0; =20 return human_readable_text_from_str(buf); @@ -104,7 +99,7 @@ HumanReadableText *qmp_x_query_profile(Error **errp) #else HumanReadableText *qmp_x_query_profile(Error **errp) { - error_setg(errp, "Internal profiler not compiled"); + error_setg(errp, "TCG should be enabled!"); return NULL; } #endif diff --git a/accel/tcg/tb-stats.c b/accel/tcg/tb-stats.c index c40c9a748e..78a3104c7f 100644 --- a/accel/tcg/tb-stats.c +++ b/accel/tcg/tb-stats.c @@ -13,6 +13,7 @@ #include "tcg/tcg.h" =20 #include "qemu/qemu-print.h" +#include "qemu/timer.h" =20 #include "exec/tb-stats.h" #include "tb-context.h" @@ -41,6 +42,13 @@ struct jit_profile_info { uint64_t host; uint64_t guest; uint64_t search_data; + + uint64_t interm_time; + uint64_t code_time; + uint64_t restore_count; + uint64_t restore_time; + uint64_t opt_time; + uint64_t la_time; }; =20 /* accumulate the statistics from all TBs */ @@ -62,6 +70,35 @@ static void collect_jit_profile_info(void *p, uint32_t h= ash, void *userp) jpi->host +=3D tbs->code.out_len; jpi->guest +=3D tbs->code.in_len; jpi->search_data +=3D tbs->code.search_out_len; + + jpi->interm_time +=3D stat_per_translation(tbs, gen_times.ir); + jpi->opt_time +=3D stat_per_translation(tbs, gen_times.ir_opt); + jpi->la_time +=3D stat_per_translation(tbs, gen_times.la); + jpi->code_time +=3D stat_per_translation(tbs, gen_times.code); + + /* + * The restore time covers how long we have spent restoring state + * from a given TB (e.g. recovering from a fault). It is therefor + * not related to the number of translations we have done. + */ + jpi->restore_time +=3D tbs->tb_restore_time; + jpi->restore_count +=3D tbs->tb_restore_count; +} + +void dump_jit_exec_time_info(uint64_t dev_time, GString *buf) +{ + static uint64_t last_cpu_exec_time; + uint64_t cpu_exec_time; + uint64_t delta; + + cpu_exec_time =3D tcg_cpu_exec_time(); + delta =3D cpu_exec_time - last_cpu_exec_time; + + g_string_append_printf(buf, "async time %" PRId64 " (%0.3f)\n", + dev_time, dev_time / (double)NANOSECONDS_PER_SE= COND); + g_string_append_printf(buf, "qemu time %" PRId64 " (%0.3f)\n", + delta, delta / (double)NANOSECONDS_PER_SECOND); + last_cpu_exec_time =3D cpu_exec_time; } =20 /* dump JIT statisticis using TCGProfile and TBStats */ @@ -88,6 +125,47 @@ void dump_jit_profile_info(TCGProfile *s, GString *buf) jpi->host / (double) jpi->translations); g_string_append_printf(buf, "avg search data/TB %0.1f\n", jpi->search_data / (double) jpi->translations); + + uint64_t tot =3D jpi->interm_time + jpi->code_time; + + g_string_append_printf(buf, "JIT cycles %" PRId64 + " (%0.3fs at 2.4 GHz)\n", + tot, tot / 2.4e9); + g_string_append_printf(buf, " cycles/op %0.1f\n", + jpi->ops ? (double)tot / jpi->ops : 0); + g_string_append_printf(buf, " cycles/in byte %0.1f\n", + jpi->guest ? (double)tot / jpi->guest : 0); + g_string_append_printf(buf, " cycles/out byte %0.1f\n", + jpi->host ? (double)tot / jpi->host : 0); + g_string_append_printf(buf, " cycles/search byte %0.1f\n", + jpi->search_data ? (double)tot / jpi->search_data : 0); + if (tot =3D=3D 0) { + tot =3D 1; + } + + g_string_append_printf(buf, " gen_interm time %0.1f%%\n", + (double)jpi->interm_time / tot * 100.0); + g_string_append_printf(buf, " gen_code time %0.1f%%\n", + (double)jpi->code_time / tot * 100.0); + + g_string_append_printf(buf, " optim./code time %0.1f%%\n", + (double)jpi->opt_time / (jpi->code_time ? jpi->code_time := 1) + * 100.0); + g_string_append_printf(buf, " liveness/code time %0.1f%%\n", + (double)jpi->la_time / (jpi->code_time ? jpi->code_time : = 1) + * 100.0); + + g_string_append_printf(buf, "cpu_restore count %" PRId64 "\n", + jpi->restore_count); + g_string_append_printf(buf, " avg cycles %0.1f\n", + jpi->restore_count ? + (double)jpi->restore_time / jpi->restore_count : 0); + + if (s) { + g_string_append_printf(buf, "cpu exec time %" PRId64 " (%0.3f= s)\n", + s->cpu_exec_time, + s->cpu_exec_time / (double) NANOSECONDS_PER_SECOND); + } } g_free(jpi); } diff --git a/accel/tcg/tcg-accel-ops.c b/accel/tcg/tcg-accel-ops.c index 58c8e64096..749ad182f2 100644 --- a/accel/tcg/tcg-accel-ops.c +++ b/accel/tcg/tcg-accel-ops.c @@ -70,20 +70,17 @@ void tcg_cpus_destroy(CPUState *cpu) int tcg_cpus_exec(CPUState *cpu) { int ret; -#ifdef CONFIG_PROFILER - int64_t ti; -#endif + uint64_t ti; + assert(tcg_enabled()); -#ifdef CONFIG_PROFILER ti =3D profile_getclock(); -#endif + cpu_exec_start(cpu); ret =3D cpu_exec(cpu); cpu_exec_end(cpu); -#ifdef CONFIG_PROFILER - qatomic_set(&tcg_ctx->prof.cpu_exec_time, - tcg_ctx->prof.cpu_exec_time + profile_getclock() - ti); -#endif + + qatomic_add(&tcg_ctx->prof.cpu_exec_time, profile_getclock() - ti); + return ret; } =20 diff --git a/include/exec/tb-stats.h b/include/exec/tb-stats.h index 2543367c70..d93d42e085 100644 --- a/include/exec/tb-stats.h +++ b/include/exec/tb-stats.h @@ -110,6 +110,6 @@ bool tb_stats_cmp(const void *ap, const void *bp); void init_tb_stats_htable(void); =20 void dump_jit_profile_info(TCGProfile *s, GString *buf); -void dump_jit_exec_time_info(uint64_t dev_time); +void dump_jit_exec_time_info(uint64_t dev_time, GString *buf); =20 #endif diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h index 9b74c5acce..b987f6a9bd 100644 --- a/include/tcg/tcg.h +++ b/include/tcg/tcg.h @@ -564,6 +564,9 @@ typedef struct TCGProfile { uint64_t gen_opt_done_time; uint64_t gen_la_done_time; uint64_t gen_code_done_time; + + /* Lifetime count of TCGOps per TCGContext */ + uint64_t table_op_count[NB_OPS]; } TCGProfile; =20 struct TCGContext { @@ -932,7 +935,7 @@ static inline TCGv_ptr tcg_temp_new_ptr(void) return temp_tcgv_ptr(t); } =20 -int64_t tcg_cpu_exec_time(void); +uint64_t tcg_cpu_exec_time(void); void tcg_dump_info(GString *buf); void tcg_dump_op_count(GString *buf); =20 diff --git a/softmmu/runstate.c b/softmmu/runstate.c index 2f2396c819..392e03c906 100644 --- a/softmmu/runstate.c +++ b/softmmu/runstate.c @@ -728,18 +728,12 @@ static bool main_loop_should_exit(int *status) int qemu_main_loop(void) { int status =3D EXIT_SUCCESS; -#ifdef CONFIG_PROFILER - int64_t ti; -#endif + uint64_t ti; =20 while (!main_loop_should_exit(&status)) { -#ifdef CONFIG_PROFILER ti =3D profile_getclock(); -#endif main_loop_wait(false); -#ifdef CONFIG_PROFILER dev_time +=3D profile_getclock() - ti; -#endif } =20 return status; diff --git a/tcg/tcg.c b/tcg/tcg.c index 9e657719fa..2ff029ae13 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -5864,25 +5864,13 @@ static void tcg_out_st_helper_args(TCGContext *s, c= onst TCGLabelQemuLdst *ldst, tcg_out_helper_load_common_args(s, ldst, parm, info, next_arg); } =20 -#ifdef CONFIG_PROFILER - /* avoid copy/paste errors */ #define PROF_ADD(to, from, field) \ do { \ (to)->field +=3D qatomic_read(&((from)->field)); \ } while (0) =20 -#define PROF_MAX(to, from, field) \ - do { \ - typeof((from)->field) val__ =3D qatomic_read(&((from)->field)); \ - if (val__ > (to)->field) { \ - (to)->field =3D val__; \ - } \ - } while (0) - -/* Pass in a zero'ed @prof */ -static inline -void tcg_profile_snapshot(TCGProfile *prof, bool counters, bool table) +static void collect_tcg_profiler(TCGProfile *prof) { unsigned int n_ctxs =3D qatomic_read(&tcg_cur_ctxs); unsigned int i; @@ -5891,55 +5879,19 @@ void tcg_profile_snapshot(TCGProfile *prof, bool co= unters, bool table) TCGContext *s =3D qatomic_read(&tcg_ctxs[i]); const TCGProfile *orig =3D &s->prof; =20 - if (counters) { - PROF_ADD(prof, orig, cpu_exec_time); - PROF_ADD(prof, orig, interm_time); - PROF_ADD(prof, orig, code_time); - PROF_ADD(prof, orig, la_time); - PROF_ADD(prof, orig, opt_time); - PROF_ADD(prof, orig, restore_count); - PROF_ADD(prof, orig, restore_time); - } - if (table) { - int i; + PROF_ADD(prof, orig, cpu_exec_time); =20 - for (i =3D 0; i < NB_OPS; i++) { - PROF_ADD(prof, orig, table_op_count[i]); - } + for (i =3D 0; i < NB_OPS; i++) { + PROF_ADD(prof, orig, table_op_count[i]); } } } =20 -#undef PROF_ADD -#undef PROF_MAX - -static void tcg_profile_snapshot_counters(TCGProfile *prof) -{ - tcg_profile_snapshot(prof, true, false); -} - -static void tcg_profile_snapshot_table(TCGProfile *prof) -{ - tcg_profile_snapshot(prof, false, true); -} - -void tcg_dump_op_count(GString *buf) -{ - TCGProfile prof =3D {}; - int i; - - tcg_profile_snapshot_table(&prof); - for (i =3D 0; i < NB_OPS; i++) { - g_string_append_printf(buf, "%s %" PRId64 "\n", tcg_op_defs[i].nam= e, - prof.table_op_count[i]); - } -} - -int64_t tcg_cpu_exec_time(void) +uint64_t tcg_cpu_exec_time(void) { unsigned int n_ctxs =3D qatomic_read(&tcg_cur_ctxs); unsigned int i; - int64_t ret =3D 0; + uint64_t ret =3D 0; =20 for (i =3D 0; i < n_ctxs; i++) { const TCGContext *s =3D qatomic_read(&tcg_ctxs[i]); @@ -5949,19 +5901,6 @@ int64_t tcg_cpu_exec_time(void) } return ret; } -#else -void tcg_dump_op_count(GString *buf) -{ - g_string_append_printf(buf, "[TCG profiler not compiled]\n"); -} - -int64_t tcg_cpu_exec_time(void) -{ - error_report("%s: TCG profiler not compiled", __func__); - exit(EXIT_FAILURE); -} -#endif - =20 int tcg_gen_code(TCGContext *s, TranslationBlock *tb, uint64_t pc_start) { @@ -6077,14 +6016,17 @@ int tcg_gen_code(TCGContext *s, TranslationBlock *t= b, uint64_t pc_start) s->pool_labels =3D NULL; #endif =20 + if (tb_stats_collection_enabled()) { + QTAILQ_FOREACH(op, &s->ops, link) { + TCGOpcode opc =3D op->opc; + s->prof.table_op_count[opc]++; + } + } + num_insns =3D -1; QTAILQ_FOREACH(op, &s->ops, link) { TCGOpcode opc =3D op->opc; =20 -#ifdef CONFIG_PROFILER - qatomic_set(&prof->table_op_count[opc], prof->table_op_count[opc] = + 1); -#endif - switch (opc) { case INDEX_op_mov_i32: case INDEX_op_mov_i64: @@ -6179,14 +6121,24 @@ int tcg_gen_code(TCGContext *s, TranslationBlock *t= b, uint64_t pc_start) return tcg_current_code_size(s); } =20 +void tcg_dump_op_count(GString *buf) +{ + TCGProfile prof =3D {}; + int i; + + collect_tcg_profiler(&prof); + for (i =3D 0; i < NB_OPS; i++) { + g_string_append_printf(buf, "%s %" PRId64 "\n", + tcg_op_defs[i].name, prof.table_op_count[i]); + } +} + void tcg_dump_info(GString *buf) { TCGProfile *s =3D NULL; -#ifdef CONFIG_PROFILER TCGProfile prof =3D {}; - tcg_profile_snapshot_counters(&prof); s =3D &prof; -#endif + collect_tcg_profiler(s); dump_jit_profile_info(s, buf); } =20 diff --git a/tests/qtest/qmp-cmd-test.c b/tests/qtest/qmp-cmd-test.c index a58de48d2a..749aafe4da 100644 --- a/tests/qtest/qmp-cmd-test.c +++ b/tests/qtest/qmp-cmd-test.c @@ -46,7 +46,7 @@ static int query_error_class(const char *cmd) { "query-balloon", ERROR_CLASS_DEVICE_NOT_ACTIVE }, { "query-hotpluggable-cpus", ERROR_CLASS_GENERIC_ERROR }, { "query-vm-generation-id", ERROR_CLASS_GENERIC_ERROR }, -#ifndef CONFIG_PROFILER +#ifndef CONFIG_TCG { "x-query-profile", ERROR_CLASS_GENERIC_ERROR }, #endif /* Only valid with a USB bus added */ --=20 2.25.1