From nobody Wed Nov 5 09:26:55 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1534483621143479.1645049916465; Thu, 16 Aug 2018 22:27:01 -0700 (PDT) Received: from localhost ([::1]:60026 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fqXHj-0001ET-2v for importer@patchew.org; Fri, 17 Aug 2018 01:26:55 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36294) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fqXA9-0000dG-2f for qemu-devel@nongnu.org; Fri, 17 Aug 2018 01:19:06 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fqXA7-0005bR-Ic for qemu-devel@nongnu.org; Fri, 17 Aug 2018 01:19:05 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:34387) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fqXA6-0005Yn-LV for qemu-devel@nongnu.org; Fri, 17 Aug 2018 01:19:03 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 895A721EC1; Fri, 17 Aug 2018 01:18:57 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 17 Aug 2018 01:18:57 -0400 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id B3F1510297; Fri, 17 Aug 2018 01:18:56 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=byMd+o0Jp/ywwZ LpbPJAPsa5i7q+vRppt9tjeaA9qa0=; b=TNea83UZ0KhqtCdBLX7eGAv0zpTO4w wesSNCeTXe5ZUpS1O/xCGbuo3oz7vq9RP/aGMeRytHy0+NKO9kJNzkqrApkDwUkc r7USK21OYkyh9bCr2S5ydVOEb2rTQHpm7h7t7F8EhiIjQi+zJq019MdpAB1iYgYn r15zB8roVZWkU= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=byMd+o0Jp/ywwZLpbPJAPsa5i7q+vRppt9tjeaA9qa0=; b=hbUOt7T+ 9ZzsDmbIdtz6nLgJHsgJDPKSl9ht4GFUcPRoLEpSpzTB9IYgaBSKU2nK2u4wDb2D 9Md/B14ZXAJJBDPmwAO98vfE0rtmdLM97OTZVLHHaVNrCW2I2YRkJT7WytgA4WR7 IddZPJTvN5pB9M/yFcGbQfOQo5FfjOj/vigzWvhomN/IayBaClzcY++EkRJJ6VaY N3VYtPqtBmBDz2fvqovrQ6sG3UtLkufae/QCwK4Q6PJ5PlqJ+n5nBzEbsnJXWPPu auXgbrvzB+oqIBhCp4XfZfwqUs7OhVptnsip9kxMYLEQWbaZCWDLI0H0vbCexxz2 9BetpKW8bkhvlA== X-ME-Proxy: X-ME-Sender: From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Fri, 17 Aug 2018 01:18:47 -0400 Message-Id: <20180817051853.23792-4-cota@braap.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180817051853.23792-1-cota@braap.org> References: <20180817051853.23792-1-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 66.111.4.25 Subject: [Qemu-devel] [PATCH 3/9] qsp: add qsp_reset X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Fam Zheng , Peter Crosthwaite , Stefan Weil , "Dr. David Alan Gilbert" , Peter Xu , Markus Armbruster , Paolo Bonzini , Richard Henderson Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail: RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" I first implemented this by deleting all entries in the global hash table. But doing that safely slows down profiling, since we'd need to introduce rcu_read_lock/unlock in the fast path. What's implemented here avoids messing with the thread-local data in the global hash table. It achieves this by taking a snapshot of the current state, so that subsequent reports present the delta wrt to the snapshot. Signed-off-by: Emilio G. Cota --- include/qemu/qsp.h | 1 + util/qsp.c | 94 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 95 insertions(+) diff --git a/include/qemu/qsp.h b/include/qemu/qsp.h index 209480b687..f8c6c9648a 100644 --- a/include/qemu/qsp.h +++ b/include/qemu/qsp.h @@ -24,5 +24,6 @@ void qsp_report(FILE *f, fprintf_function cpu_fprintf, si= ze_t max, bool qsp_is_enabled(void); void qsp_enable(void); void qsp_disable(void); +void qsp_reset(void); =20 #endif /* QEMU_QSP_H */ diff --git a/util/qsp.c b/util/qsp.c index b04d4d9986..ea4bb82a03 100644 --- a/util/qsp.c +++ b/util/qsp.c @@ -47,6 +47,11 @@ * an intermediate hash table. This would simplify the code only slightl= y, but * would perform badly if there were many threads and objects to track. * + * - Wrap operations on qsp entries with RCU read-side critical sections, = so + * that qsp_reset() can delete entries. Unfortunately, the overhead of c= alling + * rcu_read_lock/unlock slows down atomic_add-bench -m by 24%. Having + * a snapshot that is updated on qsp_reset() avoids this overhead. + * * Related Work: * - Lennart Poettering's mutrace: http://0pointer.de/blog/projects/mutrac= e.html * - Lozi, David, Thomas, Lawall and Muller. "Remote Core Locking: Migrati= ng @@ -57,6 +62,7 @@ #include "qemu/thread.h" #include "qemu/timer.h" #include "qemu/qht.h" +#include "qemu/rcu.h" #include "exec/tb-hash-xx.h" =20 enum QSPType { @@ -81,6 +87,12 @@ struct QSPEntry { }; typedef struct QSPEntry QSPEntry; =20 +struct QSPSnapshot { + struct rcu_head rcu; + struct qht ht; +}; +typedef struct QSPSnapshot QSPSnapshot; + /* initial sizing for hash tables */ #define QSP_INITIAL_SIZE 64 =20 @@ -100,6 +112,7 @@ static __thread int qsp_thread; static struct qht qsp_callsite_ht; =20 static struct qht qsp_ht; +static QSPSnapshot *qsp_snapshot; static bool qsp_initialized, qsp_initializing; =20 static const char * const qsp_typenames[] =3D { @@ -456,15 +469,69 @@ static void qsp_aggregate(struct qht *global_ht, void= *p, uint32_t h, void *up) agg->n_acqs +=3D e->n_acqs; } =20 +static void qsp_iter_diff(struct qht *orig, void *p, uint32_t hash, void *= htp) +{ + struct qht *ht =3D htp; + QSPEntry *old =3D p; + QSPEntry *new; + + new =3D qht_lookup(ht, old, hash); + /* entries are never deleted, so we must have this one */ + g_assert(new !=3D NULL); + /* our reading of the stats happened after the snapshot was taken */ + g_assert(new->n_acqs >=3D old->n_acqs); + g_assert(new->ns >=3D old->ns); + + new->n_acqs -=3D old->n_acqs; + new->ns -=3D old->ns; + + /* No point in reporting an empty entry */ + if (new->n_acqs =3D=3D 0 && new->ns =3D=3D 0) { + bool removed =3D qht_remove(ht, new, hash); + + g_assert(removed); + g_free(new); + } +} + +static void qsp_diff(struct qht *orig, struct qht *new) +{ + qht_iter(orig, qsp_iter_diff, new); +} + +static void qsp_ht_delete(struct qht *ht, void *p, uint32_t h, void *htp) +{ + g_free(p); +} + static void qsp_mktree(GTree *tree) { + QSPSnapshot *snap; struct qht ht; =20 + /* + * First, see if there's a prior snapshot, so that we read the global = hash + * table _after_ the snapshot has been created, which guarantees that + * the entries we'll read will be a superset of the snapshot's entries. + * + * We must remain in an RCU read-side critical section until we're done + * with the snapshot. + */ + rcu_read_lock(); + snap =3D atomic_rcu_read(&qsp_snapshot); + /* Aggregate all results from the global hash table into a local one */ qht_init(&ht, qsp_entry_no_thread_cmp, QSP_INITIAL_SIZE, QHT_MODE_AUTO_RESIZE | QHT_MODE_RAW_MUTEXES); qht_iter(&qsp_ht, qsp_aggregate, &ht); =20 + /* compute the difference wrt the snapshot, if any */ + if (snap) { + qsp_diff(&snap->ht, &ht); + } + /* done with the snapshot; RCU can reclaim it */ + rcu_read_unlock(); + /* sort the hash table elements by using a tree */ qht_iter(&ht, qsp_sort, tree); =20 @@ -603,3 +670,30 @@ void qsp_report(FILE *f, fprintf_function cpu_fprintf,= size_t max, pr_report(&rep, f, cpu_fprintf); report_destroy(&rep); } + +static void qsp_snapshot_destroy(QSPSnapshot *snap) +{ + qht_iter(&snap->ht, qsp_ht_delete, NULL); + qht_destroy(&snap->ht); + g_free(snap); +} + +void qsp_reset(void) +{ + QSPSnapshot *new =3D g_new(QSPSnapshot, 1); + QSPSnapshot *old; + + qsp_init(); + + qht_init(&new->ht, qsp_entry_cmp, QSP_INITIAL_SIZE, + QHT_MODE_AUTO_RESIZE | QHT_MODE_RAW_MUTEXES); + + /* take a snapshot of the current state */ + qht_iter(&qsp_ht, qsp_aggregate, &new->ht); + + /* replace the previous snapshot, if any */ + old =3D atomic_xchg(&qsp_snapshot, new); + if (old) { + call_rcu(old, qsp_snapshot_destroy, rcu); + } +} --=20 2.17.1