From nobody Sun Feb 8 21:58:09 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1749824100; cv=none; d=zohomail.com; s=zohoarc; b=DERxrKl32HRxeqh6nK6YLZg4WjjtveZCM8AWTxBpMos0QmwUx2Srx3hmdOrqhvfYvjSMiLJJsfI8l+DBnNUy7xFYmuVgHRjOX86yy5acx+Tr2TO4WHBxbrDgENzNkT5GdJqU3FJIFzP/whjbpdjhHfULweehx++J1zcXn+ncMvY= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1749824100; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=2Ovn9cHFbbOnzTL416BsSHaOjwFIm9cgFfee3HTO6y8=; b=E+mIaNiS7CuHQpN83b7LthiDHqTji4fp6nNVm/Aj54HkkSIZEbz/zRFnnX1L8Xzg7YoDsE/41TBkoOCjKwAyiecklKffu+PCY1MAh7HnohyQAHOtfqyLZPllWOIQhrqoyBjRIPFAFcnZGlsEhx+l4lLhkEcCcYHX0hE/nUl07SI= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1749824100968584.6648457070696; Fri, 13 Jun 2025 07:15:00 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1uQ59P-00044D-JX; Fri, 13 Jun 2025 10:12:59 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uQ59E-0003vi-Nk for qemu-devel@nongnu.org; Fri, 13 Jun 2025 10:12:50 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1uQ596-0002Kk-N1 for qemu-devel@nongnu.org; Fri, 13 Jun 2025 10:12:45 -0400 Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-646-q77b4VOKPl2HlPp3McsalA-1; Fri, 13 Jun 2025 10:12:39 -0400 Received: by mail-qt1-f197.google.com with SMTP id d75a77b69052e-4a57fea76beso44151141cf.0 for ; Fri, 13 Jun 2025 07:12:39 -0700 (PDT) Received: from x1.com ([85.131.185.92]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4a72a526cc6sm17141641cf.79.2025.06.13.07.12.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Jun 2025 07:12:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1749823960; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2Ovn9cHFbbOnzTL416BsSHaOjwFIm9cgFfee3HTO6y8=; b=Q5qzH6CW9vOZy948uh0yo+RSLcPuekiI6dv2vyl81dlb9Irw6h7vjs+9hcDvB2m0GIHeoR QhTCPCY2ZZIBy6l2i2oTdvwS2WKh6lCqRdM2MgIZwyDtNuk0USpa4ACXXnozYpF0B/+CRv AnGJFwNLLzrUa1Z0fz7c8rcy92nJfOc= X-MC-Unique: q77b4VOKPl2HlPp3McsalA-1 X-Mimecast-MFC-AGG-ID: q77b4VOKPl2HlPp3McsalA_1749823958 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749823958; x=1750428758; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2Ovn9cHFbbOnzTL416BsSHaOjwFIm9cgFfee3HTO6y8=; b=O6xcONZxR5iZYnQ8Amfe2toI3TX6QKG8frD+yrUW7Err4V/wZkucdTZyTttMwhSSie LybD2SIzuKUPHBQtNPu0zh3gnYwW7DpdRsxV/M77JHq8X4K+XEC9YKyX51Ckpr1HKyal WlQJyMwX8tztU6RC+UV6ctVuw63QdBINtXhTWrGb7hd5hWd9wRQ+O2oFzaq+N/rr0n1v n2tPCxceFpxJVX8bq8aZW2RfOsEoQnZP/1iCaebp23kH+Ru7CG9Is6DR0ahF+tPwvYqf IaBRluDUeZYffXy57k5CR42oBOD/jGJIbwvHaBRNlke+DcY+tL96OeJidb9gxJph7KON QsNA== X-Gm-Message-State: AOJu0YwMusoHHFan47d7aXOjkhHa7Zl3WouJXsAEoFHKoZRbCkkDOUr8 m9jJE8RGrsmQFqapV0BJn6bKDFE0ZbhsAwxh5EGsnM+tt5aR+J2lE/iExWwG92gopsgkYiSafwc tEAtYFNncapZIUUjuxNt00yrwEowu5EMGy+C8hscNBO1gueYdPkePh4xlLGMfhDykBh+ZCLgtAB RvpyAtVUop2PTNzyoejj3Aj1uu6plnnD0Y0jx2cQ== X-Gm-Gg: ASbGncuB37hT+3YcxjHy5v5W9lzXFUYOHrURLedCHSifO/klOKe+CFdUscG83llgu6h xvjysjAkGF3Kt49eixBC+0Kk7vzTaKiOkM3lR+//JjcyfgNZFrvRPV4HHCP+zETOVviVlTC4j9V Z9OIrcSYYTC4f30F9QkpJJjT8EwMRxUv/03DNh/gULrBB3I9m4EFOYZR+luJS3KwbNNDzMKeIif b2OMPuLqScbFsNnPVDlQDoEXndf/GEnHu/2Z2ZUCfoPHQSlouV3CuJHQAsFPUvwKL1FAVrrfZXK p1n9Pvu0n6Q= X-Received: by 2002:a05:622a:580e:b0:4a6:eac8:58c6 with SMTP id d75a77b69052e-4a72fe7ac63mr65133621cf.13.1749823957439; Fri, 13 Jun 2025 07:12:37 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGH/mc9he0aMumbb2nGu2xGorOI+nQbc920mlrtz3aeGfGomLV9dX5oLkQ7B/gtIjTlI/2XnA== X-Received: by 2002:a05:622a:580e:b0:4a6:eac8:58c6 with SMTP id d75a77b69052e-4a72fe7ac63mr65132811cf.13.1749823956507; Fri, 13 Jun 2025 07:12:36 -0700 (PDT) From: Peter Xu To: qemu-devel@nongnu.org Cc: Mario Casquero , Alexey Perevalov , "Dr . David Alan Gilbert" , peterx@redhat.com, Juraj Marcin , Fabiano Rosas , Markus Armbruster Subject: [PATCH v3 13/14] migration/postcopy: blocktime allows track / report non-vCPU faults Date: Fri, 13 Jun 2025 10:12:16 -0400 Message-ID: <20250613141217.474825-14-peterx@redhat.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250613141217.474825-1-peterx@redhat.com> References: <20250613141217.474825-1-peterx@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1749824102088116600 Content-Type: text/plain; charset="utf-8" When used to report page fault latencies, the blocktime feature can be almost useless when KVM async page fault is enabled, because in most cases such remote fault will kickoff async page faults, then it's not trackable from blocktime layer. After all these recent rewrites to blocktime layer, it's finally so easy to also support tracking non-vCPU faults. It'll be even faster if we could always index fault records with TIDs, unfortunately we need to maintain the blocktime API which report things in vCPU indexes. Of course this can work not only for kworkers, but also any guest accesses that may reach a missing page, for example, very likely when in the QEMU main thread too (and all other threads whenever applicable). In this case, we don't care about "how long the threads are blocked", but we only care about "how long the fault will be resolved". Cc: Markus Armbruster Cc: Dr. David Alan Gilbert Reviewed-by: Fabiano Rosas Tested-by: Mario Casquero Signed-off-by: Peter Xu --- qapi/migration.json | 12 ++++- migration/migration-hmp-cmds.c | 5 +++ migration/postcopy-ram.c | 64 +++++++++++++++++++++------ tests/qtest/migration/migration-qmp.c | 1 + migration/trace-events | 2 +- 5 files changed, 67 insertions(+), 17 deletions(-) diff --git a/qapi/migration.json b/qapi/migration.json index 30302f36cf..0f5b2d914c 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -247,6 +247,12 @@ # this is the per-vCPU statistics. This is only present when the # postcopy-blocktime migration capability is enabled. (Since 10.1) # +# @postcopy-non-vcpu-latency: average remote page fault latency for all +# faults happend in non-vCPU threads (in ns). It has the same +# definition of @postcopy-latency but this only provides statistics to +# non-vCPU faults. This is only present when the postcopy-blocktime +# migration capability is enabled. (Since 10.1) +# # @socket-address: Only used for tcp, to know what the real port is # (Since 4.0) # @@ -273,8 +279,8 @@ # # Features: # -# @unstable: Members @postcopy-latency, @postcopy-vcpu-latency are -# experimental. +# @unstable: Members @postcopy-latency, @postcopy-vcpu-latency, +# @postcopy-non-vcpu-latency are experimental. # # Since: 0.14 ## @@ -295,6 +301,8 @@ 'type': 'uint64', 'features': [ 'unstable' ] }, '*postcopy-vcpu-latency': { 'type': ['uint64'], 'features': [ 'unstable' ] }, + '*postcopy-non-vcpu-latency': { + 'type': 'uint64', 'features': [ 'unstable' ] }, '*socket-address': ['SocketAddress'], '*dirty-limit-throttle-time-per-round': 'uint64', '*dirty-limit-ring-full-time': 'uint64'} } diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index 8b3846dab5..e1f9530520 100644 --- a/migration/migration-hmp-cmds.c +++ b/migration/migration-hmp-cmds.c @@ -80,6 +80,11 @@ static void migration_dump_blocktime(Monitor *mon, Migra= tionInfo *info) info->postcopy_latency); } =20 + if (info->has_postcopy_non_vcpu_latency) { + monitor_printf(mon, "Postcopy non-vCPU Latencies (ns): %" PRIu64 "= \n", + info->postcopy_non_vcpu_latency); + } + if (info->has_postcopy_vcpu_latency) { uint64List *item =3D info->postcopy_vcpu_latency; const char *sep =3D ""; diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index 91c23b446e..f4cb23b3e0 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -192,6 +192,8 @@ typedef struct PostcopyBlocktimeContext { GHashTable *tid_to_vcpu_hash; /* Count of non-vCPU faults. This is only for debugging purpose. */ uint64_t non_vcpu_faults; + /* total blocktime when a non-vCPU thread is stopped */ + uint64_t non_vcpu_blocktime_total; =20 /* * Handler for exit event, necessary for @@ -203,7 +205,10 @@ typedef struct PostcopyBlocktimeContext { typedef struct { /* The time the fault was triggered */ uint64_t fault_time; - /* The vCPU index that was blocked */ + /* + * The vCPU index that was blocked, when cpu=3D=3D-1, it means it's a + * fault from non-vCPU threads. + */ int cpu; } BlocktimeVCPUEntry; =20 @@ -344,6 +349,12 @@ void fill_destination_postcopy_migration_info(Migratio= nInfo *info) QAPI_LIST_PREPEND(list_latency, latency); } =20 + latency_total +=3D bc->non_vcpu_blocktime_total; + faults +=3D bc->non_vcpu_faults; + + info->has_postcopy_non_vcpu_latency =3D true; + info->postcopy_non_vcpu_latency =3D bc->non_vcpu_faults ? + (bc->non_vcpu_blocktime_total / bc->non_vcpu_faults) : 0; info->has_postcopy_blocktime =3D true; /* Convert ns -> ms */ info->postcopy_blocktime =3D (uint32_t)(bc->total_blocktime / SCALE_MS= ); @@ -983,7 +994,10 @@ static uint64_t get_current_ns(void) return (uint64_t)qemu_clock_get_ns(QEMU_CLOCK_REALTIME); } =20 -/* Inject an (cpu, fault_time) entry into the database, using addr as key = */ +/* + * Inject an (cpu, fault_time) entry into the database, using addr as key. + * When cpu=3D=3D-1, it means it's a non-vCPU fault. + */ static void blocktime_fault_inject(PostcopyBlocktimeContext *ctx, uintptr_t addr, int cpu, uint64_t time) { @@ -1066,9 +1080,17 @@ void mark_postcopy_blocktime_begin(uintptr_t addr, u= int32_t ptid, /* Making sure it won't overflow - it really should never! */ assert(dc->vcpu_faults_current[cpu] <=3D 255); } else { - /* We do not support non-vCPU thread tracking yet */ + /* + * For non-vCPU thread faults, we don't care about tid or cpu index + * or time the thread is blocked (e.g., a kworker trying to help + * KVM when async_pf=3Don is OK to be blocked and not affect guest + * responsiveness), but we care about latency. Track it with + * cpu=3D-1. + * + * Note that this will NOT affect blocktime reports on vCPU being + * blocked, but only about system-wide latency reports. + */ dc->non_vcpu_faults++; - return; } =20 blocktime_fault_inject(dc, addr, cpu, current); @@ -1078,6 +1100,7 @@ typedef struct { PostcopyBlocktimeContext *ctx; uint64_t current; int affected_cpus; + int affected_non_cpus; } BlockTimeVCPUIter; =20 static void blocktime_cpu_list_iter_fn(gpointer data, gpointer user_data) @@ -1085,6 +1108,7 @@ static void blocktime_cpu_list_iter_fn(gpointer data,= gpointer user_data) BlockTimeVCPUIter *iter =3D user_data; PostcopyBlocktimeContext *ctx =3D iter->ctx; BlocktimeVCPUEntry *entry =3D data; + uint64_t time_passed; int cpu =3D entry->cpu; =20 /* @@ -1092,17 +1116,27 @@ static void blocktime_cpu_list_iter_fn(gpointer dat= a, gpointer user_data) * later than when it was faulted. */ assert(iter->current >=3D entry->fault_time); + time_passed =3D iter->current - entry->fault_time; =20 - /* - * If we resolved all pending faults on one vCPU due to this page - * resolution, take a note. - */ - if (--ctx->vcpu_faults_current[cpu] =3D=3D 0) { - ctx->vcpu_blocktime_total[cpu] +=3D iter->current - entry->fault_t= ime; - iter->affected_cpus +=3D 1; + if (cpu >=3D 0) { + /* + * If we resolved all pending faults on one vCPU due to this page + * resolution, take a note. + */ + if (--ctx->vcpu_faults_current[cpu] =3D=3D 0) { + ctx->vcpu_blocktime_total[cpu] +=3D time_passed; + iter->affected_cpus +=3D 1; + } + trace_postcopy_blocktime_end_one(cpu, ctx->vcpu_faults_current[cpu= ]); + } else { + iter->affected_non_cpus++; + ctx->non_vcpu_blocktime_total +=3D time_passed; + /* + * We do not maintain how many pending non-vCPU faults because we + * do not care about blocktime, only latency. + */ + trace_postcopy_blocktime_end_one(-1, 0); } - - trace_postcopy_blocktime_end_one(cpu, ctx->vcpu_faults_current[cpu]); } =20 /* @@ -1141,6 +1175,7 @@ static void mark_postcopy_blocktime_end(uintptr_t add= r) BlockTimeVCPUIter iter =3D { .current =3D get_current_ns(), .affected_cpus =3D 0, + .affected_non_cpus =3D 0, .ctx =3D dc, }; gpointer key =3D (gpointer)addr; @@ -1174,7 +1209,8 @@ static void mark_postcopy_blocktime_end(uintptr_t add= r) } dc->smp_cpus_down -=3D iter.affected_cpus; =20 - trace_postcopy_blocktime_end(addr, iter.current, iter.affected_cpus); + trace_postcopy_blocktime_end(addr, iter.current, iter.affected_cpus, + iter.affected_non_cpus); } =20 static void postcopy_pause_fault_thread(MigrationIncomingState *mis) diff --git a/tests/qtest/migration/migration-qmp.c b/tests/qtest/migration/= migration-qmp.c index 1a5ab2d229..67a67d4bd6 100644 --- a/tests/qtest/migration/migration-qmp.c +++ b/tests/qtest/migration/migration-qmp.c @@ -361,6 +361,7 @@ void read_blocktime(QTestState *who) g_assert(qdict_haskey(rsp_return, "postcopy-vcpu-blocktime")); g_assert(qdict_haskey(rsp_return, "postcopy-latency")); g_assert(qdict_haskey(rsp_return, "postcopy-vcpu-latency")); + g_assert(qdict_haskey(rsp_return, "postcopy-non-vcpu-latency")); qobject_unref(rsp_return); } =20 diff --git a/migration/trace-events b/migration/trace-events index a36a78f01a..706db97def 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -310,7 +310,7 @@ postcopy_preempt_thread_entry(void) "" postcopy_preempt_thread_exit(void) "" postcopy_blocktime_tid_cpu_map(int cpu, uint32_t tid) "cpu: %d, tid: %u" postcopy_blocktime_begin(uint64_t addr, uint64_t time, int cpu, bool exist= s) "addr: 0x%" PRIx64 ", time: %" PRIu64 ", cpu: %d, exist: %d" -postcopy_blocktime_end(uint64_t addr, uint64_t time, int affected_cpu) "ad= dr: 0x%" PRIx64 ", time: %" PRIu64 ", affected_cpus: %d" +postcopy_blocktime_end(uint64_t addr, uint64_t time, int affected_cpu, int= affected_non_cpus) "addr: 0x%" PRIx64 ", time: %" PRIu64 ", affected_cpus:= %d, affected_non_cpus: %d" postcopy_blocktime_end_one(int cpu, uint8_t left_faults) "cpu: %d, left_fa= ults: %" PRIu8 =20 # exec.c --=20 2.49.0