From nobody Mon May 18 04:02:48 2026 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1556971831; cv=none; d=zoho.com; s=zohoarc; b=gsEU+yXiVO0KddIvsamD4B9/OGH1ZYpAVP6c3i0P4VlFmf9V+NnNPuFlDMpC2xoIG7vZ4Ji72811G88LAgW/LZuAJiSEu2fsMUJSLv7OYgnTnszh130zPl1ioth6BZXOtAn3sqqAF0Uvkbr3J8qXqhaE+Vgq/1B9XmhSN8OpZUo= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zoho.com; s=zohoarc; t=1556971831; h=Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To:ARC-Authentication-Results; bh=JLVM6LVe8Ci/wZhURG+abGdUOpJKnFkyRFzRhc1SJWU=; b=XHYhplz+yZJV2B3cK6kASxTa2XhQIPLaFr7RYrOZXzVVaO1MEGNtf/lDcW87kkQoaJG1YWLGJVb7eF36fO8AMieVlzBllnfbCN+L/077mt5tepUqNo8+VeYzqcyBHsZkHfjmL6AimNuvNp3bNyoNJfbHVYEnqtCSd/PcnyqDgy8= ARC-Authentication-Results: i=1; mx.zoho.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1556971831623537.4782755796806; Sat, 4 May 2019 05:10:31 -0700 (PDT) Received: from localhost ([127.0.0.1]:55696 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hMtUq-0002F7-Dw for importer@patchew.org; Sat, 04 May 2019 08:10:28 -0400 Received: from eggs.gnu.org ([209.51.188.92]:50375) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hMtQA-000795-Dd for qemu-devel@nongnu.org; Sat, 04 May 2019 08:05:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hMtQ7-0001CK-0m for qemu-devel@nongnu.org; Sat, 04 May 2019 08:05:37 -0400 Received: from mail-ot1-x344.google.com ([2607:f8b0:4864:20::344]:41695) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1hMtQ6-0001Bq-Oi; Sat, 04 May 2019 08:05:34 -0400 Received: by mail-ot1-x344.google.com with SMTP id g8so7624747otl.8; Sat, 04 May 2019 05:05:34 -0700 (PDT) Received: from localhost.localdomain ([198.59.53.9]) by smtp.gmail.com with ESMTPSA id s26sm1844978otk.24.2019.05.04.05.05.32 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 04 May 2019 05:05:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=JLVM6LVe8Ci/wZhURG+abGdUOpJKnFkyRFzRhc1SJWU=; b=XjGrNLQ7KSyKQyu5UmbwKRrE0WcO8JeUSinNAWStQOurprmn5Vzj9dGaemEKFp7tM4 8NpNqASxAcAOQ1faSaaSrjq3b5QU623zKuwPkGlgrvLHx2PYX3uVc25whrztlJBR9sch jpkwCVBMuBCkNCqP7NA3MPXnnuN8RIjqh0Z1HAdCqCW/3yo3xuYDblazypY7j67Nc/5Z Vm5ItF8sFd+3FJJXPa+7eBpmzRRJPgN2GEsMxKzbudloWE89441BvJo53gUTitQ40I9a /GKVAE4CpfMxQEarVQHoVjee67mjtaghic1GlCfNCnKR3x01lay3cBnnr1k7UlBITfse PV/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references:mime-version:content-transfer-encoding; bh=JLVM6LVe8Ci/wZhURG+abGdUOpJKnFkyRFzRhc1SJWU=; b=gysFq4+r4qqfWaDoJcqfyADhBMqmXLPVB/rpIcbu1vuticTvYp/igIPJE68Ob/63ve 9P27Y10kWyB77K3uDxpoo0Agmfkm/HrdvSeXTs2pRgCP0II4d1BW/kgmVadToKjDPs+0 TQHhSSUamv95lxrY1dGBmrPe8s70uz8GuH4FFxqZnmAi8sV9DrDyS9qe33qdJh4i9nwp KhuC+FN3e7Gx+c0xU51xZ6aDALrl5M63dhj04m9/Qk8ZIxtktUC/0HzLPT6XA74lUt26 h7tKZmPhNF/w6lKbVWf2IMESue1vgMWnKeJkUBFlSo9n9HxZw1+b4rc0ELapQzEMTArX YPwA== X-Gm-Message-State: APjAAAX7UHDjBKVq/dItYqJB6CZWdTqL3nkkOaT/fdE8ibLkhVjUNL2z iXVs8r7mnud1ZNe1nWU+uzptT+NC X-Google-Smtp-Source: APXvYqwFgA6xZyLcLIz6WrCS29JaaXR+3+Z4YuzBt62cE9J8ViTsxDmC7YyA6payPMOi+y6zu2wDvQ== X-Received: by 2002:a9d:67:: with SMTP id 94mr10123010ota.57.1556971533699; Sat, 04 May 2019 05:05:33 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Date: Sat, 4 May 2019 06:05:21 -0600 Message-Id: <20190504120528.6389-4-pbonzini@redhat.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190504120528.6389-1-pbonzini@redhat.com> References: <20190504120528.6389-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::344 Subject: [Qemu-devel] [PATCH 3/9] coroutine: add host specific coroutine backend for 64-bit x86 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, cohuck@redhat.com, richard.henderson@linaro.org, qemu-block@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Type: text/plain; charset="utf-8" This backend is faster (100ns vs 150ns per switch on my laptop), but especially it will be possible to add CET support to it. Most of the code is actually not architecture specific. Signed-off-by: Paolo Bonzini Reviewed-by: Richard Henderson --- configure | 10 ++ scripts/qemugdb/coroutine.py | 5 +- scripts/qemugdb/coroutine_asm.py | 20 +++ util/Makefile.objs | 1 + util/coroutine-asm.c | 230 +++++++++++++++++++++++++++++++ 5 files changed, 264 insertions(+), 2 deletions(-) create mode 100644 scripts/qemugdb/coroutine_asm.py create mode 100644 util/coroutine-asm.c diff --git a/configure b/configure index 5b183c2e39..c01f57a3ae 100755 --- a/configure +++ b/configure @@ -5200,6 +5200,8 @@ fi if test "$coroutine" =3D ""; then if test "$mingw32" =3D "yes"; then coroutine=3Dwin32 + elif test "$cpu" =3D "x86_64"; then + coroutine=3Dasm elif test "$ucontext_works" =3D "yes"; then coroutine=3Ducontext else @@ -5225,6 +5227,14 @@ else error_exit "only the 'windows' coroutine backend is valid for Window= s" fi ;; + asm) + if test "$mingw32" =3D "yes"; then + error_exit "only the 'windows' coroutine backend is valid for Window= s" + fi + if test "$cpu" !=3D "x86_64"; then + error_exit "the 'asm' backend is only valid for x86_64 hosts" + fi + ;; *) error_exit "unknown coroutine backend $coroutine" ;; diff --git a/scripts/qemugdb/coroutine.py b/scripts/qemugdb/coroutine.py index 076f6808ab..dc760235e7 100644 --- a/scripts/qemugdb/coroutine.py +++ b/scripts/qemugdb/coroutine.py @@ -10,14 +10,15 @@ # This work is licensed under the terms of the GNU GPL, version 2 # or later. See the COPYING file in the top-level directory. =20 -from . import coroutine_ucontext +from . import coroutine_ucontext, coroutine_asm import gdb =20 VOID_PTR =3D gdb.lookup_type('void').pointer() UINTPTR_T =3D gdb.lookup_type('uintptr_t') =20 backends =3D { - 'CoroutineUContext': coroutine_ucontext + 'CoroutineUContext': coroutine_ucontext, + 'CoroutineAsm': coroutine_asm } =20 def coroutine_backend(): diff --git a/scripts/qemugdb/coroutine_asm.py b/scripts/qemugdb/coroutine_a= sm.py new file mode 100644 index 0000000000..b4ac1291db --- /dev/null +++ b/scripts/qemugdb/coroutine_asm.py @@ -0,0 +1,20 @@ +#!/usr/bin/python + +# GDB debugging support +# +# Copyright 2019 Red Hat, Inc. +# +# Authors: +# Paolo Bonzini +# +# This work is licensed under the terms of the GNU GPL, version 2 or +# later. See the COPYING file in the top-level directory. + +import gdb + +U64_PTR =3D gdb.lookup_type('uint64_t').pointer() + +def get_coroutine_regs(addr): + addr =3D addr.cast(gdb.lookup_type('CoroutineAsm').pointer()) + rsp =3D addr['sp'].cast(U64_PTR) + return {'sp': rsp, 'pc': rsp.dereference()} diff --git a/util/Makefile.objs b/util/Makefile.objs index 9206878dec..41a10539cf 100644 --- a/util/Makefile.objs +++ b/util/Makefile.objs @@ -39,6 +39,7 @@ util-obj-$(CONFIG_MEMBARRIER) +=3D sys_membarrier.o util-obj-y +=3D qemu-coroutine.o qemu-coroutine-lock.o qemu-coroutine-io.o util-obj-y +=3D qemu-coroutine-sleep.o util-obj-y +=3D coroutine-$(CONFIG_COROUTINE_BACKEND).o +coroutine-asm.o-cflags :=3D -mno-red-zone util-obj-y +=3D buffer.o util-obj-y +=3D timed-average.o util-obj-y +=3D base64.o diff --git a/util/coroutine-asm.c b/util/coroutine-asm.c new file mode 100644 index 0000000000..a06ecbcb0a --- /dev/null +++ b/util/coroutine-asm.c @@ -0,0 +1,230 @@ +/* + * Host-specific coroutine initialization code + * + * Copyright (C) 2006 Anthony Liguori + * Copyright (C) 2011 Kevin Wolf + * Copyright (C) 2019 Paolo Bonzini + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.0 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see . + */ + +#include "qemu/osdep.h" +#include "qemu-common.h" +#include "qemu/coroutine_int.h" + +#ifdef CONFIG_VALGRIND_H +#include +#endif + +#if defined(__SANITIZE_ADDRESS__) || __has_feature(address_sanitizer) +#ifdef CONFIG_ASAN_IFACE_FIBER +#define CONFIG_ASAN 1 +#include +#endif +#endif + +#define COROUTINE_SHADOW_STACK_SIZE 4096 + +typedef struct { + Coroutine base; + void *sp; + + void *stack; + size_t stack_size; + +#ifdef CONFIG_VALGRIND_H + unsigned int valgrind_stack_id; +#endif +} CoroutineAsm; + +/** + * Per-thread coroutine bookkeeping + */ +static __thread CoroutineAsm leader; +static __thread Coroutine *current; + +static void finish_switch_fiber(void *fake_stack_save) +{ +#ifdef CONFIG_ASAN + const void *bottom_old; + size_t size_old; + + __sanitizer_finish_switch_fiber(fake_stack_save, &bottom_old, &size_ol= d); + + if (!leader.stack) { + leader.stack =3D (void *)bottom_old; + leader.stack_size =3D size_old; + } +#endif +} + +static void start_switch_fiber(void **fake_stack_save, + const void *bottom, size_t size) +{ +#ifdef CONFIG_ASAN + __sanitizer_start_switch_fiber(fake_stack_save, bottom, size); +#endif +} + +#ifdef __x86_64__ +/* + * We hardcode all operands to specific registers so that we can write dow= n all the + * others in the clobber list. Note that action also needs to be hardcode= d so that + * it is the same register in all expansions of this macro. Also, we use = %rdi + * for the coroutine because that is the ABI's first argument register; + * coroutine_trampoline can then retrieve the current coroutine from there. + * + * Note that push and call would clobber the red zone. Makefile.objs comp= iles this + * file with -mno-red-zone. The alternative is to subtract/add 128 bytes = from rsp + * around the switch, with slightly lower cache performance. + */ +#define CO_SWITCH(from, to, action, jump) ({ = \ + int action_ =3D action; = \ + void *from_ =3D from; = \ + void *to_ =3D to; = \ + asm volatile( = \ + "pushq %%rbp\n" /* save frame register on sour= ce stack */ \ + ".cfi_adjust_cfa_offset 8\n" = \ + "call 1f\n" /* switch continues at label 1= */ \ + "jmp 2f\n" /* switch back continues at la= bel 2 */ \ + = \ + "1: .cfi_adjust_cfa_offset 8\n" = \ + "movq %%rsp, %c[SP](%[FROM])\n" /* save source SP */ = \ + "movq %c[SP](%[TO]), %%rsp\n" /* load destination SP */ = \ + jump "\n" /* coroutine switch */ = \ + = \ + "2: .cfi_adjust_cfa_offset -8\n" = \ + "popq %%rbp\n" = \ + ".cfi_adjust_cfa_offset -8\n" = \ + : "+a" (action_), [FROM] "+b" (from_), [TO] "+D" (to_) = \ + : [SP] "i" (offsetof(CoroutineAsm, sp)) = \ + : "rcx", "rdx", "rsi", "r8", "r9", "r10", "r11", "r12", "r13", "r1= 4", "r15", \ + "memory"); = \ + action_; = \ +}) +/* Use "call" to ensure the stack is aligned correctly. */ +#define CO_SWITCH_NEW(from, to) CO_SWITCH(from, to, 0, "call coroutine_tra= mpoline") +#define CO_SWITCH_RET(from, to, action) CO_SWITCH(from, to, action, "ret") +#else +#error coroutine-asm.c not ported to this architecture. +#endif + +static void __attribute__((__used__)) coroutine_trampoline(CoroutineAsm *s= elf) +{ + finish_switch_fiber(NULL); + + while (true) { + Coroutine *co =3D &self->base; + qemu_coroutine_switch(co, co->caller, COROUTINE_TERMINATE); + co->entry(co->entry_arg); + } +} + +Coroutine *qemu_coroutine_new(void) +{ + CoroutineAsm *co; + void *fake_stack_save =3D NULL; + + co =3D g_malloc0(sizeof(*co)); + co->stack_size =3D COROUTINE_STACK_SIZE; + co->stack =3D qemu_alloc_stack(&co->stack_size); + co->sp =3D co->stack + co->stack_size; + +#ifdef CONFIG_VALGRIND_H + co->valgrind_stack_id =3D + VALGRIND_STACK_REGISTER(co->stack, co->stack + co->stack_size); +#endif + + /* + * Immediately enter the coroutine once to initialize the stack + * and program counter. We could instead just push the address + * of coroutine_trampoline and let qemu_coroutine_switch return + * to it, but doing it this way confines the non-portable code + * to the CO_SWITCH* macros. + */ + co->base.caller =3D qemu_coroutine_self(); + start_switch_fiber(&fake_stack_save, co->stack, co->stack_size); + CO_SWITCH_NEW(current, co); + finish_switch_fiber(fake_stack_save); + co->base.caller =3D NULL; + + return &co->base; +} + +#ifdef CONFIG_VALGRIND_H +#if defined(CONFIG_PRAGMA_DIAGNOSTIC_AVAILABLE) && !defined(__clang__) +/* Work around an unused variable in the valgrind.h macro... */ +#pragma GCC diagnostic push +#pragma GCC diagnostic ignored "-Wunused-but-set-variable" +#endif +static inline void valgrind_stack_deregister(CoroutineAsm *co) +{ + VALGRIND_STACK_DEREGISTER(co->valgrind_stack_id); +} +#if defined(CONFIG_PRAGMA_DIAGNOSTIC_AVAILABLE) && !defined(__clang__) +#pragma GCC diagnostic pop +#endif +#endif + +void qemu_coroutine_delete(Coroutine *co_) +{ + CoroutineAsm *co =3D DO_UPCAST(CoroutineAsm, base, co_); + +#ifdef CONFIG_VALGRIND_H + valgrind_stack_deregister(co); +#endif + + qemu_free_stack(co->stack, co->stack_size); + g_free(co); +} + +/* + * This function is marked noinline to prevent GCC from inlining it + * into coroutine_trampoline(). If we allow it to do that then it + * hoists the code to get the address of the TLS variable "current" + * out of the while() loop. This is an invalid transformation because + * qemu_coroutine_switch() may be called when running thread A but + * return in thread B, and so we might be in a different thread + * context each time round the loop. + */ +CoroutineAction __attribute__((noinline)) +qemu_coroutine_switch(Coroutine *from_, Coroutine *to_, + CoroutineAction action) +{ + CoroutineAsm *from =3D DO_UPCAST(CoroutineAsm, base, from_); + CoroutineAsm *to =3D DO_UPCAST(CoroutineAsm, base, to_); + void *fake_stack_save =3D NULL; + + current =3D to_; + + start_switch_fiber(action =3D=3D COROUTINE_TERMINATE ? + NULL : &fake_stack_save, to->stack, to->stack_size); + action =3D CO_SWITCH_RET(from, to, action); + finish_switch_fiber(fake_stack_save); + + return action; +} + +Coroutine *qemu_coroutine_self(void) +{ + if (!current) { + current =3D &leader.base; + } + return current; +} + +bool qemu_in_coroutine(void) +{ + return current && current->caller; +} --=20 2.21.0