From nobody Tue Apr 21 14:38:13 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1776679699; cv=none; d=zohomail.com; s=zohoarc; b=byWPw2fIGp8pO8hcWFFBa4FHeW6P8UpaliSMTJk+kJmUhOGGI54g0WSWNLa9hwhPsVvA0DuySqbyaBtSAHhurtsNT2VYQxRR8f9HcKI9n5qw1AXZDng8RgBjh24oIN20xa5yPVSxpCzxFkHELlqywCbCvh4BNYTddhPtGFLoAJk= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1776679699; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:Reply-To:Reply-To:References:Sender:Subject:Subject:To:To:Message-Id; bh=6O2UHfGRuvNhWUJoGzsFLzbU32nwuB6gpukG2dUoRm4=; b=k6vBa1TjV24QY4kOnLmcMYb9IU7aJ4DRhkFEEC0W8LhzPcuNgE3vqWFZ8uzLZon18oyRoEA/aDVrkq5xMGFqvVFMBDmtR56fltl3+DJaaxATQc+dkpS9m4Eyrb5T8sidCzw/AiUCTKL7j5SlDBh6/ghq7W3We2xwXB9TnreMCc8= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists1p.gnu.org (lists1p.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 17766796988934.101079785855745; Mon, 20 Apr 2026 03:08:18 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists1p.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1wElXR-0001iA-Gj; Mon, 20 Apr 2026 06:07:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists1p.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1wElWy-0001g4-2N; Mon, 20 Apr 2026 06:07:05 -0400 Received: from relay.virtuozzo.com ([130.117.225.111]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1wElWu-0000j2-QH; Mon, 20 Apr 2026 06:07:02 -0400 Received: from ch-demo-asa.virtuozzo.com ([130.117.225.8] helo=iris.sw.ru) by relay.virtuozzo.com with esmtp (Exim 4.96) (envelope-from ) id 1wElUE-007lFH-1U; Mon, 20 Apr 2026 12:06:47 +0200 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=virtuozzo.com; s=relay; h=MIME-Version:Message-ID:Date:Subject:From: Content-Type; bh=6O2UHfGRuvNhWUJoGzsFLzbU32nwuB6gpukG2dUoRm4=; b=Rb0UMCQYqo8v gqiMVcT2wSNCXi7pWVDM4s/eUSyghqTBj4x8vJj/I/6MWpSIpYGdfK+lF7eGfW8/NujnrCInyTjPD /m/83lYsjxa2a49lzLBvBP0Kv3DHnc1rZTxICn6WHaiZujl9XogqaTcj43thV57WBX/lg4nlAN0XV c394gsla2CpQGHwp5qtrQ73V/csD2xi1UVo9SoDtMtFPgLW1SDso0OZhigj8WqQbJLglX3Y3M2E3l 0/+g3Y1x/B6Ot2HkmZQ0X+ZSlpFUJlQg3eRN/HzqztTEynbHFl3u4b1mjm4j1lOeFlZkjfvVFi7fp BGZpdi6qaOPCphK8SXZeUA==; To: qemu-devel@nongnu.org, qemu-block@nongnu.org, qemu-stable@nongnu.org Cc: kwolf@redhat.com, hreitz@redhat.com, stefanha@redhat.com, pbonzini@redhat.com, "Denis V. Lunev" Subject: [PATCH 1/1] block/linux-aio: bound ioq_submit() recursion depth #VSTOR-129345 Date: Mon, 20 Apr 2026 12:06:54 +0200 Message-ID: <20260420100655.3318452-2-den@openvz.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20260420100655.3318452-1-den@openvz.org> References: <20260420100655.3318452-1-den@openvz.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists1p.gnu.org; Received-SPF: softfail client-ip=130.117.225.111; envelope-from=den@openvz.org; helo=relay.virtuozzo.com X-Spam_score_int: -34 X-Spam_score: -3.5 X-Spam_bar: --- X-Spam_report: (-3.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: qemu development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-to: "Denis V. Lunev" From: "Denis V. Lunev" via qemu development Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1776679700506158500 Content-Type: text/plain; charset="utf-8" qemu_laio_process_completions() wraps its body in defer_call_begin / defer_call_end. Inside the section, completion callbacks wake coroutines that queue new aiocbs; laio_do_submit() defers laio_deferred_fn. At the bottom of qemu_laio_process_completions() the defer_call_end() fires laio_deferred_fn, which calls ioq_submit(), closing the cycle: ioq_submit -> io_submit(2) // some sync completions -> qemu_laio_process_completions // defer_call_begin -> aio_co_wake // resumes coroutine -> laio_do_submit -> defer_call(laio_deferred_fn, s) // enqueued -> defer_call_end // nesting drops to 0 -> laio_deferred_fn -> ioq_submit // +1 stack frame, loop When io_submit(2) returns asynchronously (O_DIRECT) the cycle terminates in one extra frame: the fresh aiocb is still in flight, no completion is drained, no coroutine wakes, no new submission queues. When submissions complete synchronously (non-O_DIRECT, or per-descriptor drivers such as vmdk) each level enqueues more work for the next defer_call_end() to drain, so recursion grows without bound and QEMU crashes with SIGSEGV on the thread guard page. The cycle was closed by two performance commits, each correct in isolation: 076682885d ("block/linux-aio: convert to blk_io_plug_call() API") -- introduced laio_deferred_fn and wired laio_do_submit -> defer_call(laio_deferred_fn, s). 84d61e5f36 ("virtio: use defer_call() in virtio_irqfd_notify()") -- added defer_call_begin/end around qemu_laio_process_completions so virtio-irqfd notifications batch across a completion pass. The supported aio=3Dnative + cache=3Dnone pairing keeps submissions asynchronous, so the cycle stays bounded; nothing in the code enforces that contract. Observed in production as a SIGSEGV during a backup job configured with --cached + aio=3Dnative; reproducible on upstream with qemu-io against vmdk. Cap ioq_submit() recursion with a per-thread counter. On overflow, return without submitting. The pending work is drained by s->completion_bh, which qemu_laio_process_completions() has already scheduled on entry -- no work is lost; one event-loop round-trip of latency is paid only when the bound is hit, which cannot happen on a supported configuration. Signed-off-by: Denis V. Lunev CC: Kevin Wolf CC: Hanna Reitz CC: Stefan Hajnoczi CC: Paolo Bonzini --- block/linux-aio.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/block/linux-aio.c b/block/linux-aio.c index 0a7424fbb3..f98bb6e766 100644 --- a/block/linux-aio.c +++ b/block/linux-aio.c @@ -36,6 +36,19 @@ /* Maximum number of requests in a batch. (default value) */ #define DEFAULT_MAX_BATCH 32 =20 +/* + * Bound on how deep ioq_submit() may recurse on a single thread via the + * ioq_submit -> qemu_laio_process_completions -> defer_call_end -> + * laio_deferred_fn -> ioq_submit cycle. The cycle terminates naturally + * when io_submit(2) returns asynchronously (O_DIRECT), but can grow + * without bound when submissions complete synchronously. On overflow + * the caller returns without submitting; the outermost + * qemu_laio_process_completions() has already scheduled s->completion_bh + * (via qemu_bh_schedule() at the top of that function), which resumes + * submission from the next event-loop dispatch. + */ +#define IOQ_SUBMIT_MAX_DEPTH 8 + struct qemu_laiocb { Coroutine *co; LinuxAioState *ctx; @@ -80,6 +93,9 @@ struct LinuxAioState { static void ioq_submit(LinuxAioState *s); static int laio_do_submit(struct qemu_laiocb *laiocb); =20 +/* Per-thread recursion counter for ioq_submit(). See IOQ_SUBMIT_MAX_DEPTH= . */ +static __thread unsigned ioq_submit_depth; + static inline ssize_t io_event_ret(struct io_event *ev) { return (ssize_t)(((uint64_t)ev->res2 << 32) | ev->res); @@ -340,6 +356,11 @@ static void ioq_submit(LinuxAioState *s) QEMU_UNINITIALIZED struct iocb *iocbs[MAX_EVENTS]; QSIMPLEQ_HEAD(, qemu_laiocb) completed; =20 + if (ioq_submit_depth >=3D IOQ_SUBMIT_MAX_DEPTH) { + return; + } + ioq_submit_depth++; + do { if (s->io_q.in_flight >=3D MAX_EVENTS) { break; @@ -385,6 +406,8 @@ static void ioq_submit(LinuxAioState *s) * pended requests will be submitted from there. */ } + + ioq_submit_depth--; } =20 static uint64_t laio_max_batch(LinuxAioState *s, uint64_t dev_max_batch) --=20 2.51.0