From nobody Sat Apr 11 23:03:17 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=quarantine dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1773069083; cv=none; d=zohomail.com; s=zohoarc; b=LO1LOd7XLpyLDPJfNzKZGEks1MJfr3eE6bonPrwrhj5ilmmZjORhlmoW/HTFZ2Z5vrjHOqWWYcAyauZmRoknEU4HjN4IbiFqjKQa5AsiI8BF5DK7aKLpPRMM8RTFIYLgajONl8sOGaFyWcjSdtBVev9bgRaHcDQGaIHrQZ+zFeo= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1773069083; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=W9Rz352gPbVCXsnI9PGbxy5vJFui4fw6p0MlUSQyxPo=; b=aIFLF5BJHM64dECvnQB2cyyMKOemEHVNeq0wh8eqB1rxWiP/2Dql5i6FW+d1/RFWZeGwTL5Hzqgl/ezIBXkqDmBx35waTCKebZGy/3mLqYVSdpBSccm4sN6/kFsrEXk30nuk6HTwrfF8qemijOo/M9QacyD7Lfo9k4cHW37FJVQ= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1773069083798286.8459270262978; Mon, 9 Mar 2026 08:11:23 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1vzcFu-0002GJ-Cz; Mon, 09 Mar 2026 11:10:58 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vzcFH-0001SQ-8U for qemu-devel@nongnu.org; Mon, 09 Mar 2026 11:10:15 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vzcFD-0000HF-At for qemu-devel@nongnu.org; Mon, 09 Mar 2026 11:10:09 -0400 Received: from mail-wr1-f72.google.com (mail-wr1-f72.google.com [209.85.221.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-568-MZX6QSbxNBapBsgo8BlHaQ-1; Mon, 09 Mar 2026 11:10:04 -0400 Received: by mail-wr1-f72.google.com with SMTP id ffacd0b85a97d-439b9116e2eso5379943f8f.2 for ; Mon, 09 Mar 2026 08:10:04 -0700 (PDT) Received: from localhost (p200300cfd737d0cf29d515fbd6051d53.dip0.t-ipconnect.de. [2003:cf:d737:d0cf:29d5:15fb:d605:1d53]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-439dad8daf2sm27880789f8f.2.2026.03.09.08.10.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Mar 2026 08:10:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1773069006; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=W9Rz352gPbVCXsnI9PGbxy5vJFui4fw6p0MlUSQyxPo=; b=DsHSQrLi6ADLuOBXCuuuXEDZQ64CAJ9vuv+hq/muPKy8iM6pEA4ow2XXnRXzCHBHCRimtk egokuvSMn+hVCsgVez7qiSDkDEZPmX9/lIxKlxjmRbJcm6zVL4WxNW3WKAqlTynyfznNY8 KK3FKnjqK59Gw59KVtFsvuAi/W75zyA= X-MC-Unique: MZX6QSbxNBapBsgo8BlHaQ-1 X-Mimecast-MFC-AGG-ID: MZX6QSbxNBapBsgo8BlHaQ_1773069003 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=google; t=1773069003; x=1773673803; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=W9Rz352gPbVCXsnI9PGbxy5vJFui4fw6p0MlUSQyxPo=; b=YLhhtJr2eECd4QA6N1yS4eUYKkgs8weVQU7h5EM916ch4htqTw/TwIKFMLtpmFvgro HWfsDUtC915fwrEAKJ7/JhVBkkbX9YnCg8syCbUnQA6bfSMp4CTXrm5rsBUqEFquXLeT fdx6asPkyAsZEMYiWS3SqXPdzxC+NnoscAsgJsyFLP2VhkpI+hI0aHnallABb6eNNFgy eo+hGh9/TcMT4KIRRF6S3eeXrIJ+m+VOEN03ncBsQUsbySiWOyA6dqpYyp3NrZ3jFb7G v5JkZTefRQmwCwl9swq5une5mloMZnUFWKH9Tm/lBhoONkMhbelvTr1TYfgjiPxLwMp8 8Zzw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1773069003; x=1773673803; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=W9Rz352gPbVCXsnI9PGbxy5vJFui4fw6p0MlUSQyxPo=; b=Q7sUzp8Nnn4Xmz5JpXQjk03I2AWhVa7sRq4I3g/zP2SnOUImhfp1B6Rb7XaVF6tQqz x+30+gv0OoZ2mK4B2Rkx2sHPxPO0kMpsmNvq3Dt4VOGnAjIwzp0tMQCzGX4kQyIPELWV /pkvx64Na03bbqMADLwabeXG1tBgVF6nm+6NV8YMqZBAT/87ldAlCux36DxKZ+JLKiuN 61jC+QKksg8oGouaoBWwzcTGjD/r4YoFfSSlXXBIh25HR5qlDtCXZ7SNDTcdgK6qHY9k nv054d9f+tbduu2Jjb12a1zom/pohlbs8c5roXJJHZLU79KHzmgJbsTmXL9pNrhaS+wJ jXIA== X-Gm-Message-State: AOJu0YzZoZIc+8eMsg7cBx/uYh0wVFvjchWbnNyYQGnYNZvwP3/nwXFY p1WpBKgEXGSY5LCT5eHD+b0JzomEQxnuU0YH2cEJCdQpYsR4Jpt2082AJtsXmYbt1PGA9R+LPKI YANC3+fseG+B4w/LfhtxM/Vw8rLgMv2qLOmOIJERcjv23g3rLZUvWS9Br X-Gm-Gg: ATEYQzwzBVcdh4fIgGoKr4bVDJWs/G0eqthL/5eojJ2MW/0qnQmO1yUfr65h6ztYKd/ ULoXuvLiWb/ksOWZgiXJ8Xhbr43o2zeHQ+M06ZxVz01w4wt3ex+ZqZLPjDNMP3JE49duQPRgeTw 5rscQEAeR4N8Nzzh3NV/k6JagBWUG1vE4gdfTI0VF/zCAhS9vuroodReACGhE54NXIB2yvFQyKQ IwTvQSL2ZCMC1yTb9u5tgLg88hEvLF3NWY44sVZ6JezW20mhBqHYU8l+z2KzzacWMwr5MdRlv9O ody7id+5iRIvcS626Gyu/XxyMcbSn0VO1F7bzCmQSmtd93456uMILoTi/qRg171M8XsRXRsY3bW XJKHowB5TWAtOPml4dXFTXrmAEohi5bOViBf0WAJ53aVW34YHsfIrxRrtkY0G5jOFPRqSrUnGB2 2/KUkX X-Received: by 2002:a05:6000:24c1:b0:439:c43a:accd with SMTP id ffacd0b85a97d-439da6693f5mr17423815f8f.27.1773069002966; Mon, 09 Mar 2026 08:10:02 -0700 (PDT) X-Received: by 2002:a05:6000:24c1:b0:439:c43a:accd with SMTP id ffacd0b85a97d-439da6693f5mr17423751f8f.27.1773069002411; Mon, 09 Mar 2026 08:10:02 -0700 (PDT) From: Hanna Czenczek To: qemu-block@nongnu.org Cc: qemu-devel@nongnu.org, Hanna Czenczek , Kevin Wolf , Brian Song Subject: [PATCH v5 23/25] fuse: Implement multi-threading Date: Mon, 9 Mar 2026 16:08:54 +0100 Message-ID: <20260309150856.26800-24-hreitz@redhat.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260309150856.26800-1-hreitz@redhat.com> References: <20260309150856.26800-1-hreitz@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=hreitz@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -3 X-Spam_score: -0.4 X-Spam_bar: / X-Spam_report: (-0.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.819, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.903, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: qemu development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1773069086067154100 FUSE allows creating multiple request queues by "cloning" /dev/fuse FDs (via open("/dev/fuse") + ioctl(FUSE_DEV_IOC_CLONE)). We can use this to implement multi-threading. For configuration, we don't need any more information beyond the simple array provided by the core block export interface: The FUSE kernel driver feeds these FDs in a round-robin fashion, so all of them are equivalent and we want to have exactly one per thread. These are the benchmark results when using four threads (compared to a single thread); note that fio still only uses a single job, but performance can still be improved because of said round-robin usage for the queues. (Not in the sync case, though, in which case I guess it just adds overhead.) file: read: seq aio: 261.7k =C2=B11.7k (+168%) rand aio: 129.2k =C2=B114.3k (+35%) seq sync: 36.6k =C2=B10.6k (+6%) rand sync: 10.1k =C2=B10.1k (+2%) write: seq aio: 235.7k =C2=B12.8k (+243%) rand aio: 232.0k =C2=B16.7k (+237%) seq sync: 31.7k =C2=B10.6k (+4%) rand sync: 31.8k =C2=B10.5k (+4%) null: read: seq aio: 253.8k =C2=B112.3k (+45%) rand aio: 248.2k =C2=B112.0k (+45%) seq sync: 91.6k =C2=B12.4k (+12%) rand sync: 91.3k =C2=B12.1k (+17%) write: seq aio: 208.2k =C2=B19.8k (+6%) rand aio: 207.0k =C2=B17.4k (+8%) seq sync: 91.2k =C2=B11.9k (+9%) rand sync: 90.4k =C2=B12.5k (+14%) So moderate improvements in most cases, but quite improved AIO performance with an actual underlying file. Here's results for numjobs=3D4: "Before", i.e. without multithreading in QSD/FUSE (results compared to numjobs=3D1): file: read: seq aio: 85.5k =C2=B10.4k (-13%) rand aio: 92.5k =C2=B10.5k (-3%) seq sync: 54.5k =C2=B19.1k (+58%) rand sync: 38.0k =C2=B10.2k (+283%) write: seq aio: 67.3k =C2=B10.3k (-2%) rand aio: 67.6k =C2=B10.3k (-2%) seq sync: 69.3k =C2=B10.5k (+126%) rand sync: 69.3k =C2=B10.3k (+126%) null: read: seq aio: 170.6k =C2=B10.8k (-2%) rand aio: 170.9k =C2=B10.9k (=C2=B10%) seq sync: 187.6k =C2=B11.3k (+129%) rand sync: 188.9k =C2=B10.9k (+142%) write: seq aio: 191.5k =C2=B11.2k (-2%) rand aio: 193.8k =C2=B11.4k (-1%) seq sync: 206.1k =C2=B11.3k (+147%) rand sync: 206.1k =C2=B11.2k (+159%) As probably expected, little difference in the AIO case, but great improvements in the sync cases because it kind of gives it an artificial iodepth of 4. "After", i.e. with four threads in QSD/FUSE (now results compared to the above): file: read: seq aio: 198.7k =C2=B12.7k (+132%) rand aio: 317.3k =C2=B10.6k (+243%) seq sync: 55.9k =C2=B18.9k (+3%) rand sync: 39.1k =C2=B10.0k (+3%) write: seq aio: 229.0k =C2=B10.8k (+240%) rand aio: 227.0k =C2=B11.3k (+235%) seq sync: 102.5k =C2=B10.2k (+48%) rand sync: 101.7k =C2=B10.2k (+47%) null: read: seq aio: 584.0k =C2=B11.5k (+242%) rand aio: 581.9k =C2=B11.9k (+240%) seq sync: 270.6k =C2=B10.9k (+44%) rand sync: 270.4k =C2=B10.7k (+43%) write: seq aio: 598.4k =C2=B12.0k (+212%) rand aio: 605.2k =C2=B12.0k (+212%) seq sync: 274.0k =C2=B10.8k (+33%) rand sync: 275.0k =C2=B10.7k (+33%) So this helps mainly for the AIO cases, but also in the null sync cases, because null is always CPU-bound, so more threads help. One unsolved mystery: When using a multithreaded export, running fio with 1 job (benchmark at the top of this commit) yields better seqread performance than doing so with 4 jobs. Actually, with 4 jobs, it's significantly than randread, which is quite strange. Signed-off-by: Hanna Czenczek --- block/export/fuse.c | 193 +++++++++++++++++++++++++++++++++++--------- 1 file changed, 153 insertions(+), 40 deletions(-) diff --git a/block/export/fuse.c b/block/export/fuse.c index fe1b6ad5ff..a2a478d293 100644 --- a/block/export/fuse.c +++ b/block/export/fuse.c @@ -31,11 +31,13 @@ #include "qemu/error-report.h" #include "qemu/main-loop.h" #include "system/block-backend.h" +#include "system/iothread.h" =20 #include #include =20 #include "standard-headers/linux/fuse.h" +#include =20 #if defined(CONFIG_FALLOCATE_ZERO_RANGE) #include @@ -118,12 +120,17 @@ QEMU_BUILD_BUG_ON(sizeof(((FuseRequestInHeaderBuf *)0= )->head) + sizeof(((FuseRequestInHeaderBuf *)0)->tail) !=3D sizeof(FuseRequestInHeader)); =20 -typedef struct FuseExport { - BlockExport common; +typedef struct FuseExport FuseExport; =20 - struct fuse_session *fuse_session; - unsigned int in_flight; /* atomic */ - bool mounted, fd_handler_set_up; +/* + * One FUSE "queue", representing one FUSE FD from which requests are fetc= hed + * and processed. Each queue is tied to an AioContext. + */ +typedef struct FuseQueue { + FuseExport *exp; + + AioContext *ctx; + int fuse_fd; =20 /* * Cached buffer to receive the data of WRITE requests. Cached becaus= e: @@ -140,6 +147,14 @@ typedef struct FuseExport { * via blk_blockalign() and thus need to be freed via qemu_vfree(). */ void *req_write_data_cached; +} FuseQueue; + +struct FuseExport { + BlockExport common; + + struct fuse_session *fuse_session; + unsigned int in_flight; /* atomic */ + bool mounted, fd_handler_set_up; =20 /* * Set when there was an unrecoverable error and no requests should be= read @@ -148,7 +163,15 @@ typedef struct FuseExport { */ bool halted; =20 - int fuse_fd; + int num_queues; + FuseQueue *queues; + /* + * True if this export should follow the generic export's AioContext. + * Will be false if the queues' AioContexts have been explicitly set b= y the + * user, i.e. are expected to stay in those contexts. + * (I.e. is always false if there is more than one queue.) + */ + bool follow_aio_context; =20 char *mountpoint; bool writable; @@ -160,7 +183,7 @@ typedef struct FuseExport { mode_t st_mode; uid_t st_uid; gid_t st_gid; -} FuseExport; +}; =20 /* * Verify that the size of FuseRequestInHeaderBuf.head plus the data @@ -179,12 +202,13 @@ static void fuse_export_halt(FuseExport *exp); static void init_exports_table(void); =20 static int mount_fuse_export(FuseExport *exp, Error **errp); +static int clone_fuse_fd(int fd, Error **errp); =20 static bool is_regular_file(const char *path, Error **errp); =20 static void read_from_fuse_fd(void *opaque); static void coroutine_fn -fuse_co_process_request(FuseExport *exp, const FuseRequestInHeader *in_hdr, +fuse_co_process_request(FuseQueue *q, const FuseRequestInHeader *in_hdr, const void *data_buffer); static int fuse_write_err(int fd, const struct fuse_in_header *in_hdr, int= err); =20 @@ -216,8 +240,11 @@ static void fuse_attach_handlers(FuseExport *exp) return; } =20 - aio_set_fd_handler(exp->common.ctx, exp->fuse_fd, - read_from_fuse_fd, NULL, NULL, NULL, exp); + for (int i =3D 0; i < exp->num_queues; i++) { + aio_set_fd_handler(exp->queues[i].ctx, exp->queues[i].fuse_fd, + read_from_fuse_fd, NULL, NULL, NULL, + &exp->queues[i]); + } exp->fd_handler_set_up =3D true; } =20 @@ -226,8 +253,10 @@ static void fuse_attach_handlers(FuseExport *exp) */ static void fuse_detach_handlers(FuseExport *exp) { - aio_set_fd_handler(exp->common.ctx, exp->fuse_fd, - NULL, NULL, NULL, NULL, NULL); + for (int i =3D 0; i < exp->num_queues; i++) { + aio_set_fd_handler(exp->queues[i].ctx, exp->queues[i].fuse_fd, + NULL, NULL, NULL, NULL, NULL); + } exp->fd_handler_set_up =3D false; } =20 @@ -242,6 +271,11 @@ static void fuse_export_drained_end(void *opaque) =20 /* Refresh AioContext in case it changed */ exp->common.ctx =3D blk_get_aio_context(exp->common.blk); + if (exp->follow_aio_context) { + assert(exp->num_queues =3D=3D 1); + exp->queues[0].ctx =3D exp->common.ctx; + } + fuse_attach_handlers(exp); } =20 @@ -273,8 +307,32 @@ static int fuse_export_create(BlockExport *blk_exp, assert(blk_exp_args->type =3D=3D BLOCK_EXPORT_TYPE_FUSE); =20 if (multithread) { - error_setg(errp, "FUSE export does not support multi-threading"); - return -EINVAL; + /* Guaranteed by common export code */ + assert(mt_count >=3D 1); + + exp->follow_aio_context =3D false; + exp->num_queues =3D mt_count; + exp->queues =3D g_new(FuseQueue, mt_count); + + for (size_t i =3D 0; i < mt_count; i++) { + exp->queues[i] =3D (FuseQueue) { + .exp =3D exp, + .ctx =3D multithread[i], + .fuse_fd =3D -1, + }; + } + } else { + /* Guaranteed by common export code */ + assert(mt_count =3D=3D 0); + + exp->follow_aio_context =3D true; + exp->num_queues =3D 1; + exp->queues =3D g_new(FuseQueue, 1); + exp->queues[0] =3D (FuseQueue) { + .exp =3D exp, + .ctx =3D exp->common.ctx, + .fuse_fd =3D -1, + }; } =20 /* For growable and writable exports, take the RESIZE permission */ @@ -286,7 +344,7 @@ static int fuse_export_create(BlockExport *blk_exp, ret =3D blk_set_perm(exp->common.blk, blk_perm | BLK_PERM_RESIZE, blk_shared_perm, errp); if (ret < 0) { - return ret; + goto fail; } } =20 @@ -362,13 +420,23 @@ static int fuse_export_create(BlockExport *blk_exp, =20 g_hash_table_insert(exports, g_strdup(exp->mountpoint), NULL); =20 - exp->fuse_fd =3D fuse_session_fd(exp->fuse_session); - ret =3D qemu_fcntl_addfl(exp->fuse_fd, O_NONBLOCK); + assert(exp->num_queues >=3D 1); + exp->queues[0].fuse_fd =3D fuse_session_fd(exp->fuse_session); + ret =3D qemu_fcntl_addfl(exp->queues[0].fuse_fd, O_NONBLOCK); if (ret < 0) { error_setg_errno(errp, -ret, "Failed to make FUSE FD non-blocking"= ); goto fail; } =20 + for (int i =3D 1; i < exp->num_queues; i++) { + int fd =3D clone_fuse_fd(exp->queues[0].fuse_fd, errp); + if (fd < 0) { + ret =3D fd; + goto fail; + } + exp->queues[i].fuse_fd =3D fd; + } + fuse_attach_handlers(exp); return 0; =20 @@ -461,28 +529,28 @@ fail: /** * Allocate a buffer to receive WRITE data, or take the cached one. */ -static void *get_write_data_buffer(FuseExport *exp) +static void *get_write_data_buffer(FuseQueue *q) { - if (exp->req_write_data_cached) { - void *cached =3D exp->req_write_data_cached; - exp->req_write_data_cached =3D NULL; + if (q->req_write_data_cached) { + void *cached =3D q->req_write_data_cached; + q->req_write_data_cached =3D NULL; return cached; } else { - return blk_blockalign(exp->common.blk, FUSE_MAX_WRITE_BYTES); + return blk_blockalign(q->exp->common.blk, FUSE_MAX_WRITE_BYTES); } } =20 /** * Release a WRITE data buffer, possibly reusing it for a subsequent reque= st. */ -static void release_write_data_buffer(FuseExport *exp, void **buffer) +static void release_write_data_buffer(FuseQueue *q, void **buffer) { if (!*buffer) { return; } =20 - if (!exp->req_write_data_cached) { - exp->req_write_data_cached =3D *buffer; + if (!q->req_write_data_cached) { + q->req_write_data_cached =3D *buffer; } else { qemu_vfree(*buffer); } @@ -528,9 +596,42 @@ static ssize_t req_op_hdr_len(const FuseRequestInHeade= r *in_hdr) } } =20 +/** + * Clone the given /dev/fuse file descriptor, yielding a second FD from wh= ich + * requests can be pulled for the associated filesystem. Returns an FD on + * success, and -errno on error. + */ +static int clone_fuse_fd(int fd, Error **errp) +{ + uint32_t src_fd =3D fd; + int new_fd; + int ret; + + /* + * The name "/dev/fuse" is fixed, see libfuse's lib/fuse_loop_mt.c + * (fuse_clone_chan()). + */ + new_fd =3D open("/dev/fuse", O_RDWR | O_CLOEXEC | O_NONBLOCK); + if (new_fd < 0) { + ret =3D -errno; + error_setg_errno(errp, errno, "Failed to open /dev/fuse"); + return ret; + } + + ret =3D ioctl(new_fd, FUSE_DEV_IOC_CLONE, &src_fd); + if (ret < 0) { + ret =3D -errno; + error_setg_errno(errp, errno, "Failed to clone FUSE FD"); + close(new_fd); + return ret; + } + + return new_fd; +} + /** * Try to read a single request from the FUSE FD. - * Takes a FuseExport pointer in `opaque`. + * Takes a FuseQueue pointer in `opaque`. * * Assumes the export's in-flight counter has already been incremented. * @@ -538,8 +639,9 @@ static ssize_t req_op_hdr_len(const FuseRequestInHeader= *in_hdr) */ static void coroutine_fn co_read_from_fuse_fd(void *opaque) { - FuseExport *exp =3D opaque; - int fuse_fd =3D exp->fuse_fd; + FuseQueue *q =3D opaque; + int fuse_fd =3D q->fuse_fd; + FuseExport *exp =3D q->exp; ssize_t ret; FuseRequestInHeaderBuf in_hdr_buf; const FuseRequestInHeader *in_hdr; @@ -551,7 +653,7 @@ static void coroutine_fn co_read_from_fuse_fd(void *opa= que) goto no_request; } =20 - data_buffer =3D get_write_data_buffer(exp); + data_buffer =3D get_write_data_buffer(q); =20 /* Construct the I/O vector to hold the FUSE request */ iov[0] =3D (struct iovec) { &in_hdr_buf.head, sizeof(in_hdr_buf.head) = }; @@ -612,29 +714,29 @@ static void coroutine_fn co_read_from_fuse_fd(void *o= paque) memcpy(in_hdr_buf.tail, data_buffer, len); } =20 - release_write_data_buffer(exp, &data_buffer); + release_write_data_buffer(q, &data_buffer); } =20 - fuse_co_process_request(exp, in_hdr, data_buffer); + fuse_co_process_request(q, in_hdr, data_buffer); =20 no_request: - release_write_data_buffer(exp, &data_buffer); + release_write_data_buffer(q, &data_buffer); fuse_dec_in_flight(exp); } =20 /** * Try to read and process a single request from the FUSE FD. * (To be used as a handler for when the FUSE FD becomes readable.) - * Takes a FuseExport pointer in `opaque`. + * Takes a FuseQueue pointer in `opaque`. */ static void read_from_fuse_fd(void *opaque) { - FuseExport *exp =3D opaque; + FuseQueue *q =3D opaque; Coroutine *co; =20 - co =3D qemu_coroutine_create(co_read_from_fuse_fd, exp); + co =3D qemu_coroutine_create(co_read_from_fuse_fd, q); /* Decremented by co_read_from_fuse_fd() */ - fuse_inc_in_flight(exp); + fuse_inc_in_flight(q->exp); qemu_coroutine_enter(co); } =20 @@ -659,6 +761,17 @@ static void fuse_export_delete(BlockExport *blk_exp) { FuseExport *exp =3D container_of(blk_exp, FuseExport, common); =20 + for (int i =3D 0; i < exp->num_queues; i++) { + FuseQueue *q =3D &exp->queues[i]; + + /* Queue 0's FD belongs to the FUSE session */ + if (i > 0 && q->fuse_fd >=3D 0) { + close(q->fuse_fd); + } + qemu_vfree(q->req_write_data_cached); + } + g_free(exp->queues); + if (exp->fuse_session) { if (exp->mounted) { fuse_session_unmount(exp->fuse_session); @@ -667,7 +780,6 @@ static void fuse_export_delete(BlockExport *blk_exp) fuse_session_destroy(exp->fuse_session); } =20 - qemu_vfree(exp->req_write_data_cached); g_free(exp->mountpoint); } =20 @@ -1344,10 +1456,11 @@ static int fuse_write_buf_response(int fd, * Process a FUSE request, incl. writing the response. */ static void coroutine_fn -fuse_co_process_request(FuseExport *exp, const FuseRequestInHeader *in_hdr, +fuse_co_process_request(FuseQueue *q, const FuseRequestInHeader *in_hdr, const void *data_buffer) { FuseRequestOutHeader out_hdr; + FuseExport *exp =3D q->exp; /* For read requests: Data to be returned */ void *out_data_buffer =3D NULL; ssize_t ret; @@ -1471,10 +1584,10 @@ fuse_co_process_request(FuseExport *exp, const Fuse= RequestInHeader *in_hdr, } =20 if (out_data_buffer) { - fuse_write_buf_response(exp->fuse_fd, &out_hdr.common, out_data_bu= ffer); + fuse_write_buf_response(q->fuse_fd, &out_hdr.common, out_data_buff= er); qemu_vfree(out_data_buffer); } else { - fuse_write_response(exp->fuse_fd, &out_hdr); + fuse_write_response(q->fuse_fd, &out_hdr); } } =20 --=20 2.53.0