From nobody Sun May 10 16:24:14 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1FC67C43217 for ; Fri, 29 Apr 2022 10:19:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1357340AbiD2KWV (ORCPT ); Fri, 29 Apr 2022 06:22:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38324 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1357315AbiD2KWQ (ORCPT ); Fri, 29 Apr 2022 06:22:16 -0400 Received: from mail-pg1-x52f.google.com (mail-pg1-x52f.google.com [IPv6:2607:f8b0:4864:20::52f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 231385006B; Fri, 29 Apr 2022 03:18:58 -0700 (PDT) Received: by mail-pg1-x52f.google.com with SMTP id t13so6175732pgn.8; Fri, 29 Apr 2022 03:18:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ojYkz2WFnCyMkIo6Kcyb7VK8syyaiI7oyaliYd8tciM=; b=Wvgfd4XGjkZMc/TTyXhiWMWrxSAs+uCdtRlewuOHb+p0JGkge4uCjqOOzuHXxF/DX7 PCQjYq17xQj6dpyCAIUUoQQjjdsZ83SjL47mUObUTwvo2M/PJhDgsw1RhQXkR5mEhJHf OTF53kKT+TP95T3Pl8sRX/JNQyVS9mhsF02QquiIV8YlTFJ9vkmUJ3ygL+JtjAN40Hr7 Vj2zYF1STxO/+Z6tbDssS8gzSXGhU+zBmNqIL0C0pyaKOM+qRz0XUDntdgtw2IKmr69j iDPYxj1ZnqWH0kr7cnzfGaiCYqZvj3dU5euHQFSvoUgCXTC7uWJrwWSl9CXauBIPenCf gQQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ojYkz2WFnCyMkIo6Kcyb7VK8syyaiI7oyaliYd8tciM=; b=llIwZdorYucztVy6nhqJKurYnVPkniQ+V3afgqljU9dVs5iWdDeV30FfXbRPkYkqVh Dfv8+DphFmhbFK53Guq5iqy6QaPPZfxwc9Px61ZJPFj2PenOmZ2I3bxwJUrIO+yuq6w9 C79MOu6WwtxDYg3jqSGiumOzLWH8Bo9bv4bz8UvL5rYH4Q+dKm5LLqxSp74JbyG3OMg0 pOL4WrfHe2Ar+sNk7X3L7u16Znt7Uiw2fz432d6ay6OO5KYdGNhtu5ULbGNGEScoS6Uy caC/Oh84JJXJtuoBMZ7YP1D4is0TUmOl9KLrZUpXKh2pjveL5UdB0tcIe1GRqZx+h0y6 HZjA== X-Gm-Message-State: AOAM530jhMcgjm1w3OR6MreRu8UtvCKU8MvOIaOekDZeeU9H3IoNWCpI R3BiP+5npo7iDUI8puzPujqOV6RQWeE= X-Google-Smtp-Source: ABdhPJxLV/pnpnmXeOT+7c3YZtblxroNP3GJskLx7gMBFUlfMz1coovzCx8cS+3gRjZcBgPZopAtyQ== X-Received: by 2002:a05:6a00:2258:b0:50d:9c5d:e1fd with SMTP id i24-20020a056a00225800b0050d9c5de1fdmr6910614pfu.82.1651227537553; Fri, 29 Apr 2022 03:18:57 -0700 (PDT) Received: from HOWEYXU-MB0.tencent.com ([106.53.33.166]) by smtp.gmail.com with ESMTPSA id k17-20020a628e11000000b0050d8d373331sm2600016pfe.214.2022.04.29.03.18.55 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 29 Apr 2022 03:18:57 -0700 (PDT) From: Hao Xu To: io-uring@vger.kernel.org Cc: Jens Axboe , Pavel Begunkov , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 1/9] io-wq: add a worker flag for individual exit Date: Fri, 29 Apr 2022 18:18:50 +0800 Message-Id: <20220429101858.90282-2-haoxu.linux@gmail.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220429101858.90282-1-haoxu.linux@gmail.com> References: <20220429101858.90282-1-haoxu.linux@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Hao Xu Add a worker flag to control exit of an individual worker, this is needed for fixed worker in the next patches but also as a generic functionality. Signed-off-by: Hao Xu --- fs/io-wq.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/fs/io-wq.c b/fs/io-wq.c index 824623bcf1a5..0c26805ca6de 100644 --- a/fs/io-wq.c +++ b/fs/io-wq.c @@ -26,6 +26,7 @@ enum { IO_WORKER_F_RUNNING =3D 2, /* account as running */ IO_WORKER_F_FREE =3D 4, /* worker on free list */ IO_WORKER_F_BOUND =3D 8, /* is doing bounded work */ + IO_WORKER_F_EXIT =3D 32, /* worker is exiting */ }; =20 enum { @@ -639,8 +640,12 @@ static int io_wqe_worker(void *data) while (!test_bit(IO_WQ_BIT_EXIT, &wq->state)) { long ret; =20 + if (worker->flags & IO_WORKER_F_EXIT) + break; + set_current_state(TASK_INTERRUPTIBLE); - while (io_acct_run_queue(acct)) + while (!(worker->flags & IO_WORKER_F_EXIT) && + io_acct_run_queue(acct)) io_worker_handle_work(worker); =20 raw_spin_lock(&wqe->lock); @@ -656,6 +661,10 @@ static int io_wqe_worker(void *data) raw_spin_unlock(&wqe->lock); if (io_flush_signals()) continue; + if (worker->flags & IO_WORKER_F_EXIT) { + __set_current_state(TASK_RUNNING); + break; + } ret =3D schedule_timeout(WORKER_IDLE_TIMEOUT); if (signal_pending(current)) { struct ksignal ksig; --=20 2.36.0 From nobody Sun May 10 16:24:14 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E1A6FC433F5 for ; Fri, 29 Apr 2022 10:19:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1357358AbiD2KWX (ORCPT ); Fri, 29 Apr 2022 06:22:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38350 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1357320AbiD2KWR (ORCPT ); Fri, 29 Apr 2022 06:22:17 -0400 Received: from mail-pf1-x42c.google.com (mail-pf1-x42c.google.com [IPv6:2607:f8b0:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 85211C667E; Fri, 29 Apr 2022 03:19:00 -0700 (PDT) Received: by mail-pf1-x42c.google.com with SMTP id t13so6566801pfg.2; Fri, 29 Apr 2022 03:19:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=CWinwQT5xv5GldvtBlyvrjtEE1mVFuBOYzQEqC+w2dc=; b=VNBjm0bgpm0nu3eHAqayYOSLbubBgfH6aSM2JcruBSQU2rsHudy/rWzmUYaZdgfOVh yAO1uSeA2mISoN55zs8PNARkOAp4Gx/0F71DEVPdh8Au6Mh7VpMONQ64Tj1JXJx3smhU tqC1nw/o7SkTkFgTTtbNNgOvIoqWrbG9J9hM/Mk/t+BtQXYeceRiQ+GZnY5I6iQfQhIU gqCykzs2Ojdf3X/ACX7/cBhvIYqtMVQkQz547QYJEhC2uZBRaws08KENZoQy4f4jonTh 914X2bFmPn8U+v+VvfmTtNwG5J6/+WpbB5Jobn2djTnngAiXhAjxx03ZbV7gw6gmEA1J X8NQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=CWinwQT5xv5GldvtBlyvrjtEE1mVFuBOYzQEqC+w2dc=; b=fd8nC6St4IGqWD/zta+pdD+lcwF+wmqeIFuhLxCDyD8E08RdETiSKuhaUgOogPiiqP iD2nlMDgxubK78k8N97Yp6MHAUg4bFUSiVpxOjfIVEw1ijPbyHMyQNuOr6002eQn0Ivd uMtcmKhgMY0qY0o30mhwfCGGw1zgHfOar0iy+m7VGUW67qlevEIkiYNGWxSkRT2DWrlH 0OKf2RHqnK9ZYYQfw+iSIMIt6gEw18GKxeXbbNIsmFTJzkFKwPivAtSLAcLrLTUAjoWA K0R7UAuhhCNv18Jc7SXbUgUIvJAcw3dVgPHNkSWybWNk+6aZj8MDT/vQ/QKntMallV04 kTnQ== X-Gm-Message-State: AOAM533wcqCRY7IRYAhR+0bE+09TCB4eKDVkSX4buUZMIRWUFSBi96Gw hBps9hRL2/nPaR3AbDw25pjMCfRU92A= X-Google-Smtp-Source: ABdhPJyF2UjX3wlGLAmAG8m5a8qknH38NO7AhfZLGgbZgBxQ8HkIqn0CyZlGoNPZXJ5du7ahajKpnA== X-Received: by 2002:a65:6d15:0:b0:382:4e6d:dd0d with SMTP id bf21-20020a656d15000000b003824e6ddd0dmr31353288pgb.333.1651227539926; Fri, 29 Apr 2022 03:18:59 -0700 (PDT) Received: from HOWEYXU-MB0.tencent.com ([106.53.33.166]) by smtp.gmail.com with ESMTPSA id k17-20020a628e11000000b0050d8d373331sm2600016pfe.214.2022.04.29.03.18.57 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 29 Apr 2022 03:18:59 -0700 (PDT) From: Hao Xu To: io-uring@vger.kernel.org Cc: Jens Axboe , Pavel Begunkov , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 2/9] io-wq: change argument of create_io_worker() for convienence Date: Fri, 29 Apr 2022 18:18:51 +0800 Message-Id: <20220429101858.90282-3-haoxu.linux@gmail.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220429101858.90282-1-haoxu.linux@gmail.com> References: <20220429101858.90282-1-haoxu.linux@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Hao Xu Change index to acct itself for create_io_worker() for convienence in the next patches. Signed-off-by: Hao Xu --- fs/io-wq.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/fs/io-wq.c b/fs/io-wq.c index 0c26805ca6de..35ce622f77ba 100644 --- a/fs/io-wq.c +++ b/fs/io-wq.c @@ -139,7 +139,8 @@ struct io_cb_cancel_data { bool cancel_all; }; =20 -static bool create_io_worker(struct io_wq *wq, struct io_wqe *wqe, int ind= ex); +static bool create_io_worker(struct io_wq *wq, struct io_wqe *wqe, + struct io_wqe_acct *acct); static void io_wqe_dec_running(struct io_worker *worker); static bool io_acct_cancel_pending_work(struct io_wqe *wqe, struct io_wqe_acct *acct, @@ -306,7 +307,7 @@ static bool io_wqe_create_worker(struct io_wqe *wqe, st= ruct io_wqe_acct *acct) raw_spin_unlock(&wqe->lock); atomic_inc(&acct->nr_running); atomic_inc(&wqe->wq->worker_refs); - return create_io_worker(wqe->wq, wqe, acct->index); + return create_io_worker(wqe->wq, wqe, acct); } =20 static void io_wqe_inc_running(struct io_worker *worker) @@ -335,7 +336,7 @@ static void create_worker_cb(struct callback_head *cb) } raw_spin_unlock(&wqe->lock); if (do_create) { - create_io_worker(wq, wqe, worker->create_index); + create_io_worker(wq, wqe, acct); } else { atomic_dec(&acct->nr_running); io_worker_ref_put(wq); @@ -812,9 +813,10 @@ static void io_workqueue_create(struct work_struct *wo= rk) kfree(worker); } =20 -static bool create_io_worker(struct io_wq *wq, struct io_wqe *wqe, int ind= ex) +static bool create_io_worker(struct io_wq *wq, struct io_wqe *wqe, + struct io_wqe_acct *acct) { - struct io_wqe_acct *acct =3D &wqe->acct[index]; + int index =3D acct->index; struct io_worker *worker; struct task_struct *tsk; =20 --=20 2.36.0 From nobody Sun May 10 16:24:14 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9CF96C433FE for ; Fri, 29 Apr 2022 10:19:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1357378AbiD2KW2 (ORCPT ); Fri, 29 Apr 2022 06:22:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38402 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1357329AbiD2KWU (ORCPT ); Fri, 29 Apr 2022 06:22:20 -0400 Received: from mail-pg1-x535.google.com (mail-pg1-x535.google.com [IPv6:2607:f8b0:4864:20::535]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CE59DC667E; Fri, 29 Apr 2022 03:19:02 -0700 (PDT) Received: by mail-pg1-x535.google.com with SMTP id r83so6197208pgr.2; Fri, 29 Apr 2022 03:19:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=rrB+iTthb+bNS3lBLpFX7rVXhXl7gxHV2a/wJB/nLbY=; b=aabTLqoPA3HAi69J7qpQV+WtIpTBwm5Bl3umDu0lsB/WZjUaYHMomy2A/SIfl6qH55 fNH2MqCWxcagQgI2CNYca83lp6ebBzEPur6Iu4mt7an11QkXp7CwUI0mbzrZtCR+s6aZ 1tQiumL6XrWxMgzYBd/vccEV3C7HMuAnTjtTYx5KS6kGXTIZyKWiBG2jGZeM9rSW+cgn 88i35Ci1xZDWpsoAUtiClCfdvLkkQXMUoXUzsZUBPc36bGfv+IqtOpTQJxhiJOnbGtQu b50QlcphLijM+tSMBiBwGDvf0Ja7A+p6seByD8He8bmtxAhV6LqmcAvlhg0/EqdVcUWa 8IKw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=rrB+iTthb+bNS3lBLpFX7rVXhXl7gxHV2a/wJB/nLbY=; b=2Eh64BO8qLI+jPwI0TtdE2IjattAtiBBDmrPF+6G+yMr4fpsJrFNQZkxaC8STZX2fM YUg8QXU8xoJ8T6PCp4F5+mJTZ/KaXooFkRVCynF4jexiucEVQthcf+Yvm6gGNz9qJqTC tqSaKk0y2/ZSCD8yTSYrK3ZQkbDzPzjVPwX4z44Tsd0vvG8c/a03FErcSIAyP5zfJVyG FSyW/Fvfx7/hN3l948N1Bcch4aAhsto6auAmv8zSNIBy3t6LKaH91pE1PAMnlkaQlyLT YRElu2NDUOXmQ/mCRv8Ric3jyaYa0gcd5UIU3IAq8URxTrc1hxHW/jIq6U4MnKHjutvA vJLw== X-Gm-Message-State: AOAM531DagOcKJ14IYtzHZCXA9MWuLIXPaFHePWIM4qUrclk8vfntkaZ SvjaeZGhVegj+hbqkNW22NbsOHCZfNw= X-Google-Smtp-Source: ABdhPJxFwlAVfk9c+/RMDitmKDqXaGDGiVlvw25Fz4bjbRa5dIGqJzjkL+EDi8Hh18GO9X2pdgOM5w== X-Received: by 2002:a62:cf02:0:b0:50d:3e00:60c with SMTP id b2-20020a62cf02000000b0050d3e00060cmr25967787pfg.69.1651227542231; Fri, 29 Apr 2022 03:19:02 -0700 (PDT) Received: from HOWEYXU-MB0.tencent.com ([106.53.33.166]) by smtp.gmail.com with ESMTPSA id k17-20020a628e11000000b0050d8d373331sm2600016pfe.214.2022.04.29.03.19.00 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 29 Apr 2022 03:19:01 -0700 (PDT) From: Hao Xu To: io-uring@vger.kernel.org Cc: Jens Axboe , Pavel Begunkov , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 3/9] io-wq: add infra data structure for fixed workers Date: Fri, 29 Apr 2022 18:18:52 +0800 Message-Id: <20220429101858.90282-4-haoxu.linux@gmail.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220429101858.90282-1-haoxu.linux@gmail.com> References: <20220429101858.90282-1-haoxu.linux@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Hao Xu Add data sttructure and basic initialization for fixed worker. Signed-off-by: Hao Xu --- fs/io-wq.c | 98 ++++++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 87 insertions(+), 11 deletions(-) diff --git a/fs/io-wq.c b/fs/io-wq.c index 35ce622f77ba..ac8faf1f7a0a 100644 --- a/fs/io-wq.c +++ b/fs/io-wq.c @@ -26,6 +26,7 @@ enum { IO_WORKER_F_RUNNING =3D 2, /* account as running */ IO_WORKER_F_FREE =3D 4, /* worker on free list */ IO_WORKER_F_BOUND =3D 8, /* is doing bounded work */ + IO_WORKER_F_FIXED =3D 16, /* is a fixed worker */ IO_WORKER_F_EXIT =3D 32, /* worker is exiting */ }; =20 @@ -37,6 +38,61 @@ enum { IO_ACCT_STALLED_BIT =3D 0, /* stalled on hash */ }; =20 +struct io_wqe_acct { + /* + * union { + * 1) for normal worker + * struct { + * unsigned nr_workers; + * unsigned max_workers; + * struct io_wq_work_list work_list; + * }; + * 2) for fixed worker + * struct { + * unsigned nr_workers; // not meaningful + * unsigned max_workers; // not meaningful + * unsigned nr_fixed; + * unsigned max_works; + * struct io_worker **fixed_workers; + * }; + * 3) for fixed worker's private acct + * struct { + * unsigned nr_works; + * unsigned max_works; + * struct io_wq_work_list work_list; + * }; + *}; + */ + union { + unsigned nr_workers; + unsigned nr_works; + }; + unsigned max_workers; + unsigned nr_fixed; + unsigned max_works; + union { + struct io_wq_work_list work_list; + struct io_worker **fixed_workers; + }; + /* + * nr_running is not meaningful for fixed worker + * but still keep the same logic for it for the + * convinence for now. So do nr_workers and + * max_workers. + */ + atomic_t nr_running; + /* + * For 1), it protects the work_list, the other two member nr_workers + * and max_workers are protected by wqe->lock. + * For 2), it protects nr_fixed, max_works, fixed_workers + * For 3), it protects nr_works, max_works and work_list. + */ + raw_spinlock_t lock; + int index; + unsigned long flags; + bool fixed_worker_registered; +}; + /* * One for each thread in a wqe pool */ @@ -62,6 +118,8 @@ struct io_worker { struct rcu_head rcu; struct work_struct work; }; + int index; + struct io_wqe_acct acct; }; =20 #if BITS_PER_LONG =3D=3D 64 @@ -72,16 +130,6 @@ struct io_worker { =20 #define IO_WQ_NR_HASH_BUCKETS (1u << IO_WQ_HASH_ORDER) =20 -struct io_wqe_acct { - unsigned nr_workers; - unsigned max_workers; - int index; - atomic_t nr_running; - raw_spinlock_t lock; - struct io_wq_work_list work_list; - unsigned long flags; -}; - enum { IO_WQ_ACCT_BOUND, IO_WQ_ACCT_UNBOUND, @@ -94,6 +142,7 @@ enum { struct io_wqe { raw_spinlock_t lock; struct io_wqe_acct acct[IO_WQ_ACCT_NR]; + struct io_wqe_acct fixed_acct[IO_WQ_ACCT_NR]; =20 int node; =20 @@ -1205,6 +1254,31 @@ struct io_wq *io_wq_create(unsigned bounded, struct = io_wq_data *data) atomic_set(&acct->nr_running, 0); INIT_WQ_LIST(&acct->work_list); raw_spin_lock_init(&acct->lock); + + acct =3D &wqe->fixed_acct[i]; + acct->index =3D i; + INIT_WQ_LIST(&acct->work_list); + raw_spin_lock_init(&acct->lock); + /* + * nr_running for a fixed worker is meaningless + * for now, init it to 1 to wround around the + * io_wqe_dec_running logic + */ + atomic_set(&acct->nr_running, 1); + /* + * max_workers for a fixed worker is meaningless + * for now, init it so since number of fixed workers + * should be controlled by users. + */ + acct->max_workers =3D task_rlimit(current, RLIMIT_NPROC); + raw_spin_lock_init(&acct->lock); + /* + * For fixed worker, not necessary + * but do it explicitly for clearity + */ + acct->nr_fixed =3D 0; + acct->max_works =3D 0; + acct->fixed_workers =3D NULL; } wqe->wq =3D wq; raw_spin_lock_init(&wqe->lock); @@ -1287,7 +1361,7 @@ static void io_wq_exit_workers(struct io_wq *wq) =20 static void io_wq_destroy(struct io_wq *wq) { - int node; + int i, node; =20 cpuhp_state_remove_instance_nocalls(io_wq_online, &wq->cpuhp_node); =20 @@ -1299,6 +1373,8 @@ static void io_wq_destroy(struct io_wq *wq) }; io_wqe_cancel_pending_work(wqe, &match); free_cpumask_var(wqe->cpu_mask); + for (i =3D 0; i < IO_WQ_ACCT_NR; i++) + kfree(wqe->fixed_acct[i].fixed_workers); kfree(wqe); } io_wq_put_hash(wq->hash); --=20 2.36.0 From nobody Sun May 10 16:24:14 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 84411C433FE for ; Fri, 29 Apr 2022 10:19:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1357324AbiD2KWd (ORCPT ); Fri, 29 Apr 2022 06:22:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38478 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1357354AbiD2KWX (ORCPT ); Fri, 29 Apr 2022 06:22:23 -0400 Received: from mail-pj1-x1032.google.com (mail-pj1-x1032.google.com [IPv6:2607:f8b0:4864:20::1032]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 04985C6EC7; Fri, 29 Apr 2022 03:19:05 -0700 (PDT) Received: by mail-pj1-x1032.google.com with SMTP id p6so6746126pjm.1; Fri, 29 Apr 2022 03:19:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=QAN5M00tMGCt7cYAHPMykZoZSkRXa9pDaIi7lMh/leU=; b=nCJ4F1zjMZ205VIMqHnp0+uw5sid1mgYGfz9ZzzldLc50Dlqi7CKiVbPgnTK9+Yp2r 6kRJT5LWGQiXolue7vKR2BYbZmC39MjQj9L0uZmoYDZNctNoKcg1FER1A6myWOP4LBZ1 0iOKKQS8ryqM4l7k6ZZONil0w8HshoPe5xEAEG2795q40NKXa2jusfAyJWcmfvyKDNtJ AOCHmqedR3cyhd00na//MxKabJ2LPehq8iPY75WlpMUrCND+eHW9O6CjKU7P5OiH08xn WGaegh0LRxyExEnwoI+3SWKFcF4Vm/rRHZdye5gCnVgDQ28pEeR9TstANNJJRGOWGcmH 5Qgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=QAN5M00tMGCt7cYAHPMykZoZSkRXa9pDaIi7lMh/leU=; b=ImgJhqpq6tN9K4vgweEt/0LQiobUJgjDrMEjYiKd+9zUQ+7AmIxM6G89drFbTmRkhA 4IL4vyciPhiAtaLTeA9YsjuOem7vZ7ASG6lXVM78n9ZgbqA1ukvl99dQXbPv+v2TTLOg ua8CXXAYXnvKwbul3fzjswwwqx47SwPje45ALWwJt+FAMNVgR2YibqbzbiAg+MJsYtg4 arSToPuwsTc4sCGtxJ8xUb9PijOdYJBB7ArhFJ0KE2hG+HyaVYpS9zBAZX9RrgqeOp+w Fhr0K/rYfiy7Wazho1HndKozndb9V9SNKzN//TIqPrksGS4l/ULsTzvKQmyoS1sHbu6d czpg== X-Gm-Message-State: AOAM532qz8OOTFX+8SitZbbDSgUFZ6pmrOYMEwRxXQf36Q1TyQJpA1xH is3WlOFtZvRnCP5i1PwSRUb9abnjoM8= X-Google-Smtp-Source: ABdhPJzvHPdb+ONPa9BmMrLisZLCOO8kIbFk/sTDGu2BRyrPaCFnVJS6d2bUAfiHMyJeQaJlEhLhgQ== X-Received: by 2002:a17:90b:3445:b0:1d6:91a5:29fe with SMTP id lj5-20020a17090b344500b001d691a529femr3061966pjb.138.1651227544333; Fri, 29 Apr 2022 03:19:04 -0700 (PDT) Received: from HOWEYXU-MB0.tencent.com ([106.53.33.166]) by smtp.gmail.com with ESMTPSA id k17-20020a628e11000000b0050d8d373331sm2600016pfe.214.2022.04.29.03.19.02 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 29 Apr 2022 03:19:04 -0700 (PDT) From: Hao Xu To: io-uring@vger.kernel.org Cc: Jens Axboe , Pavel Begunkov , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 4/9] io-wq: tweak io_get_acct() Date: Fri, 29 Apr 2022 18:18:53 +0800 Message-Id: <20220429101858.90282-5-haoxu.linux@gmail.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220429101858.90282-1-haoxu.linux@gmail.com> References: <20220429101858.90282-1-haoxu.linux@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Hao Xu Add an argument for io_get_acct() to indicate fixed or normal worker Signed-off-by: Hao Xu --- fs/io-wq.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/fs/io-wq.c b/fs/io-wq.c index ac8faf1f7a0a..c67bd5e5d117 100644 --- a/fs/io-wq.c +++ b/fs/io-wq.c @@ -208,20 +208,24 @@ static void io_worker_release(struct io_worker *worke= r) complete(&worker->ref_done); } =20 -static inline struct io_wqe_acct *io_get_acct(struct io_wqe *wqe, bool bou= nd) +static inline struct io_wqe_acct *io_get_acct(struct io_wqe *wqe, bool bou= nd, + bool fixed) { - return &wqe->acct[bound ? IO_WQ_ACCT_BOUND : IO_WQ_ACCT_UNBOUND]; + unsigned index =3D bound ? IO_WQ_ACCT_BOUND : IO_WQ_ACCT_UNBOUND; + + return fixed ? &wqe->fixed_acct[index] : &wqe->acct[index]; } =20 static inline struct io_wqe_acct *io_work_get_acct(struct io_wqe *wqe, struct io_wq_work *work) { - return io_get_acct(wqe, !(work->flags & IO_WQ_WORK_UNBOUND)); + return io_get_acct(wqe, !(work->flags & IO_WQ_WORK_UNBOUND), false); } =20 static inline struct io_wqe_acct *io_wqe_get_acct(struct io_worker *worker) { - return io_get_acct(worker->wqe, worker->flags & IO_WORKER_F_BOUND); + return io_get_acct(worker->wqe, worker->flags & IO_WORKER_F_BOUND, + worker->flags & IO_WORKER_F_FIXED); } =20 static void io_worker_ref_put(struct io_wq *wq) @@ -1124,7 +1128,7 @@ static void io_wqe_cancel_pending_work(struct io_wqe = *wqe, int i; retry: for (i =3D 0; i < IO_WQ_ACCT_NR; i++) { - struct io_wqe_acct *acct =3D io_get_acct(wqe, i =3D=3D 0); + struct io_wqe_acct *acct =3D io_get_acct(wqe, i =3D=3D 0, false); =20 if (io_acct_cancel_pending_work(wqe, acct, match)) { if (match->cancel_all) --=20 2.36.0 From nobody Sun May 10 16:24:14 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 86BF5C4332F for ; Fri, 29 Apr 2022 10:19:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1357416AbiD2KWi (ORCPT ); Fri, 29 Apr 2022 06:22:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38564 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1357326AbiD2KW0 (ORCPT ); Fri, 29 Apr 2022 06:22:26 -0400 Received: from mail-pl1-x636.google.com (mail-pl1-x636.google.com [IPv6:2607:f8b0:4864:20::636]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 067EDC6677; Fri, 29 Apr 2022 03:19:07 -0700 (PDT) Received: by mail-pl1-x636.google.com with SMTP id b12so6756567plg.4; Fri, 29 Apr 2022 03:19:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=3+L3pwb3ppcFRe2sFtkQeWcRPTykgnJuLr+xwFDuh68=; b=pAxnEK/nCE1MWwrb/q08LBo76wO+Ers7osMQZPIxciEVObYjvemvjQKUNGPG0grxNT G7Mq+rTnM3X+0EFaoyu5hmlAVR63KUoLnVIN2HrLKbjSouYSmkpEbYjLSPmFNn0+xlR+ ufL7LVhz2/p7H/e12QUsSXcI11vpviOvmTGWKqV1u4FdNlfrQCHPcGcBvDwy28Yoya1g ZBZYY6EFm/M9hTvmJPd6Il9OGIpzhkExkeCm1F8hYunuAArptBhPKfXk15dKjL5KUhFU 9z7QgkQX3QmKpJ+PudjZ+fwSJRywli9tNIBfRUiD+2k1o+7OF7Tyfjgt5m1rpgzQIxWk sJPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=3+L3pwb3ppcFRe2sFtkQeWcRPTykgnJuLr+xwFDuh68=; b=33Xqa7f3eDhr07ZImG3uXZtm9r9L5HB83+Z5Ta7nMJmuSjOz0/GZe0OV+2hsIqkjFU puHmYe/FXx4MjvS6KF3pIBMj+3ZqYwOudWnvFFTFBEZfYyfGogtv6jWyLmQ1WwywSw/M aneDbJPrlks94H2VhnalA73ieRzLXrK9PWLMxsusQEDzsSNRLP8H6ckYVNbgrQ2W9UZ1 2LtEaMbbtJwpnThFLmuDah29uOIx1YMX4SZs01yrhq4DNtTWCte8O7pXzfqk1lI3/Yya sTPYdXFJshGR2++VEXS5rY7OYfJlSejVF0StoPTPIDVHCFGFslWTenyXZ9dUvuPezs8T 5z0Q== X-Gm-Message-State: AOAM532dswL3+teooJxdG+t5G7rUpuYYC3fKXvfMuLaPNTkMYHrT1cO0 c2lavKc/hTdgtjWImmP+vnC4nnH56WQ= X-Google-Smtp-Source: ABdhPJwRjb1IFkHsIEMUvBnH3SfNPim3GwaRTAHEH0043smuw3lN2xawRH3ergLJGXGx94KwIHmE2Q== X-Received: by 2002:a17:902:70cb:b0:158:424e:a657 with SMTP id l11-20020a17090270cb00b00158424ea657mr37330398plt.6.1651227546441; Fri, 29 Apr 2022 03:19:06 -0700 (PDT) Received: from HOWEYXU-MB0.tencent.com ([106.53.33.166]) by smtp.gmail.com with ESMTPSA id k17-20020a628e11000000b0050d8d373331sm2600016pfe.214.2022.04.29.03.19.04 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 29 Apr 2022 03:19:06 -0700 (PDT) From: Hao Xu To: io-uring@vger.kernel.org Cc: Jens Axboe , Pavel Begunkov , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 5/9] io-wq: fixed worker initialization Date: Fri, 29 Apr 2022 18:18:54 +0800 Message-Id: <20220429101858.90282-6-haoxu.linux@gmail.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220429101858.90282-1-haoxu.linux@gmail.com> References: <20220429101858.90282-1-haoxu.linux@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Hao Xu Implementation of the fixed worker initialization. Signed-off-by: Hao Xu --- fs/io-wq.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/fs/io-wq.c b/fs/io-wq.c index c67bd5e5d117..a1a10fb204a7 100644 --- a/fs/io-wq.c +++ b/fs/io-wq.c @@ -774,6 +774,26 @@ void io_wq_worker_sleeping(struct task_struct *tsk) io_wqe_dec_running(worker); } =20 +static void io_init_new_fixed_worker(struct io_wqe *wqe, + struct io_worker *worker) +{ + struct io_wqe_acct *acct =3D io_wqe_get_acct(worker); + struct io_wqe_acct *iw_acct =3D &worker->acct; + unsigned index =3D acct->index; + unsigned *nr_fixed; + + raw_spin_lock(&acct->lock); + nr_fixed =3D &acct->nr_fixed; + acct->fixed_workers[*nr_fixed] =3D worker; + worker->index =3D (*nr_fixed)++; + iw_acct->nr_works =3D 0; + iw_acct->max_works =3D acct->max_works; + iw_acct->index =3D index; + INIT_WQ_LIST(&iw_acct->work_list); + raw_spin_lock_init(&iw_acct->lock); + raw_spin_unlock(&acct->lock); +} + static void io_init_new_worker(struct io_wqe *wqe, struct io_worker *worke= r, struct task_struct *tsk) { @@ -787,6 +807,8 @@ static void io_init_new_worker(struct io_wqe *wqe, stru= ct io_worker *worker, list_add_tail_rcu(&worker->all_list, &wqe->all_list); worker->flags |=3D IO_WORKER_F_FREE; raw_spin_unlock(&wqe->lock); + if (worker->flags & IO_WORKER_F_FIXED) + io_init_new_fixed_worker(wqe, worker); wake_up_new_task(tsk); } =20 @@ -893,6 +915,8 @@ static bool create_io_worker(struct io_wq *wq, struct i= o_wqe *wqe, =20 if (index =3D=3D IO_WQ_ACCT_BOUND) worker->flags |=3D IO_WORKER_F_BOUND; + if (&wqe->fixed_acct[index] =3D=3D acct) + worker->flags |=3D IO_WORKER_F_FIXED; =20 tsk =3D create_io_thread(io_wqe_worker, worker, wqe->node); if (!IS_ERR(tsk)) { --=20 2.36.0 From nobody Sun May 10 16:24:14 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6A912C433EF for ; Fri, 29 Apr 2022 10:19:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1357435AbiD2KWn (ORCPT ); Fri, 29 Apr 2022 06:22:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38476 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1357367AbiD2KW1 (ORCPT ); Fri, 29 Apr 2022 06:22:27 -0400 Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3DDA9C6ED2; Fri, 29 Apr 2022 03:19:09 -0700 (PDT) Received: by mail-pl1-x632.google.com with SMTP id q8so6758882plx.3; Fri, 29 Apr 2022 03:19:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=TTN28+3w03uGqyxVlPe5iW084zUJw2KOuFLG4iC8iPc=; b=Si5i5YG1OQNZLPcqryXd6o9MwTja20/74uWCiPw9vMzkVNJob6v6wN8BvP4/U1oNRI ZpZuqbSVrnLewHdxkdfiUaDSlNOvql4shpySpSuijbUk3Y5uVV6Zq8rDLqCDoaIvRoQC gQDH4ZlzaC94D5Rnj9g656cPNM0IFA5shH2cN08CGhLPUlGzK6CCpEtrv+N/OMQCvxIZ 7WJCHzXNHkgqFhqpntDVOc5PhecTbhWGYRIO2cE6zTvy5gs2UG7y5XeLf4BrO+27dWvD swYMvhWosKejJ7pCWH7OdEJpRIZI9QTHcmH9LpuBTSQZFdtnmpDfqp/pEd8OdGXKkQXi 5eHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=TTN28+3w03uGqyxVlPe5iW084zUJw2KOuFLG4iC8iPc=; b=gJLU1pDw8yZuEjSIIQt3A8tGgvWNHqguLMJO6vnEf3ObfEiMKuLKCEsEvZop5/T6YE fO6zvAVdQchL0aXyAWHjeu1K9WKadF2EYtG9xeTE1RXWHWuUMh0eJUoWEgax9txaehZ5 V3UUUfW1vRMiasT3ddEwyjxbilIpBUOIO31d1mn0DjuS5veFAdVXFZxtaP96TliYTv4p jvPAnoeyGnvPpt6UxLckGv+bgn379ZgTRMfJCrI8azL/XKtoTI6PfZDg/qjkKMo3utGb Ur9Lz5qQDG7Jpj72MC3fFGb67GFK31i6FW7FblO2WvCwJhAXJu2oiL9lZYsFmHIAXen8 AQ+A== X-Gm-Message-State: AOAM531v4OJg5xZ6/Kn+N0ae6hdxJUmYYYWp4Ee/2ziWx4cLCoW0+Kp3 2ureHBZCMmr/O4kYkwFc9UlJA7iSgoI= X-Google-Smtp-Source: ABdhPJxckGGSnbxK56aYeAoptHcHku6qa2wXcpKtd2WVJEKcPAWmel/cYhm7oWbMDQBYxom4LSbVVg== X-Received: by 2002:a17:90b:314e:b0:1dc:ae7:8588 with SMTP id ip14-20020a17090b314e00b001dc0ae78588mr1653101pjb.125.1651227548706; Fri, 29 Apr 2022 03:19:08 -0700 (PDT) Received: from HOWEYXU-MB0.tencent.com ([106.53.33.166]) by smtp.gmail.com with ESMTPSA id k17-20020a628e11000000b0050d8d373331sm2600016pfe.214.2022.04.29.03.19.06 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 29 Apr 2022 03:19:08 -0700 (PDT) From: Hao Xu To: io-uring@vger.kernel.org Cc: Jens Axboe , Pavel Begunkov , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 6/9] io-wq: fixed worker exit Date: Fri, 29 Apr 2022 18:18:55 +0800 Message-Id: <20220429101858.90282-7-haoxu.linux@gmail.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220429101858.90282-1-haoxu.linux@gmail.com> References: <20220429101858.90282-1-haoxu.linux@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Hao Xu Implement the fixed worker exit Signed-off-by: Hao Xu --- fs/io-wq.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/fs/io-wq.c b/fs/io-wq.c index a1a10fb204a7..2feff19970ca 100644 --- a/fs/io-wq.c +++ b/fs/io-wq.c @@ -259,6 +259,29 @@ static bool io_task_worker_match(struct callback_head = *cb, void *data) return worker =3D=3D data; } =20 +static void io_fixed_worker_exit(struct io_worker *worker) +{ + int *nr_fixed; + int index =3D worker->acct.index; + struct io_wqe *wqe =3D worker->wqe; + struct io_wqe_acct *acct =3D io_get_acct(wqe, index =3D=3D 0, true); + struct io_worker **fixed_workers; + + raw_spin_lock(&acct->lock); + fixed_workers =3D acct->fixed_workers; + if (!fixed_workers || worker->index =3D=3D -1) { + raw_spin_unlock(&acct->lock); + return; + } + nr_fixed =3D &acct->nr_fixed; + /* reuse variable index to represent fixed worker index in its array */ + index =3D worker->index; + fixed_workers[index] =3D fixed_workers[*nr_fixed - 1]; + (*nr_fixed)--; + fixed_workers[index]->index =3D index; + raw_spin_unlock(&acct->lock); +} + static void io_worker_exit(struct io_worker *worker) { struct io_wqe *wqe =3D worker->wqe; @@ -682,6 +705,7 @@ static int io_wqe_worker(void *data) struct io_wqe *wqe =3D worker->wqe; struct io_wq *wq =3D wqe->wq; bool last_timeout =3D false; + bool fixed =3D worker->flags & IO_WORKER_F_FIXED; char buf[TASK_COMM_LEN]; =20 worker->flags |=3D (IO_WORKER_F_UP | IO_WORKER_F_RUNNING); @@ -732,6 +756,8 @@ static int io_wqe_worker(void *data) =20 if (test_bit(IO_WQ_BIT_EXIT, &wq->state)) io_worker_handle_work(worker); + if (fixed) + io_fixed_worker_exit(worker); =20 audit_free(current); io_worker_exit(worker); --=20 2.36.0 From nobody Sun May 10 16:24:14 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6D03C433F5 for ; Fri, 29 Apr 2022 10:19:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1357463AbiD2KW6 (ORCPT ); Fri, 29 Apr 2022 06:22:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38716 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1357391AbiD2KWb (ORCPT ); Fri, 29 Apr 2022 06:22:31 -0400 Received: from mail-pj1-x102c.google.com (mail-pj1-x102c.google.com [IPv6:2607:f8b0:4864:20::102c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E6A25C6ED8; Fri, 29 Apr 2022 03:19:11 -0700 (PDT) Received: by mail-pj1-x102c.google.com with SMTP id fv2so6727941pjb.4; Fri, 29 Apr 2022 03:19:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=NpPFG4XRH1a1LgmEG5Xpjkc4A8sSdHnFyJLNANxCbg0=; b=dK9IAaPlt9ujn+B5tqW/USQCvnaupj4k+JHAVPywrFLWuoQHXoWHTIESjY0H5EtVwF 3vTzukISs4GltKcLJcEUAJ/vIOR9cUM3sb6ZyLoZn0xruh2qUzogdm5m193ixKaPqtiN 8FMyRHYrb0k9+63KtyaOG80uRDRTBbOP1jVw53N70PcXHuwELhHw+pt8NvIAyHkixR/M BLQVloHkY0mcsh5JrF6Kplg3r/6Nyh9Bi9Jbfv71wJ+Y5i1L6YQ0+FHaEfJiWZITX9uf jbFOKduS6laY0pVKfJQSa1S/WlYY750VS55iwZGBtJrdGhpTXLPEwTmEYfo/se51yCSc 10+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=NpPFG4XRH1a1LgmEG5Xpjkc4A8sSdHnFyJLNANxCbg0=; b=tLhSglLoPYXCe9sCsuIG0ou3ku3iWS+31PWonMEzvHM2DY2bG7MfRq54ZFrQb6hY6r 9E6Jj2l/hGxn4PFFJCcRwplRLnHjkDgN1pT8Mi/Pb6ie3eTRRlw/jEq3edvjXlD+Dvz3 QpkWb94sBVctR8jRs3MvMTqH17HWkG8FUA4VvEXOL91UKWGG3FbLIM/2WpjWX7BgQSOt UwOUCu9XJQtAmzoPk+wMjyvdzO0WSipzv8JoBVVte4VTge32qx8/GwCUQ0ZYCEliGOxZ aRF4T9NaykHXA+g3zBJvB4PjIY/gnMwDJDGjsJDai0APiUj+ayQpd+ldi/7+mpBe61VA u7pw== X-Gm-Message-State: AOAM532KiKGDm5dbB3dWYGaDDwbeyn3krjsRfjqOd7tdCALafblYP25p N1ULYwsXch0zq9hL1YXXZkZK9yoVvKU= X-Google-Smtp-Source: ABdhPJw7hWIEZ+pKMbdG9T9TYW1aAji3AdnP9yc3HWJuQ65rYBUyZXNdr9ob8qrGrVlrHWVoSQ/VaQ== X-Received: by 2002:a17:902:e5c2:b0:15e:8061:87 with SMTP id u2-20020a170902e5c200b0015e80610087mr1096130plf.123.1651227550942; Fri, 29 Apr 2022 03:19:10 -0700 (PDT) Received: from HOWEYXU-MB0.tencent.com ([106.53.33.166]) by smtp.gmail.com with ESMTPSA id k17-20020a628e11000000b0050d8d373331sm2600016pfe.214.2022.04.29.03.19.09 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 29 Apr 2022 03:19:10 -0700 (PDT) From: Hao Xu To: io-uring@vger.kernel.org Cc: Jens Axboe , Pavel Begunkov , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 7/9] io-wq: implement fixed worker logic Date: Fri, 29 Apr 2022 18:18:56 +0800 Message-Id: <20220429101858.90282-8-haoxu.linux@gmail.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220429101858.90282-1-haoxu.linux@gmail.com> References: <20220429101858.90282-1-haoxu.linux@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Hao Xu The current implementation of io-wq has big spinlock contension. The main reason is the single work list model. All producers(who insert works) and consumers(io-workers) have to grap wqe->lock to move ahead. Set max_worker to 3 or 4, do a fio read test, we can see 40%~50% lock contension. Introduce fixed io-workers which sticks there to handle works and have their own work list. previous: producer0 ---insert---> work_list ---get---> io-worker0,1,2 now: ---> private work_list0 --get--> fixed-worker0 / producer0 --insert----> private work_list1 --get--> fixed-worker1 | \ | ---> private work_list2 --get--> fixed-worker2 | |---insert---> public work_list --get--> (normal)io-worker Since each fixed-worker has a private work list, the contension will be limited to a smaller range(the private work list). Logic of fixed-worker: first handle private works then public ones. Logic of normal io-worker: only handle public works. Logic of producer: 1) randomly pick a private work list and check if it is full, insert the work if it's not 2) insert the work to the public work list if 1) fails. The get logic of a private list: fixed-worker grab all the works in its private work list(like what tctx_task_work() does) rather than one by one.(this code is in the next patches as a optimization) To achieve this, we need to add an io_wqe_acct for each fixed-worker struct, and though this we can leverage the old code as much as possible, which makes the new design clean and compatible. Good things of this feature: 1) bound and unbound work lists now have different spinlocks. 2) much smaller contension between work producers and consumers. 3) fixed workers are friendly for users to control: binding cpus, reset priority etc. Wrote a nop test program to test it, 3 fixed-workers VS 3 normal workers. normal workers: ./run_nop_wqe.sh nop_wqe_normal 200000 100 3 1-3 time spent: 10464397 usecs IOPS: 1911242 time spent: 9610976 usecs IOPS: 2080954 time spent: 9807361 usecs IOPS: 2039284 fixed workers: ./run_nop_wqe.sh nop_wqe_fixed 200000 100 3 1-3 time spent: 17314274 usecs IOPS: 1155116 time spent: 17016942 usecs IOPS: 1175299 time spent: 17908684 usecs IOPS: 1116776 About 2x improvement. From perf result, almost no acct->lock contension. Test program: https://github.com/HowHsu/liburing/tree/fixed_worker Signed-off-by: Hao Xu --- fs/io-wq.c | 148 +++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 122 insertions(+), 26 deletions(-) diff --git a/fs/io-wq.c b/fs/io-wq.c index 2feff19970ca..aaa9cea7d39a 100644 --- a/fs/io-wq.c +++ b/fs/io-wq.c @@ -70,6 +70,7 @@ struct io_wqe_acct { unsigned max_workers; unsigned nr_fixed; unsigned max_works; + unsigned work_seq; union { struct io_wq_work_list work_list; struct io_worker **fixed_workers; @@ -624,9 +625,9 @@ static void io_assign_current_work(struct io_worker *wo= rker, =20 static void io_wqe_enqueue(struct io_wqe *wqe, struct io_wq_work *work); =20 -static void io_worker_handle_work(struct io_worker *worker) +static void io_worker_handle_work(struct io_worker *worker, + struct io_wqe_acct *acct) { - struct io_wqe_acct *acct =3D io_wqe_get_acct(worker); struct io_wqe *wqe =3D worker->wqe; struct io_wq *wq =3D wqe->wq; bool do_kill =3D test_bit(IO_WQ_BIT_EXIT, &wq->state); @@ -698,19 +699,31 @@ static void io_worker_handle_work(struct io_worker *w= orker) } while (1); } =20 +static inline void io_worker_handle_private_work(struct io_worker *worker) +{ + io_worker_handle_work(worker, &worker->acct); +} + +static inline void io_worker_handle_public_work(struct io_worker *worker) +{ + io_worker_handle_work(worker, io_wqe_get_acct(worker)); +} + static int io_wqe_worker(void *data) { struct io_worker *worker =3D data; - struct io_wqe_acct *acct =3D io_wqe_get_acct(worker); struct io_wqe *wqe =3D worker->wqe; struct io_wq *wq =3D wqe->wq; - bool last_timeout =3D false; + struct io_wqe_acct *acct =3D + io_get_acct(wqe, worker->flags & IO_WORKER_F_BOUND, false); bool fixed =3D worker->flags & IO_WORKER_F_FIXED; + bool last_timeout =3D false; char buf[TASK_COMM_LEN]; =20 worker->flags |=3D (IO_WORKER_F_UP | IO_WORKER_F_RUNNING); =20 - snprintf(buf, sizeof(buf), "iou-wrk-%d", wq->task->pid); + snprintf(buf, sizeof(buf), fixed ? "iou-fix-%d" : "iou-wrk-%d", + wq->task->pid); set_task_comm(current, buf); =20 audit_alloc_kernel(current); @@ -722,13 +735,24 @@ static int io_wqe_worker(void *data) break; =20 set_current_state(TASK_INTERRUPTIBLE); - while (!(worker->flags & IO_WORKER_F_EXIT) && - io_acct_run_queue(acct)) - io_worker_handle_work(worker); - + if (fixed) { + while (io_acct_run_queue(&worker->acct)) + io_worker_handle_private_work(worker); + if (io_acct_run_queue(acct)) + io_worker_handle_public_work(worker); + } else { + while (io_acct_run_queue(acct)) + io_worker_handle_public_work(worker); + } raw_spin_lock(&wqe->lock); - /* timed out, exit unless we're the last worker */ - if (last_timeout && acct->nr_workers > 1) { + /* timed out, a worker will exit only if: + * - not a fixed worker + * - not the last non-fixed worker + * + * the second condition is due to we need at least one worker to + * handle the public work list. + */ + if (last_timeout && !fixed && acct->nr_workers > 1) { acct->nr_workers--; raw_spin_unlock(&wqe->lock); __set_current_state(TASK_RUNNING); @@ -754,10 +778,18 @@ static int io_wqe_worker(void *data) last_timeout =3D !ret; } =20 - if (test_bit(IO_WQ_BIT_EXIT, &wq->state)) - io_worker_handle_work(worker); - if (fixed) + if (test_bit(IO_WQ_BIT_EXIT, &wq->state) && !fixed) + io_worker_handle_public_work(worker); + if (fixed) { io_fixed_worker_exit(worker); + /* + * Check and handle private work list again + * to avoid race with private work insertion + * TODO: an alternative way is to deliver + * works to the public work list + */ + io_worker_handle_private_work(worker); + } =20 audit_free(current); io_worker_exit(worker); @@ -1001,9 +1033,9 @@ static void io_run_cancel(struct io_wq_work *work, st= ruct io_wqe *wqe) } while (work); } =20 -static void io_wqe_insert_work(struct io_wqe *wqe, struct io_wq_work *work) +static void io_wqe_insert_work(struct io_wqe *wqe, struct io_wq_work *work, + struct io_wqe_acct *acct) { - struct io_wqe_acct *acct =3D io_work_get_acct(wqe, work); unsigned int hash; struct io_wq_work *tail; =20 @@ -1022,6 +1054,45 @@ static void io_wqe_insert_work(struct io_wqe *wqe, s= truct io_wq_work *work) wq_list_add_after(&work->list, &tail->list, &acct->work_list); } =20 +static bool io_wqe_insert_private_work(struct io_wqe *wqe, + struct io_wq_work *work, + struct io_wqe_acct *acct) +{ + unsigned int nr_fixed; + struct io_worker *fixed_worker; + struct io_wqe_acct *iw_acct; + unsigned int fixed_worker_index; + + raw_spin_lock(&acct->lock); + nr_fixed =3D acct->nr_fixed; + if (!nr_fixed) { + raw_spin_unlock(&acct->lock); + return false; + } + + fixed_worker_index =3D (acct->work_seq++) % nr_fixed; + fixed_worker =3D acct->fixed_workers[fixed_worker_index]; + if (!fixed_worker || fixed_worker->flags & IO_WORKER_F_EXIT) { + raw_spin_unlock(&acct->lock); + return false; + } + iw_acct =3D &fixed_worker->acct; + + raw_spin_lock(&iw_acct->lock); + if (iw_acct->nr_works < iw_acct->max_works) { + io_wqe_insert_work(wqe, work, iw_acct); + iw_acct->nr_works++; + raw_spin_unlock(&iw_acct->lock); + wake_up_process(fixed_worker->task); + raw_spin_unlock(&acct->lock); + return true; + } + raw_spin_unlock(&iw_acct->lock); + raw_spin_unlock(&acct->lock); + + return false; +} + static bool io_wq_work_match_item(struct io_wq_work *work, void *data) { return work =3D=3D data; @@ -1030,6 +1101,7 @@ static bool io_wq_work_match_item(struct io_wq_work *= work, void *data) static void io_wqe_enqueue(struct io_wqe *wqe, struct io_wq_work *work) { struct io_wqe_acct *acct =3D io_work_get_acct(wqe, work); + struct io_wqe_acct *fixed_acct; struct io_cb_cancel_data match; unsigned work_flags =3D work->flags; bool do_create; @@ -1044,8 +1116,14 @@ static void io_wqe_enqueue(struct io_wqe *wqe, struc= t io_wq_work *work) return; } =20 + fixed_acct =3D io_get_acct(wqe, !acct->index, true); + if (fixed_acct->fixed_worker_registered && !io_wq_is_hashed(work)) { + if (io_wqe_insert_private_work(wqe, work, fixed_acct)) + return; + } + raw_spin_lock(&acct->lock); - io_wqe_insert_work(wqe, work); + io_wqe_insert_work(wqe, work, acct); clear_bit(IO_ACCT_STALLED_BIT, &acct->flags); raw_spin_unlock(&acct->lock); =20 @@ -1131,9 +1209,9 @@ static bool io_wq_worker_cancel(struct io_worker *wor= ker, void *data) =20 static inline void io_wqe_remove_pending(struct io_wqe *wqe, struct io_wq_work *work, - struct io_wq_work_node *prev) + struct io_wq_work_node *prev, + struct io_wqe_acct *acct) { - struct io_wqe_acct *acct =3D io_work_get_acct(wqe, work); unsigned int hash =3D io_get_work_hash(work); struct io_wq_work *prev_work =3D NULL; =20 @@ -1160,7 +1238,7 @@ static bool io_acct_cancel_pending_work(struct io_wqe= *wqe, work =3D container_of(node, struct io_wq_work, list); if (!match->fn(work, match->data)) continue; - io_wqe_remove_pending(wqe, work, prev); + io_wqe_remove_pending(wqe, work, prev, acct); raw_spin_unlock(&acct->lock); io_run_cancel(work, wqe); match->nr_pending++; @@ -1175,17 +1253,35 @@ static bool io_acct_cancel_pending_work(struct io_w= qe *wqe, static void io_wqe_cancel_pending_work(struct io_wqe *wqe, struct io_cb_cancel_data *match) { - int i; -retry: - for (i =3D 0; i < IO_WQ_ACCT_NR; i++) { - struct io_wqe_acct *acct =3D io_get_acct(wqe, i =3D=3D 0, false); + int i, j; + struct io_wqe_acct *acct, *iw_acct; =20 +retry_public: + for (i =3D 0; i < IO_WQ_ACCT_NR; i++) { + acct =3D io_get_acct(wqe, i =3D=3D 0, false); if (io_acct_cancel_pending_work(wqe, acct, match)) { if (match->cancel_all) - goto retry; - break; + goto retry_public; + return; } } + +retry_private: + for (i =3D 0; i < IO_WQ_ACCT_NR; i++) { + acct =3D io_get_acct(wqe, i =3D=3D 0, true); + raw_spin_lock(&acct->lock); + for (j =3D 0; j < acct->nr_fixed; j++) { + iw_acct =3D &acct->fixed_workers[j]->acct; + if (io_acct_cancel_pending_work(wqe, iw_acct, match)) { + if (match->cancel_all) { + raw_spin_unlock(&acct->lock); + goto retry_private; + } + break; + } + } + raw_spin_unlock(&acct->lock); + } } =20 static void io_wqe_cancel_running_work(struct io_wqe *wqe, --=20 2.36.0 From nobody Sun May 10 16:24:14 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8C6CAC433F5 for ; Fri, 29 Apr 2022 10:19:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1351305AbiD2KWw (ORCPT ); Fri, 29 Apr 2022 06:22:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38578 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1357397AbiD2KWb (ORCPT ); Fri, 29 Apr 2022 06:22:31 -0400 Received: from mail-pg1-x536.google.com (mail-pg1-x536.google.com [IPv6:2607:f8b0:4864:20::536]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CBC8DC6EE8; Fri, 29 Apr 2022 03:19:13 -0700 (PDT) Received: by mail-pg1-x536.google.com with SMTP id bg9so6174291pgb.9; Fri, 29 Apr 2022 03:19:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=yaaffhZRdgQzeJaW/txlNp8Y8Fkop4HSJzduw6Zdurk=; b=L3ZRc7u0H+ek/0Z+wpKKbgyjKi8oGCwOy6VEPbBCMmy1ihChbHsyMwTbsT030foUh3 TDoEZJ/+PBxfhOx9Vpv1YeTGtEMeglQVt1uWPQ+OxvnCLd8pDEGwhL4OrRgU9bHb5lTI YLBSYIzTcsyhUObxaxyXX8HMW/Drep594/ESiI/SnOLwuyFqAWUwpnsTy/8zCGT0po+j FehWPzwiX0vd1awDRkEJAweRDe0kZHddhil8Fnw4EYJXVXmL6J4F0I++ouc2zMkUwe9i RoWBQUV7c9gXl9vuf95cBnA51TgaHWdas4Y71alBJoOspVwy+JHi0egws04jxitEwXc2 wRZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=yaaffhZRdgQzeJaW/txlNp8Y8Fkop4HSJzduw6Zdurk=; b=sUx3mcfTrm0B3ZRvzZROxClkVVb3Sy5DXsf385mMpnzIvORVc9BehHQ4metZbfrQ+t 4vusloxMVgjJ0xhvtZBCTm3vnRjS/P9x/RMXjecIMPXXhPsVBQ0arv8jIWVvKXn6AUeB C1oi70q2FXKZtXdCUK9UyYvwDjAg7vHFo4I0+p4Pmde9UZF9B9aGcW6/Q1Hi88sq3/se I9ZLAU1OXgDbotQtIruzKl4ysZxt6I9RoypVbDFrd+YR0HLTb+A6H310HV2F+Cd6F/+d wc6v4rFc4mGQwqEILK4ytaD5YfDOdqHCQFuWE8L3I+7fiGfgYg7ScBVih9Tpj1/FgzSK aEYQ== X-Gm-Message-State: AOAM533ZADmzYw2hCKaznjv7wIjMp6cab/y+SB0Oi7HbubW3ISQKTr7+ D1NvXZKcrBCHPOfcvNZVxXONjVRByNM= X-Google-Smtp-Source: ABdhPJx58AzAT7mNOSnHmR7deNFVaKnPyEoWDnoyMjWXuXmpKIFKe0Gk8XKY3kzCG6Vznv4IisKXLw== X-Received: by 2002:a63:e90a:0:b0:3aa:2c41:87b4 with SMTP id i10-20020a63e90a000000b003aa2c4187b4mr31343195pgh.118.1651227553225; Fri, 29 Apr 2022 03:19:13 -0700 (PDT) Received: from HOWEYXU-MB0.tencent.com ([106.53.33.166]) by smtp.gmail.com with ESMTPSA id k17-20020a628e11000000b0050d8d373331sm2600016pfe.214.2022.04.29.03.19.11 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 29 Apr 2022 03:19:12 -0700 (PDT) From: Hao Xu To: io-uring@vger.kernel.org Cc: Jens Axboe , Pavel Begunkov , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 8/9] io-wq: batch the handling of fixed worker private works Date: Fri, 29 Apr 2022 18:18:57 +0800 Message-Id: <20220429101858.90282-9-haoxu.linux@gmail.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220429101858.90282-1-haoxu.linux@gmail.com> References: <20220429101858.90282-1-haoxu.linux@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Hao Xu Reduce acct->lock contension by batching the handling of private work list for fixed_workers. Signed-off-by: Hao Xu --- fs/io-wq.c | 42 +++++++++++++++++++++++++++++++++--------- fs/io-wq.h | 5 +++++ 2 files changed, 38 insertions(+), 9 deletions(-) diff --git a/fs/io-wq.c b/fs/io-wq.c index aaa9cea7d39a..df2d480395e8 100644 --- a/fs/io-wq.c +++ b/fs/io-wq.c @@ -540,8 +540,23 @@ static bool io_wait_on_hash(struct io_wqe *wqe, unsign= ed int hash) return ret; } =20 +static inline void conditional_acct_lock(struct io_wqe_acct *acct, + bool needs_lock) +{ + if (needs_lock) + raw_spin_lock(&acct->lock); +} + +static inline void conditional_acct_unlock(struct io_wqe_acct *acct, + bool needs_lock) +{ + if (needs_lock) + raw_spin_unlock(&acct->lock); +} + static struct io_wq_work *io_get_next_work(struct io_wqe_acct *acct, - struct io_worker *worker) + struct io_worker *worker, + bool needs_lock) __must_hold(acct->lock) { struct io_wq_work_node *node, *prev; @@ -549,6 +564,7 @@ static struct io_wq_work *io_get_next_work(struct io_wq= e_acct *acct, unsigned int stall_hash =3D -1U; struct io_wqe *wqe =3D worker->wqe; =20 + conditional_acct_lock(acct, needs_lock); wq_list_for_each(node, prev, &acct->work_list) { unsigned int hash; =20 @@ -557,6 +573,7 @@ static struct io_wq_work *io_get_next_work(struct io_wq= e_acct *acct, /* not hashed, can run anytime */ if (!io_wq_is_hashed(work)) { wq_list_del(&acct->work_list, node, prev); + conditional_acct_unlock(acct, needs_lock); return work; } =20 @@ -568,6 +585,7 @@ static struct io_wq_work *io_get_next_work(struct io_wq= e_acct *acct, if (!test_and_set_bit(hash, &wqe->wq->hash->map)) { wqe->hash_tail[hash] =3D NULL; wq_list_cut(&acct->work_list, &tail->list, prev); + conditional_acct_unlock(acct, needs_lock); return work; } if (stall_hash =3D=3D -1U) @@ -584,15 +602,16 @@ static struct io_wq_work *io_get_next_work(struct io_= wqe_acct *acct, * work being added and clearing the stalled bit. */ set_bit(IO_ACCT_STALLED_BIT, &acct->flags); - raw_spin_unlock(&acct->lock); + conditional_acct_unlock(acct, needs_lock); unstalled =3D io_wait_on_hash(wqe, stall_hash); - raw_spin_lock(&acct->lock); + conditional_acct_lock(acct, needs_lock); if (unstalled) { clear_bit(IO_ACCT_STALLED_BIT, &acct->flags); if (wq_has_sleeper(&wqe->wq->hash->wait)) wake_up(&wqe->wq->hash->wait); } } + conditional_acct_unlock(acct, needs_lock); =20 return NULL; } @@ -626,7 +645,7 @@ static void io_assign_current_work(struct io_worker *wo= rker, static void io_wqe_enqueue(struct io_wqe *wqe, struct io_wq_work *work); =20 static void io_worker_handle_work(struct io_worker *worker, - struct io_wqe_acct *acct) + struct io_wqe_acct *acct, bool needs_lock) { struct io_wqe *wqe =3D worker->wqe; struct io_wq *wq =3D wqe->wq; @@ -642,9 +661,7 @@ static void io_worker_handle_work(struct io_worker *wor= ker, * can't make progress, any work completion or insertion will * clear the stalled flag. */ - raw_spin_lock(&acct->lock); - work =3D io_get_next_work(acct, worker); - raw_spin_unlock(&acct->lock); + work =3D io_get_next_work(acct, worker, needs_lock); if (work) { __io_worker_busy(wqe, worker); =20 @@ -701,12 +718,19 @@ static void io_worker_handle_work(struct io_worker *w= orker, =20 static inline void io_worker_handle_private_work(struct io_worker *worker) { - io_worker_handle_work(worker, &worker->acct); + struct io_wqe_acct acct; + + raw_spin_lock(&worker->acct.lock); + acct =3D worker->acct; + wq_list_clean(&worker->acct.work_list); + worker->acct.nr_works =3D 0; + raw_spin_unlock(&worker->acct.lock); + io_worker_handle_work(worker, &acct, false); } =20 static inline void io_worker_handle_public_work(struct io_worker *worker) { - io_worker_handle_work(worker, io_wqe_get_acct(worker)); + io_worker_handle_work(worker, io_wqe_get_acct(worker), true); } =20 static int io_wqe_worker(void *data) diff --git a/fs/io-wq.h b/fs/io-wq.h index ba6eee76d028..ef3ce577e6b7 100644 --- a/fs/io-wq.h +++ b/fs/io-wq.h @@ -40,6 +40,11 @@ struct io_wq_work_list { (list)->first =3D NULL; \ } while (0) =20 +static inline void wq_list_clean(struct io_wq_work_list *list) +{ + list->first =3D list->last =3D NULL; +} + static inline void wq_list_add_after(struct io_wq_work_node *node, struct io_wq_work_node *pos, struct io_wq_work_list *list) --=20 2.36.0 From nobody Sun May 10 16:24:14 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B531C433EF for ; Fri, 29 Apr 2022 10:19:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1357472AbiD2KXB (ORCPT ); Fri, 29 Apr 2022 06:23:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38784 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1357388AbiD2KWe (ORCPT ); Fri, 29 Apr 2022 06:22:34 -0400 Received: from mail-pl1-x62c.google.com (mail-pl1-x62c.google.com [IPv6:2607:f8b0:4864:20::62c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3153BC6EC4; Fri, 29 Apr 2022 03:19:16 -0700 (PDT) Received: by mail-pl1-x62c.google.com with SMTP id s14so6745026plk.8; Fri, 29 Apr 2022 03:19:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=B3NCzom8hJlx0zEy79Sz4qhJxcFl7hPctwsRA5jCJVg=; b=n5e2gCVMetK60EV0yAY1ZD2lqkuRIIdUATIgR3jexzzkaMC8uaAWT9rNxFUAdvCLbH vVZ9m9DA9XA5BSYrmH3+/xL2rYNAqdrF7ij4qDlpioRQO92Vgf+pCqUUoOYcCG+OFAFe gLlcOVSONcpXDQiy68mGWbJkbJ8NHcMMR7ThVXFUncaJgdAPS+ztw0aNBSN7r/p+/GO+ fQYAHjthDn1s6T9Qee7aR9YX13QKL6LrL0jvPvQbuBV/ad8BaN0ev0Uru72L4ZVrM5zp gtzvQQ8KAacKFsk2fR0omTbWb8f7AWXNHyRiGcvMKq9xcWEXF/YyZRFyxfqYWZogCwYl xBXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=B3NCzom8hJlx0zEy79Sz4qhJxcFl7hPctwsRA5jCJVg=; b=ZlcNf0NsRTWQTyRiuhrceCT5Wre/3il+nDB/BKpcQDcVli3Nqj7sAWVBWOOPJ2ia/t AoYgVH0LvDy9CB/R0J2yzpdDbfbwAv2BMBrGqkHJMj2NfOvrz1s1bGlJSq81dOzLhlxf GnvVFk0/Fp5yJf2rIO0bN/zV6KRBPyhRyW36O7bVievQ1y3BYREr5bjvGG4GCC+KxWWv gOMPF5U+tjYJS+jsc8bZfpM6kSI0kQ7E8G3fH/2TiJRNykWGNsTmAV+BTOQNdJwHuK4E 9CLDUDQRbsmAiayZMR6bVKK1IZr/8t3iZXeIxa71CvIIr6FqtTlXfulf7FTc3F+OSN4X Etxg== X-Gm-Message-State: AOAM533gWUJfdFcAl1qhcIGnpPep0GVP/rRhN18V0i1J6HFXINP+MzU/ zIewtbBkui0gAwfcY95aPWGqWYyhaQc= X-Google-Smtp-Source: ABdhPJyRVy4Th8+Q8iCMIdMzGLmF4M6s5YI9UwjImZYV5uDo9pI0HszSp7nM2ZUuUPrSn3m3XiDrXA== X-Received: by 2002:a17:90a:c402:b0:1d9:a003:3f8a with SMTP id i2-20020a17090ac40200b001d9a0033f8amr3065685pjt.18.1651227555375; Fri, 29 Apr 2022 03:19:15 -0700 (PDT) Received: from HOWEYXU-MB0.tencent.com ([106.53.33.166]) by smtp.gmail.com with ESMTPSA id k17-20020a628e11000000b0050d8d373331sm2600016pfe.214.2022.04.29.03.19.13 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 29 Apr 2022 03:19:15 -0700 (PDT) From: Hao Xu To: io-uring@vger.kernel.org Cc: Jens Axboe , Pavel Begunkov , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 9/9] io_uring: add register fixed worker interface Date: Fri, 29 Apr 2022 18:18:58 +0800 Message-Id: <20220429101858.90282-10-haoxu.linux@gmail.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220429101858.90282-1-haoxu.linux@gmail.com> References: <20220429101858.90282-1-haoxu.linux@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Hao Xu Add an io_uring_register() interface to register fixed workers and indicate its work capacity. The argument is an array of two elements each is struct { __s32 nr_workers; __s32 max_works; } (nr_workers, max_works) meaning nr_workers or max_works < -1 invalid nr_workers or max_works =3D=3D -1 get the old value back nr_workers or max_works >=3D 0 get the old value and set to the new value Signed-off-by: Hao Xu --- fs/io-wq.c | 101 ++++++++++++++++++++++++++++++++++ fs/io-wq.h | 3 + fs/io_uring.c | 71 ++++++++++++++++++++++++ include/uapi/linux/io_uring.h | 11 ++++ 4 files changed, 186 insertions(+) diff --git a/fs/io-wq.c b/fs/io-wq.c index df2d480395e8..c1e87b29c960 100644 --- a/fs/io-wq.c +++ b/fs/io-wq.c @@ -1671,6 +1671,107 @@ int io_wq_max_workers(struct io_wq *wq, int *new_co= unt) return 0; } =20 +/* + * Set max number of fixed workers and the capacity of private work list, + * returns old value. If new_count is -1, then just return the old value. + */ +int io_wq_fixed_workers(struct io_wq *wq, + struct io_uring_fixed_worker_arg *new_count) +{ + struct io_uring_fixed_worker_arg prev[IO_WQ_ACCT_NR]; + bool first_node =3D true; + int i, node; + bool readonly[2] =3D { + (new_count[0].nr_workers =3D=3D -1 && new_count[0].max_works =3D=3D -1), + (new_count[1].nr_workers =3D=3D -1 && new_count[1].max_works =3D=3D -1), + }; + + BUILD_BUG_ON((int) IO_WQ_ACCT_BOUND !=3D (int) IO_WQ_BOUND); + BUILD_BUG_ON((int) IO_WQ_ACCT_UNBOUND !=3D (int) IO_WQ_UNBOUND); + BUILD_BUG_ON((int) IO_WQ_ACCT_NR !=3D 2); + + for (i =3D 0; i < IO_WQ_ACCT_NR; i++) { + if (new_count[i].nr_workers > task_rlimit(current, RLIMIT_NPROC)) + new_count[i].nr_workers =3D + task_rlimit(current, RLIMIT_NPROC); + } + + rcu_read_lock(); + for_each_node(node) { + int j; + struct io_wqe *wqe =3D wq->wqes[node]; + + for (i =3D 0; i < IO_WQ_ACCT_NR; i++) { + struct io_wqe_acct *acct =3D &wqe->fixed_acct[i]; + int *nr_fixed, *max_works; + struct io_worker **fixed_workers; + int nr =3D new_count[i].nr_workers; + + raw_spin_lock(&acct->lock); + nr_fixed =3D &acct->nr_fixed; + max_works =3D &acct->max_works; + fixed_workers =3D acct->fixed_workers; + if (first_node) { + prev[i].nr_workers =3D *nr_fixed; + prev[i].max_works =3D *max_works; + } + if (readonly[i]) { + raw_spin_unlock(&acct->lock); + continue; + } + if (*nr_fixed =3D=3D nr || nr =3D=3D -1) { + *max_works =3D new_count[i].max_works; + raw_spin_unlock(&acct->lock); + continue; + } + for (j =3D 0; j < *nr_fixed; j++) { + struct io_worker *worker =3D fixed_workers[j]; + + if (!worker) + continue; + worker->flags |=3D IO_WORKER_F_EXIT; + /* + * Mark index to -1 to avoid false deletion + * in io_fixed_worker_exit() + */ + worker->index =3D -1; + /* + * Once a worker is in fixed_workers array + * it is definitely there before we release + * the acct->lock below. That's why we don't + * need to increment the worker->ref here. + */ + wake_up_process(worker->task); + } + kfree(fixed_workers); + acct->fixed_workers =3D NULL; + *nr_fixed =3D 0; + *max_works =3D new_count[i].max_works; + acct->fixed_workers =3D kzalloc_node( + sizeof(*fixed_workers) * nr, + GFP_KERNEL, wqe->node); + if (!acct->fixed_workers) { + raw_spin_unlock(&acct->lock); + return -ENOMEM; + } + raw_spin_unlock(&acct->lock); + for (j =3D 0; j < nr; j++) + io_wqe_create_worker(wqe, acct); + + acct->fixed_worker_registered =3D !!nr; + } + first_node =3D false; + } + rcu_read_unlock(); + + for (i =3D 0; i < IO_WQ_ACCT_NR; i++) { + new_count[i].nr_workers =3D prev[i].nr_workers; + new_count[i].max_works =3D prev[i].max_works; + } + + return 0; +} + static __init int io_wq_init(void) { int ret; diff --git a/fs/io-wq.h b/fs/io-wq.h index ef3ce577e6b7..bf90488b0283 100644 --- a/fs/io-wq.h +++ b/fs/io-wq.h @@ -2,6 +2,7 @@ #define INTERNAL_IO_WQ_H =20 #include +#include =20 struct io_wq; =20 @@ -202,6 +203,8 @@ void io_wq_hash_work(struct io_wq_work *work, void *val= ); =20 int io_wq_cpu_affinity(struct io_wq *wq, cpumask_var_t mask); int io_wq_max_workers(struct io_wq *wq, int *new_count); +int io_wq_fixed_workers(struct io_wq *wq, + struct io_uring_fixed_worker_arg *new_count); =20 static inline bool io_wq_is_hashed(struct io_wq_work *work) { diff --git a/fs/io_uring.c b/fs/io_uring.c index 1e7466079af7..c0c7c1fd94fd 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -11806,6 +11806,71 @@ static __cold int io_register_iowq_max_workers(str= uct io_ring_ctx *ctx, return ret; } =20 +static __cold int io_register_iowq_fixed_workers(struct io_ring_ctx *ctx, + void __user *arg) + __must_hold(&ctx->uring_lock) +{ + struct io_uring_task *tctx =3D NULL; + struct io_sq_data *sqd =3D NULL; + struct io_uring_fixed_worker_arg new_count[2]; + int i, ret; + + if (copy_from_user(new_count, arg, sizeof(new_count))) + return -EFAULT; + for (i =3D 0; i < ARRAY_SIZE(new_count); i++) { + int nr_workers =3D new_count[i].nr_workers; + int max_works =3D new_count[i].max_works; + + if (nr_workers < -1 || max_works < -1) + return -EINVAL; + } + + if (ctx->flags & IORING_SETUP_SQPOLL) { + sqd =3D ctx->sq_data; + if (sqd) { + /* + * Observe the correct sqd->lock -> ctx->uring_lock + * ordering. Fine to drop uring_lock here, we hold + * a ref to the ctx. + */ + refcount_inc(&sqd->refs); + mutex_unlock(&ctx->uring_lock); + mutex_lock(&sqd->lock); + mutex_lock(&ctx->uring_lock); + if (sqd->thread) + tctx =3D sqd->thread->io_uring; + } + } else { + tctx =3D current->io_uring; + } + + if (tctx && tctx->io_wq) { + ret =3D io_wq_fixed_workers(tctx->io_wq, new_count); + if (ret) + goto err; + } else { + memset(new_count, -1, sizeof(new_count)); + } + + if (sqd) { + mutex_unlock(&sqd->lock); + io_put_sq_data(sqd); + } + + if (copy_to_user(arg, new_count, sizeof(new_count))) + return -EFAULT; + + /* that's it for SQPOLL, only the SQPOLL task creates requests */ + if (sqd) + return 0; + +err: + if (sqd) { + mutex_unlock(&sqd->lock); + io_put_sq_data(sqd); + } + return ret; +} static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode, void __user *arg, unsigned nr_args) __releases(ctx->uring_lock) @@ -11934,6 +11999,12 @@ static int __io_uring_register(struct io_ring_ctx = *ctx, unsigned opcode, case IORING_UNREGISTER_RING_FDS: ret =3D io_ringfd_unregister(ctx, arg, nr_args); break; + case IORING_REGISTER_IOWQ_FIXED_WORKERS: + ret =3D -EINVAL; + if (!arg || nr_args !=3D 2) + break; + ret =3D io_register_iowq_fixed_workers(ctx, arg); + break; default: ret =3D -EINVAL; break; diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index fad63564678a..f0ec9523ab42 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -360,6 +360,12 @@ enum { IORING_REGISTER_RING_FDS =3D 20, IORING_UNREGISTER_RING_FDS =3D 21, =20 + /* set number of fixed workers and number + * of works in a private work list which + * belongs to a fixed worker + */ + IORING_REGISTER_IOWQ_FIXED_WORKERS =3D 22, + /* this goes last */ IORING_REGISTER_LAST }; @@ -457,4 +463,9 @@ struct io_uring_getevents_arg { __u64 ts; }; =20 +struct io_uring_fixed_worker_arg { + __s32 nr_workers; + __s32 max_works; +}; + #endif --=20 2.36.0