From nobody Sun May 5 01:49:36 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) client-ip=192.237.175.120; envelope-from=xen-devel-bounces@lists.xenproject.org; helo=lists.xenproject.org; Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass(p=quarantine dis=none) header.from=suse.com ARC-Seal: i=1; a=rsa-sha256; t=1630067570; cv=none; d=zohomail.com; s=zohoarc; b=cOnrSsvG0vaLTCLJGqJ0ZYV5Pc5WTZlHtV05CLL3hbjrIYIgNmOcckWB7ZzxPW3Z+9MfVbBWYPumed23+rCwBfVW5gdLps5bd3E4I8fJ9Cg4texO7dPANsRzTqgHJgi4gpWEPTVGhrHX6VgaYhKfDC39N1uOogW/6G7H2viD3O4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1630067570; h=Content-Transfer-Encoding:Cc:Date:From:List-Subscribe:List-Post:List-Id:List-Help:List-Unsubscribe:MIME-Version:Message-ID:Sender:Subject:To; bh=Vd+tUs6wTAUAyjisZYaUQyAL8A4GNv6GpVWaw3KUUZ8=; b=len9FPySubTMeiu+XNkEpzfLihST3VrqknMgVBgF/ouJTtwi6UrW1vgsp7MRqIuWq314p8Sca8HhkZNe1RMlEZ7Xaoun1VEJaX8YtR6g1OOx930zdpqwQZbIW5U8ECe/I4G1Kz3lXjTkO6nM5DFfa+RetSVXUuogZCDECU67sPo= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of lists.xenproject.org designates 192.237.175.120 as permitted sender) smtp.mailfrom=xen-devel-bounces@lists.xenproject.org; dmarc=pass header.from= (p=quarantine dis=none) Return-Path: Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) by mx.zohomail.com with SMTPS id 1630067570947214.71691817362716; Fri, 27 Aug 2021 05:32:50 -0700 (PDT) Received: from list by lists.xenproject.org with outflank-mailman.173631.316771 (Exim 4.92) (envelope-from ) id 1mJb1r-00019Y-OM; Fri, 27 Aug 2021 12:32:15 +0000 Received: by outflank-mailman (output) from mailman id 173631.316771; Fri, 27 Aug 2021 12:32:15 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1mJb1r-00019R-J9; Fri, 27 Aug 2021 12:32:15 +0000 Received: by outflank-mailman (input) for mailman id 173631; Fri, 27 Aug 2021 12:32:14 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1mJb1q-00019L-Fs for xen-devel@lists.xenproject.org; Fri, 27 Aug 2021 12:32:14 +0000 Received: from smtp-out2.suse.de (unknown [195.135.220.29]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id c6fbf94d-e849-4a7c-9c53-db48a153c773; Fri, 27 Aug 2021 12:32:13 +0000 (UTC) Received: from imap1.suse-dmz.suse.de (imap1.suse-dmz.suse.de [192.168.254.73]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id F36261FF08; Fri, 27 Aug 2021 12:32:11 +0000 (UTC) Received: from imap1.suse-dmz.suse.de (imap1.suse-dmz.suse.de [192.168.254.73]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap1.suse-dmz.suse.de (Postfix) with ESMTPS id B4F0413890; Fri, 27 Aug 2021 12:32:11 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap1.suse-dmz.suse.de with ESMTPSA id 0awpKkvbKGE1VAAAGKfGzw (envelope-from ); Fri, 27 Aug 2021 12:32:11 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: c6fbf94d-e849-4a7c-9c53-db48a153c773 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1630067532; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=Vd+tUs6wTAUAyjisZYaUQyAL8A4GNv6GpVWaw3KUUZ8=; b=kdBQWhOQveSfOrwlCVe1caRuUSXuJPygt/VCXx4IEvgeUU7gulPPZmbCqWGhdjtkBjVgwY qFmpg7yPbh/aMrFrNoWp1Jj8YCOeaaZRbYBKXH41/GCb0JwIZP/2JvwPpa6rL2RrqwoK1I wABRKwy2wUjD4aCruD5xzGVTjOLWyjI= From: Juergen Gross To: xen-devel@lists.xenproject.org, linux-kernel@vger.kernel.org Cc: Juergen Gross , Boris Ostrovsky , Stefano Stabellini , Jan Beulich Subject: [PATCH] xen/balloon: use a kernel thread instead a workqueue Date: Fri, 27 Aug 2021 14:32:06 +0200 Message-Id: <20210827123206.15429-1-jgross@suse.com> X-Mailer: git-send-email 2.26.2 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ZohoMail-DKIM: pass (identity @suse.com) X-ZM-MESSAGEID: 1630067571610100001 Content-Type: text/plain; charset="utf-8" Today the Xen ballooning is done via delayed work in a workqueue. This might result in workqueue hangups being reported in case of large amounts of memory are being ballooned in one go (here 16GB): BUG: workqueue lockup - pool cpus=3D6 node=3D0 flags=3D0x0 nice=3D0 stuck f= or 64s! Showing busy workqueues and worker pools: workqueue events: flags=3D0x0 pwq 12: cpus=3D6 node=3D0 flags=3D0x0 nice=3D0 active=3D2/256 refcnt=3D3 in-flight: 229:balloon_process pending: cache_reap workqueue events_freezable_power_: flags=3D0x84 pwq 12: cpus=3D6 node=3D0 flags=3D0x0 nice=3D0 active=3D1/256 refcnt=3D2 pending: disk_events_workfn workqueue mm_percpu_wq: flags=3D0x8 pwq 12: cpus=3D6 node=3D0 flags=3D0x0 nice=3D0 active=3D1/256 refcnt=3D2 pending: vmstat_update pool 12: cpus=3D6 node=3D0 flags=3D0x0 nice=3D0 hung=3D64s workers=3D3 idle= : 2222 43 This can easily be avoided by using a dedicated kernel thread for doing the ballooning work. Reported-by: Jan Beulich Signed-off-by: Juergen Gross Reviewed-by: Boris Ostrovsky --- drivers/xen/balloon.c | 62 +++++++++++++++++++++++++++++++------------ 1 file changed, 45 insertions(+), 17 deletions(-) diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index 671c71245a7b..2d2803883306 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -43,6 +43,8 @@ #include #include #include +#include +#include #include #include #include @@ -115,7 +117,7 @@ static struct ctl_table xen_root[] =3D { #define EXTENT_ORDER (fls(XEN_PFN_PER_PAGE) - 1) =20 /* - * balloon_process() state: + * balloon_thread() state: * * BP_DONE: done or nothing to do, * BP_WAIT: wait to be rescheduled, @@ -130,6 +132,8 @@ enum bp_state { BP_ECANCELED }; =20 +/* Main waiting point for xen-balloon thread. */ +static DECLARE_WAIT_QUEUE_HEAD(balloon_thread_wq); =20 static DEFINE_MUTEX(balloon_mutex); =20 @@ -144,10 +148,6 @@ static xen_pfn_t frame_list[PAGE_SIZE / sizeof(xen_pfn= _t)]; static LIST_HEAD(ballooned_pages); static DECLARE_WAIT_QUEUE_HEAD(balloon_wq); =20 -/* Main work function, always executed in process context. */ -static void balloon_process(struct work_struct *work); -static DECLARE_DELAYED_WORK(balloon_worker, balloon_process); - /* When ballooning out (allocating memory to return to Xen) we don't really want the kernel to try too hard since that can trigger the oom killer. = */ #define GFP_BALLOON \ @@ -366,7 +366,7 @@ static void xen_online_page(struct page *page, unsigned= int order) static int xen_memory_notifier(struct notifier_block *nb, unsigned long va= l, void *v) { if (val =3D=3D MEM_ONLINE) - schedule_delayed_work(&balloon_worker, 0); + wake_up(&balloon_thread_wq); =20 return NOTIFY_OK; } @@ -491,18 +491,43 @@ static enum bp_state decrease_reservation(unsigned lo= ng nr_pages, gfp_t gfp) } =20 /* - * As this is a work item it is guaranteed to run as a single instance onl= y. + * Stop waiting if either state is not BP_EAGAIN and ballooning action is + * needed, or if the credit has changed while state is BP_EAGAIN. + */ +static bool balloon_thread_cond(enum bp_state state, long credit) +{ + if (state !=3D BP_EAGAIN) + credit =3D 0; + + return current_credit() !=3D credit || kthread_should_stop(); +} + +/* + * As this is a kthread it is guaranteed to run as a single instance only. * We may of course race updates of the target counts (which are protected * by the balloon lock), or with changes to the Xen hard limit, but we will * recover from these in time. */ -static void balloon_process(struct work_struct *work) +static int balloon_thread(void *unused) { enum bp_state state =3D BP_DONE; long credit; + unsigned long timeout; + + set_freezable(); + for (;;) { + if (state =3D=3D BP_EAGAIN) + timeout =3D balloon_stats.schedule_delay * HZ; + else + timeout =3D 3600 * HZ; + credit =3D current_credit(); =20 + wait_event_interruptible_timeout(balloon_thread_wq, + balloon_thread_cond(state, credit), timeout); + + if (kthread_should_stop()) + return 0; =20 - do { mutex_lock(&balloon_mutex); =20 credit =3D current_credit(); @@ -529,12 +554,7 @@ static void balloon_process(struct work_struct *work) mutex_unlock(&balloon_mutex); =20 cond_resched(); - - } while (credit && state =3D=3D BP_DONE); - - /* Schedule more work if there is some still to be done. */ - if (state =3D=3D BP_EAGAIN) - schedule_delayed_work(&balloon_worker, balloon_stats.schedule_delay * HZ= ); + } } =20 /* Resets the Xen limit, sets new target, and kicks off processing. */ @@ -542,7 +562,7 @@ void balloon_set_new_target(unsigned long target) { /* No need for lock. Not read-modify-write updates. */ balloon_stats.target_pages =3D target; - schedule_delayed_work(&balloon_worker, 0); + wake_up(&balloon_thread_wq); } EXPORT_SYMBOL_GPL(balloon_set_new_target); =20 @@ -647,7 +667,7 @@ void free_xenballooned_pages(int nr_pages, struct page = **pages) =20 /* The balloon may be too large now. Shrink it if needed. */ if (current_credit()) - schedule_delayed_work(&balloon_worker, 0); + wake_up(&balloon_thread_wq); =20 mutex_unlock(&balloon_mutex); } @@ -679,6 +699,8 @@ static void __init balloon_add_region(unsigned long sta= rt_pfn, =20 static int __init balloon_init(void) { + struct task_struct *task; + if (!xen_domain()) return -ENODEV; =20 @@ -722,6 +744,12 @@ static int __init balloon_init(void) } #endif =20 + task =3D kthread_run(balloon_thread, NULL, "xen-balloon"); + if (IS_ERR(task)) { + pr_err("xen-balloon thread could not be started, ballooning will not wor= k!\n"); + return PTR_ERR(task); + } + /* Init the xen-balloon driver. */ xen_balloon_init(); =20 --=20 2.26.2