From nobody Wed Dec 17 19:18:06 2025 Received: from mail-pl1-f178.google.com (mail-pl1-f178.google.com [209.85.214.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D786C14D2A0 for ; Tue, 25 Jun 2024 11:43:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719315785; cv=none; b=M8CwVt9vdaZ/2CBnKsaCRRg/2T8eBlDWM4+T4EsXhscgxVz/8Kgz65Zh/PzEEftA9Uk5bZJOcnZhElOukEhSmnRIepkPcM8I4KmZDiFl3NBi8fCk0TWlqUoW5dIBfEWKKwegLSu8Go4NHW2KNHH0CE/T+AUV7siGGXXz2ydkOxw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719315785; c=relaxed/simple; bh=TH9YzaELe6IxjoA2Gh0dTe5ifnOKZGmbvqwkfIfVmm8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=cOBgwHUY/e0qHOJZTUw9NcLFpeAj8JRoBBUNMDafO7BUtnZMMyla3W8xwwWNqBtLeWes/qDnhjfrGI+5raI8aLEkGXhsxw21p3diCi8egg9nAnVKxDdxDT3MffaxUL3WNiDIsmPQZsqlbCGYmDYMOdUQ9E7QQWS15FhDYOMJq8M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=bRcLDhcs; arc=none smtp.client-ip=209.85.214.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="bRcLDhcs" Received: by mail-pl1-f178.google.com with SMTP id d9443c01a7336-1f4c7b022f8so46832785ad.1 for ; Tue, 25 Jun 2024 04:43:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1719315783; x=1719920583; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=DMuRqbTG26uadPUsVqc61Ox5O3jYym2meXNjMFc0pVg=; b=bRcLDhcsWdNPUDkuBjLeD9D/in592a1iN72R/HSlMKpwfx1a7JhmzvyLMXPr7dAHzX 7B4gqraOY9qk+3D65qsCzoFfl6DLHsyTUZFR63Dy238Lk87pV8JzT8snom6Il49CHWLS 2f9Z5mymw08uN/3yhyD3oJ0nrCftIFnnRN1I6kf43kCHeEoiLdAlyLU6zU/XpOvNdsjE b9PZOW9Nj1dmyb81+Z/jpHGQKJm5NOPq8NS6QvshFXqoBw/C3n6r5eTjNhT6JoVQW4UR yXq5j2ZXYEt3ylSf4ckLO3mH7t6itpOR1kDFv+ll85Tetpiizgns8O3yLYWlhYVwj1wI /UCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719315783; x=1719920583; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DMuRqbTG26uadPUsVqc61Ox5O3jYym2meXNjMFc0pVg=; b=N6VP0wGSxo7pW+sY/Hapl4Z/sTQfuTp/AV5pfrIk4YwqZVtcURV7UHsRS2vEFoPLsp wF5EsTZWulIeMqGNOEyQq+v9wjUnNgQk13qov+h9bkZSkNp5Nrxv4YKjslYmoRLs8LtM Nvxsw00XNCYZ38wFoel6g1Fgs+69tj0fUr8/3chnL5HrPHUHq5AlCBjccyT3II1Xarqn thKJ0GEDd4FmiD3aAZpRL/+lHTiDUsjQJT2tqk0WuJLg1RtEqYsyE4wspsNnjIGyxHJc RExytPTClg/vTFTjBBH1S4+VEV8amz2Ejk1vo3EczuH4yt2dyRRxxUhUSIl7lHneUDyr Sm9Q== X-Forwarded-Encrypted: i=1; AJvYcCXW7kIFCuUFtl3ihnNqOX4fFdyMIDjuA1iDjjTXeT3ALMqKnm75dg59zkYfHEmzQKeSRVA2zHzfvUzRfxpYyDbDnHa01rnPMsjU2/ru X-Gm-Message-State: AOJu0YyCGEWzLjF8vn5LOj8v8vWFzk4UwU0K54KGxAZszmd8bVmBRa7l DlMSSy++K4KQ5HghLOAjUccNuPQ5xbZl4exGcNa8ynTpC1dCb/wi X-Google-Smtp-Source: AGHT+IEP16XnPsYWRnZSo+1Pk716wGzvHi1zkYqGFeDpM3jrCTFBRKNx0aJaeIxt7Z23OiJgXSOW5g== X-Received: by 2002:a17:902:d481:b0:1fa:d3d:a68d with SMTP id d9443c01a7336-1fa23bdded8mr79110545ad.22.1719315783050; Tue, 25 Jun 2024 04:43:03 -0700 (PDT) Received: from wheely.local0.net (118-211-5-80.tpgi.com.au. [118.211.5.80]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f9eb2f028asm79638525ad.3.2024.06.25.04.42.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Jun 2024 04:43:02 -0700 (PDT) From: Nicholas Piggin To: Tejun Heo Cc: Nicholas Piggin , "Paul E . McKenney" , Peter Zijlstra , Lai Jiangshan , Srikar Dronamraju , linux-kernel@vger.kernel.org Subject: [PATCH 1/4] workqueue: wq_watchdog_touch is always called with valid CPU Date: Tue, 25 Jun 2024 21:42:44 +1000 Message-ID: <20240625114249.289014-2-npiggin@gmail.com> X-Mailer: git-send-email 2.45.1 In-Reply-To: <20240625114249.289014-1-npiggin@gmail.com> References: <20240625114249.289014-1-npiggin@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Warn in the case it is called with cpu =3D=3D -1. This does not appear to happen anywhere. Signed-off-by: Nicholas Piggin Reviewed-by: Paul E. McKenney --- kernel/workqueue.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 003474c9a77d..0954b778b315 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -7562,6 +7562,8 @@ notrace void wq_watchdog_touch(int cpu) { if (cpu >=3D 0) per_cpu(wq_watchdog_touched_cpu, cpu) =3D jiffies; + else + WARN_ONCE(1, "%s should be called with valid CPU", __func__); =20 wq_watchdog_touched =3D jiffies; } --=20 2.45.1 From nobody Wed Dec 17 19:18:06 2025 Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B9CB4152503 for ; Tue, 25 Jun 2024 11:43:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719315789; cv=none; b=NcyzjrXe3II+Uj3wkAiu1xPajKJOuvPUQalT/Re/BmP7gaflFvb1kMLFJYn3i6vCmEwzeVfBzecCF1mb67knG83ux3B841aIzVwvculB7dli4e+prV9Wd3lO74KgYgZUO1UJoOecJbsv4eHpFKenm5h35ooRSdNXS7ZsQYKM7Yo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719315789; c=relaxed/simple; bh=pG7PEDJSU75AngYJXfFf1+5rP+FYTqtYQqM1uU+9lqg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=l5PWkxi/6l9cz1Fy9MbEhfmtjAkKWhvzBigjYKvd2NdO599gIqffiXbn9Y3JZphBy381vJvuRboaX46Tq6l7HqKNbVXUpuct8LPeavfKsRaXzzPO6jXEVT8kod6lULOJB5ZeqPz7y2MLHMMLgH1L5ojBml5nxq4bdaK0J0hnVow= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=B9YMl1Pa; arc=none smtp.client-ip=209.85.214.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="B9YMl1Pa" Received: by mail-pl1-f181.google.com with SMTP id d9443c01a7336-1fa0f143b85so21519795ad.3 for ; Tue, 25 Jun 2024 04:43:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1719315787; x=1719920587; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=QHkqx3Vf/PXoicoAHPo0Hgl41Wb+/6pWvOIAvWHYmAs=; b=B9YMl1Paw3iqDT72mT13l0EZhuIjwsijw9L/6GBytraep5whFpC53NC+evFMCHFvLD Ue5tCzdSgvtCd87JtjU/8itCJSKGorc5tUirPJJmVRMEbTXnUo+RUFLc12M66U/1D8Os j0TavQi3lTpwZjePvI5MfYeUDRdQsNhxUKTAtgAHcoH2iMvaO7ExctAGsN7rLC2J6rRp dKpW3TzjxQrfbj/lEazvjGnMDO0wtRkzvPQW/z/jMtKQ8CjrEkCxqdCrqevd2JB7q4rV kbbPaNn9beiOtmSm4WwTlOshc9Q7Naur4eFEVxbgxmMNQK0Dbr+qe/R8PlpVP3Be4uMv rzEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719315787; x=1719920587; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QHkqx3Vf/PXoicoAHPo0Hgl41Wb+/6pWvOIAvWHYmAs=; b=wS2FgguDGPPj0mbddm7Pj3lZvpT/kAunLxgIHrbobl5vRKM5gWtc2bov3okxnBZpdB nNESjg7iSTMqcUHYBogwoc0Cc4xj1y4Wx2FjHPyWNyN7ijz9og7/XyZJXLwxROqWF8P9 J9/KAUYCvY/EhNPLX/PACGG0zsL+X9cYPOiYnRLhKZn+WUG4rl+c1kdS24AwE81EE9/1 zPQyiz3X7h8Ljmtq3zVjizmc1lLVI3AhPVrwQmcg1AlwQK+lP4o/APlDQuSKkQj3LUog xSrsF4THtPW7L4+XJBj0oFQVPyNsaiJmp1MhVaiJ2zqTQTJChyV6OaCVsOIKZfP/Ofb1 2YLg== X-Forwarded-Encrypted: i=1; AJvYcCW77FxKtdpWPxTcqKSZe2tq0qPBq/lxhQPRcAR/AYUZXyXjClo4MvDRdF+WhlBCRKD4twXwUOK9PBFfA57BiPsgG1zVTfO532+fOpTB X-Gm-Message-State: AOJu0YwG7Nd6n1n9Cke5eP25MajFYrj1GxnEdJGguG9S1HZsEYOlJnC0 noqpovAd9Y3z2z4F1ZP5N7lpz9kBwqa3mCoNxVGtsaqgDjnd7AxO/Q+gNw== X-Google-Smtp-Source: AGHT+IFakaFvbd8m6nAAdYcQDtL565vvVWPg14fBYfm3KsXgtMR3piv2WKOl+/9xipGtKuUAjESGgA== X-Received: by 2002:a17:902:c40e:b0:1fa:2401:be7d with SMTP id d9443c01a7336-1fa2401befcmr83336905ad.8.1719315786990; Tue, 25 Jun 2024 04:43:06 -0700 (PDT) Received: from wheely.local0.net (118-211-5-80.tpgi.com.au. [118.211.5.80]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f9eb2f028asm79638525ad.3.2024.06.25.04.43.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Jun 2024 04:43:06 -0700 (PDT) From: Nicholas Piggin To: Tejun Heo Cc: Nicholas Piggin , "Paul E . McKenney" , Peter Zijlstra , Lai Jiangshan , Srikar Dronamraju , linux-kernel@vger.kernel.org Subject: [PATCH 2/4] workqueue: Improve scalability of workqueue watchdog touch Date: Tue, 25 Jun 2024 21:42:45 +1000 Message-ID: <20240625114249.289014-3-npiggin@gmail.com> X-Mailer: git-send-email 2.45.1 In-Reply-To: <20240625114249.289014-1-npiggin@gmail.com> References: <20240625114249.289014-1-npiggin@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" On a ~2000 CPU powerpc system, hard lockups have been observed in the workqueue code when stop_machine runs (in this case due to CPU hotplug). This is due to lots of CPUs spinning in multi_cpu_stop, calling touch_nmi_watchdog() which ends up calling wq_watchdog_touch(). wq_watchdog_touch() writes to the global variable wq_watchdog_touched, and that can find itself in the same cacheline as other important workqueue data, which slows down operations to the point of lockups. In the case of the following abridged trace, worker_pool_idr was in the hot line, causing the lockups to always appear at idr_find. watchdog: CPU 1125 self-detected hard LOCKUP @ idr_find Call Trace: get_work_pool __queue_work call_timer_fn run_timer_softirq __do_softirq do_softirq_own_stack irq_exit timer_interrupt decrementer_common_virt * interrupt: 900 (timer) at multi_cpu_stop multi_cpu_stop cpu_stopper_thread smpboot_thread_fn kthread Fix this by having wq_watchdog_touch() only write to the line if the last time a touch was recorded exceeds 1/4 of the watchdog threshold. Reported-by: Srikar Dronamraju Signed-off-by: Nicholas Piggin Reviewed-by: Paul E. McKenney --- kernel/workqueue.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 0954b778b315..f60886782f31 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -7560,12 +7560,18 @@ static void wq_watchdog_timer_fn(struct timer_list = *unused) =20 notrace void wq_watchdog_touch(int cpu) { + unsigned long thresh =3D READ_ONCE(wq_watchdog_thresh) * HZ; + unsigned long touch_ts =3D READ_ONCE(wq_watchdog_touched); + unsigned long now =3D jiffies; + if (cpu >=3D 0) - per_cpu(wq_watchdog_touched_cpu, cpu) =3D jiffies; + per_cpu(wq_watchdog_touched_cpu, cpu) =3D now; else WARN_ONCE(1, "%s should be called with valid CPU", __func__); =20 - wq_watchdog_touched =3D jiffies; + /* Don't unnecessarily store to global cacheline */ + if (time_after(now, touch_ts + thresh / 4)) + WRITE_ONCE(wq_watchdog_touched, jiffies); } =20 static void wq_watchdog_set_thresh(unsigned long thresh) --=20 2.45.1 From nobody Wed Dec 17 19:18:06 2025 Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7F169154C0B for ; Tue, 25 Jun 2024 11:43:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719315792; cv=none; b=NjiDKecn90DOoVWbMixeQDy9pIWcEt+9OZtfEa74I5EC/z4NyGrTi8E3em7yr5Mm5nedGBX7BvIZjtzD8rx5N/Tnz5OzjCRqvraP9XZoJIJ+b87OHEUdkQejJZNHbNPtXMPknuj9dc3dXTNUt/RNwrXJwOKgvT+hA5J1gglcNug= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719315792; c=relaxed/simple; bh=8Jm80H9s7uZvf0aStKpkpcr3QFdofRB/PMI2VKaBG6w=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=BjLwKUAtlXvtICW+vjzwb7iqXorgN0vx+5ZujAMWh2Y4Edx9vS12qM/+pcna4L3iXJeBGUy22Gd+EJpqfuGfAv5o8YpXDfw2t/gp96gIznTBbIukLpwt8PBP6GQ8wRaTxn9+mY6oSts7ReEPo+1a0T/6v9f49tr/EVR0HSIT8ZQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=UIqyfbPi; arc=none smtp.client-ip=209.85.214.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="UIqyfbPi" Received: by mail-pl1-f171.google.com with SMTP id d9443c01a7336-1f9a78c6c5dso42420875ad.1 for ; Tue, 25 Jun 2024 04:43:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1719315791; x=1719920591; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=3jAq3V/NXt3fqcoe01ie/fhuO4SjwfZtkMJYxOwmCAQ=; b=UIqyfbPic+ar8xI0PjqY8y7fHRBwW2HQd+inIVDJKi6rnPmHxBfRd+O1wDRZeitpQX AZaYA2zFQmEbryYocSXYHLMfSIBLJMoVUk9yG8VoHeDvwZA2inRYNS2PGhvYJtYoQrcM XoGSSOAS5R9m1QTykgGvmp48oCyNoG3g6Tj2F6JrfHzlLhZU/+0j0gvpBN7tKDSJINsX IPpUV3u1DYsj3dFNimV93vFWPcZBT6e4c4ctf55MK0n/M8ZQWDgCOqwvDlk9sxLr8hY4 JsI1pRdOMp36cyRTCB1vLNbUloFwxN3FxLie540b3UmuIhq2nSWtXh4rcbIm8ZizsXrq yZRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719315791; x=1719920591; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3jAq3V/NXt3fqcoe01ie/fhuO4SjwfZtkMJYxOwmCAQ=; b=ahI5TEJzSxpunPbtYy83wQAU/D1k5CPYAqoDhiQbEmekRfZVHLYbwedXbqNZHgz2Xr JlwxloWqPpfNQRZ/YEOUG4aCCsZt8mXF6WPYy7zai0fUtBElfqcAfPT/4+UA5naX0IH7 39nxSiQQd51dZK0hpD0QKvGnwzbN/mEN+ow1oVRmLNtqHJ7sEXuVE+BccgisZBGQqFMj 3xwCrStS+1FY5/gUlcr6YDgVsyDRkqcjg4FQ7StOZRjjY4h6aljTa0/53pVUgbhrrTSx TWI1vIpBFWBujT5DR1ADoaGlCc1XEWmAoq820jVfy5Wrjbg6BAv7Ioi2axlobdZtpHnB qVVQ== X-Forwarded-Encrypted: i=1; AJvYcCVlWaqk5NO5iGKuO+3bBjekOxMC1ijNTbsTzHsKyBoeJYNMXUQ1Gg/WUetZsmWPScC/L1kKblF1+yFOOkiPVEEotBHXEpB7xVQ1zwFg X-Gm-Message-State: AOJu0YyvL/bSrHv7l5cu0BcAphEiEqyObYcg+kD6T81aMvRpgEv4pXHR SoejM++V5JDZr0Phxf+pcQNX8E8zadYxD+iuI8oEV/UyC5lKMrtC7Ud87w== X-Google-Smtp-Source: AGHT+IHxZMQADaFGMjDTm1G/i5o+tMd1MNmvFxW8cw/GgcrXFR3zRCAR1ET79n+sE9rkQ/isbP4OyQ== X-Received: by 2002:a17:902:ce90:b0:1fa:1d7b:c21b with SMTP id d9443c01a7336-1fa23edefcemr84621615ad.31.1719315790825; Tue, 25 Jun 2024 04:43:10 -0700 (PDT) Received: from wheely.local0.net (118-211-5-80.tpgi.com.au. [118.211.5.80]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f9eb2f028asm79638525ad.3.2024.06.25.04.43.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Jun 2024 04:43:10 -0700 (PDT) From: Nicholas Piggin To: Tejun Heo Cc: Nicholas Piggin , "Paul E . McKenney" , Peter Zijlstra , Lai Jiangshan , Srikar Dronamraju , linux-kernel@vger.kernel.org Subject: [PATCH 3/4] stop_machine: Rearrange multi_cpu_stop state machine loop Date: Tue, 25 Jun 2024 21:42:46 +1000 Message-ID: <20240625114249.289014-4-npiggin@gmail.com> X-Mailer: git-send-email 2.45.1 In-Reply-To: <20240625114249.289014-1-npiggin@gmail.com> References: <20240625114249.289014-1-npiggin@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" More clearly separate the state-machine progress case from the non-progress case. Move stop_machine_yield() and rcu_momentary_dyntick_idle() calls to the non-progress case like touch_nmi_watchdog(), rather than always calling them, since there is no reason to yield or touch watchdogs if the state machine progressed. Signed-off-by: Nicholas Piggin Reviewed-by: Paul E. McKenney --- kernel/stop_machine.c | 25 +++++++++++++++---------- 1 file changed, 15 insertions(+), 10 deletions(-) diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index cedb17ba158a..1e5c4702e36c 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -225,8 +225,6 @@ static int multi_cpu_stop(void *data) =20 /* Simple state machine */ do { - /* Chill out and ensure we re-read multi_stop_state. */ - stop_machine_yield(cpumask); newstate =3D READ_ONCE(msdata->state); if (newstate !=3D curstate) { curstate =3D newstate; @@ -243,15 +241,22 @@ static int multi_cpu_stop(void *data) break; } ack_state(msdata); - } else if (curstate > MULTI_STOP_PREPARE) { - /* - * At this stage all other CPUs we depend on must spin - * in the same loop. Any reason for hard-lockup should - * be detected and reported on their side. - */ - touch_nmi_watchdog(); + + } else { + /* No state change, chill out */ + stop_machine_yield(cpumask); + if (curstate > MULTI_STOP_PREPARE) { + /* + * At this stage all other CPUs we depend on + * must spin in the same loop. Any reason for + * hard-lockup should be detected and reported + * on their side. + */ + touch_nmi_watchdog(); + } + rcu_momentary_dyntick_idle(); } - rcu_momentary_dyntick_idle(); + } while (curstate !=3D MULTI_STOP_EXIT); =20 local_irq_restore(flags); --=20 2.45.1 From nobody Wed Dec 17 19:18:06 2025 Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5E3E71553AF for ; Tue, 25 Jun 2024 11:43:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719315796; cv=none; b=HpIOTk6+xRvrLvpj3Z5njbVbqcqq0B1980KsUEmc1gWAYLJY5fZ2f/ZuvGURJGtk0uDSxivEWPVvYYBb/KQug1UFi9RJ7nGWhIv4uxqlur+zovi9oGLw/JBWevkWSJ1trAI2gnGHGjt4WJ32xV6m8wPQRLQlNMJldsIhFeCM0A0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1719315796; c=relaxed/simple; bh=1eOgL4cfAMljELnjAjinh7SrzTqU4VK/mnGfkMlWVVE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=WIEw/XETyK+glet1EkDeh0PbL+qowKhN3A5VglYCOaxUJTL4pff/xT4N7FQx2UPnIJVw2oy9pUeEkAJcCD7kW225XPizJhQIamkSfVIxYAJiIJzngPOEwmkGTWvELC6CCJNI4eAKLBKMvgHxUYdmCcFArKVAnq7c+IjrqWssZR0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=A63mRVtH; arc=none smtp.client-ip=209.85.214.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="A63mRVtH" Received: by mail-pl1-f181.google.com with SMTP id d9443c01a7336-1f65a3abd01so43750215ad.3 for ; Tue, 25 Jun 2024 04:43:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1719315795; x=1719920595; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=4Ut/C6qhA9aOwITbcjH8dIajqyGZCkEHhPgDreABlew=; b=A63mRVtHCQ+WyZ/fndhX6EdPXZsBzHnowhAu1I0yzU/KIsoDtQji3P+hZ+Jrd5b993 2nlYAlAkQat+sjh9y/+gix5YLHf3JMKA4e3wDfo477hcST2s/6lqSpby14mC/+CQu2Ab 4zVLHpDPt16GUhmI37keRMueddOcykotYqYpEN1FEyLQNP4STmIbkOhi1lB5+1wrnsUb 7PRz2Yahjirgv2GqymZ28c8ezIt94qlEAFw8JZDlsTfCIInQDTAp0dX4xOlEzgTvJsdw rvpe3XdSo/AWL7MlT0LIb7FpUp5YsdMQAjSu8kdS4GJjAt0v6MXjcNF4EXhq6SkHUDVM 9Gkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719315795; x=1719920595; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4Ut/C6qhA9aOwITbcjH8dIajqyGZCkEHhPgDreABlew=; b=Ag/1SQtQMt1M0PzvwVOlVKWRb/e5CTyBVce2Aa1zjbmiVJub4+R7nrrbr5pHKN4Z8M L5Cz+x/1cdlicOzOQFi6Sp/NnqUgoMwVQdbSKMaZMPYqfAbDzIwxh3B67WSPomZAyoFi E3G6recrVDC8djEsqUksZHMo7bRHnQ9lJHcSEfRnMG29cGXEe1A1xtCc3ldrzGopd/K/ fpF1m0CH/9bbioAIeAcmRNvUPf4GQsYk3+NUsj30WtIFb7gQLN91O1coQYkSTVWPvdv/ d6CHlxgvV3oBkJptjGfa2oDCAPckvc4qrM/y4gIwkOWQoYpe/SuAje06f0pteJed1xLL OAaA== X-Forwarded-Encrypted: i=1; AJvYcCUsV042WtqR+ROiGbqV1JGcol2JUTuGoSrJUYuW0rveXuIgWazk/4UqfwdFo61c0NEUC21+3qErVRSrRgminPV/Yh4KqkHgKIE1XDxv X-Gm-Message-State: AOJu0Yx8w06AF22kw6S+XVKDbHIISS67E3FQ4kilgMPoJJtjEJYugHvw q9M3VCVov6PJw+ShkuwxVOlDONMMga/X43sQVZube6Gk+X5qd5qS X-Google-Smtp-Source: AGHT+IGSAGczVwLSd1GZrjMQPgs8n8ea+P8Fg9++dGsm24hVGUX2QdMSqhOgaBAS+lj6bPr0jrTxIA== X-Received: by 2002:a17:902:6847:b0:1f7:123e:2c6f with SMTP id d9443c01a7336-1fa23ee5b60mr70122115ad.37.1719315794786; Tue, 25 Jun 2024 04:43:14 -0700 (PDT) Received: from wheely.local0.net (118-211-5-80.tpgi.com.au. [118.211.5.80]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1f9eb2f028asm79638525ad.3.2024.06.25.04.43.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Jun 2024 04:43:14 -0700 (PDT) From: Nicholas Piggin To: Tejun Heo Cc: Nicholas Piggin , "Paul E . McKenney" , Peter Zijlstra , Lai Jiangshan , Srikar Dronamraju , linux-kernel@vger.kernel.org Subject: [PATCH 4/4] stop_machine: Add a delay between multi_cpu_stop touching watchdogs Date: Tue, 25 Jun 2024 21:42:47 +1000 Message-ID: <20240625114249.289014-5-npiggin@gmail.com> X-Mailer: git-send-email 2.45.1 In-Reply-To: <20240625114249.289014-1-npiggin@gmail.com> References: <20240625114249.289014-1-npiggin@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" If a lot of CPUs call rcu_momentary_dyntick_idle() in a tight loop, this can cause contention that could slow other CPUs reaching multi_cpu_stop. Add a 10ms delay between patting the various dogs. Signed-off-by: Nicholas Piggin Reviewed-by: Paul E. McKenney --- kernel/stop_machine.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index 1e5c4702e36c..626199b572c6 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -243,8 +243,18 @@ static int multi_cpu_stop(void *data) ack_state(msdata); =20 } else { - /* No state change, chill out */ - stop_machine_yield(cpumask); + /* + * No state change, chill out. Delay here to prevent + * the watchdogs and RCU being hit too hard by lots + * of CPUs, which can cause contention and slowdowns. + */ + unsigned long t =3D jiffies + msecs_to_jiffies(10); + + while (time_before(jiffies, t)) { + if (READ_ONCE(msdata->state) !=3D curstate) + break; + stop_machine_yield(cpumask); + } if (curstate > MULTI_STOP_PREPARE) { /* * At this stage all other CPUs we depend on --=20 2.45.1