From nobody Wed Apr 1 09:43:46 2026 Received: from mail-pl1-f177.google.com (mail-pl1-f177.google.com [209.85.214.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2628021D3D2 for ; Tue, 31 Mar 2026 01:21:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774920062; cv=none; b=KJ9hdnPVIsY/2YTPYxZoATqpWsHLgqGMwbEGxnSnEl4Rer2w/+Li+6fTfiM8/Yo1h3pc+hH0F5E/S6zYJgUyeoJ7yGMInASOzwQxKi4ZOej2P8iz+fJ//ulfjGZQ69tIZZJO2zMMDrm8etmghzRwX21n/ogjFRbXDNNsPO7jW+0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774920062; c=relaxed/simple; bh=wz9DP8ossvWEwwEzXi0JEvpqAu9l0Xp8lufFApzuPz4=; h=Date:From:To:cc:Subject:In-Reply-To:Message-ID:References: MIME-Version:Content-Type; b=b/OMOcbNT4VsSs0NrqFrS1MtQ5RsbxYaacW1EfBh2RSK319ojyO+ZMvDDormOQM8LwT9/CqiaB0mh8yyCT+/SHYrDOllBqEhRaXYMIZo7J01K06JTw9dgBzBZjiFR+CoG+zsmteYrksha314kl5OUCetefMF+hw6Fq8VaemUtKU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=K0MVVZBV; arc=none smtp.client-ip=209.85.214.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="K0MVVZBV" Received: by mail-pl1-f177.google.com with SMTP id d9443c01a7336-2b0c12be0ecso51775ad.0 for ; Mon, 30 Mar 2026 18:21:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20251104; t=1774920060; x=1775524860; darn=vger.kernel.org; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=crrmzM1Q7Xv/TEJHsW3iggtMROK5wI2Xln2hlSGHOm4=; b=K0MVVZBVg13mkA2w2dTGl13LXpHeNGtTDOAaJ51upJ9UYd3mZVeD8zdY4q/QFRXuiO 2gSfTrnELqApNaugjsqxdhlWcI3cr+jPOKqaEhroiW7O4z4D3wE1vNYIlQfyk7GTZCam swLY/PzDd9VWQGKQpIADHw90RFO/Dy3eDhWy+D5wP3O1F2IHp3e7VfKKtH/rMUtpn3CD CPug/SXevC2398odsUKeE+xV97Gh41o97sL+r2k60JaNNj9Mi/a45OADSgabC14Dwdpd R1bF2oQJrtzVykNKeTuxnY3gDIm3bwbbMa1pVc9G8oJqT+cedvGVR0o7U7XDrL/Kp+YU /SJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774920060; x=1775524860; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-gg:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=crrmzM1Q7Xv/TEJHsW3iggtMROK5wI2Xln2hlSGHOm4=; b=AyUg4LkcTZReMl5DzQRRkysq3LKeXeKaloBAOly/sKJqXibmctKdL78MZvmcODN3K2 8jIgni/Y5VhfJZn0SA/TjtWwCSl1CJfyxmB+4e17evrv/8I/psgig21uHpuLOpJHC8PF 4jzRVgikTK7Z1Phda6AA9x2RSp6m4LEfEPxDjKJpn8E1NuuTULxYc8yJZB9+Dcmi+OnH 7fyHldHhp1btjRu3W6FQ2aMVkcyzvGyrX3vEJWVUZ2iFPpaHV6XlSRfsVl4GDRykkONl RjmfuWyt3em1YMBjrlEXgb3z9bJgtzaKAwOukMvsEG7sbN4Ny6PfnW6PlUYClYw2YIBP VTSA== X-Forwarded-Encrypted: i=1; AJvYcCUyMdJMv8AedEFSDIK2270ycVMRN6uqAnbHHQiMLMB/cXWLCBPFwNr/Kd2YhGBpx2hQeBARJYa+MD09pQA=@vger.kernel.org X-Gm-Message-State: AOJu0Yw1QbYfJbCfKOzYHC9pHPatQFuyUAcwEZdr6XN6UJjGcWDp/lP2 GeRcqzcFfW5kBesq6mNcQ08ix0VB86u/RSflellK6dHNwUlkkQLWttkK53WwJKVHCw== X-Gm-Gg: ATEYQzxpOCo6DgoTmoSMXp6DiTE1oZhIjaZNcJZWbxm1oESoZRm/KhsSwHCqUcpYlP3 IIB9nG/7s2NZY+wwS2eTkxiqpqsrSiUVLaY06ia8rkvvETPZ31vssEpRtpxHHtQNFJUewmODRT3 jMYGOOXbOP/sBceORi6hHSqXHb081q4cc2EH+sFTj+yLa6Z0VsZ+mb+YRGb3vVmA1AJfxJrw0om IdRFJRofYqoO8H05uDsFSNxxRP+3Evf3SMgccs44gZkfBoM0UK2niSWfpIraic4op6Qg2Bsclqz yMT45+beQj3nPb0S0mBZpqHBNDM6Ee5DmRWt27Pmw6pOErom7Zk5iXWaKMauDjWB/FGgkNMuZc/ xXkcwYLiZacN2Og8TMj6wooaB/LK3QKFDIpf+PUqnEbgiVZhpUdSk4eBGI6yf/OcWYInzLAM+sI O3FOlywUBAVOiseK1B/nHjzHEtC2Q0NvdRi8m4uLEDuG9/1PkX8/WPZfkk9/zd1Ewq56UOtDaxQ LMEUGxwftMjeGsb3VofXEotDi7K1G6IUfwWS82AH1STpH2cz9B7zIrLgAIL15l+ X-Received: by 2002:a17:902:db06:b0:2ae:4808:bd99 with SMTP id d9443c01a7336-2b25f72b1dbmr2168405ad.2.1774920059710; Mon, 30 Mar 2026 18:20:59 -0700 (PDT) Received: from [2a00:79e0:2eb0:8:bec3:8e5e:fca2:7852] ([2a00:79e0:2eb0:8:bec3:8e5e:fca2:7852]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2b24277fb3asm99685595ad.57.2026.03.30.18.20.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Mar 2026 18:20:58 -0700 (PDT) Date: Mon, 30 Mar 2026 18:20:57 -0700 (PDT) From: David Rientjes To: Andrew Morton , Vlastimil Babka cc: Suren Baghdasaryan , Michal Hocko , Brendan Jackman , Johannes Weiner , Zi Yan , Petr Mladek , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [patch v3] mm, page_alloc: reintroduce page allocation stall warning In-Reply-To: <58a10940-e44c-a120-dd6e-ee9f480c4946@google.com> Message-ID: <371c86c8-1d47-bd70-b74c-769842718b1f@google.com> References: <30945cc3-9c4d-94bb-e7e7-dde71483800c@google.com> <231154f8-a3c3-229a-31a7-f91ab8ec1773@google.com> <58a10940-e44c-a120-dd6e-ee9f480c4946@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Previously, we had warnings when a single page allocation took longer than reasonably expected. This was introduced in commit 63f53dea0c98 ("mm: warn about allocations which stall for too long"). The warning was subsequently reverted in commit 400e22499dd9 ("mm: don't warn about allocations which stall for too long") because it was possible to generate memory pressure that would effectively stall further progress through printk execution. Page allocation stalls in excess of 10 seconds are always useful to debug because they can result in severe userspace unresponsiveness. Adding this artifact can be used to correlate with userspace going out to lunch and to understand the state of memory at the time. There should be a reasonable expectation that this warning will never trigger given it is very passive, it will only be emitted when a page allocation takes longer than 10 seconds. If it does trigger, this reveals an issue that should be fixed: a single page allocation should never loop for more than 10 seconds without oom killing to make memory available. Unlike the original implementation, this implementation only reports stalls once for the system every 10 seconds. Otherwise, many concurrent reclaimers could spam the kernel log unnecessarily. Stalls are only reported when calling into direct reclaim. Acked-by: Vlastimil Babka (SUSE) Signed-off-by: David Rientjes Acked-by: Michal Hocko Reviewed-by: Shakeel Butt --- v3: - initialize to INITIAL_JIFFIES per AI review so warnings are not suppressed in the first five minutes after boot - time_before(jiffies, a) -> time_is_after_jiffies(a) v2: - commit message update per Michal - check_alloc_stall_warn() cleanup per Vlastimil mm/page_alloc.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/mm/page_alloc.c b/mm/page_alloc.c --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -316,6 +316,14 @@ EXPORT_SYMBOL(nr_node_ids); EXPORT_SYMBOL(nr_online_nodes); #endif =20 +/* + * When page allocations stall for longer than a threshold, + * ALLOC_STALL_WARN_MSECS, leave a warning in the kernel log. Only one wa= rning + * will be printed during this duration for the entire system. + */ +#define ALLOC_STALL_WARN_MSECS (10 * 1000UL) +static unsigned long alloc_stall_warn_jiffies =3D INITIAL_JIFFIES; + static bool page_contains_unaccepted(struct page *page, unsigned int order= ); static bool cond_accept_memory(struct zone *zone, unsigned int order, int alloc_flags); @@ -4706,6 +4714,40 @@ check_retry_cpuset(int cpuset_mems_cookie, struct al= loc_context *ac) return false; } =20 +static void check_alloc_stall_warn(gfp_t gfp_mask, nodemask_t *nodemask, + unsigned int order, unsigned long alloc_start_time) +{ + static DEFINE_SPINLOCK(alloc_stall_lock); + unsigned long stall_msecs =3D jiffies_to_msecs(jiffies - alloc_start_time= ); + + if (likely(stall_msecs < ALLOC_STALL_WARN_MSECS)) + return; + if (time_is_after_jiffies(READ_ONCE(alloc_stall_warn_jiffies))) + return; + if (gfp_mask & __GFP_NOWARN) + return; + + if (!spin_trylock(&alloc_stall_lock)) + return; + + /* Check again, this time under the lock */ + if (time_is_after_jiffies(alloc_stall_warn_jiffies)) { + spin_unlock(&alloc_stall_lock); + return; + } + + WRITE_ONCE(alloc_stall_warn_jiffies, jiffies + msecs_to_jiffies(ALLOC_STA= LL_WARN_MSECS)); + spin_unlock(&alloc_stall_lock); + + pr_warn("%s: page allocation stall for %lu secs: order:%d, mode:%#x(%pGg)= nodemask=3D%*pbl", + current->comm, stall_msecs / MSEC_PER_SEC, order, gfp_mask, &gfp_mask, + nodemask_pr_args(nodemask)); + cpuset_print_current_mems_allowed(); + pr_cont("\n"); + dump_stack(); + warn_alloc_show_mem(gfp_mask, nodemask); +} + static inline struct page * __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, struct alloc_context *ac) @@ -4726,6 +4768,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int o= rder, int reserve_flags; bool compact_first =3D false; bool can_retry_reserves =3D true; + unsigned long alloc_start_time =3D jiffies; =20 if (unlikely(nofail)) { /* @@ -4841,6 +4884,9 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int o= rder, if (current->flags & PF_MEMALLOC) goto nopage; =20 + /* If allocation has taken excessively long, warn about it */ + check_alloc_stall_warn(gfp_mask, ac->nodemask, order, alloc_start_time); + /* Try direct reclaim and then allocating */ if (!compact_first) { page =3D __alloc_pages_direct_reclaim(gfp_mask, order, alloc_flags,