From nobody Sat Feb 7 15:26:40 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2474B421884 for ; Fri, 6 Feb 2026 14:40:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770388817; cv=none; b=fHZ5XCNsGFSfqSJPVLKvAsKtfGM9m1uBOCe+HqC7D2LVxUJu4DGaLYe6eWbqXObrWSOoNmF2+cD9h2UooDTjcTsYjMroTn1IgaZGXkYcGOjBIFWgrSkZCc1sRKN2DqybrLobHdub2zaRIG3xZOPS6gqa2sbKXfo41y6yM1R/p54= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770388817; c=relaxed/simple; bh=5MTIYnSrhIUJ17bD+PnmP1FVtEy0iYc8v1FyJh5RjRo=; h=Message-ID:Date:From:To:Cc:Subject:References:MIME-Version: Content-Type; b=AFToSsHrHiGjFoU+Fpwj1EypChBq+A2p/SQn8WYt3WNmdsYD8FBYJiOkwWjJtEe9wnSQMoZHssW/DzyjiT2lRsSRgNbFQD1prmtmcJChnk29WlmRBLhXGe71epF2aosIBggARO+FvK+D0Xn7AyXvJ2Q1+arGmwqJ5T0gsocl58A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=YvbvYQ9F; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="YvbvYQ9F" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1770388816; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=+1brVgWYEq0ZWfMlhAaXgacR5RpthFeG1w4z1Bx58lo=; b=YvbvYQ9FdthxRTuxO9nBB4vlH3/vRxbYlty32tjyqJmshmrEpul9qHA5lUo2ZnbfyMh+K/ Du/tII3jx/0W3HVPDtXFyIFZoDraUEJsZR885lcj/vnO7LAlWFIt2BuZRZPb7hZNq0ZyWi ad+SO2EEw5twZ8jF4NwHdRy6o2HB5k4= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-358-qKQQ0E6EO7K-OUnMby61xQ-1; Fri, 06 Feb 2026 09:40:12 -0500 X-MC-Unique: qKQQ0E6EO7K-OUnMby61xQ-1 X-Mimecast-MFC-AGG-ID: qKQQ0E6EO7K-OUnMby61xQ_1770388809 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 1A5E91955DCA; Fri, 6 Feb 2026 14:40:09 +0000 (UTC) Received: from tpad.localdomain (unknown [10.22.74.16]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id A2978192C7CA; Fri, 6 Feb 2026 14:40:07 +0000 (UTC) Received: by tpad.localdomain (Postfix, from userid 1000) id A8AB741DF09D7; Fri, 6 Feb 2026 11:39:20 -0300 (-03) Message-ID: <20260206143741.557251404@redhat.com> User-Agent: quilt/0.66 Date: Fri, 06 Feb 2026 11:34:32 -0300 From: Marcelo Tosatti To: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Vlastimil Babka , Hyeonggon Yoo <42.hyeyoo@gmail.com>, Leonardo Bras , Thomas Gleixner , Waiman Long , Boqun Feng , Marcelo Tosatti Subject: [PATCH 2/4] mm/swap: move bh draining into a separate workqueue References: <20260206143430.021026873@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Separate the bh draining into a separate workqueue (from the mm lru draining), so that its possible to switch the mm lru draining to QPW. To switch bh draining to QPW, it would be necessary to add a spinlock to addition of bhs to percpu cache, and that is a very hot path. Signed-off-by: Marcelo Tosatti --- mm/swap.c | 52 +++++++++++++++++++++++++++++++++++++--------------- 1 file changed, 37 insertions(+), 15 deletions(-) Index: slab/mm/swap.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- slab.orig/mm/swap.c +++ slab/mm/swap.c @@ -745,12 +745,11 @@ void lru_add_drain(void) * the same cpu. It shouldn't be a problem in !SMP case since * the core is only one and the locks will disable preemption. */ -static void lru_add_and_bh_lrus_drain(void) +static void lru_add_mm_drain(void) { local_lock(&cpu_fbatches.lock); lru_add_drain_cpu(smp_processor_id()); local_unlock(&cpu_fbatches.lock); - invalidate_bh_lrus_cpu(); mlock_drain_local(); } =20 @@ -769,10 +768,17 @@ static DEFINE_PER_CPU(struct work_struct =20 static void lru_add_drain_per_cpu(struct work_struct *dummy) { - lru_add_and_bh_lrus_drain(); + lru_add_mm_drain(); } =20 -static bool cpu_needs_drain(unsigned int cpu) +static DEFINE_PER_CPU(struct work_struct, bh_add_drain_work); + +static void bh_add_drain_per_cpu(struct work_struct *dummy) +{ + invalidate_bh_lrus_cpu(); +} + +static bool cpu_needs_mm_drain(unsigned int cpu) { struct cpu_fbatches *fbatches =3D &per_cpu(cpu_fbatches, cpu); =20 @@ -783,8 +789,12 @@ static bool cpu_needs_drain(unsigned int folio_batch_count(&fbatches->lru_deactivate) || folio_batch_count(&fbatches->lru_lazyfree) || folio_batch_count(&fbatches->lru_activate) || - need_mlock_drain(cpu) || - has_bh_in_lru(cpu, NULL); + need_mlock_drain(cpu); +} + +static bool cpu_needs_bh_drain(unsigned int cpu) +{ + return has_bh_in_lru(cpu, NULL); } =20 /* @@ -807,7 +817,7 @@ static inline void __lru_add_drain_all(b * each CPU. */ static unsigned int lru_drain_gen; - static struct cpumask has_work; + static struct cpumask has_mm_work, has_bh_work; static DEFINE_MUTEX(lock); unsigned cpu, this_gen; =20 @@ -870,20 +880,31 @@ static inline void __lru_add_drain_all(b WRITE_ONCE(lru_drain_gen, lru_drain_gen + 1); smp_mb(); =20 - cpumask_clear(&has_work); + cpumask_clear(&has_mm_work); + cpumask_clear(&has_bh_work); for_each_online_cpu(cpu) { - struct work_struct *work =3D &per_cpu(lru_add_drain_work, cpu); + struct work_struct *mm_work =3D &per_cpu(lru_add_drain_work, cpu); + struct work_struct *bh_work =3D &per_cpu(bh_add_drain_work, cpu); + + if (cpu_needs_mm_drain(cpu)) { + INIT_WORK(mm_work, lru_add_drain_per_cpu); + queue_work_on(cpu, mm_percpu_wq, mm_work); + __cpumask_set_cpu(cpu, &has_mm_work); + } =20 - if (cpu_needs_drain(cpu)) { - INIT_WORK(work, lru_add_drain_per_cpu); - queue_work_on(cpu, mm_percpu_wq, work); - __cpumask_set_cpu(cpu, &has_work); + if (cpu_needs_bh_drain(cpu)) { + INIT_WORK(bh_work, bh_add_drain_per_cpu); + queue_work_on(cpu, mm_percpu_wq, bh_work); + __cpumask_set_cpu(cpu, &has_bh_work); } } =20 - for_each_cpu(cpu, &has_work) + for_each_cpu(cpu, &has_mm_work) flush_work(&per_cpu(lru_add_drain_work, cpu)); =20 + for_each_cpu(cpu, &has_bh_work) + flush_work(&per_cpu(bh_add_drain_work, cpu)); + done: mutex_unlock(&lock); } @@ -929,7 +950,8 @@ void lru_cache_disable(void) #ifdef CONFIG_SMP __lru_add_drain_all(true); #else - lru_add_and_bh_lrus_drain(); + lru_add_mm_drain(); + invalidate_bh_lrus_cpu(); #endif }