From nobody Thu Jun 11 00:35:51 2026 Received: from mx0b-00364e01.pphosted.com (mx0b-00364e01.pphosted.com [148.163.139.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 79C4A3CF693 for ; Thu, 14 May 2026 21:51:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.139.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778795529; cv=none; b=R17RCZy6A0PzZfsFWHnN5Mmn4ymS2LSSTjbECcHpYBW85oLZ0WbIduJrEjpwyQZwwid0GiJkvHhOLKsVkIB5h2dkD3bL2pRxYLS6aEReRBkXnaj6sy2VUpeHc/+h8FnbgNgPQ+M5S5iDm8MVIU95+BvRIxAVtntKRP0IcfkACpo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778795529; c=relaxed/simple; bh=oc6U7NpIca6DPcliabejSR7QvsWIVWFdwf+uh6U3Rsg=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Jzi1+Xh50+E//+AZYuPIs9PNayIRCkvyAV+x3KbCEPzQ0Z9H2Kak8ed4fg9NWPHWm0mb9omBWyjZ9TrbpwEXx7lmdF/pYeQaAdezbs3w6eww0BmLm/cfLwCp5YZ49QKj/Rvtf9n+EB2NYAHnDXJuXzUhymRVz/5uCIE+8fYtbCA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=columbia.edu; spf=pass smtp.mailfrom=columbia.edu; dkim=pass (2048-bit key) header.d=columbia.edu header.i=@columbia.edu header.b=amptX5dC; arc=none smtp.client-ip=148.163.139.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=columbia.edu Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=columbia.edu Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=columbia.edu header.i=@columbia.edu header.b="amptX5dC" Received: from pps.filterd (m0167076.ppops.net [127.0.0.1]) by mx0b-00364e01.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 64ELhbxr3098446 for ; Thu, 14 May 2026 17:51:51 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=columbia.edu; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=pps01; bh=ot8e 9s9wEWPY8e0IKULsGPR22o/8zTD5eyv3hI7CBS8=; b=amptX5dCAhpcnMGmuHLT GYOn8Vyq1JSUuTaRKvpBtp4cP8ObZJDxT/CCcaLdO9j67TZKns3IqaI0gwDxMfri sZ71s+48zMDhznNI5m8HKE7ePWk5fVxvvLhKPxt12xJjWv2nss7XbWFa1XgLyRi9 0AR6y6+jHz3Wg9TYx6OVohvAKySgnHEfrYfXp/EjQT+S3qyxvM+il3WnRtGDbqTR 09uaHzbDh9rC22KUjg/cc46V0pl8mR8vESanql4dUBrPmKGzKGzbzU9nqubUXNNS Cz6BtlC8kk8av73Apy9e4SkRvFAgB+t3i9HIV8Ns3CfPiJApjoJggTBJSr0ZVtxD kQ== Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by mx0b-00364e01.pphosted.com (PPS) with ESMTPS id 4e5m24hcbd-1 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NOT) for ; Thu, 14 May 2026 17:51:50 -0400 (EDT) Received: by mail-qk1-f199.google.com with SMTP id af79cd13be357-90d175ae52bso930815785a.0 for ; Thu, 14 May 2026 14:51:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1778795510; x=1779400310; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=ot8e9s9wEWPY8e0IKULsGPR22o/8zTD5eyv3hI7CBS8=; b=dL+MLn2hYan6gXUEpx/4PZDOPjND2OTa6O/1AKYlIQGQetIuPPURoe71zdncadCg6m vaplKBYQQpsi/24cVYtSIo27VhtApY16qIpDycZcvHuAAB7+XCnnkeIobGtPPuE8Skza VlFFa1xOXYDEshrff/ons/nKcQdOT0E/CCLGuWJb8HoLnMXAJfHvDpZS0U6VFgvM4inp st53hvI/2AE9ioaO6W5xm201pglPQocFxKyz6/jAOTHSFzKEbN4e3cbhMYuJR00dP0ZJ +WdoIo8Q2KAgI3RetyEsbLgaZ2FXIQuq+1xa0D7bM7AyaEYkkdX2FAQf9E65/D7xqP41 SwWQ== X-Forwarded-Encrypted: i=1; AFNElJ9JHcciqSKKRT2bZkRX5ZutOsv/vpg6lzPrxgWsQ3f7VO4JWUVJY2/wM92hEiZZrKzk1FDmvg1A3fxbtHM=@vger.kernel.org X-Gm-Message-State: AOJu0YxYBh6dQbwLmtTXlviQVWbQK3GPDR1ijDycOw7EzJphTWxzd9G/ 27IOh3lsfHqqsO6PHS7X4LRYX9yFXwrKkpLQgcfebYEDo0Jf5zvAWrbtizgSdyVyNtUW2EF7zQm YyjkbKDChriF1n0VkVYky/gE0lZGcbHRrUHWZuBw+47zux9wchtAUkLorrpwzJwpUBM6KGA== X-Gm-Gg: Acq92OHBrCBzifovNJHujRv1mKRGWE1Nq3WH3YWZ1NC+bshbgmDXD4M/JcbzLReXIm6 Qk/B4WdBas29lNjT97WmioOEP0Z/qZxnaZeMzALI5XavtJqSKc4HcYYUDTv0Unx97DwXQKvfnrk Agf3XIW/9P4Irkb9fKGDjgrhqYA6ZgfdiYgUzfekyqAdMBtB15h0o9S5NcHE611chWB+q9X13La c4mgWFRjDxXf8CwoyBPZkdiD/lzdfiKbzNNKsxyh9nt0aMR9LjlG2Rw5mfqOIRB3Prt6TclwkBa g3q2Qa2R5aEBF1pBjvQplj1wwlBpktLLGB10G45WUxFf22iyx827ZKEdlpyoEu0I3pF4DMMqgRZ MxEjQfwr2+7V56IX0PRCvulXnJeMo5FWuYyhWmOx78zY= X-Received: by 2002:a05:620a:7112:b0:8f0:7516:da99 with SMTP id af79cd13be357-911ce90bc41mr243014785a.60.1778795510063; Thu, 14 May 2026 14:51:50 -0700 (PDT) X-Received: by 2002:a05:620a:7112:b0:8f0:7516:da99 with SMTP id af79cd13be357-911ce90bc41mr243010885a.60.1778795509581; Thu, 14 May 2026 14:51:49 -0700 (PDT) Received: from [127.0.1.1] ([129.236.229.175]) by smtp.gmail.com with ESMTPSA id af79cd13be357-910ba18132fsm354966085a.6.2026.05.14.14.51.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 May 2026 14:51:49 -0700 (PDT) From: Tal Zussman Date: Thu, 14 May 2026 17:51:14 -0400 Subject: [PATCH v6 1/4] block: add task-context bio completion infrastructure Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260514-blk-dontcache-v6-1-782e2fa7477b@columbia.edu> References: <20260514-blk-dontcache-v6-0-782e2fa7477b@columbia.edu> In-Reply-To: <20260514-blk-dontcache-v6-0-782e2fa7477b@columbia.edu> To: Jens Axboe , "Matthew Wilcox (Oracle)" , Christian Brauner , "Darrick J. Wong" , Carlos Maiolino , Alexander Viro , Jan Kara , Christoph Hellwig Cc: Dave Chinner , Bart Van Assche , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Gao Xiang , Tal Zussman X-Mailer: b4 0.14.3-dev-d7477 X-Developer-Signature: v=1; a=ed25519-sha256; t=1778795507; l=9111; i=tz2294@columbia.edu; s=20250528; h=from:subject:message-id; bh=oc6U7NpIca6DPcliabejSR7QvsWIVWFdwf+uh6U3Rsg=; b=at5gcGpC3+Y/wzEruMkgdaBcuT4Jo3oZWqaJdg6p+Uy1AWNOkr00RFSEU5rxarBI4k5fCGW3W OHd6hHEUgR8AKdSok87SSLXOCUi+S4svySklPkFKz3mCi8gj+YE2QgR X-Developer-Key: i=tz2294@columbia.edu; a=ed25519; pk=BIj5KdACscEOyAC0oIkeZqLB3L94fzBnDccEooxeM5Y= X-Authority-Analysis: v=2.4 cv=aPrAb79m c=1 sm=1 tr=0 ts=6a0643f6 cx=c_pps a=HLyN3IcIa5EE8TELMZ618Q==:117 a=nkegbr1AwBRna+m8FBH0UQ==:17 a=IkcTkHD0fZMA:10 a=NGcC8JguVDcA:10 a=x7bEGLp0ZPQA:10 a=VkNPw1HP01LnGYTKEx00:22 a=Da8U98TiO7q1upZEImrf:22 a=Qm0qsxP7aFY2tkT6R2MF:22 a=VwQbUJbxAAAA:8 a=JfrnYn6hAAAA:8 a=OwvvbMMXozzxZtH9MBEA:9 a=QEXdDO2ut3YA:10 a=bTQJ7kPSJx9SKPbeHEYW:22 a=1CNFftbPRP8L7MoqJWF3:22 X-Proofpoint-GUID: eS81Sqb_tNNOmZjPuCEki0t76ki14SMs X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwNTE0MDIxNiBTYWx0ZWRfXyu8+yBVYSyxJ C2FMHBqkosFGYZH/kqaMwAboQ/xF0LA2Xg0IaDpmEpwsCZ73HnnCp5H8AN58YHf0JqZvwQq1Zad 6XcQsVFocRG3/2+jV2smRfswXPBBQ4RTIu61RAcJIPAL7JEw2wmG7JC+JZfE4YQ2rso9fepL/Mt JZIpn7Ck7isyYgetqGcU6kK3byPvWHkjvZfi2ZM9Nb5Xs8hIobBrFFcxIy4gs6qJS8nuQtdxX+N M6i1NtZpOf1CHd1xpSxduhyuIZoMwlfGjvhUL2MD733vzNnKOLzTMpxpd/b+B4kHTxpBYjlsQhA kqhK9oHts6Q6gOpGbHkM/8tF/IFupHxc+2uFuz02xWEhaiUumh4QHa9oSNkQQcNXVFE9EY+Ux69 XQA6eWXpOAMjhl8rmsGqCKmCa1z4hradVIqgZ5Gq5XBTYBBWV5kZeobdojTpqBQPUY/02lybpnH TFsDICh5bkqRRvzvhfQ== X-Proofpoint-ORIG-GUID: eS81Sqb_tNNOmZjPuCEki0t76ki14SMs X-Proofpoint-Virus-Version: vendor=nai engine=6900 definitions=11786 signatures=596817 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 lowpriorityscore=10 spamscore=0 phishscore=0 malwarescore=0 bulkscore=10 suspectscore=0 impostorscore=10 priorityscore=1501 clxscore=1015 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2605130000 definitions=main-2605140216 Some bio completion handlers need to run from preemptible task context, but bio_endio() may be called from IRQ context (e.g., buffer_head writeback). Callers need a way to ensure their callback eventually runs from a sleepable context. Add infrastructure for that, in two forms: 1. BIO_COMPLETE_IN_TASK, a bio flag the submitter sets when it knows in advance that its callback needs task context (e.g., dropbehind writeback). bio_endio() sees the flag and offloads completion to a worker automatically. 2. bio_complete_in_task(), a helper that completion callbacks can invoke from within bi_end_io() when the deferral decision is dynamic (e.g., fserror reporting). Both share a per-CPU batch list drained by a delayed work item on a WQ_PERCPU workqueue. Producers push the bio onto the local CPU's batch and schedule the work item, which then dispatches each bio's bi_end_io() from task context. The delayed work item uses a 1-jiffie delay to allow batches of completions to accumulate before processing. Both methods are gated on bio_in_atomic(), which returns true in any context where a sleeping bi_end_io() is unsafe, including non-preemptible task context. This logic is copied from commit c99fab6e80b7 ("erofs: fix atomic context detection when !CONFIG_DEBUG_LOCK_ALLOC"). Two CPU hotplug callbacks are used to drain remaining bios from the departing CPU's batch, while maintaining the per-CPU behavior. The CPUHP_AP_ONLINE_DYN callback disables the per-CPU delayed work while the CPU is still online, preventing it from running on an unbound worker later. CPUHP_BP_PREPARE_DYN then drains any bios added between disabling the work item and CPU offline. Link: https://lore.kernel.org/all/20260409160243.1008358-1-hch@lst.de/ Suggested-by: Matthew Wilcox Suggested-by: Christoph Hellwig Signed-off-by: Tal Zussman --- block/bio.c | 147 ++++++++++++++++++++++++++++++++++++++++++= +++- include/linux/bio.h | 32 ++++++++++ include/linux/blk_types.h | 1 + 3 files changed, 179 insertions(+), 1 deletion(-) diff --git a/block/bio.c b/block/bio.c index b8972dba68a0..6864ee737400 100644 --- a/block/bio.c +++ b/block/bio.c @@ -19,6 +19,7 @@ #include #include #include +#include =20 #include #include "blk.h" @@ -1717,6 +1718,79 @@ void bio_check_pages_dirty(struct bio *bio) } EXPORT_SYMBOL_GPL(bio_check_pages_dirty); =20 +/* + * Infrastructure for deferring bio completions to task-context via a per-= CPU + * workqueue. Triggered either by the BIO_COMPLETE_IN_TASK bio flag (static + * decision at submit time) or by calling bio_complete_in_task() from + * bi_end_io() (dynamic decision at completion time). + */ + +struct bio_complete_batch { + local_lock_t lock; + struct bio_list list; + struct delayed_work work; + int cpu; +}; + +static DEFINE_PER_CPU(struct bio_complete_batch, bio_complete_batch) =3D { + .lock =3D INIT_LOCAL_LOCK(lock), +}; +static struct workqueue_struct *bio_complete_wq; + +static void bio_complete_work_fn(struct work_struct *w) +{ + struct delayed_work *dw =3D to_delayed_work(w); + struct bio_complete_batch *batch =3D + container_of(dw, struct bio_complete_batch, work); + + while (1) { + struct bio_list list; + struct bio *bio; + + local_lock_irq(&bio_complete_batch.lock); + list =3D batch->list; + bio_list_init(&batch->list); + local_unlock_irq(&bio_complete_batch.lock); + + if (bio_list_empty(&list)) + break; + + while ((bio =3D bio_list_pop(&list))) + bio->bi_end_io(bio); + + if (need_resched()) { + bool is_empty; + + local_lock_irq(&bio_complete_batch.lock); + is_empty =3D bio_list_empty(&batch->list); + local_unlock_irq(&bio_complete_batch.lock); + if (!is_empty) + mod_delayed_work_on(batch->cpu, + bio_complete_wq, + &batch->work, 0); + break; + } + } +} + +void __bio_complete_in_task(struct bio *bio) +{ + struct bio_complete_batch *batch; + unsigned long flags; + bool was_empty; + + local_lock_irqsave(&bio_complete_batch.lock, flags); + batch =3D this_cpu_ptr(&bio_complete_batch); + was_empty =3D bio_list_empty(&batch->list); + bio_list_add(&batch->list, bio); + local_unlock_irqrestore(&bio_complete_batch.lock, flags); + + if (was_empty) + mod_delayed_work_on(batch->cpu, bio_complete_wq, + &batch->work, 1); +} +EXPORT_SYMBOL_GPL(__bio_complete_in_task); + static inline bool bio_remaining_done(struct bio *bio) { /* @@ -1791,7 +1865,9 @@ void bio_endio(struct bio *bio) } #endif =20 - if (bio->bi_end_io) + if (bio_flagged(bio, BIO_COMPLETE_IN_TASK) && bio_in_atomic()) + __bio_complete_in_task(bio); + else if (bio->bi_end_io) bio->bi_end_io(bio); } EXPORT_SYMBOL(bio_endio); @@ -1977,6 +2053,51 @@ int bioset_init(struct bio_set *bs, } EXPORT_SYMBOL(bioset_init); =20 +static int bio_complete_batch_cpu_online(unsigned int cpu) +{ + enable_delayed_work(&per_cpu(bio_complete_batch, cpu).work); + return 0; +} + +/* + * Disable this CPU's delayed work so that it cannot run on an unbound wor= ker + * after the CPU is offlined. + */ +static int bio_complete_batch_cpu_down_prep(unsigned int cpu) +{ + disable_delayed_work_sync(&per_cpu(bio_complete_batch, cpu).work); + return 0; +} + +/* + * Drain a dead CPU's deferred bio completions. The CPU is dead and the wo= rker + * is canceled so no locking is needed. + */ +static int bio_complete_batch_cpu_dead(unsigned int cpu) +{ + struct bio_complete_batch *batch =3D + per_cpu_ptr(&bio_complete_batch, cpu); + struct bio *bio; + + while ((bio =3D bio_list_pop(&batch->list))) + bio->bi_end_io(bio); + + return 0; +} + +static void __init bio_complete_batch_init(int cpu) +{ + struct bio_complete_batch *batch =3D + per_cpu_ptr(&bio_complete_batch, cpu); + + bio_list_init(&batch->list); + INIT_DELAYED_WORK(&batch->work, bio_complete_work_fn); + batch->cpu =3D cpu; + + if (!cpu_online(cpu)) + disable_delayed_work_sync(&batch->work); +} + static int __init init_bio(void) { int i; @@ -1991,6 +2112,30 @@ static int __init init_bio(void) SLAB_HWCACHE_ALIGN | SLAB_PANIC, NULL); } =20 + for_each_possible_cpu(i) + bio_complete_batch_init(i); + + bio_complete_wq =3D alloc_workqueue("bio_complete", + WQ_MEM_RECLAIM | WQ_PERCPU, 0); + if (!bio_complete_wq) + panic("bio: can't allocate bio_complete workqueue\n"); + + /* + * bio task-context completion draining on hot-unplugged CPUs: + * + * 1. Stop the per-CPU delayed work while the CPU is still online, so + * that it cannot run on an unbound worker later. + * 2. Drain leftover bios added between worker disabling and CPU + * offlining. + */ + cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN, + "block/bio:complete:online", + bio_complete_batch_cpu_online, + bio_complete_batch_cpu_down_prep); + cpuhp_setup_state_nocalls(CPUHP_BP_PREPARE_DYN, + "block/bio:complete:dead", + NULL, bio_complete_batch_cpu_dead); + cpuhp_setup_state_multi(CPUHP_BIO_DEAD, "block/bio:dead", NULL, bio_cpu_dead); =20 diff --git a/include/linux/bio.h b/include/linux/bio.h index 97d747320b35..c0214d6c28d6 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -369,6 +369,38 @@ static inline struct bio *bio_alloc(struct block_devic= e *bdev, =20 void submit_bio(struct bio *bio); =20 +/** + * bio_in_atomic - check if the current context is unsafe for bio completi= on + * + * Return: %true in atomic contexts (e.g. hard/soft IRQ, preempt-disabled); + * %false when a bio can be safely completed in the current context. + */ +static inline bool bio_in_atomic(void) +{ + if (IS_ENABLED(CONFIG_PREEMPTION) && rcu_preempt_depth()) + return true; + if (!IS_ENABLED(CONFIG_PREEMPT_COUNT)) + return true; + return !preemptible(); +} + +void __bio_complete_in_task(struct bio *bio); + +/** + * bio_complete_in_task - ensure a bio is completed in preemptible task co= ntext + * @bio: bio to complete + * + * If called from non-task context, offload the bio completion to a worker + * thread and return %true. Else return %false and do nothing. + */ +static inline bool bio_complete_in_task(struct bio *bio) +{ + if (!bio_in_atomic()) + return false; + __bio_complete_in_task(bio); + return true; +} + extern void bio_endio(struct bio *); =20 static inline void bio_io_error(struct bio *bio) diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 8808ee76e73c..d49d97a050d0 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -322,6 +322,7 @@ enum { BIO_REMAPPED, BIO_ZONE_WRITE_PLUGGING, /* bio handled through zone write plugging */ BIO_EMULATES_ZONE_APPEND, /* bio emulates a zone append operation */ + BIO_COMPLETE_IN_TASK, /* complete bi_end_io() in task context */ BIO_FLAG_LAST }; =20 --=20 2.39.5 From nobody Thu Jun 11 00:35:51 2026 Received: from mx0b-00364e01.pphosted.com (mx0b-00364e01.pphosted.com [148.163.139.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9BFD03CDBBD for ; Thu, 14 May 2026 21:52:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.139.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778795529; cv=none; b=TrNmp06ISym2XwpyBhDhIXV1krpiu2i7ofo0i2KMMHDu2u6hrc9F/9/8bhndtlajiqatF87Wrg9eYzBQ1XjOfbKwLj5MG9e8bnOFBmKjxkS+HTIiX1Gx+FSFVoVMDcGvAtUiioBaVCwD9hdhv4xZiBjtYiugnctUVbLvVO/hDSY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778795529; c=relaxed/simple; bh=cKMEY2qMe+ThIRuTqPtJK/4lqWcAKsfKxgMzaSM3Nxs=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=TpmEWDq22x4GI3srhU91PpQJOVydDcawgqsaSO0b6g3TLueq0rHksoy1FxUtafNSS7z6IW/AgPQ/tJx1lWnLTQIZIgjir1SMZB5l8dnfRMhsyGLGwPabTgBDjW1vNatK1P3dUarfmNTXZZWDW4RYSMHH9eG2YrFkwkj7joXu9e8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=columbia.edu; spf=pass smtp.mailfrom=columbia.edu; dkim=pass (2048-bit key) header.d=columbia.edu header.i=@columbia.edu header.b=lVizpkSM; arc=none smtp.client-ip=148.163.139.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=columbia.edu Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=columbia.edu Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=columbia.edu header.i=@columbia.edu header.b="lVizpkSM" Received: from pps.filterd (m0167073.ppops.net [127.0.0.1]) by mx0b-00364e01.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 64ELgabC789014 for ; Thu, 14 May 2026 17:51:52 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=columbia.edu; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=pps01; bh=KiFB 1XYNEx2p4p1r40wGMJ8MJmpntx+6yGMPM/Sx6+E=; b=lVizpkSMqP3/FNMIKfpJ 2hYHvrsvlJYHnoo3y0s5a04k3jgVNQAeOD6ig9ZVde/pTJtE5K5rxyPO4de+aDsX qH34Q+J7K0T3JD5Mx1wAx0Bz1gRqXXDQtW04vtheDJmQIP/aaCX7y0Jj7pFZylPO VV5rlxyWSwj/hMDAXgjPY7e+bDDxuKwJK2VaJsxOfL4PHNQf+4Z7MxCEj2VRgohx x4La/cHrHDCpNTUc7iLKyDPFgO+0LerhtZA2C6OymuwDDAG2I7U6yoSzRHVYhKzp MEqCS0cxojwtxF5sSKiCMmL5UMtOVIQeReQH7pNSDVe25Xcc6171h3BdIBw5LE5g jQ== Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by mx0b-00364e01.pphosted.com (PPS) with ESMTPS id 4e5m299h84-1 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NOT) for ; Thu, 14 May 2026 17:51:51 -0400 (EDT) Received: by mail-qk1-f198.google.com with SMTP id af79cd13be357-9114ffb5d86so112259585a.1 for ; Thu, 14 May 2026 14:51:51 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1778795511; x=1779400311; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=KiFB1XYNEx2p4p1r40wGMJ8MJmpntx+6yGMPM/Sx6+E=; b=HVPYY/jDLucuyTJrDIxFg7jLexnDgKS33tM03YOr9aiti1XCNW+cjAQUVz8ySWhnDU 2kzXYGo2wiUiktYn7ThA/RYwKIvuXvOnVmZZxdDwYtuxwJFGamY33flFS0M5/Na9T7tn oL/WJmat1d4gUXMrigOXcMLYIwy+xMSkdnguUTwBLrixCrQe5An7iAf/y4kbrzHFvMdJ QOvP9X9pj8Q9DgL/BhCdmWibrWAoHiW+jng7xbdsDJ7wizB5TuRljfyV6Cskuk4LLsTV vouzS4mLp2bVvHOgLTg9Fr2EUMAwchaQK3jwvvj+U4SA/n6eyndjqJBTPZs+UZ1N9+Mr kQfw== X-Forwarded-Encrypted: i=1; AFNElJ8q+e84e/IuWuZbXOrlOjxeQ5QJhV/7sreOWF0SlrAVcQKOCtx+SjeuR8q0QhwBmO+QDGPkd3/Ik/y1jJ4=@vger.kernel.org X-Gm-Message-State: AOJu0Ywal4TF89ao5bCpFxVhZMUrHNU7mYpHFlGI9swfwUoZilUlTsJI C5abTnW6AzcN9EyrEVmLPr7aZbc8N6P09KgYJFpLSJhXw4Yr62qrARv6lzZilZ63q4bqutNcopv V1Tpdd/8B9RIFSVjfdZtfRiNdqEZ6NiFdqOysa6gMegzT7arMbT8sW9D2VIetXN15Q8BIPQ== X-Gm-Gg: Acq92OE6XLXcp3iADIb+hth/a8SLLcOBvx4Syhm2R4TwcUCrZSTX8rLJYTV3cRcfFNk XeLSn/9Vzg8YIuCiNgdxvNzBgCzezKCyKBNzMQd8B0gRXPUa2CsCzNBgYfEpnU13WmgUw01xSSx M9UtV9zbACC1NEKQLMr5bxbBbU7ZvpYHeFdHPiN12cSgYiitggpCpR21pU4EtfutglMX0vq+TOV R4DrzwzO51FsW5GZw8i1Aek5IQXp+e60JHHDGqaYVEuJqTrV3W3sBZ6vm3jq64NA8VwbeOlY5Av pcSi5SogpKaoYmQVNy5Y2a384OuBmdqYBcwUhmYadc0Msapk1EB43cQjUIIkTQJkgPEAFLNZEkj yJxknXwx3eC1H7PdBLzJb28nvSXYrMch70b+0UNvtaeU= X-Received: by 2002:a05:620a:31a4:b0:904:f4a5:ab76 with SMTP id af79cd13be357-911cd8572famr230869685a.3.1778795511115; Thu, 14 May 2026 14:51:51 -0700 (PDT) X-Received: by 2002:a05:620a:31a4:b0:904:f4a5:ab76 with SMTP id af79cd13be357-911cd8572famr230865885a.3.1778795510670; Thu, 14 May 2026 14:51:50 -0700 (PDT) Received: from [127.0.1.1] ([129.236.229.175]) by smtp.gmail.com with ESMTPSA id af79cd13be357-910ba18132fsm354966085a.6.2026.05.14.14.51.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 May 2026 14:51:50 -0700 (PDT) From: Tal Zussman Date: Thu, 14 May 2026 17:51:15 -0400 Subject: [PATCH v6 2/4] iomap: use BIO_COMPLETE_IN_TASK for dropbehind writeback Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260514-blk-dontcache-v6-2-782e2fa7477b@columbia.edu> References: <20260514-blk-dontcache-v6-0-782e2fa7477b@columbia.edu> In-Reply-To: <20260514-blk-dontcache-v6-0-782e2fa7477b@columbia.edu> To: Jens Axboe , "Matthew Wilcox (Oracle)" , Christian Brauner , "Darrick J. Wong" , Carlos Maiolino , Alexander Viro , Jan Kara , Christoph Hellwig Cc: Dave Chinner , Bart Van Assche , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Gao Xiang , Tal Zussman X-Mailer: b4 0.14.3-dev-d7477 X-Developer-Signature: v=1; a=ed25519-sha256; t=1778795507; l=3015; i=tz2294@columbia.edu; s=20250528; h=from:subject:message-id; bh=cKMEY2qMe+ThIRuTqPtJK/4lqWcAKsfKxgMzaSM3Nxs=; b=EUSRdVenMg2q2kdAIUHNovBRSRrki7XEbmF5ULdT5lQWRq5+xfggvWhH1KNRKBXyJKgw/JuBx lcJ9tESwNybAtxuwQcAtvSMvqi8D3Ob2J54zMF5tDxhJkwMV4Tpj6IN X-Developer-Key: i=tz2294@columbia.edu; a=ed25519; pk=BIj5KdACscEOyAC0oIkeZqLB3L94fzBnDccEooxeM5Y= X-Proofpoint-ORIG-GUID: 8_iCCJEBS3CeUSaZbKLOTy85ZnCf8wl2 X-Proofpoint-GUID: 8_iCCJEBS3CeUSaZbKLOTy85ZnCf8wl2 X-Authority-Analysis: v=2.4 cv=Wuob99fv c=1 sm=1 tr=0 ts=6a0643f7 cx=c_pps a=qKBjSQ1v91RyAK45QCPf5w==:117 a=nkegbr1AwBRna+m8FBH0UQ==:17 a=IkcTkHD0fZMA:10 a=NGcC8JguVDcA:10 a=x7bEGLp0ZPQA:10 a=VkNPw1HP01LnGYTKEx00:22 a=Da8U98TiO7q1upZEImrf:22 a=jHxIr1HyPKZ_Q5_91PL3:22 a=CxwgBoAeUDVvfQxjk70A:9 a=QEXdDO2ut3YA:10 a=NFOGd7dJGGMPyQGDc5-O:22 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwNTE0MDIxNiBTYWx0ZWRfXzLIESqHaxqw6 uCTrtCohhwIjYdhYa8IUgiO8eAM7hov3tIHqRK4iF3krAyjOJKbPwQhTR66/qTQYBeV28Hzb2gW TPbrvEjeMsdCSXTFXr4VzLz8uvLjlW2CnHyvlyRLXSvYCgAYCGuJSOv9+VP7XqyYgwCbxn2CdkO yhI6BqAIXvoFvvxp4WbIbDBxx2bdr6Gf/x4mZPWsUTwSzTBg01P7Vwsc5yqwEoay/d0v+7K3ZEI UYYBHt8SOe+BMH6O6juxJIqbzuzFej9brroILQRaF9V5LLSejJS2DqV6SNB+dw3AvbpEh1KgC8D ++LcE29i0Pv24NmcOKwdnBZdJdb27nYH/RgUxpWhudacFSOc8HcENhKsHK4k/j1dPHW0P3PtiH4 vfKLuAKUz/gOdE+DhIJYxWUqsHbqvb6apGxGA/PV6tQ+YyRe1IxfU1SEOO62bBei+J1r8F/y/Ew 8gusLlLEuKdE3h4xIbw== X-Proofpoint-Virus-Version: vendor=nai engine=6900 definitions=11786 signatures=596817 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 suspectscore=0 spamscore=0 adultscore=0 impostorscore=10 phishscore=0 lowpriorityscore=10 priorityscore=1501 bulkscore=10 malwarescore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2605130000 definitions=main-2605140216 Set BIO_COMPLETE_IN_TASK on iomap writeback bios when a dropbehind folio is added. This ensures that bi_end_io runs in task context, where folio_end_dropbehind() can safely invalidate folios. With the bio layer now handling task-context deferral generically, IOMAP_IOEND_DONTCACHE is no longer needed, as XFS no longer needs to route DONTCACHE ioends through its completion workqueue. Remove the flag and its NOMERGE entry. Without the NOMERGE, regular I/Os that get merged with a dropbehind folio will also have their completion deferred to task context. Signed-off-by: Tal Zussman Reviewed-by: Christoph Hellwig --- fs/iomap/ioend.c | 5 +++-- fs/xfs/xfs_aops.c | 4 ---- include/linux/iomap.h | 5 +---- 3 files changed, 4 insertions(+), 10 deletions(-) diff --git a/fs/iomap/ioend.c b/fs/iomap/ioend.c index acf3cf98b23a..892dbfc77ae9 100644 --- a/fs/iomap/ioend.c +++ b/fs/iomap/ioend.c @@ -237,8 +237,6 @@ ssize_t iomap_add_to_ioend(struct iomap_writepage_ctx *= wpc, struct folio *folio, =20 if (wpc->iomap.flags & IOMAP_F_SHARED) ioend_flags |=3D IOMAP_IOEND_SHARED; - if (folio_test_dropbehind(folio)) - ioend_flags |=3D IOMAP_IOEND_DONTCACHE; if (pos =3D=3D wpc->iomap.offset && (wpc->iomap.flags & IOMAP_F_BOUNDARY)) ioend_flags |=3D IOMAP_IOEND_BOUNDARY; =20 @@ -255,6 +253,9 @@ ssize_t iomap_add_to_ioend(struct iomap_writepage_ctx *= wpc, struct folio *folio, if (!bio_add_folio(&ioend->io_bio, folio, map_len, poff)) goto new_ioend; =20 + if (folio_test_dropbehind(folio)) + bio_set_flag(&ioend->io_bio, BIO_COMPLETE_IN_TASK); + /* * Clamp io_offset and io_size to the incore EOF so that ondisk * file size updates in the ioend completion are byte-accurate. diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index f279055fcea0..0dcf78beae8a 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -511,10 +511,6 @@ xfs_ioend_needs_wq_completion( if (ioend->io_flags & (IOMAP_IOEND_UNWRITTEN | IOMAP_IOEND_SHARED)) return true; =20 - /* Page cache invalidation cannot be done in irq context. */ - if (ioend->io_flags & IOMAP_IOEND_DONTCACHE) - return true; - return false; } =20 diff --git a/include/linux/iomap.h b/include/linux/iomap.h index 2c5685adf3a9..fef04e01116f 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -399,16 +399,13 @@ sector_t iomap_bmap(struct address_space *mapping, se= ctor_t bno, #define IOMAP_IOEND_BOUNDARY (1U << 2) /* is direct I/O */ #define IOMAP_IOEND_DIRECT (1U << 3) -/* is DONTCACHE I/O */ -#define IOMAP_IOEND_DONTCACHE (1U << 4) =20 /* * Flags that if set on either ioend prevent the merge of two ioends. * (IOMAP_IOEND_BOUNDARY also prevents merges, but only one-way) */ #define IOMAP_IOEND_NOMERGE_FLAGS \ - (IOMAP_IOEND_SHARED | IOMAP_IOEND_UNWRITTEN | IOMAP_IOEND_DIRECT | \ - IOMAP_IOEND_DONTCACHE) + (IOMAP_IOEND_SHARED | IOMAP_IOEND_UNWRITTEN | IOMAP_IOEND_DIRECT) =20 /* * Structure for writeback I/O completions. --=20 2.39.5 From nobody Thu Jun 11 00:35:51 2026 Received: from mx0b-00364e01.pphosted.com (mx0b-00364e01.pphosted.com [148.163.139.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CDBCC3CF043 for ; Thu, 14 May 2026 21:52:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.139.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778795530; cv=none; b=V0tBMOwo02T9qiGiW24E4r0+xeCQ2/Bp1AIHBlV8sNXq/IGVj/YrTjZJ7YAc6qhUnJRWrgw2zv0pczBciJkvbLCQJ0guNOA5gWjULpsf8SNAxr4oAr+UnwLi6rDu+Da2a8tJpmPkwE6eZHQMQJk7nDMg0OCK9HA9P6rRwT0siHQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778795530; c=relaxed/simple; bh=5MBJYs1Ny9I8ogr/547ALSAn0Pj03wWQCXtJIb2TJHQ=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Ca2p7eMjkfbU9OkBGHbjOxV9yxvTNqujRh3aMba2GLKd6MRpaQ6N50R3vfswT2diW0JtNIUoWHkEo6xdM4u7MVJk4LdDO/2e84qovj0zOsoxRprSo0XivECRl0iYw2+8r7qpGXS5jr/OTpvKM1EnOKdjho0hTjwvLZb04XzAznA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=columbia.edu; spf=pass smtp.mailfrom=columbia.edu; dkim=pass (2048-bit key) header.d=columbia.edu header.i=@columbia.edu header.b=tBt53dWq; arc=none smtp.client-ip=148.163.139.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=columbia.edu Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=columbia.edu Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=columbia.edu header.i=@columbia.edu header.b="tBt53dWq" Received: from pps.filterd (m0499198.ppops.net [127.0.0.1]) by mx0b-00364e01.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 64ELjjN4341732 for ; Thu, 14 May 2026 17:51:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=columbia.edu; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=pps01; bh=6kqw 7BAzD2WWgy7QOAIi6FHWqeO1F7nxlJS1uusq/eg=; b=tBt53dWqDG+4YQvP+yzf t08Mdq9u5NAo2gghpSycmflWYst40GFzx5jWQAqvDBXC9c9xly/C3sWSTmVurFWZ Vx/VsGGDRmszucNqshm3wKuYSxkPI+S09nfFpFyGxDQmoIdk/ehJcqfaZJa16SKo qE7wWfYnudBRXoQ9SXVciqfcnZjxUp8kZW38QhxYqCeyjT3N4YnvgsE/hLQH/PIw ADyyyR0CCO781xiYtzXjFkE4qBAWM428/9tks/eHETxMC+c5tnfxoYkysukZ2Cv6 1ba4WBu8dUZxR5oCYIn6aXy2gCnn0nSX1M45JKGJBApZHLKWxL0oZNlP4l12z1Em UA== Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by mx0b-00364e01.pphosted.com (PPS) with ESMTPS id 4e5m22sd1p-1 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NOT) for ; Thu, 14 May 2026 17:51:52 -0400 (EDT) Received: by mail-qk1-f199.google.com with SMTP id af79cd13be357-90cb6202d40so930736585a.1 for ; Thu, 14 May 2026 14:51:52 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1778795512; x=1779400312; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=6kqw7BAzD2WWgy7QOAIi6FHWqeO1F7nxlJS1uusq/eg=; b=DgV/Heiadto1SLjg4KWRx+hhHQrOmcRk14ZqlMhK8FNXavIFOUEkGaLECLHVNELXiF OKZ4ULZNGDAKnYznvOWMIXzU9jetSmdUFmjqWooYdNA9ASC272zt2MCh40MiTzbnfPaK A94lj6bdppmaTaEyX5MwCy0mZD5IRds7YMfMIyeQXAtcPlB491xPTte7nk+p34+sPujB 2yiZmMIJ+84j9KqkkkiBabBqrkoS9+8h+rkkPXkd1GKEHMIR0Y/oeYsTAcgcxiSXfR6L bnM6b+ejza1ZObb2WmOUG+1VBIIHZg2u9aDCBeTldds8rJzkvyvpi5AGb/JCacIHlyEL M+zQ== X-Forwarded-Encrypted: i=1; AFNElJ8lrb0shaXzME49dAWQbkGD3A0M6P968b+9bQQEnaEY4XM/bTabJU7Q7Ds1QcRpjID2k2+t/OjiycBINJc=@vger.kernel.org X-Gm-Message-State: AOJu0YyXpxKr4HPz84rT8hhHWV8vGYBomOqBYuQWBWBmQHOsx7RhDoWZ 6t8a5V1v2X7mqb09Yw3px3FYE+5LUZrkiE6zeBerAQmJ4XvC6+r/neA5Gp4UOxNMgHE1+fr/4ZC 0L/cAOjhWtTHJ7uXS5NwFpYNtWoouOHG6YgH/zhOyPdUpOhPPoczsuGD+a73ytrM0Hrdlcg== X-Gm-Gg: Acq92OFeV6bHABUBRpXU/V6ctoyWN4TBcu5xxyVBT+ZNaQhaYTPloOXw+gIODZuJ9+8 azQDuMjEsaULALboxV1aNmUAfzT1uOrX/8TR4XuZkFLZ6CIg9sKcByZCpU6Cts5AZVqxbR+rOXH t4moOHaxCVW8iDmWPynyib7U9ZFoQd8QzzZuikrX1vZOtCg1zJvFtY6NDJrf6nYg6To+/B941od kCAqDI8PV6TqjaxB9MObiXcfXp+DpcyDG+s53Xuff8VWtPRSBS/aVW+cqLFargnjbEckf+JARql MIq7e8QXRelGKCy0jSafT2p0XzRG+TiL3t8bGgQl7VFx6kQdeahGEGVERg0W6r9af50NYnmP3IR faHabrEtaaPq1fKQEUOOgTKmQpXE9r7ZA X-Received: by 2002:a05:620a:170c:b0:910:5637:4bec with SMTP id af79cd13be357-911cd1752admr254277285a.18.1778795512191; Thu, 14 May 2026 14:51:52 -0700 (PDT) X-Received: by 2002:a05:620a:170c:b0:910:5637:4bec with SMTP id af79cd13be357-911cd1752admr254272685a.18.1778795511761; Thu, 14 May 2026 14:51:51 -0700 (PDT) Received: from [127.0.1.1] ([129.236.229.175]) by smtp.gmail.com with ESMTPSA id af79cd13be357-910ba18132fsm354966085a.6.2026.05.14.14.51.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 May 2026 14:51:51 -0700 (PDT) From: Tal Zussman Date: Thu, 14 May 2026 17:51:16 -0400 Subject: [PATCH v6 3/4] buffer: add dropbehind writeback support Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260514-blk-dontcache-v6-3-782e2fa7477b@columbia.edu> References: <20260514-blk-dontcache-v6-0-782e2fa7477b@columbia.edu> In-Reply-To: <20260514-blk-dontcache-v6-0-782e2fa7477b@columbia.edu> To: Jens Axboe , "Matthew Wilcox (Oracle)" , Christian Brauner , "Darrick J. Wong" , Carlos Maiolino , Alexander Viro , Jan Kara , Christoph Hellwig Cc: Dave Chinner , Bart Van Assche , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Gao Xiang , Tal Zussman X-Mailer: b4 0.14.3-dev-d7477 X-Developer-Signature: v=1; a=ed25519-sha256; t=1778795507; l=3250; i=tz2294@columbia.edu; s=20250528; h=from:subject:message-id; bh=5MBJYs1Ny9I8ogr/547ALSAn0Pj03wWQCXtJIb2TJHQ=; b=9bQ2fojDINOqvoN+XUOT5/4ajyDiEcYRuoe/toJiMW2IYEErMQPP4nWP0kyaFXtYEdFs8QavD IoIc3c+yGL9AJoDL8xs3GhMcj/KZ4S3T3QanBUhqIizKxZ8785C3SWR X-Developer-Key: i=tz2294@columbia.edu; a=ed25519; pk=BIj5KdACscEOyAC0oIkeZqLB3L94fzBnDccEooxeM5Y= X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwNTE0MDIxNiBTYWx0ZWRfXyBU01EBooone 4P+38Da5q+EFDU6JswXgZcGkZlkbu4qFukVkGeru31zEWaYyesDnIJX8XRewoI9J9e6fJGm1Quq VPD1ymHC/EmXv4Gj+YPzs77QZZuBeWCqLJPQh0SxArNYjUwy+hzlP//eqxYuNkt0n0CAo2tlWvt vdzxJEvkJnReCnl9u9QdralUxnz5foT9vk3Egnc7KkiAvDzYFWBotyPTf1YF8M9X1q0ixwh+wMv TnlHJyQNu6+1eG8w4SxrO+UViyDkF11e37Yxk28q3Sdv/P5VOWhrfEgHkZTXgDQryiP6aX0xhgl A3KrrplGYTYE0NbYJPRmEykybCrLnjymSRUATM8yogMwHb7PteVm1qJdrNQEAbdH+VoNTW8LPKV Oh4ATOkNYEBQ3ggoMVTiJLfxzvHMs6x0Cm3pzhebs8k4Qhblhd0oodArFsRcSAH3F4VgmYfyejS 2HFhszKLViVsjsGKmIA== X-Authority-Analysis: v=2.4 cv=Je6Ma0KV c=1 sm=1 tr=0 ts=6a0643f8 cx=c_pps a=HLyN3IcIa5EE8TELMZ618Q==:117 a=nkegbr1AwBRna+m8FBH0UQ==:17 a=IkcTkHD0fZMA:10 a=NGcC8JguVDcA:10 a=x7bEGLp0ZPQA:10 a=VkNPw1HP01LnGYTKEx00:22 a=Da8U98TiO7q1upZEImrf:22 a=BpGzv1V74M3SfeTrGa8v:22 a=H0jnwmGZIJSlLSF8bakA:9 a=QEXdDO2ut3YA:10 a=bTQJ7kPSJx9SKPbeHEYW:22 X-Proofpoint-GUID: 8GMVYGmz4rv08ONNC9P1jfTk1SJTc1oX X-Proofpoint-ORIG-GUID: 8GMVYGmz4rv08ONNC9P1jfTk1SJTc1oX X-Proofpoint-Virus-Version: vendor=nai engine=6900 definitions=11786 signatures=596817 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 impostorscore=10 lowpriorityscore=10 bulkscore=10 clxscore=1015 adultscore=0 suspectscore=0 malwarescore=0 phishscore=0 spamscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2605130000 definitions=main-2605140216 Add block_write_begin_iocb() which threads the kiocb through to __filemap_get_folio() so that buffer_head-based I/O can use DONTCACHE behavior. When the iocb has IOCB_DONTCACHE set, FGP_DONTCACHE is passed to mark the folio for dropbehind. The existing block_write_begin() is preserved as a wrapper that passes a NULL iocb. Set BIO_COMPLETE_IN_TASK in submit_bh_wbc() when the folio has dropbehind set, so that buffer_head writeback completions get deferred to task context. Signed-off-by: Tal Zussman Reviewed-by: Christoph Hellwig --- fs/buffer.c | 19 +++++++++++++++++-- include/linux/buffer_head.h | 3 +++ 2 files changed, 20 insertions(+), 2 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index b0b3792b1496..d0abaf44d782 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -2138,14 +2138,19 @@ EXPORT_SYMBOL(block_commit_write); * * The filesystem needs to handle block truncation upon failure. */ -int block_write_begin(struct address_space *mapping, loff_t pos, unsigned = len, +int block_write_begin_iocb(const struct kiocb *iocb, + struct address_space *mapping, loff_t pos, unsigned len, struct folio **foliop, get_block_t *get_block) { pgoff_t index =3D pos >> PAGE_SHIFT; + fgf_t fgp_flags =3D FGP_WRITEBEGIN; struct folio *folio; int status; =20 - folio =3D __filemap_get_folio(mapping, index, FGP_WRITEBEGIN, + if (iocb && iocb->ki_flags & IOCB_DONTCACHE) + fgp_flags |=3D FGP_DONTCACHE; + + folio =3D __filemap_get_folio(mapping, index, fgp_flags, mapping_gfp_mask(mapping)); if (IS_ERR(folio)) return PTR_ERR(folio); @@ -2160,6 +2165,13 @@ int block_write_begin(struct address_space *mapping,= loff_t pos, unsigned len, *foliop =3D folio; return status; } + +int block_write_begin(struct address_space *mapping, loff_t pos, unsigned = len, + struct folio **foliop, get_block_t *get_block) +{ + return block_write_begin_iocb(NULL, mapping, pos, len, foliop, + get_block); +} EXPORT_SYMBOL(block_write_begin); =20 int block_write_end(loff_t pos, unsigned len, unsigned copied, @@ -2715,6 +2727,9 @@ static void submit_bh_wbc(blk_opf_t opf, struct buffe= r_head *bh, =20 bio =3D bio_alloc(bh->b_bdev, 1, opf, GFP_NOIO); =20 + if (folio_test_dropbehind(bh->b_folio)) + bio_set_flag(bio, BIO_COMPLETE_IN_TASK); + if (IS_ENABLED(CONFIG_FS_ENCRYPTION)) buffer_set_crypto_ctx(bio, bh, GFP_NOIO); =20 diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h index e4939e33b4b5..4ce50882d621 100644 --- a/include/linux/buffer_head.h +++ b/include/linux/buffer_head.h @@ -260,6 +260,9 @@ int block_read_full_folio(struct folio *, get_block_t *= ); bool block_is_partially_uptodate(struct folio *, size_t from, size_t count= ); int block_write_begin(struct address_space *mapping, loff_t pos, unsigned = len, struct folio **foliop, get_block_t *get_block); +int block_write_begin_iocb(const struct kiocb *iocb, + struct address_space *mapping, loff_t pos, unsigned len, + struct folio **foliop, get_block_t *get_block); int __block_write_begin(struct folio *folio, loff_t pos, unsigned len, get_block_t *get_block); int block_write_end(loff_t pos, unsigned len, unsigned copied, struct foli= o *); --=20 2.39.5 From nobody Thu Jun 11 00:35:51 2026 Received: from mx0b-00364e01.pphosted.com (mx0b-00364e01.pphosted.com [148.163.139.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 079EB3CEB9E for ; Thu, 14 May 2026 21:52:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.139.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778795530; cv=none; b=Vga1QCzdRtEsUfcXTk+l/sNHm1OZhyvDkQMfCzGwooBJ/mB6ZihEt8/o3mSFbaparDjRWqwV/eTP4utIFRN9NKkS9gGVvvh8Dcrruas0EupMeioNAquJPZyxY4JVLMHaqkn8/BBRuwUfPSRzPP6WxJiK32QLgclwJxbeD1sSuQM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778795530; c=relaxed/simple; bh=4kYnZvc+owx18DdRK+b0WjIntKdQQVbjejy+2ia87ck=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=eJiE1AIypmC3Qo5wNMcwVCc4o3OEH6TacPASSozYvlnSj8BqLex6fCVKn1oaSCTecHS88Fl6a+4UpYX7azmruI7OzSv/r/6MEDafj6xrT+GHboT+HpZIPDhEVuFpZsWYXvbSGc9NVXAtesD7gR7z+MlVoq3O94Iz7iRHHUo56iA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=columbia.edu; spf=pass smtp.mailfrom=columbia.edu; dkim=pass (2048-bit key) header.d=columbia.edu header.i=@columbia.edu header.b=eZyHj6/o; arc=none smtp.client-ip=148.163.139.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=columbia.edu Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=columbia.edu Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=columbia.edu header.i=@columbia.edu header.b="eZyHj6/o" Received: from pps.filterd (m0167075.ppops.net [127.0.0.1]) by mx0b-00364e01.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 64ELhlYO1548044 for ; Thu, 14 May 2026 17:51:54 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=columbia.edu; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=pps01; bh=3Hr8 xytd8pzj0HErDwC1OWfZgJX2Msw8AYmLwQIxWTo=; b=eZyHj6/o229FuGb6obFP y01//iQN2R10GEr4qVdWRAcKfbBn39iOW7pDeiamYsItYvBz9nE/Pon62/FrHbRT ca25JObOyp6I0YgF1LVja0LAoUrXTH8PN4c9FaJLVX4cocBvkgvr8P1/bXRUtYEU ezdeGP/xD8FpSO71pFPt2m+7sM8WNaM3qd7Ykfu81marJvVKRJ43F9LEndWsCW/H RNqSJjAMDbssaOoJUGGgGwOzvmON7E+wXPvY9wvASbmqJ2uqaCGwOtc7KTVgCX9l VxxXI7XPNFK0AnLfKFz1TVkvCMAgh/s6LpNCtKUT3llAjiwNjZdOQCTCQf/Jzy6W FA== Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by mx0b-00364e01.pphosted.com (PPS) with ESMTPS id 4e5m25hdsq-1 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NOT) for ; Thu, 14 May 2026 17:51:54 -0400 (EDT) Received: by mail-qk1-f199.google.com with SMTP id af79cd13be357-90f5248b209so28480485a.2 for ; Thu, 14 May 2026 14:51:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1778795513; x=1779400313; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=3Hr8xytd8pzj0HErDwC1OWfZgJX2Msw8AYmLwQIxWTo=; b=oQxd3wDfkY9CT3vBQBj50ZXwJ8QdfgscNHFaQzIBRSSVqr2plxpah1lnulKp+sIGP3 DKs89SaEKoALJt/5qnxhSLIQemVK+vDJOawPBPmTEwZ2mNOmuIbrXE7AK5vkZ0NPaNCj XOuwBjVg0NECJVtr9VggjL2joDWlXYXTTbJt3JfgBhPTvIqa63epsM/cUptRzXnFK/nu fij1ZbuLYtDe4/5U6Z45aLCjvaPLkEjHcDT+wAVSSDe9qvu1ZCAsbCmw/fq9sgF9AHrk BkJwWBDiI6TvDzNm7CrwS1UoM/iG17am9Q4ZC1zDkAG2gD358/zZb59k6XFgPtCHWCjL Om5Q== X-Forwarded-Encrypted: i=1; AFNElJ/ox6Zw1bw1T6T3KIHASXXynQmJqMDoZCavi3qd7BVIJSas08HVlOtZA33Uv24AHyCdxVr+enIyj/CLgbc=@vger.kernel.org X-Gm-Message-State: AOJu0YzkwGA1H7LtccaUx0WIRqGhHTZP060xsf4EWRLVdwoQeAN17g7D TQ9EymT3xzlLeh+31SIXu4MgNYGrqchxU7zMOpAa7XESA9OamXEZr4NAAZQjw/HsHK/v1v51eXD UjT2SEOB/cTjZdRybMg1OyQi8ezX8P6VPHGQmL7LCEIJxOUpdoL5oImegF+9mOuwbKLkCEg== X-Gm-Gg: Acq92OH0OfNfWrH+mi6M4LGF9iexDoTDyjFRxVE9Qbcc2naTA1iK20BmBgf/HvPKOLe e7S1Y939waRYBRl8lPeC5UgO4oz8aLcJ1xytmr+NDLxZ58Zzm41SgYeMzOFZM+PSrsmlia3+trd 5xLl1Be29AFdUhGTF57um3G5A+hPtfhcqY4A0SK9Q8YO6OLFhStprHploDDa9XJvk+lJbmHNR8Y AkXUnOxQqpXs2M4dBo7znioSKDvSccTX72yksK6isU7AoRc9HTkvADuLKgzf0KLXRZJEATZz+nw BhcCDy95lY+Nq12q7tDcvDLerJpMkpGkRpHnIHGvggr8gQrmuDs5rdsJLyvhm51wWiMP4J+z+k3 ZJ6p70iRogO4wMFzDAoKH33yXbxo1WEpw X-Received: by 2002:a05:620a:2987:b0:8eb:605f:6cd6 with SMTP id af79cd13be357-911d0c5d023mr239923985a.60.1778795513394; Thu, 14 May 2026 14:51:53 -0700 (PDT) X-Received: by 2002:a05:620a:2987:b0:8eb:605f:6cd6 with SMTP id af79cd13be357-911d0c5d023mr239919185a.60.1778795512909; Thu, 14 May 2026 14:51:52 -0700 (PDT) Received: from [127.0.1.1] ([129.236.229.175]) by smtp.gmail.com with ESMTPSA id af79cd13be357-910ba18132fsm354966085a.6.2026.05.14.14.51.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 May 2026 14:51:52 -0700 (PDT) From: Tal Zussman Date: Thu, 14 May 2026 17:51:17 -0400 Subject: [PATCH v6 4/4] block: enable RWF_DONTCACHE for block devices Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260514-blk-dontcache-v6-4-782e2fa7477b@columbia.edu> References: <20260514-blk-dontcache-v6-0-782e2fa7477b@columbia.edu> In-Reply-To: <20260514-blk-dontcache-v6-0-782e2fa7477b@columbia.edu> To: Jens Axboe , "Matthew Wilcox (Oracle)" , Christian Brauner , "Darrick J. Wong" , Carlos Maiolino , Alexander Viro , Jan Kara , Christoph Hellwig Cc: Dave Chinner , Bart Van Assche , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Gao Xiang , Tal Zussman X-Mailer: b4 0.14.3-dev-d7477 X-Developer-Signature: v=1; a=ed25519-sha256; t=1778795507; l=1606; i=tz2294@columbia.edu; s=20250528; h=from:subject:message-id; bh=4kYnZvc+owx18DdRK+b0WjIntKdQQVbjejy+2ia87ck=; b=h/wC1HCfLFItwTEGisusw9jduZm6p1jTrRpV6ed4zqtg5o3ocYnG/8kC6WqedCRsxxH5uzIMI xkDRAyq8KAYA0Vpg3tuCs1ya2+fmJNNoRSOmfuMYQk5UpCQHPwkaVV6 X-Developer-Key: i=tz2294@columbia.edu; a=ed25519; pk=BIj5KdACscEOyAC0oIkeZqLB3L94fzBnDccEooxeM5Y= X-Authority-Analysis: v=2.4 cv=JOcLdcKb c=1 sm=1 tr=0 ts=6a0643fa cx=c_pps a=HLyN3IcIa5EE8TELMZ618Q==:117 a=nkegbr1AwBRna+m8FBH0UQ==:17 a=IkcTkHD0fZMA:10 a=NGcC8JguVDcA:10 a=x7bEGLp0ZPQA:10 a=VkNPw1HP01LnGYTKEx00:22 a=Da8U98TiO7q1upZEImrf:22 a=HpS3TJQ9O3Ob1ozEcmik:22 a=hW3giJO2Q2g30DqAd_4A:9 a=QEXdDO2ut3YA:10 a=bTQJ7kPSJx9SKPbeHEYW:22 X-Proofpoint-GUID: 6_YpbwR5tLZq3r-s92QYV2yFG2Isrm-E X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwNTE0MDIxNiBTYWx0ZWRfX7GIrV8Cr+2+t LSNtbHnnHE3+JB72Xx1IiNUhtlHmXkE2+QCDXqG0Pcp18D4RiQQkjA4AcUgZjMY+TiX8LqTpQXI ap38lFFkX7Ozhag35booA5dE/ijIq2253/ETwXTrd0WClZ79EX7CpHmBBCctjnUhC51v5x8LgP2 3er+bHGePokDYU3udSSF+ZVce/EYZLOdNhKuH/qXlEIZ1CYCTjzuGh6UIAuy63JDNHPx0theAGJ wqy1B7b5+AdaUEzMAZlXMV8ToeIpP1MOWOEwxRkzGq6+5Nwm/bjRESOuK43YS94x6LJcrUT1SGO lPQDQdoyOziC2B83X8DzIl7FhS8xY0vSDxcuU7oox6DyfsOMcQogFUF1PzEeeZjyITkJP2OqgNF PPoRAi3lWVIVZ/F8i7KDsLHbSKPPUmEntxVuLCnfcPEVNRbJTBZzhE+5hIwGJUvmZTr9V/Qcllf 5xtJO8X9hyVG7Et+LHw== X-Proofpoint-ORIG-GUID: 6_YpbwR5tLZq3r-s92QYV2yFG2Isrm-E X-Proofpoint-Virus-Version: vendor=nai engine=6900 definitions=11786 signatures=596817 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 impostorscore=10 priorityscore=1501 suspectscore=0 bulkscore=10 adultscore=0 spamscore=0 clxscore=1015 lowpriorityscore=10 phishscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2605130000 definitions=main-2605140216 Block device buffered reads and writes already pass through filemap_read() and iomap_file_buffered_write() respectively, both of which handle IOCB_DONTCACHE. Enable RWF_DONTCACHE for block device files by setting FOP_DONTCACHE in def_blk_fops. For CONFIG_BUFFER_HEAD=3Dy paths, use block_write_begin_iocb() in blkdev_write_begin() to thread the kiocb through so that buffer_head writeback gets dropbehind support. CONFIG_BUFFER_HEAD=3Dn paths are handled by the previously added iomap BIO_COMPLETE_IN_TASK support. This support is useful for databases that operate on raw block devices, among other userspace applications. Signed-off-by: Tal Zussman Reviewed-by: Christoph Hellwig --- block/fops.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/block/fops.c b/block/fops.c index bb6642b45937..31b073181d87 100644 --- a/block/fops.c +++ b/block/fops.c @@ -504,7 +504,8 @@ static int blkdev_write_begin(const struct kiocb *iocb, unsigned len, struct folio **foliop, void **fsdata) { - return block_write_begin(mapping, pos, len, foliop, blkdev_get_block); + return block_write_begin_iocb(iocb, mapping, pos, len, foliop, + blkdev_get_block); } =20 static int blkdev_write_end(const struct kiocb *iocb, @@ -966,7 +967,7 @@ const struct file_operations def_blk_fops =3D { .splice_write =3D iter_file_splice_write, .fallocate =3D blkdev_fallocate, .uring_cmd =3D blkdev_uring_cmd, - .fop_flags =3D FOP_BUFFER_RASYNC, + .fop_flags =3D FOP_BUFFER_RASYNC | FOP_DONTCACHE, }; =20 static __init int blkdev_init(void) --=20 2.39.5