From nobody Thu Apr 16 20:49:10 2026 Received: from mx0b-00364e01.pphosted.com (mx0b-00364e01.pphosted.com [148.163.139.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5F15938422A for ; Wed, 8 Apr 2026 23:09:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.139.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775689751; cv=none; b=ibx9BZsGD/UEnPMQLdZNhor1q4uketZ5oHdkNbCYZ28MBMUxRbS8KdUL8gbPfpLw3p/WrSg0JQS2BJ0zaRR5nyUiKtUyn+pAa5DlvUHoW7F1h0rmCviVu52idI5eWzomQ03aPJPNzgdQAGVmck23osshId20YtZa3aFsKI12i8g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775689751; c=relaxed/simple; bh=NLsrsfzymzUrMVWpo0v5CWjs+5vL2gt0uuiGjvhOdbM=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=PvdBHzS3k/s1GW/CJhIcrINGqEuwiP7K0tnx/3Fmcmtii5cRpHzVZXXXQWDgouI5/sAayNQ/h0sm6Jeoz8udOdS+R5VeoNwfRi6H3GgBCZMbGr1FIuS/rq8zUKYroQbzwe38o+Yn7AA3HtLbxiIL7NWlOUCQJaFrNRbReUruufs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=columbia.edu; spf=pass smtp.mailfrom=columbia.edu; dkim=pass (2048-bit key) header.d=columbia.edu header.i=@columbia.edu header.b=HzKTHYw5; arc=none smtp.client-ip=148.163.139.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=columbia.edu Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=columbia.edu Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=columbia.edu header.i=@columbia.edu header.b="HzKTHYw5" Received: from pps.filterd (m0167073.ppops.net [127.0.0.1]) by mx0b-00364e01.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 638MtsuU2293501 for ; Wed, 8 Apr 2026 19:09:08 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=columbia.edu; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=pps01; bh=oqJ0 a3t8jS6c/vIHutzSVyyVpnCMDCW4Tg6feS4OXSE=; b=HzKTHYw5S6TceESdnFDi 2PNZkSYBxbwNoM6vCZD02Bn52JxAF3fmiLb5Zl0DXAZ027SkvzDtmLqsSh0agOZ2 FMQHkv37L2/YC6T0QQZhV32Hyng6vCTFHZtnQ/qFXH+1N6AqwlmlDpk0rs1t+HS1 y/u4YeZFNMsff/0FZv2/u9Elb9b6m4+NJg/ZuHU1Fu2C85UakQ/0j2hh8ZzKxko6 86QeTWgfNfGnr6RScYrZDvHWFk2B77v1WjT6IELF6h/oVvFbLOxxceGdmgCkZohz 4Kfajmxx6+h0eire1jm052vY07QegjnGepZ4S2NQ8O3Q5A/He8G7moK9ISs5vDn6 nw== Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by mx0b-00364e01.pphosted.com (PPS) with ESMTPS id 4dda3mgu64-1 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NOT) for ; Wed, 08 Apr 2026 19:09:08 -0400 (EDT) Received: by mail-qk1-f198.google.com with SMTP id af79cd13be357-8d3bb62cd2bso22618085a.1 for ; Wed, 08 Apr 2026 16:09:08 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775689748; x=1776294548; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=oqJ0a3t8jS6c/vIHutzSVyyVpnCMDCW4Tg6feS4OXSE=; b=Po8iof0BphzOlycfwBP+cdG7VZOGY5a9oyUYuDHj9T762O0lWLYrP1KdPSj76lBQ1q MJxV9eTJkU3U3FWEf959nBNrlbGl3j9Lhxg6t3Qh0n0PWc9/BYNVDTLHv8fvC/OHDFal uiJmEdPOCRGt7u/ajbngnpNT06aO9D+tU9kktV+S2n/pvUj7LOqJbM4O3+T/aXNe0LL+ T/vCoUfmFNGJkJU0uSBplmC6vu/4MJ+7YVyEuHRxRlLSm4tA2pYpJbUHnPWlnxpwnrlc p9p/6LX9rP8b1IRWOEvVzq9f6i2+I/Uc+aGPlrmBtv+uOXKzuh+eDUWGOYQX9JGSk+Vv 2/BQ== X-Forwarded-Encrypted: i=1; AJvYcCW+2pTcEdPH6Oihoo160CZl2ueiHFGrk/iwlTriO1nOaG7Da1+liM43En/jycwYZEnxPkmBNEcLiIdPUEs=@vger.kernel.org X-Gm-Message-State: AOJu0YwuUaWBYtsQ/TViUMcf65vjCN8B5OXqzrCJqnT3XyYE4xsh7uTk iYpi0yYOfkfA6TAjMKnOX0mbeCmOQWXaAvxBEmxydGY1rGwXOvRsj23ZGKq2KLm22CFI/pF64wp 1dRw7X+ikPqwJu6TFqnjPLJCtNgrFfgzrtOYZrSvIR1bulGwLxQGhDMJCdCTk5Q== X-Gm-Gg: AeBDieuOgZjgmFVOjCg9iudCd+5MN/KkbIARsZrmEHrcALgEzvekzlGba6jQq6Plj6W evyTKcbjHWqk2aot7QSmcMWbhncSa8yn3lyTVY6T1hCfNbY62KoE6Jl5ORiFfmAHlw7VANSTVaf g4ijovrN+t5x2ke04VFjiWwvc/oZkXTA7RpquuE3t30hPzlQ6TzPGkig696fTQYXonkWfeSI8Fk BAWvjiS/BDc6ozj2o2geo6gvA2xPCq6bqhTptD4FJXK4rHnE/1IBrczt939qR1mfOgmXQM3UmOD tXS3ICa7hm/pbgG+/MlklFYPTeu8PpIAEVpZVv0pOzxPTHy3yVwbeAOnRFAHiWrcY0I/ssIA5Yq iTWupGTprxiE+AhWA5bXW3DtPlzMrpRD4 X-Received: by 2002:a05:620a:288c:b0:8cf:e015:afee with SMTP id af79cd13be357-8dc3d5637abmr251837485a.41.1775689747770; Wed, 08 Apr 2026 16:09:07 -0700 (PDT) X-Received: by 2002:a05:620a:288c:b0:8cf:e015:afee with SMTP id af79cd13be357-8dc3d5637abmr251831985a.41.1775689747307; Wed, 08 Apr 2026 16:09:07 -0700 (PDT) Received: from [127.0.1.1] ([216.158.158.246]) by smtp.gmail.com with ESMTPSA id af79cd13be357-8dc1514d382sm125665485a.4.2026.04.08.16.09.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Apr 2026 16:09:06 -0700 (PDT) From: Tal Zussman Date: Wed, 08 Apr 2026 19:08:49 -0400 Subject: [PATCH RFC v5 1/3] block: add BIO_COMPLETE_IN_TASK for task-context completion Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260408-blk-dontcache-v5-1-0f080c20a96f@columbia.edu> References: <20260408-blk-dontcache-v5-0-0f080c20a96f@columbia.edu> In-Reply-To: <20260408-blk-dontcache-v5-0-0f080c20a96f@columbia.edu> To: Jens Axboe , "Matthew Wilcox (Oracle)" , Christian Brauner , "Darrick J. Wong" , Carlos Maiolino , Alexander Viro , Jan Kara Cc: Christoph Hellwig , Dave Chinner , Bart Van Assche , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Tal Zussman X-Mailer: b4 0.14.3-dev-d7477 X-Developer-Signature: v=1; a=ed25519-sha256; t=1775689745; l=5340; i=tz2294@columbia.edu; s=20250528; h=from:subject:message-id; bh=NLsrsfzymzUrMVWpo0v5CWjs+5vL2gt0uuiGjvhOdbM=; b=u66r3JWWGQoqmnanaB2/ADYIPFPR9XzOSNkajI3Qmqzv2OFXZ3a1f4nowuwzVHTLchKqwLvP3 C8IK+oGI7fzBnCtSgWvUg1tgEQWCLKJKma5i/Qp7RNIk3ZGfLD8QeKY X-Developer-Key: i=tz2294@columbia.edu; a=ed25519; pk=BIj5KdACscEOyAC0oIkeZqLB3L94fzBnDccEooxeM5Y= X-Proofpoint-GUID: Fupf3JIUbQBik-f9hz7bnP6RdsEy7114 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwNDA4MDIxNCBTYWx0ZWRfX3gq4fkSATnB0 WdeMylUFFdsiNnwRtOgCGPFbOC/58gmu66pJAoAoBVxUrcksj39vCWuVmcZy1Vh/TsejO3kJE47 8rOmDf7EnzOHYy2p9Ag6bQuqMuxD6swAq+58Pkd0N/f2aoq4UOv0v/5rGI0USmx85RbWlo6C0If J5LExtxVuUP7pe3/hob7/Wozl9bhdMfnJTQ3lz1I+FeguOQgaCnXI9WwSv5SVWIxct9OZa94w7a DqEUFlkpgtp85VSC+byI8cV9HiEYrO3E/sYPaZf2p2VYTxHqhWPcBglRCS+IM2x75XAeXiWo2dI udQuTsFgjhX5wJ7jlyHuOIc69DKGYlTlSEHMTOrY6bi0HaAGPSgRPY5H79c1llLWTyH9R0iGWUC 9/cAX2sB4eYNT63cUb6dmQMh16DaJQamz0F/Ud6HBwLzD/bKZ2C1vYJr8P1jrrsU4pe3Z7hi0Eu 6SVT96rkmuP9eTdsb1w== X-Proofpoint-ORIG-GUID: Fupf3JIUbQBik-f9hz7bnP6RdsEy7114 X-Authority-Analysis: v=2.4 cv=X7Fi7mTe c=1 sm=1 tr=0 ts=69d6e014 cx=c_pps a=qKBjSQ1v91RyAK45QCPf5w==:117 a=mD05b5UW6KhLIDvowZ5dSQ==:17 a=IkcTkHD0fZMA:10 a=A5OVakUREuEA:10 a=x7bEGLp0ZPQA:10 a=VkNPw1HP01LnGYTKEx00:22 a=Da8U98TiO7q1upZEImrf:22 a=jHxIr1HyPKZ_Q5_91PL3:22 a=JfrnYn6hAAAA:8 a=ignxIbPQDb6uhq8P750A:9 a=QEXdDO2ut3YA:10 a=NFOGd7dJGGMPyQGDc5-O:22 a=1CNFftbPRP8L7MoqJWF3:22 X-Proofpoint-Virus-Version: vendor=nai engine=6800 definitions=11753 signatures=596818 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=10 priorityscore=1501 impostorscore=10 spamscore=0 phishscore=0 adultscore=0 malwarescore=0 clxscore=1015 lowpriorityscore=10 suspectscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2604010000 definitions=main-2604080214 Some bio completion handlers need to run in task context but bio_endio() can be called from IRQ context (e.g. buffer_head writeback). Add a BIO_COMPLETE_IN_TASK flag that bio submitters can set to request task-context completion of their bi_end_io callback. When bio_endio() sees this flag and is running in non-task context, it queues the bio to a per-cpu lockless list and schedules a delayed work item to call bi_end_io() from task context. The delayed work uses a 1-jiffie delay to allow batches of completions to accumulate before processing. A CPU hotplug dead callback drains any remaining bios from the departing CPU's batch. This will be used to enable RWF_DONTCACHE for block devices, and could be used for other subsystems like fscrypt that need task-context bio completion. Suggested-by: Matthew Wilcox Signed-off-by: Tal Zussman --- block/bio.c | 83 +++++++++++++++++++++++++++++++++++++++++++= +++- include/linux/blk_types.h | 7 +++- 2 files changed, 88 insertions(+), 2 deletions(-) diff --git a/block/bio.c b/block/bio.c index 8203bb7455a9..21b403eb1c04 100644 --- a/block/bio.c +++ b/block/bio.c @@ -18,6 +18,7 @@ #include #include #include +#include =20 #include #include "blk.h" @@ -1714,6 +1715,51 @@ void bio_check_pages_dirty(struct bio *bio) } EXPORT_SYMBOL_GPL(bio_check_pages_dirty); =20 +struct bio_complete_batch { + struct llist_head list; + struct delayed_work work; + int cpu; +}; + +static DEFINE_PER_CPU(struct bio_complete_batch, bio_complete_batch); +static struct workqueue_struct *bio_complete_wq; + +static void bio_complete_work_fn(struct work_struct *w) +{ + struct delayed_work *dw =3D to_delayed_work(w); + struct bio_complete_batch *batch =3D + container_of(dw, struct bio_complete_batch, work); + struct llist_node *node; + struct bio *bio, *next; + + do { + node =3D llist_del_all(&batch->list); + if (!node) + break; + + node =3D llist_reverse_order(node); + llist_for_each_entry_safe(bio, next, node, bi_llist) + bio->bi_end_io(bio); + + if (need_resched()) { + if (!llist_empty(&batch->list)) + mod_delayed_work_on(batch->cpu, + bio_complete_wq, + &batch->work, 0); + break; + } + } while (1); +} + +static void bio_queue_completion(struct bio *bio) +{ + struct bio_complete_batch *batch =3D this_cpu_ptr(&bio_complete_batch); + + if (llist_add(&bio->bi_llist, &batch->list)) + mod_delayed_work_on(batch->cpu, bio_complete_wq, + &batch->work, 1); +} + static inline bool bio_remaining_done(struct bio *bio) { /* @@ -1788,7 +1834,9 @@ void bio_endio(struct bio *bio) } #endif =20 - if (bio->bi_end_io) + if (!in_task() && bio_flagged(bio, BIO_COMPLETE_IN_TASK)) + bio_queue_completion(bio); + else if (bio->bi_end_io) bio->bi_end_io(bio); } EXPORT_SYMBOL(bio_endio); @@ -1974,6 +2022,24 @@ int bioset_init(struct bio_set *bs, } EXPORT_SYMBOL(bioset_init); =20 +/* + * Drain a dead CPU's deferred bio completions. + */ +static int bio_complete_batch_cpu_dead(unsigned int cpu) +{ + struct bio_complete_batch *batch =3D + per_cpu_ptr(&bio_complete_batch, cpu); + struct llist_node *node; + struct bio *bio, *next; + + node =3D llist_del_all(&batch->list); + node =3D llist_reverse_order(node); + llist_for_each_entry_safe(bio, next, node, bi_llist) + bio->bi_end_io(bio); + + return 0; +} + static int __init init_bio(void) { int i; @@ -1988,6 +2054,21 @@ static int __init init_bio(void) SLAB_HWCACHE_ALIGN | SLAB_PANIC, NULL); } =20 + for_each_possible_cpu(i) { + struct bio_complete_batch *batch =3D + per_cpu_ptr(&bio_complete_batch, i); + + init_llist_head(&batch->list); + INIT_DELAYED_WORK(&batch->work, bio_complete_work_fn); + batch->cpu =3D i; + } + + bio_complete_wq =3D alloc_workqueue("bio_complete", WQ_MEM_RECLAIM, 0); + if (!bio_complete_wq) + panic("bio: can't allocate bio_complete workqueue\n"); + + cpuhp_setup_state(CPUHP_BP_PREPARE_DYN, "block/bio:complete:dead", + NULL, bio_complete_batch_cpu_dead); cpuhp_setup_state_multi(CPUHP_BIO_DEAD, "block/bio:dead", NULL, bio_cpu_dead); =20 diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 8808ee76e73c..0b55159d110d 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -11,6 +11,7 @@ #include #include #include +#include =20 struct bio_set; struct bio; @@ -208,7 +209,10 @@ typedef unsigned int blk_qc_t; * stacking drivers) */ struct bio { - struct bio *bi_next; /* request queue link */ + union { + struct bio *bi_next; /* request queue link */ + struct llist_node bi_llist; /* deferred completion */ + }; struct block_device *bi_bdev; blk_opf_t bi_opf; /* bottom bits REQ_OP, top bits * req_flags. @@ -322,6 +326,7 @@ enum { BIO_REMAPPED, BIO_ZONE_WRITE_PLUGGING, /* bio handled through zone write plugging */ BIO_EMULATES_ZONE_APPEND, /* bio emulates a zone append operation */ + BIO_COMPLETE_IN_TASK, /* complete bi_end_io() in task context */ BIO_FLAG_LAST }; =20 --=20 2.39.5 From nobody Thu Apr 16 20:49:10 2026 Received: from mx0b-00364e01.pphosted.com (mx0b-00364e01.pphosted.com [148.163.139.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5667A39FCA2 for ; Wed, 8 Apr 2026 23:09:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.139.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775689752; cv=none; b=G3VfJ9hrZT/zXX+4Kujs1TBbO8j4FuTnYaODQMWpU0L+N6nCIdUwo7zq8EivyMUIbHQ44WpQZ8nP+ar7iAfl6qD8Jbn0jSzYjbYlSgOIgAO4EhKDWjd0CiLie8KRK+RidDyuHtEIQ+q3NG+kVzezWpkSulWIDohaWbx6cUYRIBI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775689752; c=relaxed/simple; bh=uxVNlK6BMzyn/Swuex2mcjfTN1L938gYzjb1HF2lqyU=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=aEyAqXLFXyqJ6mvzrjBF0YzG967uwCXunI21+Z254psrQa7S8C5+GUiB0grh2VKQ+2Rofh8i7HbIGJhcHpd/GwOb4uOShIW/g96E28IlQ4rz4/2bm/4jS/oTGrMLObHd5j9yTE+cD95V4p0Ixxf2P78vldtdPXxyW5+tLoNjGZA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=columbia.edu; spf=pass smtp.mailfrom=columbia.edu; dkim=pass (2048-bit key) header.d=columbia.edu header.i=@columbia.edu header.b=G95NswD+; arc=none smtp.client-ip=148.163.139.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=columbia.edu Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=columbia.edu Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=columbia.edu header.i=@columbia.edu header.b="G95NswD+" Received: from pps.filterd (m0167077.ppops.net [127.0.0.1]) by mx0b-00364e01.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 638MtrrM3224000 for ; Wed, 8 Apr 2026 19:09:09 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=columbia.edu; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=pps01; bh=8dsl Z9dvpn2QBwNrcJEpfHjycCNypPP7xdaXKfMePho=; b=G95NswD+1AgrwF1KSmuo 7ejnwnOjIs4ukUSShU1rYQhI+zrg5Ns4e67slikatL5s+3HTdp0Ob+Y3K+es5jKl PA+SO8dD5ezEfR+GrDhs9SMAoOHcQS/sWaEWVi+Bc0WYLYSL5qeiOyKsmNwe+1Dy x8DB5xHpHsRjQfFhUzF2YM/jsDzZXKmKA6c/euUY1KUgWXso3wAhwiAiDFB9udzF NQnlG3OYbqFIe11dVdrc76+R8CC0VBm8h056JDC0GpLqoSxncN/Ipiyl/dFo9skJ UXesjWHj5qmY1lJ2eH9Sl53JNd6z/VnObBZyoXey6h30J2sRqT42No8Z4oBUZKtR og== Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by mx0b-00364e01.pphosted.com (PPS) with ESMTPS id 4dda3n0pc7-1 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NOT) for ; Wed, 08 Apr 2026 19:09:09 -0400 (EDT) Received: by mail-qk1-f200.google.com with SMTP id af79cd13be357-8d5b5d607d1so62741685a.3 for ; Wed, 08 Apr 2026 16:09:09 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775689749; x=1776294549; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=8dslZ9dvpn2QBwNrcJEpfHjycCNypPP7xdaXKfMePho=; b=F/6n5OD4i/bOC4y7pyxf6Mkf7xXICICiXyM0H3b9Z3X5W6W/LmViGiDn85SwzyUcLL sdjksHbzB6s3BeOcKTUGze9h2WgXvnZXRfd56aEiaLlNQDFRODsKxaOMG8MSFVNPVOKR znPwjduUXrpeSgOd7TazPZ6oE1Nu7pDJaikUtNmiC0gxssX7o8qZQhzwkiZicZUbGk+o sUpRGR/tLg+DB/QHyYJFmacfiaBLGQYJl50ktYI6kmbIxuD6k2UsPqxhlTeRWv+MuEep Gwiin8ehyQiH3czy0QcBa454wUCrSrB7q0x6P+ALWXaaduLwCnWRu9N18TUwzDTlXTPf COHA== X-Forwarded-Encrypted: i=1; AJvYcCUT8BR3ZP5kVz9JrxdtftCR0zaHtT3lMEhD/LrD3uWOt9TlUCDn9qgIOhOuWDCjIO3EfqOrvVSgKlrubZs=@vger.kernel.org X-Gm-Message-State: AOJu0Ywo8Gyyw39+kSNstuoMfDSpG1xrJV2SKvEKq1Vpq9dRASzJr628 22QnpxE5lw1Yd28nA/kojElH9clMvs0hEcBrJAMlZuBPQclYLDIIPc9u30iEvPaOQmyS8AEZJwU WuIyg93RyeOiuF/O6dXXyY6eECe5SPW1T975y6H8EFXZkH9WnYdsWB+tSq6KMBA== X-Gm-Gg: AeBDieskgudZWs2VHOm6c9lrraAdwPFfAyhsIjhb9Sv3O7He0ID/LsbM1hOxS3EdKqw 9Uo1c7ejr7inC5rqJ07MZVByO52I1mVC9C7FqKzgZRtofoAGXRdOiI0N+I/pVz1vXZmNCua9aEs gfLR5YHvDGxDohH/zsuuANFlFYEavWdaA4yCLxUYy6SXhQ4a9gAracu+HtOdqgTs3NcGk2PEQIH P62VRUjyM+44TtnKbYutxFSpOw9tXTXZ+X3161VCTkyeoxV/TsLokZKKYfGp4FplHHxcAyMfA26 96Doqj5KFTn/HyFUdfK4kYAcr9UkrP1zud13G7OWfhwzE5VJ6w2i3x65Ukzt1H/1Lj4jfkvIHVf STWLVSzi2I9OEK9SgJhtLPwDy0425XyuA X-Received: by 2002:a05:620a:440e:b0:8cd:8f04:50ec with SMTP id af79cd13be357-8dc3b03cfe9mr238711585a.2.1775689748772; Wed, 08 Apr 2026 16:09:08 -0700 (PDT) X-Received: by 2002:a05:620a:440e:b0:8cd:8f04:50ec with SMTP id af79cd13be357-8dc3b03cfe9mr238705885a.2.1775689748199; Wed, 08 Apr 2026 16:09:08 -0700 (PDT) Received: from [127.0.1.1] ([216.158.158.246]) by smtp.gmail.com with ESMTPSA id af79cd13be357-8dc1514d382sm125665485a.4.2026.04.08.16.09.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Apr 2026 16:09:07 -0700 (PDT) From: Tal Zussman Date: Wed, 08 Apr 2026 19:08:50 -0400 Subject: [PATCH RFC v5 2/3] iomap: use BIO_COMPLETE_IN_TASK for dropbehind writeback Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260408-blk-dontcache-v5-2-0f080c20a96f@columbia.edu> References: <20260408-blk-dontcache-v5-0-0f080c20a96f@columbia.edu> In-Reply-To: <20260408-blk-dontcache-v5-0-0f080c20a96f@columbia.edu> To: Jens Axboe , "Matthew Wilcox (Oracle)" , Christian Brauner , "Darrick J. Wong" , Carlos Maiolino , Alexander Viro , Jan Kara Cc: Christoph Hellwig , Dave Chinner , Bart Van Assche , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Tal Zussman X-Mailer: b4 0.14.3-dev-d7477 X-Developer-Signature: v=1; a=ed25519-sha256; t=1775689745; l=3055; i=tz2294@columbia.edu; s=20250528; h=from:subject:message-id; bh=uxVNlK6BMzyn/Swuex2mcjfTN1L938gYzjb1HF2lqyU=; b=wiuqQGIUXlaDQRyL/2cvKQLeklEvgNwRkNWuSOVmD4QDiAmxSLl8wwsDtLCKaObM2FWAiuPdv KjGXMpBW4o4ASxzBvkdeJ8ApER14bjxSNNSrtc8cfpyy9YHBS4vsTeF X-Developer-Key: i=tz2294@columbia.edu; a=ed25519; pk=BIj5KdACscEOyAC0oIkeZqLB3L94fzBnDccEooxeM5Y= X-Authority-Analysis: v=2.4 cv=RMCD2Yi+ c=1 sm=1 tr=0 ts=69d6e015 cx=c_pps a=hnmNkyzTK/kJ09Xio7VxxA==:117 a=mD05b5UW6KhLIDvowZ5dSQ==:17 a=IkcTkHD0fZMA:10 a=A5OVakUREuEA:10 a=x7bEGLp0ZPQA:10 a=VkNPw1HP01LnGYTKEx00:22 a=Da8U98TiO7q1upZEImrf:22 a=QOCMdifcju39GKoXhKua:22 a=tmanUA2hrRIj4wcn57QA:9 a=QEXdDO2ut3YA:10 a=PEH46H7Ffwr30OY-TuGO:22 X-Proofpoint-ORIG-GUID: 9GDi-UMGRajJvSBwyXo8EQKkTeSv0HvD X-Proofpoint-GUID: 9GDi-UMGRajJvSBwyXo8EQKkTeSv0HvD X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwNDA4MDIxNCBTYWx0ZWRfX2Um5AVspVJqz m5K9y/8xfk9vwdihOjou8st+xixrYeYktCplDZkdVmy/IJGcHyCcXCMnW3mGJA3xlcqkcMa6l8P AOnK6tTvGxHEYsUPxZaLdjPAqTXHYwI//RreJMGjidGHd5+1MGZ4NvOt2s3IP2EjfGMJKc/e1or c8S0dxRuWdAh+QkvCpoC3YtRFY/tYTQZ6ALAvPN4Ys4TbXhc2j/j3JnWUCsxHw5tPsUxBSA6zpW QWFkg/tQ5rFmzmV+J+NOGHpqOmmw+2/l0UQpF0B/h3J/emiokt/YyY8+XPFNkxkR0Gv0Cno6lhC D1q0WchILZgfXzrkEHiqQ4ZOPab6VtEoRZyVCTSyy7oSvfz8fOPD3atjUOjF3+ySrVEw1SvdYy5 dJ82uTqrdGHegcDILDl6e3R/dkB7QtkHm+AEhXlriZw2VY0w0wpjr7eUtVtEiLIE784pu1LBTGs ObNOevSw5KPM/msyC5A== X-Proofpoint-Virus-Version: vendor=nai engine=6800 definitions=11753 signatures=596818 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 lowpriorityscore=10 phishscore=0 impostorscore=10 suspectscore=0 clxscore=1015 spamscore=0 bulkscore=10 adultscore=0 malwarescore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2604010000 definitions=main-2604080214 Set BIO_COMPLETE_IN_TASK on iomap writeback bios when a dropbehind folio is added. This ensures that bi_end_io runs in task context, where folio_end_dropbehind() can safely invalidate folios. With the bio layer now handling task-context deferral generically, IOMAP_IOEND_DONTCACHE is no longer needed, as XFS no longer needs to route DONTCACHE ioends through its completion workqueue. Remove the flag and its NOMERGE entry. Without the NOMERGE, regular I/Os that get merged with a dropbehind folio will also have their completion deferred to task context. Signed-off-by: Tal Zussman --- fs/iomap/ioend.c | 5 +++-- fs/xfs/xfs_aops.c | 4 ---- include/linux/iomap.h | 6 +----- 3 files changed, 4 insertions(+), 11 deletions(-) diff --git a/fs/iomap/ioend.c b/fs/iomap/ioend.c index e4d57cb969f1..fe2a4c3dae42 100644 --- a/fs/iomap/ioend.c +++ b/fs/iomap/ioend.c @@ -182,8 +182,6 @@ ssize_t iomap_add_to_ioend(struct iomap_writepage_ctx *= wpc, struct folio *folio, ioend_flags |=3D IOMAP_IOEND_UNWRITTEN; if (wpc->iomap.flags & IOMAP_F_SHARED) ioend_flags |=3D IOMAP_IOEND_SHARED; - if (folio_test_dropbehind(folio)) - ioend_flags |=3D IOMAP_IOEND_DONTCACHE; if (pos =3D=3D wpc->iomap.offset && (wpc->iomap.flags & IOMAP_F_BOUNDARY)) ioend_flags |=3D IOMAP_IOEND_BOUNDARY; =20 @@ -200,6 +198,9 @@ ssize_t iomap_add_to_ioend(struct iomap_writepage_ctx *= wpc, struct folio *folio, if (!bio_add_folio(&ioend->io_bio, folio, map_len, poff)) goto new_ioend; =20 + if (folio_test_dropbehind(folio)) + bio_set_flag(&ioend->io_bio, BIO_COMPLETE_IN_TASK); + /* * Clamp io_offset and io_size to the incore EOF so that ondisk * file size updates in the ioend completion are byte-accurate. diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c index 76678814f46f..0d469b91377d 100644 --- a/fs/xfs/xfs_aops.c +++ b/fs/xfs/xfs_aops.c @@ -510,10 +510,6 @@ xfs_ioend_needs_wq_completion( if (ioend->io_flags & (IOMAP_IOEND_UNWRITTEN | IOMAP_IOEND_SHARED)) return true; =20 - /* Page cache invalidation cannot be done in irq context. */ - if (ioend->io_flags & IOMAP_IOEND_DONTCACHE) - return true; - return false; } =20 diff --git a/include/linux/iomap.h b/include/linux/iomap.h index 99b7209dabd7..a5d6401ebd80 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -392,16 +392,12 @@ sector_t iomap_bmap(struct address_space *mapping, se= ctor_t bno, #define IOMAP_IOEND_BOUNDARY (1U << 2) /* is direct I/O */ #define IOMAP_IOEND_DIRECT (1U << 3) -/* is DONTCACHE I/O */ -#define IOMAP_IOEND_DONTCACHE (1U << 4) - /* * Flags that if set on either ioend prevent the merge of two ioends. * (IOMAP_IOEND_BOUNDARY also prevents merges, but only one-way) */ #define IOMAP_IOEND_NOMERGE_FLAGS \ - (IOMAP_IOEND_SHARED | IOMAP_IOEND_UNWRITTEN | IOMAP_IOEND_DIRECT | \ - IOMAP_IOEND_DONTCACHE) + (IOMAP_IOEND_SHARED | IOMAP_IOEND_UNWRITTEN | IOMAP_IOEND_DIRECT) =20 /* * Structure for writeback I/O completions. --=20 2.39.5 From nobody Thu Apr 16 20:49:10 2026 Received: from mx0b-00364e01.pphosted.com (mx0b-00364e01.pphosted.com [148.163.139.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B28CA3A4514 for ; Wed, 8 Apr 2026 23:09:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.139.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775689753; cv=none; b=bEEL1uhgKldOJdcPb7rGTge/IASotqtD9uME8cO5MFO+zDrsthqiFRsbBeto9/1FX1kONEWOknq72tWkZCIFrdTFGLiWLnXwkyHEVXUkC1L99PUodly3wm0wzvGFOzdQUdUQ8ADryffE9ybd0sf85M5VD0k2wX3jGiH4C8C9mMs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775689753; c=relaxed/simple; bh=vqYjVe4t0i/AiFArpSZ9mEXLlU4rnE2ZRBBMSnozEZ0=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=dRHEuVDvOm5QQNfP2QdTSzCBdh2iPD+JtFHy3eEMTsPBVC9GwAJ7luwBQZEWoEvg4lhX1BCeo5tqzJzBnccZxhGpJukOl18/T1mbjMxfW+kPozOsfEazwvZ0ES3qVdloYoVOVR6ibU36GlZDeWMj3RjUkyEBF6zTiz9cDZHS+Cc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=columbia.edu; spf=pass smtp.mailfrom=columbia.edu; dkim=pass (2048-bit key) header.d=columbia.edu header.i=@columbia.edu header.b=sX/DybFo; arc=none smtp.client-ip=148.163.139.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=columbia.edu Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=columbia.edu Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=columbia.edu header.i=@columbia.edu header.b="sX/DybFo" Received: from pps.filterd (m0167074.ppops.net [127.0.0.1]) by mx0b-00364e01.pphosted.com (8.18.1.11/8.18.1.11) with ESMTP id 638MudgK2689495 for ; Wed, 8 Apr 2026 19:09:10 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=columbia.edu; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=pps01; bh=S7NO oLpQC1gt8jjGEOFARMhStqFULmtQPTxyzd8Tw9E=; b=sX/DybFoA6nPq2/fiQXH P1m4akAquZ6v2+LHTPlMLkgavf+yG/DxaGshLuCrqSsAX1UsI7rsFWWkC8zKyt6R o1zyLIdzrtyjaUXfI/1+JiOR3R5ucwVnwPjynyuQRSoQwEj65c1m2riCbuU9eaVi EDGmL0+83ljeQ52xYm3+wL9pa6Pera34PDJQGRduG35WlqlTGSk2lJmjjpiHpzmk inX4NVoEK2SBuy2TVftA54Yy+kkL09FQ4K1GH0WO50Ld1C0sRhaPuvjbAYxv144V +0qQTlQqhrroAD0eG8b9WPI88SLci48x/PbC/hNlpBaytb43vk9WRZs4dD0UUI5K Gg== Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by mx0b-00364e01.pphosted.com (PPS) with ESMTPS id 4dda3n8v61-1 (version=TLSv1.3 cipher=TLS_AES_128_GCM_SHA256 bits=128 verify=NOT) for ; Wed, 08 Apr 2026 19:09:10 -0400 (EDT) Received: by mail-qk1-f199.google.com with SMTP id af79cd13be357-8d4c2906fdfso21520385a.2 for ; Wed, 08 Apr 2026 16:09:10 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775689750; x=1776294550; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=S7NOoLpQC1gt8jjGEOFARMhStqFULmtQPTxyzd8Tw9E=; b=Y4Ntj7jPOoDMZ9sExUhehOlz3H/Ff3d9NcLU/U2UK84WjuGbXWyy1Zr1Q2tqVyLCOZ GeMXK/CdOodGYTDAxGR0VEMme+jIJpwLPvxkBmje3kn6xf2q+IpaBFRJ7U96cS0EMlBc eZCq0MhyJygT+u++j5vrB52D0NEYwFK2aRPtd3ONzk999aCKTQGPgQtXsPa7sSLw9SiT ewqO/fA+vi2FOPMDkNNoAsfmEUwRV9yRlyD61NbvwU7a9AGzUCy9cqs/iF2dPwQsveC5 oktsdB7/k4Dkqx8ZVPeXqO81wQZan1WvlxtffU45OuE6cHg4q3x2LT71dAFrXIHCfS3y IEdQ== X-Forwarded-Encrypted: i=1; AJvYcCWd7hixxdHUj/h6IV+oc+tLeVxznvdsnTqlmBXEPMVh3h6ytKnYdu+LVBjoaBQ+1Q54XX/72Pbx51Thjck=@vger.kernel.org X-Gm-Message-State: AOJu0YzGRQlByNKZ5F4ap70yPrxgLZ10zprRwemjWNwJOgLafu2EfaXj 6vVpRMuZPLdmYIVxS3hV00jeTsKMtZyuzc+/msIn6B0Bbo054SLCTkuRsPFBPE5vvr0S1b+3k6f zVYLPqGNXcHrb32MGVhzrbLFC5BmDbJ7CfdzVHr7k1GDyIpT+tEVYfGZ4UnVjnHQFmOTJZA== X-Gm-Gg: AeBDievJCbAxGOV/+0WaQUUBpQcx7zyCkdrBIpVEK+/uSAxfgcbhlTeqz8ryKiuuain cObUycsHTkTx8L2gdANIkKyKaNJyPGLyIhkp39qmnHX+p65VX9HxHW6XrIxibiD3SzTqMFxbuUZ eP2NDGfbBLXUfoARsN9Gitl4oh2N1ni+6VrdjJEkFDuqLEuxjvUyTbku0z3Uw7ebW6ofWkygnn1 MIzaOi9qeOalp1eNlowGywNlTQ6OVHegqpQ/oecI8Tsl2IfYiyfZ4A7glbiMrW9zKd/I6/k8wh8 WGVS8oG18WPn3R3TwFsPOFqkKnaCaxEDvVC7M8UEgN5XOG/lzXOy9zXadCblZ0NBpqwWOX1K8Cg /8Ui95G4sCf6uxX194DKJonRFFHydZnun X-Received: by 2002:a05:620a:2848:b0:8cf:df8b:1e5e with SMTP id af79cd13be357-8d4185c9512mr3194928685a.5.1775689749883; Wed, 08 Apr 2026 16:09:09 -0700 (PDT) X-Received: by 2002:a05:620a:2848:b0:8cf:df8b:1e5e with SMTP id af79cd13be357-8d4185c9512mr3194922385a.5.1775689749261; Wed, 08 Apr 2026 16:09:09 -0700 (PDT) Received: from [127.0.1.1] ([216.158.158.246]) by smtp.gmail.com with ESMTPSA id af79cd13be357-8dc1514d382sm125665485a.4.2026.04.08.16.09.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Apr 2026 16:09:08 -0700 (PDT) From: Tal Zussman Date: Wed, 08 Apr 2026 19:08:51 -0400 Subject: [PATCH RFC v5 3/3] block: enable RWF_DONTCACHE for block devices Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260408-blk-dontcache-v5-3-0f080c20a96f@columbia.edu> References: <20260408-blk-dontcache-v5-0-0f080c20a96f@columbia.edu> In-Reply-To: <20260408-blk-dontcache-v5-0-0f080c20a96f@columbia.edu> To: Jens Axboe , "Matthew Wilcox (Oracle)" , Christian Brauner , "Darrick J. Wong" , Carlos Maiolino , Alexander Viro , Jan Kara Cc: Christoph Hellwig , Dave Chinner , Bart Van Assche , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Tal Zussman X-Mailer: b4 0.14.3-dev-d7477 X-Developer-Signature: v=1; a=ed25519-sha256; t=1775689745; l=4833; i=tz2294@columbia.edu; s=20250528; h=from:subject:message-id; bh=vqYjVe4t0i/AiFArpSZ9mEXLlU4rnE2ZRBBMSnozEZ0=; b=TkXe1cDpr+0h2BcgYcYMwDXJ9OzGkB0Fsjl6RdOUbuhHuLsFpdpyPcESqIE4200vy+eNNs/A+ 038SUVLSqekBHFqdXjGfViao6J1wkhHyywZ87y01bw5yaC4bwnIIKLF X-Developer-Key: i=tz2294@columbia.edu; a=ed25519; pk=BIj5KdACscEOyAC0oIkeZqLB3L94fzBnDccEooxeM5Y= X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwNDA4MDIxNCBTYWx0ZWRfXwxKRpo7n9ZnA EdfdvJdyXtebImG2pBoQcNYi/qf6X2HBOziSpSQfipycVndRzMaWDKs35U0r08XHFK6KsQJ7CBu rNI34KrMuIpLN2ZV9FH09db6d7FdcTClS6NHwv3v6U1DULOBbmoGuqvZeoqHTWKRVH17o8zwveo 6B4B1IyGg9JCYJ6Ml9tJ8HqTnrKGw7feOUgL/BwgAISgGYpkq6l05WmAfTdhJ/1N5ghDqpkAhbc JBEXOghSz6hLJo4p9HGD/hpIS27mYUPQGx+0S7jO5bMD9ehj77bqqkJV2qWXg2oSp2pHYpUdAqa KRrFyccPJYQ/0xUUpcU6cWr0VuXppUfQk4O6zw+/qjq1yOzTIPhH2N+XPF1OG/GiArzHNUZlbxW fd0ibgCaFw+gBqpLtn4OqWg7DMkVz3+taDFgxLItaZjet8I4LEiMwhNnAucjx4HkwjSM40bqWwV cQMTY/QMZl+HXWu4Fcw== X-Proofpoint-GUID: -V7HfHT-hqNLuPe6JcC_NzGdVBmg7RVQ X-Proofpoint-ORIG-GUID: -V7HfHT-hqNLuPe6JcC_NzGdVBmg7RVQ X-Authority-Analysis: v=2.4 cv=N9AZ0W9B c=1 sm=1 tr=0 ts=69d6e016 cx=c_pps a=HLyN3IcIa5EE8TELMZ618Q==:117 a=mD05b5UW6KhLIDvowZ5dSQ==:17 a=IkcTkHD0fZMA:10 a=A5OVakUREuEA:10 a=x7bEGLp0ZPQA:10 a=VkNPw1HP01LnGYTKEx00:22 a=Da8U98TiO7q1upZEImrf:22 a=azVShVRs0zEubeQ0wG0L:22 a=KmSSIXLuV57r0wwE8Y4A:9 a=QEXdDO2ut3YA:10 a=bTQJ7kPSJx9SKPbeHEYW:22 X-Proofpoint-Virus-Version: vendor=nai engine=6800 definitions=11753 signatures=596818 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 lowpriorityscore=10 bulkscore=10 priorityscore=1501 clxscore=1015 phishscore=0 impostorscore=10 suspectscore=0 adultscore=0 malwarescore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.22.0-2604010000 definitions=main-2604080214 Block device buffered reads and writes already pass through filemap_read() and iomap_file_buffered_write() respectively, both of which handle IOCB_DONTCACHE. Enable RWF_DONTCACHE for block device files by setting FOP_DONTCACHE in def_blk_fops. For CONFIG_BUFFER_HEAD=3Dy paths, add block_write_begin_iocb() which threads the kiocb through so that buffer_head-based I/O can use DONTCACHE behavior. The existing block_write_begin() is preserved as a wrapper that passes a NULL iocb. Set BIO_COMPLETE_IN_TASK in submit_bh_wbc() when the folio has dropbehind so that buffer_head writeback completions get deferred to task context. CONFIG_BUFFER_HEAD=3Dn paths are handled by the previously added iomap BIO_COMPLETE_IN_TASK support. This support is useful for databases that operate on raw block devices, among other userspace applications. Signed-off-by: Tal Zussman --- block/fops.c | 5 +++-- fs/buffer.c | 22 +++++++++++++++++++--- include/linux/buffer_head.h | 3 +++ 3 files changed, 25 insertions(+), 5 deletions(-) diff --git a/block/fops.c b/block/fops.c index 4d32785b31d9..d8165f6ba71c 100644 --- a/block/fops.c +++ b/block/fops.c @@ -505,7 +505,8 @@ static int blkdev_write_begin(const struct kiocb *iocb, unsigned len, struct folio **foliop, void **fsdata) { - return block_write_begin(mapping, pos, len, foliop, blkdev_get_block); + return block_write_begin_iocb(iocb, mapping, pos, len, foliop, + blkdev_get_block); } =20 static int blkdev_write_end(const struct kiocb *iocb, @@ -967,7 +968,7 @@ const struct file_operations def_blk_fops =3D { .splice_write =3D iter_file_splice_write, .fallocate =3D blkdev_fallocate, .uring_cmd =3D blkdev_uring_cmd, - .fop_flags =3D FOP_BUFFER_RASYNC, + .fop_flags =3D FOP_BUFFER_RASYNC | FOP_DONTCACHE, }; =20 static __init int blkdev_init(void) diff --git a/fs/buffer.c b/fs/buffer.c index ed724a902657..c60c0ad6cc35 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -2239,14 +2239,19 @@ EXPORT_SYMBOL(block_commit_write); * * The filesystem needs to handle block truncation upon failure. */ -int block_write_begin(struct address_space *mapping, loff_t pos, unsigned = len, +int block_write_begin_iocb(const struct kiocb *iocb, + struct address_space *mapping, loff_t pos, unsigned len, struct folio **foliop, get_block_t *get_block) { pgoff_t index =3D pos >> PAGE_SHIFT; + fgf_t fgp_flags =3D FGP_WRITEBEGIN; struct folio *folio; int status; =20 - folio =3D __filemap_get_folio(mapping, index, FGP_WRITEBEGIN, + if (iocb && iocb->ki_flags & IOCB_DONTCACHE) + fgp_flags |=3D FGP_DONTCACHE; + + folio =3D __filemap_get_folio(mapping, index, fgp_flags, mapping_gfp_mask(mapping)); if (IS_ERR(folio)) return PTR_ERR(folio); @@ -2261,6 +2266,13 @@ int block_write_begin(struct address_space *mapping,= loff_t pos, unsigned len, *foliop =3D folio; return status; } + +int block_write_begin(struct address_space *mapping, loff_t pos, unsigned = len, + struct folio **foliop, get_block_t *get_block) +{ + return block_write_begin_iocb(NULL, mapping, pos, len, foliop, + get_block); +} EXPORT_SYMBOL(block_write_begin); =20 int block_write_end(loff_t pos, unsigned len, unsigned copied, @@ -2589,7 +2601,8 @@ int cont_write_begin(const struct kiocb *iocb, struct= address_space *mapping, (*bytes)++; } =20 - return block_write_begin(mapping, pos, len, foliop, get_block); + return block_write_begin_iocb(iocb, mapping, pos, len, foliop, + get_block); } EXPORT_SYMBOL(cont_write_begin); =20 @@ -2801,6 +2814,9 @@ static void submit_bh_wbc(blk_opf_t opf, struct buffe= r_head *bh, =20 bio =3D bio_alloc(bh->b_bdev, 1, opf, GFP_NOIO); =20 + if (folio_test_dropbehind(bh->b_folio)) + bio_set_flag(bio, BIO_COMPLETE_IN_TASK); + fscrypt_set_bio_crypt_ctx_bh(bio, bh, GFP_NOIO); =20 bio->bi_iter.bi_sector =3D bh->b_blocknr * (bh->b_size >> 9); diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h index b16b88bfbc3e..ddf88ce290f2 100644 --- a/include/linux/buffer_head.h +++ b/include/linux/buffer_head.h @@ -260,6 +260,9 @@ int block_read_full_folio(struct folio *, get_block_t *= ); bool block_is_partially_uptodate(struct folio *, size_t from, size_t count= ); int block_write_begin(struct address_space *mapping, loff_t pos, unsigned = len, struct folio **foliop, get_block_t *get_block); +int block_write_begin_iocb(const struct kiocb *iocb, + struct address_space *mapping, loff_t pos, unsigned len, + struct folio **foliop, get_block_t *get_block); int __block_write_begin(struct folio *folio, loff_t pos, unsigned len, get_block_t *get_block); int block_write_end(loff_t pos, unsigned len, unsigned copied, struct foli= o *); --=20 2.39.5