From nobody Sun Feb 8 02:26:29 2026 Received: from mail-pl1-f173.google.com (mail-pl1-f173.google.com [209.85.214.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 19B9A379973 for ; Wed, 14 Jan 2026 07:41:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768376515; cv=none; b=K2McuUnYobJD0uYiysB3FSSbEB5LQ+/9CZWMbH9lZOUZQofJIGoqeysIcLXey1plQaXf2yd7yQO03cDka2rQCwd7g/1GAId22W1uJOJI2xs1eBdQSUhV3MVtGa42Mp3qKaSxIXlABxGulMCwDf2ybDKbVHhoEBpNImKhxQ6fi24= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768376515; c=relaxed/simple; bh=+H3aAh+H9MYWBdA3obltdIOF5BErV3p/RdjTsRHDREA=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=MwuE5mBUUZagIIW1z2HZBC4DYysAvBJ2uSgJl320UPLDS3XEyGUibvDw78TvwzhoKhpBiAkLexr6PKgfJXe9YK3Uja9EYJZ+pUog5yfIgsGDQ9f5IXl0E/m5L/UFT8ZhR2Nd2UIoZGfLRoqSYwli1pR4fEREcG7amHgfknqPADs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ICkBm14S; arc=none smtp.client-ip=209.85.214.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ICkBm14S" Received: by mail-pl1-f173.google.com with SMTP id d9443c01a7336-29f2676bb21so71567795ad.0 for ; Tue, 13 Jan 2026 23:41:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1768376505; x=1768981305; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=snBrbkipsnG2NrPPSCJsD8qwO3TtVqHYjFPmR+kbTww=; b=ICkBm14SWIotz6yZrsGoMpdITnqLw483a4lo4R7iflmomOVI5ibeA/69PxFJJwuq0M ZBYwMdF14IzJLYOfdWNrsV8wXzLHBPI5+vRvdrtvv1iwfHu/8UiBQMvQi8nRaz95W/99 H6LA0p4EiucanWHc5Sd1tjuB95X8TrL6bZp7jHFuer0plZJabBfEVpCevxIKYBZYymoF WiC8pC2CqkOiDuBbADjI8ApJXno/nMgPgPArpkxM7txXvzdHOhk2Lb1VypkLhDtOxbW2 zZ5l2MaSPivQJiD7JSfXf40doDhf6SJBEZGJOOh9gKI7uIItwpJsbF0wbEnSBvJ8EvpR LfTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1768376505; x=1768981305; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=snBrbkipsnG2NrPPSCJsD8qwO3TtVqHYjFPmR+kbTww=; b=C+1zO9imMvPpskboOEWiFmpGM2WM0yGzfPiIJzjSHzUgUu/l9nYiFeoNcSKEvZ69l9 iD2CQRtFDSvEhm7KJiNvKyoYdbh533LaAjTNzBLALYdWZS1vY2BcyaeOc51RCwgVMBgi ZjZebRy6onETlFkTdYuVMM1T1LbHMe50U2FkYZ7rjnoxXzDqgaISVdqHI5ybU/3sF1oK pWR5DSXL1ZAd1A2nrXBp/tR/VkKgqvYr+tKiygIdekgGrtXitxwDtPxCO1yr9ITVh1N6 o+c0EJTjS1EcWQ776dr5I8RUVYlRG5XgWBJQ/TfMJ9CfBPYlddPeMSUsv69RkJ8It7VD rw2Q== X-Forwarded-Encrypted: i=1; AJvYcCUHxbnVXTTCPymC1uyLnVz/MnwZsnSP9dBUzyW6j0ylOHQHQxHkMc1u9Wb+0biOrushHgIsWG3qPzlYnM4=@vger.kernel.org X-Gm-Message-State: AOJu0YwGTcWjIVBfPKz7xYki9nnSHArBr3FpwDrNnL0vEPZn/EBQw/2L BJXrWRGbUaP+I1PgSLxP4y2Whecim9vnqS6XtP31zvkt4gspycuoVFPle4+YcKAE X-Gm-Gg: AY/fxX7DzQ3r6C8/2x/q3zUsK9AeDb/MTAyowrJKawvguP8Nw268EFU3DHMs5O6FWGt blwbUWelJAPYy4JP1yWuqVPrtl2b7W0Jva1PM0+cA4wuid9JGpo9/+U7T3JANhAmNMomIdm3Win 110gu6IpGad0aOK7Ysz53yx5KxRBOGxvzPXmbSm4Z3Lio/mKUmxNQHfMzODuw/qjHjVNPmhT4Zf t2E2GpFRtiTO5CyE3FG4X18lMMli4DRnCKTsHyHfFyapvqnXsdc3OzPsj94XIzijJA/WhMeiuKJ zbWl3dUcFmBTASF6wQSxejg2l2NtaSl5vfKehA1hkNcvQ9yaLKXL7wMdqJQ4y0pWOm3YBHDmeoP PzDMOZgjCKLGDIjlYNAhyV90Kg54rSlkkrcrvZZSB2F2AGuhM2pRNpdX5vyjcFkBnLMBglHAC4k QQatMlYkDUpVsnDc5aUgKQNMGLPxIeDw== X-Received: by 2002:a17:902:d506:b0:2a5:99e9:569d with SMTP id d9443c01a7336-2a599e9579amr22476575ad.18.1768376505152; Tue, 13 Jan 2026 23:41:45 -0800 (PST) Received: from n232-175-066.byted.org ([36.110.163.107]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2a3e3cb2debsm221476875ad.65.2026.01.13.23.41.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Jan 2026 23:41:44 -0800 (PST) From: guzebing To: brauner@kernel.org, djwong@kernel.org Cc: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, guzebing , Fengnan Chang Subject: [PATCH v2] iomap: add allocation cache for iomap_dio Date: Wed, 14 Jan 2026 15:41:13 +0800 Message-Id: <20260114074113.151089-1-guzebing1612@gmail.com> X-Mailer: git-send-email 2.20.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" As implemented by the bio structure, we do the same thing on the iomap-dio structure. Add a per-cpu cache for iomap_dio allocations, enabling us to quickly recycle them instead of going through the slab allocator. By making such changes, we can reduce memory allocation on the direct IO path, so that direct IO will not block due to insufficient system memory. In addition, for direct IO, the read performance of io_uring is improved by about 2.6%. v2: Factor percpu cache into common code and the iomap module uses it. v1: https://lore.kernel.org/all/20251121090052.384823-1-guzebing1612@gmail.com/ Suggested-by: Fengnan Chang Signed-off-by: guzebing --- fs/iomap/direct-io.c | 135 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 132 insertions(+), 3 deletions(-) diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index 5d5d63efbd57..b152fd2c7042 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -56,6 +56,132 @@ struct iomap_dio { }; }; =20 +#define PCPU_CACHE_IRQ_THRESHOLD 16 +#define PCPU_CACHE_ELEMENT_SIZE(pcpu_cache_list) \ + (sizeof(struct pcpu_cache_element) + pcpu_cache_list->element_size) +#define PCPU_CACHE_ELEMENT_GET_HEAD_FROM_PAYLOAD(payload) \ + ((struct pcpu_cache_element *)((unsigned long)(payload) - \ + sizeof(struct pcpu_cache_element))) +#define PCPU_CACHE_ELEMENT_GET_PAYLOAD_FROM_HEAD(head) \ + ((void *)((unsigned long)(head) + sizeof(struct pcpu_cache_element))) + +struct pcpu_cache_element { + struct pcpu_cache_element *next; + char payload[]; +}; +struct pcpu_cache { + struct pcpu_cache_element *free_list; + struct pcpu_cache_element *free_list_irq; + int nr; + int nr_irq; +}; +struct pcpu_cache_list { + struct pcpu_cache __percpu *cache; + size_t element_size; + int max_nr; +}; + +static struct pcpu_cache_list *pcpu_cache_list_create(int max_nr, size_t s= ize) +{ + struct pcpu_cache_list *pcpu_cache_list; + + pcpu_cache_list =3D kmalloc(sizeof(struct pcpu_cache_list), GFP_KERNEL); + if (!pcpu_cache_list) + return NULL; + + pcpu_cache_list->element_size =3D size; + pcpu_cache_list->max_nr =3D max_nr; + pcpu_cache_list->cache =3D alloc_percpu(struct pcpu_cache); + if (!pcpu_cache_list->cache) { + kfree(pcpu_cache_list); + return NULL; + } + return pcpu_cache_list; +} + +static void pcpu_cache_list_destroy(struct pcpu_cache_list *pcpu_cache_lis= t) +{ + free_percpu(pcpu_cache_list->cache); + kfree(pcpu_cache_list); +} + +static void irq_cache_splice(struct pcpu_cache *cache) +{ + unsigned long flags; + + /* cache->free_list must be empty */ + if (WARN_ON_ONCE(cache->free_list)) + return; + + local_irq_save(flags); + cache->free_list =3D cache->free_list_irq; + cache->free_list_irq =3D NULL; + cache->nr +=3D cache->nr_irq; + cache->nr_irq =3D 0; + local_irq_restore(flags); +} + +static void *pcpu_cache_list_alloc(struct pcpu_cache_list *pcpu_cache_list) +{ + struct pcpu_cache *cache; + struct pcpu_cache_element *cache_element; + + cache =3D per_cpu_ptr(pcpu_cache_list->cache, get_cpu()); + if (!cache->free_list) { + if (READ_ONCE(cache->nr_irq) >=3D PCPU_CACHE_IRQ_THRESHOLD) + irq_cache_splice(cache); + if (!cache->free_list) { + cache_element =3D kmalloc(PCPU_CACHE_ELEMENT_SIZE(pcpu_cache_list), + GFP_KERNEL); + if (!cache_element) { + put_cpu(); + return NULL; + } + put_cpu(); + return PCPU_CACHE_ELEMENT_GET_PAYLOAD_FROM_HEAD(cache_element); + } + } + + cache_element =3D cache->free_list; + cache->free_list =3D cache_element->next; + cache->nr--; + put_cpu(); + return PCPU_CACHE_ELEMENT_GET_PAYLOAD_FROM_HEAD(cache_element); +} + +static void pcpu_cache_list_free(void *payload, struct pcpu_cache_list *pc= pu_cache_list) +{ + struct pcpu_cache *cache; + struct pcpu_cache_element *cache_element; + + cache_element =3D PCPU_CACHE_ELEMENT_GET_HEAD_FROM_PAYLOAD(payload); + + cache =3D per_cpu_ptr(pcpu_cache_list->cache, get_cpu()); + if (READ_ONCE(cache->nr_irq) + cache->nr >=3D pcpu_cache_list->max_nr) + goto out_free; + + if (in_task()) { + cache_element->next =3D cache->free_list; + cache->free_list =3D cache_element; + cache->nr++; + } else if (in_hardirq()) { + lockdep_assert_irqs_disabled(); + cache_element->next =3D cache->free_list_irq; + cache->free_list_irq =3D cache_element; + cache->nr_irq++; + } else { + goto out_free; + } + put_cpu(); + return; +out_free: + put_cpu(); + kfree(cache_element); +} + +#define DIO_ALLOC_CACHE_MAX 256 +static struct pcpu_cache_list *dio_pcpu_cache_list; + static struct bio *iomap_dio_alloc_bio(const struct iomap_iter *iter, struct iomap_dio *dio, unsigned short nr_vecs, blk_opf_t opf) { @@ -135,7 +261,7 @@ ssize_t iomap_dio_complete(struct iomap_dio *dio) ret +=3D dio->done_before; } trace_iomap_dio_complete(iocb, dio->error, ret); - kfree(dio); + pcpu_cache_list_free(dio, dio_pcpu_cache_list); return ret; } EXPORT_SYMBOL_GPL(iomap_dio_complete); @@ -620,7 +746,7 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *ite= r, if (!iomi.len) return NULL; =20 - dio =3D kmalloc(sizeof(*dio), GFP_KERNEL); + dio =3D pcpu_cache_list_alloc(dio_pcpu_cache_list); if (!dio) return ERR_PTR(-ENOMEM); =20 @@ -804,7 +930,7 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *ite= r, return dio; =20 out_free_dio: - kfree(dio); + pcpu_cache_list_free(dio, dio_pcpu_cache_list); if (ret) return ERR_PTR(ret); return NULL; @@ -834,6 +960,9 @@ static int __init iomap_dio_init(void) if (!zero_page) return -ENOMEM; =20 + dio_pcpu_cache_list =3D pcpu_cache_list_create(DIO_ALLOC_CACHE_MAX, sizeo= f(struct iomap_dio)); + if (!dio_pcpu_cache_list) + return -ENOMEM; return 0; } fs_initcall(iomap_dio_init); --=20 2.20.1