block/bio-integrity.c | 4 +- block/bio.c | 58 +++++++++++++++++--------- block/blk-merge.c | 5 +++ block/fops.c | 4 +- fs/iomap/direct-io.c | 3 +- include/linux/blkdev.h | 7 ---- include/linux/uio.h | 2 - lib/iov_iter.c | 95 ------------------------------------------ 8 files changed, 49 insertions(+), 129 deletions(-)
From: Keith Busch <kbusch@kernel.org> In furthering direct IO use from user space buffers without bouncing to align to unnecessary kernel software constraints, this series removes the requirement that io vector lengths align to the logical block size. The downside (if want to call it that) is that mis-aligned io vectors are caught further down the block stack rather than closer to the syscall. This change also removes one walking of the io vector, so that's nice too. Keith Busch (7): block: check for valid bio while splitting block: align the bio after building it block: simplify direct io validity check iomap: simplify direct io validity check block: remove bdev_iter_is_aligned blk-integrity: use simpler alignment check iov_iter: remove iov_iter_is_aligned block/bio-integrity.c | 4 +- block/bio.c | 58 +++++++++++++++++--------- block/blk-merge.c | 5 +++ block/fops.c | 4 +- fs/iomap/direct-io.c | 3 +- include/linux/blkdev.h | 7 ---- include/linux/uio.h | 2 - lib/iov_iter.c | 95 ------------------------------------------ 8 files changed, 49 insertions(+), 129 deletions(-) -- 2.47.3
On 8/1/25 5:47 PM, Keith Busch wrote: > From: Keith Busch <kbusch@kernel.org> > > In furthering direct IO use from user space buffers without bouncing to > align to unnecessary kernel software constraints, this series removes > the requirement that io vector lengths align to the logical block size. > The downside (if want to call it that) is that mis-aligned io vectors > are caught further down the block stack rather than closer to the > syscall. That's not a downside imho, it's much nicer to have the correct/expected case be fast, and catch the unexpected error case down the line when we have to iterate the vecs anyway. IOW, I love this patchset. I'll spend some time going over the details. Did you write some test cases for this? > This change also removes one walking of the io vector, so that's nice > too. > > Keith Busch (7): > block: check for valid bio while splitting > block: align the bio after building it > block: simplify direct io validity check > iomap: simplify direct io validity check > block: remove bdev_iter_is_aligned > blk-integrity: use simpler alignment check > iov_iter: remove iov_iter_is_aligned > > block/bio-integrity.c | 4 +- > block/bio.c | 58 +++++++++++++++++--------- > block/blk-merge.c | 5 +++ > block/fops.c | 4 +- > fs/iomap/direct-io.c | 3 +- > include/linux/blkdev.h | 7 ---- > include/linux/uio.h | 2 - > lib/iov_iter.c | 95 ------------------------------------------ > 8 files changed, 49 insertions(+), 129 deletions(-) Now that's a beautiful diffstat. -- Jens Axboe
On Sat, Aug 02, 2025 at 09:37:32AM -0600, Jens Axboe wrote: > Did you write some test cases for this? I have some crude unit tests to hit specific conditions that might happen with nvme. Note, the "second" test here will fail with the wrong result with this version of the patchset due to the issue I mentioned on patch 2, but I've a fix for it ready for the next version. --- /* * This test is aligned to NVMe's PRP virtual boundary. It is intended to * execute on such a device with 4k formatted logical block size. * * The first test will submit a vectored read with a total size aligned to a 4k * block, but individual vectors may not be. This should be successful. * * The second test will submit a vectored read with a total size aligned to a * 4k block, but the first vector contains an invalid address. This should get * EFAULT. * * The third one will submit an IO with a total size aligned to a 4k block, * but it will fail the virtual boundary condition, which should result in a * split to a 0 length bio. This should get an EINVAL. * * The fourth test will submit IO with a total size aligned to a 4k block, but * with invalid DMA offsets. This should get an EINVAL. * * The last test will submit a large IO with a page offset that should exceed * the bio max vectors limit, resulting in reverting part of a bio iteration. * This should be successful. */ #define _GNU_SOURCE #include <fcntl.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <errno.h> #include <sys/uio.h> #include <string.h> #define BSIZE (8 * 1024 * 1024) #define VECS 4 int main(int argc, char **argv) { int fd, ret, i, j; struct iovec iov[VECS]; char *buf; if (argc < 2) return -1; fd = open(argv[1], O_RDONLY | O_DIRECT); if (fd < 0) return fd; ret = posix_memalign((void **)&buf, 4096, BSIZE); if (ret) return ret; memset(buf, 0, BSIZE); iov[0].iov_base = buf + 3072; iov[0].iov_len = 1024; iov[1].iov_base = buf + (2 * 4096); iov[1].iov_len = 4096; iov[2].iov_base = buf + (8 * 4096); iov[2].iov_len = 4096; iov[3].iov_base = buf + (16 * 4096); iov[3].iov_len = 3072; ret = preadv(fd, iov, VECS, 0); if (ret < 0) perror("unexpected read failure"); iov[0].iov_base = 0; ret = preadv(fd, iov, VECS, 0); if (ret < 0) perror("expected read failure for invalid address"); iov[0].iov_base = buf; iov[0].iov_len = 1024; iov[1].iov_base = buf + (2 * 4096); iov[1].iov_len = 1024; iov[2].iov_base = buf + (8 * 4096); iov[2].iov_len = 1024; iov[3].iov_base = buf + (16 * 4096); iov[3].iov_len = 1024; ret = preadv(fd, iov, VECS, 0); if (ret < 0) perror("expected read for invalid virtual boundary"); iov[0].iov_base = buf + 3072; iov[0].iov_len = 1025; iov[1].iov_base = buf + (2 * 4096); iov[1].iov_len = 4096; iov[2].iov_base = buf + (8 * 4096); iov[2].iov_len = 4096; iov[3].iov_base = buf + (16 * 4096); iov[3].iov_len = 3073; ret = preadv(fd, iov, VECS, 0); if (ret < 0) perror("expected read for invalid dma boundary"); ret = pread(fd, buf + 2048, BSIZE - 8192, 0); if (ret < 0) perror("unexpected large read failure"); free(buf); return errno; } --
On Mon, Aug 04, 2025 at 11:06:12AM -0600, Keith Busch wrote: > On Sat, Aug 02, 2025 at 09:37:32AM -0600, Jens Axboe wrote: > > Did you write some test cases for this? > > I have some crude unit tests to hit specific conditions that might > happen with nvme. I've made imporvements today that make these targeted tests fit into blktests framework. Just fyi, I took a look at what 'fio' needs in order to exercise these new use cases. This patchset requires multiple io-vectors for anything interesting to happen, which 'fio' currently doesn't do. I'm not even sure what new command line parameters could best convey how you want to construct iovecs! Maybe just make it random within some alignment constraints?
© 2016 - 2025 Red Hat, Inc.