[PATCH 002/437] fs: add generic read/write iterator helpers

Jens Axboe posted 437 patches 1 year, 8 months ago
[PATCH 002/437] fs: add generic read/write iterator helpers
Posted by Jens Axboe 1 year, 8 months ago
We already do this internally for vfs_readv() and vfs_writev(), which
need to check what method to use. Add generic helpers for this so that
drivers can do this themselves, if they haven't converted to using the
read/write iterator file_operations hooks just yet.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/read_write.c    | 18 ++++++++++++++++++
 include/linux/fs.h |  6 ++++++
 2 files changed, 24 insertions(+)

diff --git a/fs/read_write.c b/fs/read_write.c
index 82ec75937b08..1d035293607b 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -802,6 +802,24 @@ static ssize_t do_loop_writev(struct file *file, struct iov_iter *iter,
 	return ret;
 }
 
+/* generic read side helper for drivers converting to ->read_iter() */
+ssize_t vfs_read_iter(struct kiocb *iocb, struct iov_iter *to,
+		      ssize_t (*read)(struct file *, char __user *,
+				     size_t, loff_t *))
+{
+	return do_loop_readv(iocb->ki_filp, to, &iocb->ki_pos, 0, read);
+}
+EXPORT_SYMBOL(vfs_read_iter);
+
+/* generic write side helper for drivers converting to ->write_iter() */
+ssize_t vfs_write_iter(struct kiocb *iocb, struct iov_iter *from,
+		       ssize_t (*write)(struct file *, const char __user *,
+				       size_t, loff_t *))
+{
+	return do_loop_writev(iocb->ki_filp, from, &iocb->ki_pos, 0, write);
+}
+EXPORT_SYMBOL(vfs_write_iter);
+
 ssize_t vfs_iocb_iter_read(struct file *file, struct kiocb *iocb,
 			   struct iov_iter *iter)
 {
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 8dfd53b52744..fd862985a309 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2119,6 +2119,12 @@ extern ssize_t vfs_read(struct file *, char __user *, size_t, loff_t *);
 extern ssize_t vfs_write(struct file *, const char __user *, size_t, loff_t *);
 extern ssize_t vfs_copy_file_range(struct file *, loff_t , struct file *,
 				   loff_t, size_t, unsigned int);
+ssize_t vfs_write_iter(struct kiocb *iocb, struct iov_iter *from,
+		       ssize_t (*write)(struct file *, const char __user *,
+				        size_t, loff_t *));
+ssize_t vfs_read_iter(struct kiocb *iocb, struct iov_iter *to,
+		      ssize_t (*read)(struct file *, char __user *,
+				      size_t, loff_t *));
 int __generic_remap_file_range_prep(struct file *file_in, loff_t pos_in,
 				    struct file *file_out, loff_t pos_out,
 				    loff_t *len, unsigned int remap_flags,
-- 
2.43.0
Re: [PATCH 002/437] fs: add generic read/write iterator helpers
Posted by Al Viro 1 year, 8 months ago
On Thu, Apr 11, 2024 at 09:12:22AM -0600, Jens Axboe wrote:

> +/* generic read side helper for drivers converting to ->read_iter() */
> +ssize_t vfs_read_iter(struct kiocb *iocb, struct iov_iter *to,
> +		      ssize_t (*read)(struct file *, char __user *,
> +				     size_t, loff_t *))
> +{
> +	return do_loop_readv(iocb->ki_filp, to, &iocb->ki_pos, 0, read);
> +}
> +EXPORT_SYMBOL(vfs_read_iter);
> +
> +/* generic write side helper for drivers converting to ->write_iter() */
> +ssize_t vfs_write_iter(struct kiocb *iocb, struct iov_iter *from,
> +		       ssize_t (*write)(struct file *, const char __user *,
> +				       size_t, loff_t *))
> +{
> +	return do_loop_writev(iocb->ki_filp, from, &iocb->ki_pos, 0, write);
> +}
> +EXPORT_SYMBOL(vfs_write_iter);

Wait a minute; just what do you expect to happen if that ever gets called
for ITER_BVEC or ITER_XARRAY?
RE: [PATCH 002/437] fs: add generic read/write iterator helpers
Posted by David Laight 1 year, 8 months ago
From: Al Viro
> Sent: 15 April 2024 20:55
> 
> On Thu, Apr 11, 2024 at 09:12:22AM -0600, Jens Axboe wrote:
> 
> > +/* generic read side helper for drivers converting to ->read_iter() */
> > +ssize_t vfs_read_iter(struct kiocb *iocb, struct iov_iter *to,
> > +		      ssize_t (*read)(struct file *, char __user *,
> > +				     size_t, loff_t *))
> > +{
> > +	return do_loop_readv(iocb->ki_filp, to, &iocb->ki_pos, 0, read);
> > +}
> > +EXPORT_SYMBOL(vfs_read_iter);
> > +
> > +/* generic write side helper for drivers converting to ->write_iter() */
> > +ssize_t vfs_write_iter(struct kiocb *iocb, struct iov_iter *from,
> > +		       ssize_t (*write)(struct file *, const char __user *,
> > +				       size_t, loff_t *))
> > +{
> > +	return do_loop_writev(iocb->ki_filp, from, &iocb->ki_pos, 0, write);
> > +}
> > +EXPORT_SYMBOL(vfs_write_iter);
> 
> Wait a minute; just what do you expect to happen if that ever gets called
> for ITER_BVEC or ITER_XARRAY?

The extra indirect call is also going to be noticeable.
You need a code loop with a direct call.
That probably requires the loop to be a #define.

I was also thinking about drivers that only handle 'user' buffers and
where there really isn't a requirement to do anything else.

I've a driver that basically does:
	if (!access_ok(....))
		return -EFAULT;
	for (off = 0; off < len; off += 8) {
		if (__put_user(readq(io_addr + off), uaddr + off))
			return -EFAULT;
	}

Any non-trivial change requires a function that return the first/only
user buffer address/length and an error for a non-user address.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Re: [PATCH 002/437] fs: add generic read/write iterator helpers
Posted by Jens Axboe 1 year, 8 months ago
On 4/15/24 1:55 PM, Al Viro wrote:
> On Thu, Apr 11, 2024 at 09:12:22AM -0600, Jens Axboe wrote:
> 
>> +/* generic read side helper for drivers converting to ->read_iter() */
>> +ssize_t vfs_read_iter(struct kiocb *iocb, struct iov_iter *to,
>> +		      ssize_t (*read)(struct file *, char __user *,
>> +				     size_t, loff_t *))
>> +{
>> +	return do_loop_readv(iocb->ki_filp, to, &iocb->ki_pos, 0, read);
>> +}
>> +EXPORT_SYMBOL(vfs_read_iter);
>> +
>> +/* generic write side helper for drivers converting to ->write_iter() */
>> +ssize_t vfs_write_iter(struct kiocb *iocb, struct iov_iter *from,
>> +		       ssize_t (*write)(struct file *, const char __user *,
>> +				       size_t, loff_t *))
>> +{
>> +	return do_loop_writev(iocb->ki_filp, from, &iocb->ki_pos, 0, write);
>> +}
>> +EXPORT_SYMBOL(vfs_write_iter);
> 
> Wait a minute; just what do you expect to happen if that ever gets called
> for ITER_BVEC or ITER_XARRAY?

do_loop_{readv,writev} need to look like the one io_uring had, which was
just an augmented version of the fs/ version. At least for handling
anything that is IOVEC/UBUF/BVEC. Outside of that, should not be
callable for eg ITER_XARRAY, who would do that? We should probably add a
check at the top of each just to vet the iter type and -EINVAL if it's
not one of the supported ones. With a WARN_ON_ONCE(), I suspect.

I'll be posting the first patches separately again, I've made some local
tweaks, with some drivers that can support it as-is. Just haven't gotten
around to doing this weeks iteration on it yet.

-- 
Jens Axboe
Re: [PATCH 002/437] fs: add generic read/write iterator helpers
Posted by Al Viro 1 year, 8 months ago
On Mon, Apr 15, 2024 at 02:11:56PM -0600, Jens Axboe wrote:

> do_loop_{readv,writev} need to look like the one io_uring had, which was
> just an augmented version of the fs/ version. At least for handling
> anything that is IOVEC/UBUF/BVEC.

IOVEC and UBUF: pointer will be __user one; copy_from_user() works.
KVEC: kernel pointer.  Try copy_from_user() on that on x86 with SMAP (or
on e.g. sparc64, etc.) and you'll get an oops.
BVEC: page + offset, page quite possibly not mapped anywhere in kernel page
tables.  And "just kmap() around the call of your callback" is not a good
idea either, for even more reason that for KVEC.
Re: [PATCH 002/437] fs: add generic read/write iterator helpers
Posted by Jens Axboe 1 year, 8 months ago
On 4/15/24 3:08 PM, Al Viro wrote:
> On Mon, Apr 15, 2024 at 02:11:56PM -0600, Jens Axboe wrote:
> 
>> do_loop_{readv,writev} need to look like the one io_uring had, which was
>> just an augmented version of the fs/ version. At least for handling
>> anything that is IOVEC/UBUF/BVEC.
> 
> IOVEC and UBUF: pointer will be __user one; copy_from_user() works.
> KVEC: kernel pointer.  Try copy_from_user() on that on x86 with SMAP (or
> on e.g. sparc64, etc.) and you'll get an oops.
> BVEC: page + offset, page quite possibly not mapped anywhere in kernel page
> tables.  And "just kmap() around the call of your callback" is not a good
> idea either, for even more reason that for KVEC.

The old read/write path only handled user pointers, with the exception
being bvecs mapped on the io_uring side (which the io_uring version
dealt with) which also originally came from user pointers. So just user
memory. Why would we need to expand that now? Who would be doing
->read_iter() or ->write_iter() with something that isn't either UBUF or
IOVEC? Because that would break horrible as it is in the current kernel.

-- 
Jens Axboe
Re: [PATCH 002/437] fs: add generic read/write iterator helpers
Posted by Al Viro 1 year, 8 months ago
On Mon, Apr 15, 2024 at 03:16:12PM -0600, Jens Axboe wrote:

> The old read/write path only handled user pointers, with the exception
> being bvecs mapped on the io_uring side (which the io_uring version
> dealt with) which also originally came from user pointers. So just user
> memory. Why would we need to expand that now? Who would be doing
> ->read_iter() or ->write_iter() with something that isn't either UBUF or
> IOVEC? Because that would break horrible as it is in the current kernel.

No, it will not.  And yes, it does happen.  Look, for example, at
fs/coredump.c:dump_emit_page().  ->write_iter() (regular file or pipe one)
is called.  On ITER_BVEC.

It happens, and not only there.  Look at how /dev/loop works, for crying
out loud!  You get a struct request; the backing pages might very well have
_never_ been mapped to any userland address (e.g. when it's something like
a directory block).  And you hit this:

static int lo_write_simple(struct loop_device *lo, struct request *rq,
                loff_t pos)
{
        struct bio_vec bvec;
        struct req_iterator iter;
        int ret = 0;

        rq_for_each_segment(bvec, rq, iter) {
                ret = lo_write_bvec(lo->lo_backing_file, &bvec, &pos);
                if (ret < 0)
                        break;
                cond_resched();
        }

        return ret;
}

with lo_write_bvec() being

static int lo_write_bvec(struct file *file, struct bio_vec *bvec, loff_t *ppos)
{
        struct iov_iter i;
        ssize_t bw;

        iov_iter_bvec(&i, ITER_SOURCE, bvec, 1, bvec->bv_len);

        bw = vfs_iter_write(file, &i, ppos, 0);

        if (likely(bw ==  bvec->bv_len))
                return 0;   

        printk_ratelimited(KERN_ERR
                "loop: Write error at byte offset %llu, length %i.\n",
                (unsigned long long)*ppos, bvec->bv_len);
        if (bw >= 0)
                bw = -EIO;
        return bw;
}


Neither ->read_iter() nor ->write_iter() can make an assumption that it
will be used with userland buffer.  And yes, it works...