[PATCH v3 2/2] iov_iter: Don't deal with iter->copy_mc in memcpy_from_iter_mc()

David Howells posted 2 patches 2 years, 4 months ago
There is a newer version of this series
[PATCH v3 2/2] iov_iter: Don't deal with iter->copy_mc in memcpy_from_iter_mc()
Posted by David Howells 2 years, 4 months ago
iter->copy_mc is only used with a bvec iterator and only by
dump_emit_page() in fs/coredump.c so rather than handle this in
memcpy_from_iter_mc() where it is checked repeatedly by _copy_from_iter() and
copy_page_from_iter_atomic(),
---
 lib/iov_iter.c | 39 +++++++++++++++++++++++++++------------
 1 file changed, 27 insertions(+), 12 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 378da0cb3069..33febccadd9d 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -450,14 +450,33 @@ size_t _copy_mc_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
 EXPORT_SYMBOL_GPL(_copy_mc_to_iter);
 #endif /* CONFIG_ARCH_HAS_COPY_MC */
 
-static size_t memcpy_from_iter_mc(void *iter_from, size_t progress,
-				  size_t len, void *to, void *priv2)
+static __always_inline
+size_t memcpy_from_iter_mc(void *iter_from, size_t progress,
+			   size_t len, void *to, void *priv2)
 {
-	struct iov_iter *iter = priv2;
+	return copy_mc_to_kernel(to + progress, iter_from, len);
+}
 
-	if (iov_iter_is_copy_mc(iter))
-		return copy_mc_to_kernel(to + progress, iter_from, len);
-	return memcpy_from_iter(iter_from, progress, len, to, priv2);
+static size_t __copy_from_iter_mc(void *addr, size_t bytes, struct iov_iter *i)
+{
+	size_t progress;
+
+	if (unlikely(i->count < bytes))
+		bytes = i->count;
+	if (unlikely(!bytes))
+		return 0;
+	progress = iterate_bvec(i, bytes, addr, NULL, memcpy_from_iter_mc);
+	i->count -= progress;
+	return progress;
+}
+
+static __always_inline
+size_t __copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
+{
+	if (unlikely(iov_iter_is_copy_mc(i)))
+		return __copy_from_iter_mc(addr, bytes, i);
+	return iterate_and_advance(i, bytes, addr,
+				   copy_from_user_iter, memcpy_from_iter);
 }
 
 size_t _copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
@@ -467,9 +486,7 @@ size_t _copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
 
 	if (user_backed_iter(i))
 		might_fault();
-	return iterate_and_advance2(i, bytes, addr, i,
-				    copy_from_user_iter,
-				    memcpy_from_iter_mc);
+	return __copy_from_iter(addr, bytes, i);
 }
 EXPORT_SYMBOL(_copy_from_iter);
 
@@ -690,9 +707,7 @@ size_t copy_page_from_iter_atomic(struct page *page, size_t offset,
 		}
 
 		p = kmap_atomic(page) + offset;
-		n = iterate_and_advance2(i, n, p, i,
-					 copy_from_user_iter,
-					 memcpy_from_iter_mc);
+		__copy_from_iter(p, n, i);
 		kunmap_atomic(p);
 		copied += n;
 		offset += n;
RE: [PATCH v3 2/2] iov_iter: Don't deal with iter->copy_mc in memcpy_from_iter_mc()
Posted by David Laight 2 years, 4 months ago
From: David Howells
> Sent: Wednesday, August 16, 2023 1:08 PM

Given:

> iter->copy_mc is only used with a bvec iterator and only by
> dump_emit_page() in fs/coredump.c so rather than handle this in
> memcpy_from_iter_mc() where it is checked repeatedly by _copy_from_iter() and
> copy_page_from_iter_atomic(),

Instead of adding the extra check to every copy:

> +static __always_inline
> +size_t __copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
> +{
> +	if (unlikely(iov_iter_is_copy_mc(i)))
> +		return __copy_from_iter_mc(addr, bytes, i);
> +	return iterate_and_advance(i, bytes, addr,
> +				   copy_from_user_iter, memcpy_from_iter);
>  }

Couldn't the relevant code directly call __copy_frtom_iter_mc() ?
Or a version then checked iov_is_copy_mc() and then fell
back to the standard function.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Re: [PATCH v3 2/2] iov_iter: Don't deal with iter->copy_mc in memcpy_from_iter_mc()
Posted by David Howells 2 years, 4 months ago
David Laight <David.Laight@ACULAB.COM> wrote:

> 
> Couldn't the relevant code directly call __copy_from_iter_mc() ?
> Or a version then checked iov_is_copy_mc() and then fell
> back to the standard function.

No, because the marked iterator is handed by the coredump code to
__kernel_write_iter() and thence on to who-knows-what driver - which will call
copy_from_iter() or some such.  $DRIVER shouldn't need to know about
->copy_mc.

One thing I do wonder about, though, is what should happen if they call, say,
csum_and_copy_from_iter()?  That doesn't have an _mc variant.  Or what if they
extract the pages and operate directly on those?

David
RE: [PATCH v3 2/2] iov_iter: Don't deal with iter->copy_mc in memcpy_from_iter_mc()
Posted by David Laight 2 years, 4 months ago
From: David Howells
> Sent: Wednesday, August 16, 2023 2:01 PM
> 
> David Laight <David.Laight@ACULAB.COM> wrote:
> 
> >
> > Couldn't the relevant code directly call __copy_from_iter_mc() ?
> > Or a version then checked iov_is_copy_mc() and then fell
> > back to the standard function.
> 
> No, because the marked iterator is handed by the coredump code to
> __kernel_write_iter() and thence on to who-knows-what driver - which will call
> copy_from_iter() or some such.  $DRIVER shouldn't need to know about
> ->copy_mc.

What about ITER_BVEC_MC ??

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Re: [PATCH v3 2/2] iov_iter: Don't deal with iter->copy_mc in memcpy_from_iter_mc()
Posted by Linus Torvalds 2 years, 4 months ago
On Wed, 16 Aug 2023 at 14:19, David Laight <David.Laight@aculab.com> wrote:
>
> What about ITER_BVEC_MC ??

That probably would be the best option. Just make it a proper
ITER_xyz, instead of an odd sub-case for one ITER (but set up in such
a way that it looks like it might happen for other ITER_xyz cases).

                Linus
Re: [PATCH v3 2/2] iov_iter: Don't deal with iter->copy_mc in memcpy_from_iter_mc()
Posted by David Howells 2 years, 4 months ago
Would it make sense to always check for MCE in _copy_from_iter() and always
return a short transfer if we encounter one?  It looks pretty cheap in terms
of code size as the exception table stuff handles it, so we don't need to do
anything in the normal path.

I guess this would change the handling of memory errors and DAX errors.

David
---
iov_iter: Always handle MCE in _copy_to_iter()

(incomplete)

---
 arch/x86/include/asm/mce.h |   22 ++++++++++++++++++++++
 fs/coredump.c              |    1 -
 include/linux/uio.h        |   16 ----------------
 lib/iov_iter.c             |   17 +++++------------
 4 files changed, 27 insertions(+), 29 deletions(-)

diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index 180b1cbfcc4e..ee3ff090360d 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -353,4 +353,26 @@ static inline void mce_hygon_feature_init(struct cpuinfo_x86 *c)	{ return mce_am
 
 unsigned long copy_mc_fragile_handle_tail(char *to, char *from, unsigned len);
 
+static __always_inline __must_check
+unsigned long memcpy_mc(void *to, const void *from, unsigned long len)
+{
+#ifdef CONFIG_ARCH_HAS_COPY_MC
+	/*
+	 * If CPU has FSRM feature, use 'rep movs'.
+	 * Otherwise, use rep_movs_alternative.
+	 */
+	asm volatile(
+		"1:\n\t"
+		ALTERNATIVE("rep movsb",
+			    "call rep_movs_alternative", ALT_NOT(X86_FEATURE_FSRM))
+		"2:\n"
+		_ASM_EXTABLE_TYPE(1b, 2b, EX_TYPE_DEFAULT_MCE_SAFE)
+		:"+c" (len), "+D" (to), "+S" (from), ASM_CALL_CONSTRAINT
+		: : "memory", "rax", "r8", "r9", "r10", "r11");
+#else
+	return memcpy(to, from, len);
+#endif
+	return len;
+}
+
 #endif /* _ASM_X86_MCE_H */
diff --git a/fs/coredump.c b/fs/coredump.c
index 9d235fa14ab9..ad54102a5e14 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -884,7 +884,6 @@ static int dump_emit_page(struct coredump_params *cprm, struct page *page)
 	pos = file->f_pos;
 	bvec_set_page(&bvec, page, PAGE_SIZE, 0);
 	iov_iter_bvec(&iter, ITER_SOURCE, &bvec, 1, PAGE_SIZE);
-	iov_iter_set_copy_mc(&iter);
 	n = __kernel_write_iter(cprm->file, &iter, &pos);
 	if (n != PAGE_SIZE)
 		return 0;
diff --git a/include/linux/uio.h b/include/linux/uio.h
index 42bce38a8e87..73078ba297b7 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -40,7 +40,6 @@ struct iov_iter_state {
 
 struct iov_iter {
 	u8 iter_type;
-	bool copy_mc;
 	bool nofault;
 	bool data_source;
 	bool user_backed;
@@ -252,22 +251,8 @@ size_t _copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i);
 
 #ifdef CONFIG_ARCH_HAS_COPY_MC
 size_t _copy_mc_to_iter(const void *addr, size_t bytes, struct iov_iter *i);
-static inline void iov_iter_set_copy_mc(struct iov_iter *i)
-{
-	i->copy_mc = true;
-}
-
-static inline bool iov_iter_is_copy_mc(const struct iov_iter *i)
-{
-	return i->copy_mc;
-}
 #else
 #define _copy_mc_to_iter _copy_to_iter
-static inline void iov_iter_set_copy_mc(struct iov_iter *i) { }
-static inline bool iov_iter_is_copy_mc(const struct iov_iter *i)
-{
-	return false;
-}
 #endif
 
 size_t iov_iter_zero(size_t bytes, struct iov_iter *);
@@ -382,7 +367,6 @@ static inline void iov_iter_ubuf(struct iov_iter *i, unsigned int direction,
 	WARN_ON(direction & ~(READ | WRITE));
 	*i = (struct iov_iter) {
 		.iter_type = ITER_UBUF,
-		.copy_mc = false,
 		.user_backed = true,
 		.data_source = direction,
 		.ubuf = buf,
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index d282fd4d348f..887d9cb9be4e 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -14,6 +14,7 @@
 #include <linux/scatterlist.h>
 #include <linux/instrumented.h>
 #include <linux/iov_iter.h>
+#include <asm/mce.h>
 
 static __always_inline
 size_t copy_to_user_iter(void __user *iter_to, size_t progress,
@@ -168,7 +169,6 @@ void iov_iter_init(struct iov_iter *i, unsigned int direction,
 	WARN_ON(direction & ~(READ | WRITE));
 	*i = (struct iov_iter) {
 		.iter_type = ITER_IOVEC,
-		.copy_mc = false,
 		.nofault = false,
 		.user_backed = true,
 		.data_source = direction,
@@ -254,14 +254,11 @@ size_t _copy_mc_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
 EXPORT_SYMBOL_GPL(_copy_mc_to_iter);
 #endif /* CONFIG_ARCH_HAS_COPY_MC */
 
-static size_t memcpy_from_iter_mc(void *iter_from, size_t progress,
-				  size_t len, void *to, void *priv2)
+static __always_inline
+size_t memcpy_from_iter_mc(void *iter_from, size_t progress,
+			   size_t len, void *to, void *priv2)
 {
-	struct iov_iter *iter = priv2;
-
-	if (iov_iter_is_copy_mc(iter))
-		return copy_mc_to_kernel(to + progress, iter_from, len);
-	return memcpy_from_iter(iter_from, progress, len, to, priv2);
+	return memcpy_mc(to + progress, iter_from, len);
 }
 
 size_t _copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
@@ -632,7 +629,6 @@ void iov_iter_kvec(struct iov_iter *i, unsigned int direction,
 	WARN_ON(direction & ~(READ | WRITE));
 	*i = (struct iov_iter){
 		.iter_type = ITER_KVEC,
-		.copy_mc = false,
 		.data_source = direction,
 		.kvec = kvec,
 		.nr_segs = nr_segs,
@@ -649,7 +645,6 @@ void iov_iter_bvec(struct iov_iter *i, unsigned int direction,
 	WARN_ON(direction & ~(READ | WRITE));
 	*i = (struct iov_iter){
 		.iter_type = ITER_BVEC,
-		.copy_mc = false,
 		.data_source = direction,
 		.bvec = bvec,
 		.nr_segs = nr_segs,
@@ -678,7 +673,6 @@ void iov_iter_xarray(struct iov_iter *i, unsigned int direction,
 	BUG_ON(direction & ~1);
 	*i = (struct iov_iter) {
 		.iter_type = ITER_XARRAY,
-		.copy_mc = false,
 		.data_source = direction,
 		.xarray = xarray,
 		.xarray_start = start,
@@ -702,7 +696,6 @@ void iov_iter_discard(struct iov_iter *i, unsigned int direction, size_t count)
 	BUG_ON(direction != READ);
 	*i = (struct iov_iter){
 		.iter_type = ITER_DISCARD,
-		.copy_mc = false,
 		.data_source = false,
 		.count = count,
 		.iov_offset = 0
Re: [PATCH v3 2/2] iov_iter: Don't deal with iter->copy_mc in memcpy_from_iter_mc()
Posted by David Howells 2 years, 4 months ago
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> > What about ITER_BVEC_MC ??
> 
> That probably would be the best option. Just make it a proper
> ITER_xyz, instead of an odd sub-case for one ITER (but set up in such
> a way that it looks like it might happen for other ITER_xyz cases).

I'm not sure that buys us anything.  It would then require every call to
iov_iter_is_bvec()[*] to check for two values instead of one - including in
iterate_and_advance() - and *still* we'd have to have the special-casing in
_copy_from_iter() and copy_page_from_iter_atomic().

The issue is that ITER_xyz changes the iteration function - but we don't
actually want to do that; rather, we need to change the step function.

David

[*] There's a bunch of them outside of iov_iter.c.