RE: [PATCH v10 0/4] Replace fallback for IO memcpy and IO memset

David Laight posted 4 patches 1 month ago
Only 0 patches received!
RE: [PATCH v10 0/4] Replace fallback for IO memcpy and IO memset
Posted by David Laight 1 month ago
From: Julian Vetter
> Sent: 21 October 2024 14:32
> 
> Thank you again for your remarks Arnd and Christoph! I have updated the
> patchset, and placed the functions directly in asm-generic/io.h. I have
> dropped the libs/iomem_copy.c and have updated/clarified the commit
> message in the first patch.

Apart from build 'issues' what is the justification for inlining
these functions?

They are quite large for inlining and some drivers could easily
call them many times.

The I/O cycles themselves are likely to be slow enough that
the cost of a function call is pretty much likely to be noise.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Re: [PATCH v10 0/4] Replace fallback for IO memcpy and IO memset
Posted by Arnd Bergmann 1 month ago
On Mon, Oct 21, 2024, at 14:16, David Laight wrote:
> From: Julian Vetter
>> Sent: 21 October 2024 14:32
>> 
>> Thank you again for your remarks Arnd and Christoph! I have updated the
>> patchset, and placed the functions directly in asm-generic/io.h. I have
>> dropped the libs/iomem_copy.c and have updated/clarified the commit
>> message in the first patch.
>
> Apart from build 'issues' what is the justification for inlining
> these functions?

I think I wasn't clear enough with my previous comment, and Julian
just misunderstood what I was asking him to do. Sorry about causing
extra work here.

> They are quite large for inlining and some drivers could easily
> call them many times.
>
> The I/O cycles themselves are likely to be slow enough that
> the cost of a function call is pretty much likely to be noise.

I'm not overly worried about the this, as the functions are
not that big and there are not that many callers. If a file
contains multiple calls to this function, we can expect the
compiler to be smart enough to keep it out of line, though it
still gets duplicated in each driver calling it.

The bit that I am worried about however is the extra #include
for linux/unaligned.h that pulls in fairly large headers
and may lead to circular header dependencies.

To be clear: what I had expected here was to not have any
changes to the v9 version of lib/iomem_copy.c and to simplify
the asm-generic/io.h change to the version below.

       Arnd

---
--- a/include/asm-generic/io.h
+++ b/include/asm-generic/io.h
@@ -1211,7 +1211,6 @@ static inline void unxlate_dev_mem_ptr(phys_addr_t phys, void *addr)
 #endif
 
 #ifndef memset_io
-#define memset_io memset_io
 /**
  * memset_io   Set a range of I/O memory to a constant value
  * @addr:      The beginning of the I/O-memory range to set
@@ -1220,15 +1219,10 @@ static inline void unxlate_dev_mem_ptr(phys_addr_t phys, void *addr)
  *
  * Set a range of I/O memory to a given value.
  */
-static inline void memset_io(volatile void __iomem *addr, int value,
-                            size_t size)
-{
-       memset(__io_virt(addr), value, size);
-}
+void memset_io(volatile void __iomem *addr, int value, size_t size);
 #endif
 
 #ifndef memcpy_fromio
-#define memcpy_fromio memcpy_fromio
 /**
  * memcpy_fromio       Copy a block of data from I/O memory
  * @dst:               The (RAM) destination for the copy
@@ -1237,16 +1231,11 @@ static inline void memset_io(volatile void __iomem *addr, int value,
  *
  * Copy a block of data from I/O memory.
  */
-static inline void memcpy_fromio(void *buffer,
-                                const volatile void __iomem *addr,
-                                size_t size)
-{
-       memcpy(buffer, __io_virt(addr), size);
-}
+void memcpy_fromio(void *buffer, const volatile void __iomem *addr,
+                  size_t size);
 #endif
 
 #ifndef memcpy_toio
-#define memcpy_toio memcpy_toio
 /**
  * memcpy_toio         Copy a block of data into I/O memory
  * @dst:               The (I/O memory) destination for the copy
@@ -1255,11 +1244,8 @@ static inline void memcpy_fromio(void *buffer,
  *
  * Copy a block of data to I/O memory.
  */
-static inline void memcpy_toio(volatile void __iomem *addr, const void *buffer,
-                              size_t size)
-{
-       memcpy(__io_virt(addr), buffer, size);
-}
+void memcpy_toio(volatile void __iomem *addr, const void *buffer,
+                size_t size);
 #endif
 
 extern int devmem_is_allowed(unsigned long pfn);