[PATCH v2 1/4][next] uapi: socket: Introduce struct sockaddr_legacy

Gustavo A. R. Silva posted 4 patches 1 month ago
[PATCH v2 1/4][next] uapi: socket: Introduce struct sockaddr_legacy
Posted by Gustavo A. R. Silva 1 month ago
We are currently working on enabling the -Wflex-array-member-not-at-end
compiler option. This option has helped us detect several objects of
the type `struct sockaddr` that appear in the middle of composite
structures like `struct rtentry`, `struct compat_rtentry`, and others:

include/uapi/linux/wireless.h:751:33: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
include/uapi/linux/wireless.h:776:25: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
include/uapi/linux/wireless.h:833:25: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
include/uapi/linux/wireless.h:857:25: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
include/uapi/linux/wireless.h:864:25: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
include/uapi/linux/route.h:33:25: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
include/uapi/linux/route.h:34:25: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
include/uapi/linux/route.h:35:25: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
include/uapi/linux/if_arp.h:118:25: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
include/uapi/linux/if_arp.h:119:25: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
include/uapi/linux/if_arp.h:121:25: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
include/uapi/linux/if_arp.h:126:25: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
include/uapi/linux/if_arp.h:127:25: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
include/net/compat.h:34:25: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
include/net/compat.h:35:25: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]

In order to fix the warnings above, we introduce `struct sockaddr_legacy`.
The intention is to use it to replace the type of several struct members
in the middle of composite structures, currently of type `struct sockaddr`.

These middle struct members are currently causing thousands of warnings
because `struct sockaddr` contains a flexible-array member, introduced
by commit b5f0de6df6dce ("net: dev: Convert sa_data to flexible array in
struct sockaddr").

The new `struct sockaddr_legacy` doesn't include a flexible-array
member, making it suitable for use as the type of middle members
in composite structs that don't really require the flexible-array
member in `struct sockaddr`, thus avoiding -Wflex-array-member-not-at-end
warnings.

As this new struct will live in UAPI, to avoid breaking user-space code
that expects `struct sockaddr`, the `__kernel_sockaddr_legacy` macro is
introduced. This macro allows us to use either `struct sockaddr` or
`struct sockaddr_legacy` depending on the context in which the code is
used: kernel-space or user-space.

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
---
 include/uapi/linux/socket.h | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/include/uapi/linux/socket.h b/include/uapi/linux/socket.h
index d3fcd3b5ec53..2e179706bec4 100644
--- a/include/uapi/linux/socket.h
+++ b/include/uapi/linux/socket.h
@@ -35,4 +35,32 @@ struct __kernel_sockaddr_storage {
 #define SOCK_TXREHASH_DISABLED	0
 #define SOCK_TXREHASH_ENABLED	1
 
+typedef __kernel_sa_family_t    sa_family_t;
+
+/*
+ * This is the legacy form of `struct sockaddr`. The original `struct sockaddr`
+ * was modified in commit b5f0de6df6dce ("net: dev: Convert sa_data to flexible
+ * array in struct sockaddr") due to the fact that "One of the worst offenders
+ * of "fake flexible arrays" is struct sockaddr". This means that the original
+ * `char sa_data[14]` behaved as a flexible array at runtime, so a proper
+ * flexible-array member was introduced.
+ *
+ * This caused several flexible-array-in-the-middle issues:
+ * https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wflex-array-member-not-at-end
+ *
+ * `struct sockaddr_legacy` replaces `struct sockaddr` in all instances where
+ * objects of this type do not appear at the end of composite structures.
+ */
+struct sockaddr_legacy {
+        sa_family_t     sa_family;      /* address family, AF_xxx       */
+        char            sa_data[14];    /* 14 bytes of protocol address */
+};
+
+#ifdef __KERNEL__
+#	define __kernel_sockaddr_legacy		sockaddr_legacy
+#else
+#	define __kernel_sockaddr_legacy		sockaddr
+#endif
+
+
 #endif /* _UAPI_LINUX_SOCKET_H */
-- 
2.34.1
Re: [PATCH v2 1/4][next] uapi: socket: Introduce struct sockaddr_legacy
Posted by Jakub Kicinski 3 weeks, 3 days ago
On Thu, 24 Oct 2024 15:11:24 -0600 Gustavo A. R. Silva wrote:
> + * This is the legacy form of `struct sockaddr`. The original `struct sockaddr`
> + * was modified in commit b5f0de6df6dce ("net: dev: Convert sa_data to flexible
> + * array in struct sockaddr") due to the fact that "One of the worst offenders
> + * of "fake flexible arrays" is struct sockaddr". This means that the original
> + * `char sa_data[14]` behaved as a flexible array at runtime, so a proper
> + * flexible-array member was introduced.

This isn't spelled out in the commit messages AFACT so let me ask..
Why aren't we reverting b5f0de6df6dce, then?
Feels like the best solution would be to have a separate type with
the flex array to clearly annotate users who treat it as such.
Is that not going to work?

My noob reading of b5f0de6df6dce is that it was a simpler workaround
for the previous problem, avoided adding a new type (and the conversion
churn). But now we are adding a type and another workaround on top.
Sorry if I'm misunderstanding. No question that the struct is a mess,
but I don't feel like this is helping the messiness...
Re: [PATCH v2 1/4][next] uapi: socket: Introduce struct sockaddr_legacy
Posted by Kees Cook 3 weeks ago
On Thu, Oct 31, 2024 at 06:01:45PM -0700, Jakub Kicinski wrote:
> On Thu, 24 Oct 2024 15:11:24 -0600 Gustavo A. R. Silva wrote:
> > + * This is the legacy form of `struct sockaddr`. The original `struct sockaddr`
> > + * was modified in commit b5f0de6df6dce ("net: dev: Convert sa_data to flexible
> > + * array in struct sockaddr") due to the fact that "One of the worst offenders
> > + * of "fake flexible arrays" is struct sockaddr". This means that the original
> > + * `char sa_data[14]` behaved as a flexible array at runtime, so a proper
> > + * flexible-array member was introduced.
> 
> This isn't spelled out in the commit messages AFACT so let me ask..
> Why aren't we reverting b5f0de6df6dce, then?
> Feels like the best solution would be to have a separate type with
> the flex array to clearly annotate users who treat it as such.
> Is that not going to work?
> 
> My noob reading of b5f0de6df6dce is that it was a simpler workaround
> for the previous problem, avoided adding a new type (and the conversion
> churn). But now we are adding a type and another workaround on top.
> Sorry if I'm misunderstanding. No question that the struct is a mess,
> but I don't feel like this is helping the messiness...

This is a pretty reasonable position. I think sockaddr turns out to be a
really difficult problem, given its exposure to UAPI. Linux has been
trying to deal with it for a while (e.g. sockaddr_storage), and perhaps
b5f0de6df6dce wasn't the right solution. (It looked correct since it
looks like what we've done in many other tricky places, but perhaps
sockaddr should be dealt with differently yet.)

So, the rationale with b5f0de6df6dce was to change things as minimally
as possible. The goal is to stop lying to the compiler about object
sizes, with the big lie being "this object ends with a 14 byte array".

I think this results in two primary goals:
1- make sockaddr-like objects unambiguously sized
2- do not break userspace

For 2, we cannot change the existing (and externally defined) struct
sockaddr. It will stay the historical horrible thing that it is.

The specific UAPI sockaddr-related things we have that are _fixed_ sizes
are all fine. These can be seen with:
	git grep '^struct sockaddr_' -- include/uapi/

For 1, we cannot use (the externally defined) sockaddr as-is since we
run into size problems. (Basically anything casting from sockaddr,
anything copying out of sockaddr, etc.)

The storage problem for sockaddr has been attempted to be addressed with
the internally defined sockaddr_storage, but this doesn't capture the
"I don't know how big this object is yet" issues associated with taking
a buffer of arbitrary size from userspace that is initially defined
as sockaddr.

If we internally add struct sockaddr_unknown, or _bytes, or _unspec, or
_flex, or _array, or _raw, we can use that everywhere in the kernel, but
it is going to cause a lot of churn. This is why b5f0de6df6dce happened,
and why we went the direction of thinking sockaddr_legacy would be
workable: the kernel's view of sockaddr from userspace (sockaddr_legacy)
would be the only change needed.

Now, if we want to get to a place with the least ambiguity, we need to
abolish sockaddr from the kernel internally, and I think that might take
a while. I only think so because of what I went through just for doing a
portion of NFS in cf0d7e7f4520 ("NFS: Avoid memcpy() run-time warning
for struct sockaddr overflows").

But maybe it won't be as bad all that since we already do a lot of "what
is the family then cast to a fixed-size struct and carry on" stuff. But
the passing things between layers (usually "up" from a family into the
networking core) still depends on using sockaddr, and all of those APIs
would need to switch to sockaddr_unknown both in the core and in all
families. Maybe there is some Coccinelle magic that could be worked.

I think it's worth spending the time to try and figure out the size of
the effort needed to make that change.

-Kees

-- 
Kees Cook
Re: [PATCH v2 1/4][next] uapi: socket: Introduce struct sockaddr_legacy
Posted by Kees Cook 3 weeks, 6 days ago
On Thu, Oct 24, 2024 at 03:11:24PM -0600, Gustavo A. R. Silva wrote:
> diff --git a/include/uapi/linux/socket.h b/include/uapi/linux/socket.h
> index d3fcd3b5ec53..2e179706bec4 100644
> --- a/include/uapi/linux/socket.h
> +++ b/include/uapi/linux/socket.h
> @@ -35,4 +35,32 @@ struct __kernel_sockaddr_storage {
>  #define SOCK_TXREHASH_DISABLED	0
>  #define SOCK_TXREHASH_ENABLED	1
>  
> +typedef __kernel_sa_family_t    sa_family_t;
> +
> +/*
> + * This is the legacy form of `struct sockaddr`. The original `struct sockaddr`
> + * was modified in commit b5f0de6df6dce ("net: dev: Convert sa_data to flexible
> + * array in struct sockaddr") due to the fact that "One of the worst offenders
> + * of "fake flexible arrays" is struct sockaddr". This means that the original
> + * `char sa_data[14]` behaved as a flexible array at runtime, so a proper
> + * flexible-array member was introduced.
> + *
> + * This caused several flexible-array-in-the-middle issues:
> + * https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wflex-array-member-not-at-end
> + *
> + * `struct sockaddr_legacy` replaces `struct sockaddr` in all instances where
> + * objects of this type do not appear at the end of composite structures.
> + */
> +struct sockaddr_legacy {
> +        sa_family_t     sa_family;      /* address family, AF_xxx       */
> +        char            sa_data[14];    /* 14 bytes of protocol address */
> +};
> +
> +#ifdef __KERNEL__
> +#	define __kernel_sockaddr_legacy		sockaddr_legacy
> +#else
> +#	define __kernel_sockaddr_legacy		sockaddr
> +#endif

Yeah, this matches what I'd expect.

Reviewed-by: Kees Cook <kees@kernel.org>

-- 
Kees Cook
Re: [PATCH v2 1/4][next] uapi: socket: Introduce struct sockaddr_legacy
Posted by Andrew Lunn 3 weeks, 6 days ago
> As this new struct will live in UAPI, to avoid breaking user-space code
> that expects `struct sockaddr`, the `__kernel_sockaddr_legacy` macro is
> introduced. This macro allows us to use either `struct sockaddr` or
> `struct sockaddr_legacy` depending on the context in which the code is
> used: kernel-space or user-space.

Are there cases of userspace API structures where the flexiable array
appears in the middle? I assume this new compiler flag is not only for
use in the kernel? When it gets turned on in user space, will the
kernel headers will again produce warnings? Should we be considering
allowing user space to opt in to using sockaddr_legacy?

    Andrew
Re: [PATCH v2 1/4][next] uapi: socket: Introduce struct sockaddr_legacy
Posted by Kees Cook 3 weeks, 6 days ago
On Mon, Oct 28, 2024 at 09:38:46PM +0100, Andrew Lunn wrote:
> > As this new struct will live in UAPI, to avoid breaking user-space code
> > that expects `struct sockaddr`, the `__kernel_sockaddr_legacy` macro is
> > introduced. This macro allows us to use either `struct sockaddr` or
> > `struct sockaddr_legacy` depending on the context in which the code is
> > used: kernel-space or user-space.
> 
> Are there cases of userspace API structures where the flexiable array
> appears in the middle? I assume this new compiler flag is not only for
> use in the kernel? When it gets turned on in user space, will the
> kernel headers will again produce warnings? Should we be considering
> allowing user space to opt in to using sockaddr_legacy?

I expect that the userspace usage of -Wflex-array-member-not-at-end will
be driven by the libc projects, as it'll need to happen there before it
can be done anywhere else. We'll be able to coordinate with them at that
time, but I'm not aware of any plans by any libc to use this flag yet.

-- 
Kees Cook
Re: [PATCH v2 1/4][next] uapi: socket: Introduce struct sockaddr_legacy
Posted by Johannes Berg 3 weeks, 6 days ago
On Mon, 2024-10-28 at 21:38 +0100, Andrew Lunn wrote:
> > As this new struct will live in UAPI, to avoid breaking user-space code
> > that expects `struct sockaddr`, the `__kernel_sockaddr_legacy` macro is
> > introduced. This macro allows us to use either `struct sockaddr` or
> > `struct sockaddr_legacy` depending on the context in which the code is
> > used: kernel-space or user-space.
> 
> Are there cases of userspace API structures where the flexiable array
> appears in the middle?

Clearly, it's the case for all the three other patches in this series.

> I assume this new compiler flag is not only for
> use in the kernel? When it gets turned on in user space, will the
> kernel headers will again produce warnings? Should we be considering
> allowing user space to opt in to using sockaddr_legacy?

For the userspace covered by patch 2 this will almost certainly never
happen, and I suspect that might also be true for the others (arp and
rtnetlink ioctls)? But it probably wouldn't be difficult either.

johannes 
Re: [PATCH v2 1/4][next] uapi: socket: Introduce struct sockaddr_legacy
Posted by Kees Cook 3 weeks, 6 days ago
On Mon, Oct 28, 2024 at 09:47:08PM +0100, Johannes Berg wrote:
> On Mon, 2024-10-28 at 21:38 +0100, Andrew Lunn wrote:
> > > As this new struct will live in UAPI, to avoid breaking user-space code
> > > that expects `struct sockaddr`, the `__kernel_sockaddr_legacy` macro is
> > > introduced. This macro allows us to use either `struct sockaddr` or
> > > `struct sockaddr_legacy` depending on the context in which the code is
> > > used: kernel-space or user-space.
> > 
> > Are there cases of userspace API structures where the flexiable array
> > appears in the middle?
> 
> Clearly, it's the case for all the three other patches in this series.

The issue is that the kernel uses these structures, and the kernel's view
of sockaddr is that it (correctly) has a flexible array.  Userspace's view
of sockaddr is the old struct (which comes from the libc, not the kernel)
which ends with a fake flexible array. We need to correct the kernel's
view of these structures to use the introduced legacy struct to avoid
lying to the compiler about what's going on. :)

-- 
Kees Cook