tools/testing/selftests/drivers/net/hw/ncdevmem.c | 4 ++++ 1 file changed, 4 insertions(+)
devmem test fails on NIPA. Most likely we get skb(s) with readable
frags (why?) but the failure manifests as an OOM. The OOM happens
because ncdevmem spams the following message:
recvmsg ret=-1
recvmsg: Bad address
As of today, ncdevmem can't deal with various reasons of EFAULT:
- falling back to regular recvmsg for non-devmem skbs
- increasing ctrl_data size (can't happen with ncdevmem's large buffer)
Exit (cleanly) with error when recvmsg returns EFAULT. This should at
least cause the test to cleanup its state.
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
---
tools/testing/selftests/drivers/net/hw/ncdevmem.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/testing/selftests/drivers/net/hw/ncdevmem.c b/tools/testing/selftests/drivers/net/hw/ncdevmem.c
index 8dc9511d046f..c0a22938bed2 100644
--- a/tools/testing/selftests/drivers/net/hw/ncdevmem.c
+++ b/tools/testing/selftests/drivers/net/hw/ncdevmem.c
@@ -945,6 +945,10 @@ static int do_server(struct memory_buffer *mem)
continue;
if (ret < 0) {
perror("recvmsg");
+ if (errno == EFAULT) {
+ pr_err("received EFAULT, won't recover");
+ goto err_close_client;
+ }
continue;
}
if (ret == 0) {
--
2.51.0
On Thu, Sep 4, 2025 at 11:27 AM Stanislav Fomichev <sdf@fomichev.me> wrote:
>
> devmem test fails on NIPA. Most likely we get skb(s) with readable
> frags (why?)
I would expect if we get readable frags that the frags land in the
host buffer we provide ncdevmem and we actually hit this error:
```
1 if (!is_devmem) {
0 pr_err("flow steering error");
1 goto err_close_client;
2 }
```
which as it says, should be root caused in a flow steering error. I
don't know what would cause an EFAULT off the top of my head.
> but the failure manifests as an OOM. The OOM happens
> because ncdevmem spams the following message:
>
> recvmsg ret=-1
> recvmsg: Bad address
>
> As of today, ncdevmem can't deal with various reasons of EFAULT:
> - falling back to regular recvmsg for non-devmem skbs
> - increasing ctrl_data size (can't happen with ncdevmem's large buffer)
>
> Exit (cleanly) with error when recvmsg returns EFAULT. This should at
> least cause the test to cleanup its state.
>
> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Either way, change looks good to me.
Reviewed-by: Mina Almasry <almasrymina@google.com>
--
Thanks,
Mina
On 09/04, Mina Almasry wrote:
> On Thu, Sep 4, 2025 at 11:27 AM Stanislav Fomichev <sdf@fomichev.me> wrote:
> >
> > devmem test fails on NIPA. Most likely we get skb(s) with readable
> > frags (why?)
>
> I would expect if we get readable frags that the frags land in the
> host buffer we provide ncdevmem and we actually hit this error:
>
> ```
> 1 if (!is_devmem) {
> 0 pr_err("flow steering error");
> 1 goto err_close_client;
> 2 }
> ```
>
> which as it says, should be root caused in a flow steering error. I
> don't know what would cause an EFAULT off the top of my head.
Yea, I don't understand what happens :-( I'm thinking of doing the
following as well:
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 40b774b4f587..0c18a8c7965f 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2820,7 +2820,7 @@ static int tcp_recvmsg_locked(struct sock *sk, struct msghdr *msg, size_t len,
used);
if (err <= 0) {
if (!copied)
- copied = -EFAULT;
+ copied = err;
break;
}
Should give us more info for the devmem case... LMK if you don't like
it. If I don't hear from you in a couple of days, I'll send it out..
On Fri, Sep 5, 2025 at 8:22 AM Stanislav Fomichev <stfomichev@gmail.com> wrote:
>
> On 09/04, Mina Almasry wrote:
> > On Thu, Sep 4, 2025 at 11:27 AM Stanislav Fomichev <sdf@fomichev.me> wrote:
> > >
> > > devmem test fails on NIPA. Most likely we get skb(s) with readable
> > > frags (why?)
> >
> > I would expect if we get readable frags that the frags land in the
> > host buffer we provide ncdevmem and we actually hit this error:
> >
> > ```
> > 1 if (!is_devmem) {
> > 0 pr_err("flow steering error");
> > 1 goto err_close_client;
> > 2 }
> > ```
> >
> > which as it says, should be root caused in a flow steering error. I
> > don't know what would cause an EFAULT off the top of my head.
>
> Yea, I don't understand what happens :-( I'm thinking of doing the
> following as well:
>
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 40b774b4f587..0c18a8c7965f 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -2820,7 +2820,7 @@ static int tcp_recvmsg_locked(struct sock *sk, struct msghdr *msg, size_t len,
> used);
> if (err <= 0) {
> if (!copied)
> - copied = -EFAULT;
> + copied = err;
>
> break;
> }
>
> Should give us more info for the devmem case... LMK if you don't like
> it. If I don't hear from you in a couple of days, I'll send it out..
Hmm, the other code paths overwrite the error to EFAULT; I don't know
if that's significant in some way. But seems fine to me, I don't see
why not do this, other than maybe potentional confusion with recvmsg
returning an error not documented here:
https://linux.die.net/man/2/recvmsg
But that seems a minor point.
--
Thanks,
Mina
On 09/05, Mina Almasry wrote:
> On Fri, Sep 5, 2025 at 8:22 AM Stanislav Fomichev <stfomichev@gmail.com> wrote:
> >
> > On 09/04, Mina Almasry wrote:
> > > On Thu, Sep 4, 2025 at 11:27 AM Stanislav Fomichev <sdf@fomichev.me> wrote:
> > > >
> > > > devmem test fails on NIPA. Most likely we get skb(s) with readable
> > > > frags (why?)
> > >
> > > I would expect if we get readable frags that the frags land in the
> > > host buffer we provide ncdevmem and we actually hit this error:
> > >
> > > ```
> > > 1 if (!is_devmem) {
> > > 0 pr_err("flow steering error");
> > > 1 goto err_close_client;
> > > 2 }
> > > ```
> > >
> > > which as it says, should be root caused in a flow steering error. I
> > > don't know what would cause an EFAULT off the top of my head.
> >
> > Yea, I don't understand what happens :-( I'm thinking of doing the
> > following as well:
> >
> > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> > index 40b774b4f587..0c18a8c7965f 100644
> > --- a/net/ipv4/tcp.c
> > +++ b/net/ipv4/tcp.c
> > @@ -2820,7 +2820,7 @@ static int tcp_recvmsg_locked(struct sock *sk, struct msghdr *msg, size_t len,
> > used);
> > if (err <= 0) {
> > if (!copied)
> > - copied = -EFAULT;
> > + copied = err;
> >
> > break;
> > }
> >
> > Should give us more info for the devmem case... LMK if you don't like
> > it. If I don't hear from you in a couple of days, I'll send it out..
>
> Hmm, the other code paths overwrite the error to EFAULT; I don't know
> if that's significant in some way. But seems fine to me, I don't see
> why not do this, other than maybe potentional confusion with recvmsg
> returning an error not documented here:
>
> https://linux.die.net/man/2/recvmsg
>
> But that seems a minor point.
SG! Exporting new errnos seems to be ok because we are dealing with a
new devmem mode? I think the bug we have on the fbnic is that we don't
properly set skb->unreadable, so it's completely unrelated to this,
but still feels like it might help with future debugging..
© 2016 - 2026 Red Hat, Inc.