[PATCH net v2] net-sysfs: check device is present when showing duplex

Jamie Bainbridge posted 1 patch 1 year, 6 months ago
There is a newer version of this series
net/core/net-sysfs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH net v2] net-sysfs: check device is present when showing duplex
Posted by Jamie Bainbridge 1 year, 6 months ago
A sysfs reader can race with a device reset or removal, attempting to
read device state when the device is not actuall present.

This is the same sort of panic as observed in commit 4224cfd7fb65
("net-sysfs: add check for netdevice being present to speed_show"):

     [exception RIP: qed_get_current_link+17]
  #8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
  #9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
 #10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
 #11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
 #12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
 #13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
 #14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
 #15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
 #16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
 #17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb

 crash> struct net_device.state ffff9a9d21336000
   state = 5,

state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
The device is not present, note lack of __LINK_STATE_PRESENT (0b10).

Resolve by adding the same netif_device_present() check to duplex_show.

Fixes: 8ae6daca85c8 ("ethtool: Call ethtool's get/set_settings callbacks with cleaned data")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
---
v2: Restrict patch to just required path and describe problem in more
    detail as suggested by Johannes Berg. Improve commit message format
    as suggested by Shigeru Yoshida.
---
 net/core/net-sysfs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 0e2084ce7b7572bff458ed7e02358d9258c74628..22801d165d852a6578ca625b9674090519937be5 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -261,7 +261,7 @@ static ssize_t duplex_show(struct device *dev,
 	if (!rtnl_trylock())
 		return restart_syscall();
 
-	if (netif_running(netdev)) {
+	if (netif_running(netdev) && netif_device_present(netdev)) {
 		struct ethtool_link_ksettings cmd;
 
 		if (!__ethtool_get_link_ksettings(netdev, &cmd)) {
-- 
2.39.2
Re: [PATCH net v2] net-sysfs: check device is present when showing duplex
Posted by Shigeru Yoshida 1 year, 6 months ago
Hi Jamie,

On Mon, 29 Jul 2024 10:12:10 +1000, Jamie Bainbridge wrote:
> A sysfs reader can race with a device reset or removal, attempting to
> read device state when the device is not actuall present.
> 
> This is the same sort of panic as observed in commit 4224cfd7fb65
> ("net-sysfs: add check for netdevice being present to speed_show"):
> 
>      [exception RIP: qed_get_current_link+17]
>   #8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
>   #9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
>  #10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
>  #11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
>  #12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
>  #13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
>  #14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
>  #15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
>  #16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
>  #17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb
> 
>  crash> struct net_device.state ffff9a9d21336000
>    state = 5,
> 
> state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
> The device is not present, note lack of __LINK_STATE_PRESENT (0b10).
> 
> Resolve by adding the same netif_device_present() check to duplex_show.
> 
> Fixes: 8ae6daca85c8 ("ethtool: Call ethtool's get/set_settings callbacks with cleaned data")
> Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
> ---
> v2: Restrict patch to just required path and describe problem in more
>     detail as suggested by Johannes Berg. Improve commit message format
>     as suggested by Shigeru Yoshida.
> ---
>  net/core/net-sysfs.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
> index 0e2084ce7b7572bff458ed7e02358d9258c74628..22801d165d852a6578ca625b9674090519937be5 100644
> --- a/net/core/net-sysfs.c
> +++ b/net/core/net-sysfs.c
> @@ -261,7 +261,7 @@ static ssize_t duplex_show(struct device *dev,
>  	if (!rtnl_trylock())
>  		return restart_syscall();
>  
> -	if (netif_running(netdev)) {
> +	if (netif_running(netdev) && netif_device_present(netdev)) {
>  		struct ethtool_link_ksettings cmd;
>  
>  		if (!__ethtool_get_link_ksettings(netdev, &cmd)) {

As for the qede driver mentioned in the commit log, I assume the race
was caused between duplex_show() and qede_recovery_handler().
qede_recovery_handler() clears __LINK_STATE_PRESENT on recovery
failure and it is called with rtnl lock, so I think the patch works
correctly.

As Paolo mentioned, I think the issue was introduced when
duplex_show()/show_duplex() was first introduced.

Anyway,

Reviewed-by: Shigeru Yoshida <syoshida@redhat.com>

> -- 
> 2.39.2
> 
>
Re: [PATCH net v2] net-sysfs: check device is present when showing duplex
Posted by Paolo Abeni 1 year, 6 months ago
Hi,

On 7/29/24 02:12, Jamie Bainbridge wrote:
> A sysfs reader can race with a device reset or removal, attempting to
> read device state when the device is not actuall present.
> 
> This is the same sort of panic as observed in commit 4224cfd7fb65
> ("net-sysfs: add check for netdevice being present to speed_show"):
> 
>       [exception RIP: qed_get_current_link+17]
>    #8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
>    #9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
>   #10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
>   #11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
>   #12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
>   #13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
>   #14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
>   #15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
>   #16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
>   #17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb
> 
>   crash> struct net_device.state ffff9a9d21336000
>     state = 5,
> 
> state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
> The device is not present, note lack of __LINK_STATE_PRESENT (0b10).
> 
> Resolve by adding the same netif_device_present() check to duplex_show.
> 
> Fixes: 8ae6daca85c8 ("ethtool: Call ethtool's get/set_settings callbacks with cleaned data")

the patch LGTM, but it looks like the issue pre-exist WRT the above 
blamed commit??! possibly:

Fixes: d519e17e2d01 ("net: export device speed and duplex via sysfs")

Also please explicitly CC people who gave feedback on previous revisions,

Thanks,

Paolo