It is impossible to use init_dummy_netdev together with alloc_netdev()
as the 'setup' argument.
This is because alloc_netdev() initializes some fields in the net_device
structure, and later init_dummy_netdev() memzero them all. This causes
some problems as reported here:
https://lore.kernel.org/all/20240322082336.49f110cc@kernel.org/
Split the init_dummy_netdev() function in two. Create a new function called
init_dummy_netdev_core() that does not memzero the net_device structure.
Then have init_dummy_netdev() memzero-ing and calling
init_dummy_netdev_core(), keeping the old behaviour.
init_dummy_netdev_core() is the new function that could be called as an
argument for alloc_netdev().
Also, create a helper to allocate and initialize dummy net devices,
leveraging init_dummy_netdev_core() as the setup argument. This function
basically simplify the allocation of dummy devices, by allocating and
initializing it. Freeing the device continue to be done through
free_netdev()
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
---
include/linux/netdevice.h | 3 +++
net/core/dev.c | 54 ++++++++++++++++++++++++++-------------
2 files changed, 39 insertions(+), 18 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 0c198620ac93..544767d218c0 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -4517,6 +4517,9 @@ static inline void netif_addr_unlock_bh(struct net_device *dev)
void ether_setup(struct net_device *dev);
+/* Allocate dummy net_device */
+struct net_device *alloc_netdev_dummy(int sizeof_priv);
+
/* Support for loadable net-drivers */
struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
unsigned char name_assign_type,
diff --git a/net/core/dev.c b/net/core/dev.c
index bf0a335781aa..5d2cb97d0ae6 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10413,25 +10413,12 @@ int register_netdevice(struct net_device *dev)
}
EXPORT_SYMBOL(register_netdevice);
-/**
- * init_dummy_netdev - init a dummy network device for NAPI
- * @dev: device to init
- *
- * This takes a network device structure and initialize the minimum
- * amount of fields so it can be used to schedule NAPI polls without
- * registering a full blown interface. This is to be used by drivers
- * that need to tie several hardware interfaces to a single NAPI
- * poll scheduler due to HW limitations.
+/* Initialize the core of a dummy net device.
+ * This is useful if you are calling this function after alloc_netdev(),
+ * since it does not memset the net_device fields.
*/
-void init_dummy_netdev(struct net_device *dev)
+static void init_dummy_netdev_core(struct net_device *dev)
{
- /* Clear everything. Note we don't initialize spinlocks
- * are they aren't supposed to be taken by any of the
- * NAPI code and this dummy netdev is supposed to be
- * only ever used for NAPI polls
- */
- memset(dev, 0, sizeof(struct net_device));
-
/* make sure we BUG if trying to hit standard
* register/unregister code path
*/
@@ -10452,8 +10439,28 @@ void init_dummy_netdev(struct net_device *dev)
* its refcount.
*/
}
-EXPORT_SYMBOL_GPL(init_dummy_netdev);
+/**
+ * init_dummy_netdev - init a dummy network device for NAPI
+ * @dev: device to init
+ *
+ * This takes a network device structure and initialize the minimum
+ * amount of fields so it can be used to schedule NAPI polls without
+ * registering a full blown interface. This is to be used by drivers
+ * that need to tie several hardware interfaces to a single NAPI
+ * poll scheduler due to HW limitations.
+ */
+void init_dummy_netdev(struct net_device *dev)
+{
+ /* Clear everything. Note we don't initialize spinlocks
+ * are they aren't supposed to be taken by any of the
+ * NAPI code and this dummy netdev is supposed to be
+ * only ever used for NAPI polls
+ */
+ memset(dev, 0, sizeof(struct net_device));
+ init_dummy_netdev_core(dev);
+}
+EXPORT_SYMBOL_GPL(init_dummy_netdev);
/**
* register_netdev - register a network device
@@ -11065,6 +11072,17 @@ void free_netdev(struct net_device *dev)
}
EXPORT_SYMBOL(free_netdev);
+/**
+ * alloc_netdev_dummy - Allocate and initialize a dummy net device.
+ * @sizeof_priv: size of private data to allocate space for
+ */
+struct net_device *alloc_netdev_dummy(int sizeof_priv)
+{
+ return alloc_netdev(sizeof_priv, "dummy#", NET_NAME_UNKNOWN,
+ init_dummy_netdev_core);
+}
+EXPORT_SYMBOL_GPL(alloc_netdev_dummy);
+
/**
* synchronize_net - Synchronize with packet receive processing
*
--
2.43.0
On Tue, Apr 09, 2024 at 05:57:16AM -0700, Breno Leitao wrote: > It is impossible to use init_dummy_netdev together with alloc_netdev() > as the 'setup' argument. > > This is because alloc_netdev() initializes some fields in the net_device > structure, and later init_dummy_netdev() memzero them all. This causes > some problems as reported here: > > https://lore.kernel.org/all/20240322082336.49f110cc@kernel.org/ > > Split the init_dummy_netdev() function in two. Create a new function called > init_dummy_netdev_core() that does not memzero the net_device structure. > Then have init_dummy_netdev() memzero-ing and calling > init_dummy_netdev_core(), keeping the old behaviour. > > init_dummy_netdev_core() is the new function that could be called as an > argument for alloc_netdev(). > > Also, create a helper to allocate and initialize dummy net devices, > leveraging init_dummy_netdev_core() as the setup argument. This function > basically simplify the allocation of dummy devices, by allocating and > initializing it. Freeing the device continue to be done through > free_netdev() > > Suggested-by: Jakub Kicinski <kuba@kernel.org> > Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> We were about to submit another user of init_dummy_netdev() when I noticed this patch. Converted the code to use alloc_netdev_dummy() [1] and it seems to be working fine. Will submit after your patch is accepted. See a few minor comments below. [...] > +/** > + * init_dummy_netdev - init a dummy network device for NAPI > + * @dev: device to init > + * > + * This takes a network device structure and initialize the minimum s/initialize/initializes/ > + * amount of fields so it can be used to schedule NAPI polls without > + * registering a full blown interface. This is to be used by drivers > + * that need to tie several hardware interfaces to a single NAPI > + * poll scheduler due to HW limitations. > + */ > +void init_dummy_netdev(struct net_device *dev) > +{ > + /* Clear everything. Note we don't initialize spinlocks > + * are they aren't supposed to be taken by any of the I assume you meant s/are/as/ ? > + * NAPI code and this dummy netdev is supposed to be > + * only ever used for NAPI polls > + */ > + memset(dev, 0, sizeof(struct net_device)); > + init_dummy_netdev_core(dev); > +} > +EXPORT_SYMBOL_GPL(init_dummy_netdev); [1] diff --git a/drivers/net/ethernet/mellanox/mlxsw/pci.c b/drivers/net/ethernet/mellanox/mlxsw/pci.c index db2950baf6b4..bf66d996e32e 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/pci.c +++ b/drivers/net/ethernet/mellanox/mlxsw/pci.c @@ -132,20 +132,40 @@ struct mlxsw_pci { u8 num_cqs; /* Number of CQs */ u8 num_sdqs; /* Number of SDQs */ bool skip_reset; - struct net_device napi_dev_tx; - struct net_device napi_dev_rx; + struct net_device *napi_dev_tx; + struct net_device *napi_dev_rx; }; -static void mlxsw_pci_napi_devs_init(struct mlxsw_pci *mlxsw_pci) +static int mlxsw_pci_napi_devs_init(struct mlxsw_pci *mlxsw_pci) { - init_dummy_netdev(&mlxsw_pci->napi_dev_tx); - strscpy(mlxsw_pci->napi_dev_tx.name, "mlxsw_tx", - sizeof(mlxsw_pci->napi_dev_tx.name)); + int err; + + mlxsw_pci->napi_dev_tx = alloc_netdev_dummy(0); + if (!mlxsw_pci->napi_dev_tx) + return -ENOMEM; + strscpy(mlxsw_pci->napi_dev_tx->name, "mlxsw_tx", + sizeof(mlxsw_pci->napi_dev_tx->name)); + + mlxsw_pci->napi_dev_rx = alloc_netdev_dummy(0); + if (!mlxsw_pci->napi_dev_rx) { + err = -ENOMEM; + goto err_alloc_rx; + } + strscpy(mlxsw_pci->napi_dev_rx->name, "mlxsw_rx", + sizeof(mlxsw_pci->napi_dev_rx->name)); + dev_set_threaded(mlxsw_pci->napi_dev_rx, true); + + return 0; - init_dummy_netdev(&mlxsw_pci->napi_dev_rx); - strscpy(mlxsw_pci->napi_dev_rx.name, "mlxsw_rx", - sizeof(mlxsw_pci->napi_dev_rx.name)); - dev_set_threaded(&mlxsw_pci->napi_dev_rx, true); +err_alloc_rx: + free_netdev(mlxsw_pci->napi_dev_tx); + return err; +} + +static void mlxsw_pci_napi_devs_fini(struct mlxsw_pci *mlxsw_pci) +{ + free_netdev(mlxsw_pci->napi_dev_rx); + free_netdev(mlxsw_pci->napi_dev_tx); } static char *__mlxsw_pci_queue_elem_get(struct mlxsw_pci_queue *q, @@ -804,11 +824,11 @@ static void mlxsw_pci_cq_napi_setup(struct mlxsw_pci_queue *q, switch (cq_type) { case MLXSW_PCI_CQ_SDQ: - netif_napi_add(&mlxsw_pci->napi_dev_tx, &q->u.cq.napi, + netif_napi_add(mlxsw_pci->napi_dev_tx, &q->u.cq.napi, mlxsw_pci_napi_poll_cq_tx); break; case MLXSW_PCI_CQ_RDQ: - netif_napi_add(&mlxsw_pci->napi_dev_rx, &q->u.cq.napi, + netif_napi_add(mlxsw_pci->napi_dev_rx, &q->u.cq.napi, mlxsw_pci_napi_poll_cq_rx); break; } @@ -1793,7 +1813,10 @@ static int mlxsw_pci_init(void *bus_priv, struct mlxsw_core *mlxsw_core, if (err) goto err_requery_resources; - mlxsw_pci_napi_devs_init(mlxsw_pci); + err = mlxsw_pci_napi_devs_init(mlxsw_pci); + if (err) + goto err_napi_devs_init; + err = mlxsw_pci_aqs_init(mlxsw_pci, mbox); if (err) goto err_aqs_init; @@ -1811,6 +1834,8 @@ static int mlxsw_pci_init(void *bus_priv, struct mlxsw_core *mlxsw_core, err_request_eq_irq: mlxsw_pci_aqs_fini(mlxsw_pci); err_aqs_init: + mlxsw_pci_napi_devs_fini(mlxsw_pci); +err_napi_devs_init: err_requery_resources: err_config_profile: err_cqe_v_check: @@ -1838,6 +1863,7 @@ static void mlxsw_pci_fini(void *bus_priv) free_irq(pci_irq_vector(mlxsw_pci->pdev, 0), mlxsw_pci); mlxsw_pci_aqs_fini(mlxsw_pci); + mlxsw_pci_napi_devs_fini(mlxsw_pci); mlxsw_pci_fw_area_fini(mlxsw_pci); mlxsw_pci_free_irq_vectors(mlxsw_pci); }
On Wed, Apr 10, 2024 at 02:10:04PM +0300, Ido Schimmel wrote: > On Tue, Apr 09, 2024 at 05:57:16AM -0700, Breno Leitao wrote: > > It is impossible to use init_dummy_netdev together with alloc_netdev() > > as the 'setup' argument. > > > > This is because alloc_netdev() initializes some fields in the net_device > > structure, and later init_dummy_netdev() memzero them all. This causes > > some problems as reported here: > > > > https://lore.kernel.org/all/20240322082336.49f110cc@kernel.org/ > > > > Split the init_dummy_netdev() function in two. Create a new function called > > init_dummy_netdev_core() that does not memzero the net_device structure. > > Then have init_dummy_netdev() memzero-ing and calling > > init_dummy_netdev_core(), keeping the old behaviour. > > > > init_dummy_netdev_core() is the new function that could be called as an > > argument for alloc_netdev(). > > > > Also, create a helper to allocate and initialize dummy net devices, > > leveraging init_dummy_netdev_core() as the setup argument. This function > > basically simplify the allocation of dummy devices, by allocating and > > initializing it. Freeing the device continue to be done through > > free_netdev() > > > > Suggested-by: Jakub Kicinski <kuba@kernel.org> > > Signed-off-by: Breno Leitao <leitao@debian.org> > > Reviewed-by: Ido Schimmel <idosch@nvidia.com> > > We were about to submit another user of init_dummy_netdev() when I > noticed this patch. Converted the code to use alloc_netdev_dummy() [1] > and it seems to be working fine. Will submit after your patch is > accepted. Thanks. It seems that this patch is close to get accepted. Let's see... > See a few minor comments below. > > [...] > > > +/** > > + * init_dummy_netdev - init a dummy network device for NAPI > > + * @dev: device to init > > + * > > + * This takes a network device structure and initialize the minimum > > s/initialize/initializes/ > > > + * amount of fields so it can be used to schedule NAPI polls without > > + * registering a full blown interface. This is to be used by drivers > > + * that need to tie several hardware interfaces to a single NAPI > > + * poll scheduler due to HW limitations. > > + */ > > +void init_dummy_netdev(struct net_device *dev) > > +{ > > + /* Clear everything. Note we don't initialize spinlocks > > + * are they aren't supposed to be taken by any of the > > I assume you meant s/are/as/ ? Thanks for the feedback, I agree with all of them. Since these lines were not introduced by this patch, and this patch is just moving code (and comments) around, I would add a new patch to the patch series fixing the grammar errors.
© 2016 - 2024 Red Hat, Inc.