Currently, commands that qualify as throttled can't be used via the
async API. That's due to the fact that the throttle semaphore can sleep
but the async API can't.
This patch allows throttling in the async API by using the tentative
variant of the semaphore and upon failure (semaphore at 0) returns EBUSY
to signal to the caller that they need to wait for the completion of
previously issued commands.
Furthermore, make sure that the semaphore is released in the callback.
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Cc: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
---
drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 21 ++++++++++++++-----
1 file changed, 16 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index 20768ef2e9d2..f69c977c1569 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -1882,10 +1882,12 @@ static int cmd_exec(struct mlx5_core_dev *dev, void *in, int in_size, void *out,
throttle_op = mlx5_cmd_is_throttle_opcode(opcode);
if (throttle_op) {
- /* atomic context may not sleep */
- if (callback)
- return -EINVAL;
- down(&dev->cmd.vars.throttle_sem);
+ if (callback) {
+ if (down_trylock(&dev->cmd.vars.throttle_sem))
+ return -EBUSY;
+ } else {
+ down(&dev->cmd.vars.throttle_sem);
+ }
}
pages_queue = is_manage_pages(in);
@@ -2091,10 +2093,19 @@ static void mlx5_cmd_exec_cb_handler(int status, void *_work)
{
struct mlx5_async_work *work = _work;
struct mlx5_async_ctx *ctx;
+ struct mlx5_core_dev *dev;
+ u16 opcode;
ctx = work->ctx;
- status = cmd_status_err(ctx->dev, status, work->opcode, work->op_mod, work->out);
+ dev = ctx->dev;
+ opcode = work->opcode;
+ status = cmd_status_err(dev, status, work->opcode, work->op_mod, work->out);
work->user_callback(status, work);
+ /* Can't access "work" from this point on. It could have been freed in
+ * the callback.
+ */
+ if (mlx5_cmd_is_throttle_opcode(opcode))
+ up(&dev->cmd.vars.throttle_sem);
if (atomic_dec_and_test(&ctx->num_inflight))
complete(&ctx->inflight_done);
}
--
2.45.1
On 16.08.24 11:01, Dragos Tatulea wrote:
> Currently, commands that qualify as throttled can't be used via the
> async API. That's due to the fact that the throttle semaphore can sleep
> but the async API can't.
>
> This patch allows throttling in the async API by using the tentative
> variant of the semaphore and upon failure (semaphore at 0) returns EBUSY
> to signal to the caller that they need to wait for the completion of
> previously issued commands.
>
> Furthermore, make sure that the semaphore is released in the callback.
>
> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
> Cc: Leon Romanovsky <leonro@nvidia.com>
> Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Same reminder as in v1: Tariq is the maintainer for mlx5 so his review
also counts as Acked-by.
Thanks,
Dragos
> ---
> drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 21 ++++++++++++++-----
> 1 file changed, 16 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
> index 20768ef2e9d2..f69c977c1569 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
> @@ -1882,10 +1882,12 @@ static int cmd_exec(struct mlx5_core_dev *dev, void *in, int in_size, void *out,
>
> throttle_op = mlx5_cmd_is_throttle_opcode(opcode);
> if (throttle_op) {
> - /* atomic context may not sleep */
> - if (callback)
> - return -EINVAL;
> - down(&dev->cmd.vars.throttle_sem);
> + if (callback) {
> + if (down_trylock(&dev->cmd.vars.throttle_sem))
> + return -EBUSY;
> + } else {
> + down(&dev->cmd.vars.throttle_sem);
> + }
> }
>
> pages_queue = is_manage_pages(in);
> @@ -2091,10 +2093,19 @@ static void mlx5_cmd_exec_cb_handler(int status, void *_work)
> {
> struct mlx5_async_work *work = _work;
> struct mlx5_async_ctx *ctx;
> + struct mlx5_core_dev *dev;
> + u16 opcode;
>
> ctx = work->ctx;
> - status = cmd_status_err(ctx->dev, status, work->opcode, work->op_mod, work->out);
> + dev = ctx->dev;
> + opcode = work->opcode;
> + status = cmd_status_err(dev, status, work->opcode, work->op_mod, work->out);
> work->user_callback(status, work);
> + /* Can't access "work" from this point on. It could have been freed in
> + * the callback.
> + */
> + if (mlx5_cmd_is_throttle_opcode(opcode))
> + up(&dev->cmd.vars.throttle_sem);
> if (atomic_dec_and_test(&ctx->num_inflight))
> complete(&ctx->inflight_done);
> }
On Mon, Sep 9, 2024 at 11:33 AM Dragos Tatulea <dtatulea@nvidia.com> wrote: > > > > On 16.08.24 11:01, Dragos Tatulea wrote: > > Currently, commands that qualify as throttled can't be used via the > > async API. That's due to the fact that the throttle semaphore can sleep > > but the async API can't. > > > > This patch allows throttling in the async API by using the tentative > > variant of the semaphore and upon failure (semaphore at 0) returns EBUSY > > to signal to the caller that they need to wait for the completion of > > previously issued commands. > > > > Furthermore, make sure that the semaphore is released in the callback. > > > > Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> > > Cc: Leon Romanovsky <leonro@nvidia.com> > > Reviewed-by: Tariq Toukan <tariqt@nvidia.com> > Same reminder as in v1: Tariq is the maintainer for mlx5 so his review > also counts as Acked-by. > Not sure if it was the case when you send the mail, but this series is already in the maintainer's branch: * https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git/ * https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git/commit/?h=vhost&id=691fd851e1bc8ec043798e1ab337305e6291cd6b Thanks!
On 11.09.24 10:00, Eugenio Perez Martin wrote: > On Mon, Sep 9, 2024 at 11:33 AM Dragos Tatulea <dtatulea@nvidia.com> wrote: >> >> >> >> On 16.08.24 11:01, Dragos Tatulea wrote: >>> Currently, commands that qualify as throttled can't be used via the >>> async API. That's due to the fact that the throttle semaphore can sleep >>> but the async API can't. >>> >>> This patch allows throttling in the async API by using the tentative >>> variant of the semaphore and upon failure (semaphore at 0) returns EBUSY >>> to signal to the caller that they need to wait for the completion of >>> previously issued commands. >>> >>> Furthermore, make sure that the semaphore is released in the callback. >>> >>> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> >>> Cc: Leon Romanovsky <leonro@nvidia.com> >>> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> >> Same reminder as in v1: Tariq is the maintainer for mlx5 so his review >> also counts as Acked-by. >> > > Not sure if it was the case when you send the mail, but this series is > already in the maintainer's branch: > * https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git/ > * https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git/commit/?h=vhost&id=691fd851e1bc8ec043798e1ab337305e6291cd6b It wasn't. Thanks for the notice! Thanks, Dragos
© 2016 - 2026 Red Hat, Inc.