From nobody Mon Oct 6 19:09:16 2025 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 09A412FE321 for ; Thu, 17 Jul 2025 19:20:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752780030; cv=none; b=SwB9TxEEQf9im7aTGEqkY+d4KByLvtPhm0NZunTLCAnI13AAMMy/nU4buWIxBE1raoe9y7a5EJQoYFm7bHkfA/M4Cyn0IfzVKYwhpWqaKi9JwnuU1ZdeQwe8ePoOXT3XJtXVPdjnlt3Ld51ZfCwmqEZrxzKbapo3fne4r9KisFo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752780030; c=relaxed/simple; bh=qHNE6ZJsNsPHrMLOVMlNPJ1T9cDi7uaDRJUDA35BTb0=; h=Date:Mime-Version:Message-ID:Subject:From:To:Cc:Content-Type; b=BRUkCCVK6gOnZCiKIMDI73TtYfAqCPy522d7OlAyldcA9/V+pvwl+OExDuaPMToq0XR+7g1ZAfqyQEnL/iESYO8Orxqnqa7rEbHCvQ6f/SrcZ0ZRxM88jmZWwHbNvH4qbEyLtSIMPyAA3FXN2sdQq7iykwz88Zvhuth7hQco7w4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--hramamurthy.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=IaWqBezb; arc=none smtp.client-ip=209.85.210.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--hramamurthy.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="IaWqBezb" Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-748d96b974cso1325556b3a.2 for ; Thu, 17 Jul 2025 12:20:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1752780028; x=1753384828; darn=vger.kernel.org; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=HR8MX3mhC9f/vMePEVzAQse8ycjHGoEB8XupYWNC4G4=; b=IaWqBezbXIkLjDNaoGB+MlwdD1pruj2ZbzfmKZ2QiGS5h0VaJ3b0XdeWAJWdJYru22 6yoTT4tDf2k0ufGvFa+x7O9AW6pO+14NEQ+sLnx5SI1rVbzAcOsH4hytR6lH2RY96wwW L8PPDGlbtOFcvsZyOnz4CVC801aBxgfbICL/wHYuUYkA13noBKhVtcwECP1rnZJEiqaC FNetg/p8flCbis+oVAlmMOutQN9warI1JFqRGfu7iUMvqxPdmVgNNWY+wqsrPmoQryNk pNiSHamVHUKClXyPcKuI33C7EYTpSjaBN9AOMn/2tG1vwa5col1nraiLlkt/basc/HSa 6bvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752780028; x=1753384828; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=HR8MX3mhC9f/vMePEVzAQse8ycjHGoEB8XupYWNC4G4=; b=q+MwwKFAstpbYzW8e8StunOhzwjSmVTuaWJmnuDQKqWItm2xUH/LX5mIhRP7ppOGr3 Dql40SHEME1FuVbGosBUDccFbkkWK7td6x66MxshJNy9RBACHyLREGMMxyFtm/OdGMY3 A9iKXCm6bQXFrKB05CeOn0dNfgGThDWUQjd9506dbhnOKfQvsUNI35SQQZjfrEk+NoFO Jl4UXp1aedaoAElwqohIZcutntVa/Raxt+yesEUhIOhTycc/0DypYaAU70cZ4ExQb5zX hclo+Q0L+L9nLb3d4HH/W/2WKVdzep0o3kvaF6+cRBbpn2C1yiL74fnupecz9/dEdjPr ViZw== X-Forwarded-Encrypted: i=1; AJvYcCVhjBR0DD2wx3aNgJSdbTAdbWE5NlDu9HYE+Gf2M4afVHzGC5by0QDU8pEgrhLQB9bAmkNqpjtdiqmt/gQ=@vger.kernel.org X-Gm-Message-State: AOJu0Yy8C7f9b0zDwxG+m8a7nf9jdsBJwvZmePBC1uj52RQUBrvUb5zh 4894TO/6/VakQPOxTrFRmPqhaajJMVfcM9S4IhVyzQsvzDFlSzBWUHhG3kTegZgRtnhCrO4Zs8H 8pzmd8yDjTF6Kycn1/oh4DvjGag== X-Google-Smtp-Source: AGHT+IEtBesc2N/Fs+i9lOTd8XbLy9wXA+Q1KV0da06KAym2OQZ4lDK2l1GvqB7GpOLTPoD+KcroaFJ0N6GKz0JNag== X-Received: from pfbki26.prod.google.com ([2002:a05:6a00:949a:b0:748:f3b0:4db2]) (user=hramamurthy job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:985:b0:74d:247f:faf1 with SMTP id d2e1a72fcca58-756e819fa32mr11350357b3a.6.1752780028251; Thu, 17 Jul 2025 12:20:28 -0700 (PDT) Date: Thu, 17 Jul 2025 19:20:24 +0000 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog Message-ID: <20250717192024.1820931-1-hramamurthy@google.com> Subject: [PATCH net v2] gve: Fix stuck TX queue for DQ queue format From: Harshitha Ramamurthy To: netdev@vger.kernel.org Cc: jeroendb@google.com, hramamurthy@google.com, andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, pkaligineedi@google.com, willemb@google.com, joshwash@google.com, ziweixiao@google.com, jfraker@google.com, awogbemila@google.com, linux-kernel@vger.kernel.org, stable@vger.kernel.org, Tim Hostetler Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Praveen Kaligineedi gve_tx_timeout was calculating missed completions in a way that is only relevant in the GQ queue format. Additionally, it was attempting to disable device interrupts, which is not needed in either GQ or DQ queue formats. As a result, TX timeouts with the DQ queue format likely would have triggered early resets without kicking the queue at all. This patch drops the check for pending work altogether and always kicks the queue after validating the queue has not seen a TX timeout too recently. Cc: stable@vger.kernel.org Fixes: 87a7f321bb6a ("gve: Recover from queue stall due to missed IRQ") Co-developed-by: Tim Hostetler Signed-off-by: Tim Hostetler Signed-off-by: Praveen Kaligineedi Signed-off-by: Harshitha Ramamurthy --- Changes in v2: -Refactor out gve_tx_timeout_try_q_kick to remove goto statements (Jakub Kicinski) --- drivers/net/ethernet/google/gve/gve_main.c | 67 ++++++++++++++++++++++++--= ----------------- 1 file changed, 37 insertions(+), 30 deletions(-) diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ether= net/google/gve/gve_main.c index c3791cf23c87..2fdb58646132 100644 --- a/drivers/net/ethernet/google/gve/gve_main.c +++ b/drivers/net/ethernet/google/gve/gve_main.c @@ -1916,49 +1916,56 @@ static void gve_turnup_and_check_status(struct gve_= priv *priv) gve_handle_link_status(priv, GVE_DEVICE_STATUS_LINK_STATUS_MASK & status); } =20 -static void gve_tx_timeout(struct net_device *dev, unsigned int txqueue) +static struct gve_notify_block *gve_get_tx_notify_block(struct gve_priv *p= riv, + unsigned int txqueue) { - struct gve_notify_block *block; - struct gve_tx_ring *tx =3D NULL; - struct gve_priv *priv; - u32 last_nic_done; - u32 current_time; u32 ntfy_idx; =20 - netdev_info(dev, "Timeout on tx queue, %d", txqueue); - priv =3D netdev_priv(dev); if (txqueue > priv->tx_cfg.num_queues) - goto reset; + return NULL; =20 ntfy_idx =3D gve_tx_idx_to_ntfy(priv, txqueue); if (ntfy_idx >=3D priv->num_ntfy_blks) - goto reset; + return NULL; + + return &priv->ntfy_blocks[ntfy_idx]; +} + +static bool gve_tx_timeout_try_q_kick(struct gve_priv *priv, + unsigned int txqueue) +{ + struct gve_notify_block *block; + u32 current_time; =20 - block =3D &priv->ntfy_blocks[ntfy_idx]; - tx =3D block->tx; + block =3D gve_get_tx_notify_block(priv, txqueue); + + if (!block) + return false; =20 current_time =3D jiffies_to_msecs(jiffies); - if (tx->last_kick_msec + MIN_TX_TIMEOUT_GAP > current_time) - goto reset; + if (block->tx->last_kick_msec + MIN_TX_TIMEOUT_GAP > current_time) + return false; =20 - /* Check to see if there are missed completions, which will allow us to - * kick the queue. - */ - last_nic_done =3D gve_tx_load_event_counter(priv, tx); - if (last_nic_done - tx->done) { - netdev_info(dev, "Kicking queue %d", txqueue); - iowrite32be(GVE_IRQ_MASK, gve_irq_doorbell(priv, block)); - napi_schedule(&block->napi); - tx->last_kick_msec =3D current_time; - goto out; - } // Else reset. + netdev_info(priv->dev, "Kicking queue %d", txqueue); + napi_schedule(&block->napi); + block->tx->last_kick_msec =3D current_time; + return true; +} =20 -reset: - gve_schedule_reset(priv); +static void gve_tx_timeout(struct net_device *dev, unsigned int txqueue) +{ + struct gve_notify_block *block; + struct gve_priv *priv; =20 -out: - if (tx) - tx->queue_timeout++; + netdev_info(dev, "Timeout on tx queue, %d", txqueue); + priv =3D netdev_priv(dev); + + if (!gve_tx_timeout_try_q_kick(priv, txqueue)) + gve_schedule_reset(priv); + + block =3D gve_get_tx_notify_block(priv, txqueue); + if (block) + block->tx->queue_timeout++; priv->tx_timeo_cnt++; } =20 --=20 2.50.0.727.gbf7dc18ff4-goog