From nobody Thu Sep 4 03:30:32 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C9CEEC54EE9 for ; Tue, 13 Sep 2022 15:10:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235583AbiIMPKk (ORCPT ); Tue, 13 Sep 2022 11:10:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37550 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235683AbiIMPJD (ORCPT ); Tue, 13 Sep 2022 11:09:03 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0AEEB5F119; Tue, 13 Sep 2022 07:31:45 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id B41F1B80D87; Tue, 13 Sep 2022 14:31:40 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2D281C43146; Tue, 13 Sep 2022 14:31:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1663079499; bh=HQHYkKIuu2m483r8bXGwDp3s7/RFPN6AGKU9oMAMuI0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=SYTUjjVLZbFYBgX3qaytNrKGYW886JEcdcdWX97PVaJtkUjfpkrb6TpNn25OnWs0q 6BY56dkH2ErBJR6z7isR+Mq8UYPea4klIbtIdtwcgASN/28SC/gco8thlNQdxtTeqR mIIDi9rmZ2zeC9ATq8FjmknLxdjYYaPmrIRm9mgE= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, =?UTF-8?q?Christian=20K=C3=B6nig?= , Zhenneng Li , Alex Deucher , Sasha Levin Subject: [PATCH 4.19 50/79] drm/radeon: add a force flush to delay work when radeon Date: Tue, 13 Sep 2022 16:07:08 +0200 Message-Id: <20220913140351.317693720@linuxfoundation.org> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20220913140348.835121645@linuxfoundation.org> References: <20220913140348.835121645@linuxfoundation.org> User-Agent: quilt/0.67 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Zhenneng Li [ Upstream commit f461950fdc374a3ada5a63c669d997de4600dffe ] Although radeon card fence and wait for gpu to finish processing current ba= tch rings, there is still a corner case that radeon lockup work queue may not be fully= flushed, and meanwhile the radeon_suspend_kms() function has called pci_set_power_st= ate() to put device in D3hot state. Per PCI spec rev 4.0 on 5.3.1.4.1 D3hot State. > Configuration and Message requests are the only TLPs accepted by a Functi= on in > the D3hot state. All other received Requests must be handled as Unsupport= ed Requests, > and all received Completions may optionally be handled as Unexpected Comp= letions. This issue will happen in following logs: Unable to handle kernel paging request at virtual address 00008800e0008010 CPU 0 kworker/0:3(131): Oops 0 pc =3D [] ra =3D [] ps =3D 0000 Taint= ed: G W pc is at si_gpu_check_soft_reset+0x3c/0x240 ra is at si_dma_is_lockup+0x34/0xd0 v0 =3D 0000000000000000 t0 =3D fff08800e0008010 t1 =3D 0000000000010000 t2 =3D 0000000000008010 t3 =3D fff00007e3c00000 t4 =3D fff00007e3c00258 t5 =3D 000000000000ffff t6 =3D 0000000000000001 t7 =3D fff00007ef078000 s0 =3D fff00007e3c016e8 s1 =3D fff00007e3c00000 s2 =3D fff00007e3c00018 s3 =3D fff00007e3c00000 s4 =3D fff00007fff59d80 s5 =3D 0000000000000000 s6 =3D fff00007ef07bd98 a0 =3D fff00007e3c00000 a1 =3D fff00007e3c016e8 a2 =3D 0000000000000008 a3 =3D 0000000000000001 a4 =3D 8f5c28f5c28f5c29 a5 =3D ffffffff810f4338 t8 =3D 0000000000000275 t9 =3D ffffffff809b66f8 t10 =3D ff6769c5d964b800 t11=3D 000000000000b886 pv =3D ffffffff811bea20 at =3D 0000000000000000 gp =3D ffffffff81d89690 sp =3D 00000000aa814126 Disabling lock debugging due to kernel taint Trace: [] si_dma_is_lockup+0x34/0xd0 [] radeon_fence_check_lockup+0xd0/0x290 [] process_one_work+0x280/0x550 [] worker_thread+0x70/0x7c0 [] worker_thread+0x130/0x7c0 [] kthread+0x200/0x210 [] worker_thread+0x0/0x7c0 [] kthread+0x14c/0x210 [] ret_from_kernel_thread+0x18/0x20 [] kthread+0x0/0x210 Code: ad3e0008 43f0074a ad7e0018 ad9e0020 8c3001e8 40230101 <88210000> 4821ed21 So force lockup work queue flush to fix this problem. Acked-by: Christian K=C3=B6nig Signed-off-by: Zhenneng Li Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/radeon/radeon_device.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeo= n/radeon_device.c index 59c8a6647ff21..cc1c07963116c 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -1625,6 +1625,9 @@ int radeon_suspend_kms(struct drm_device *dev, bool s= uspend, if (r) { /* delay GPU reset to resume */ radeon_fence_driver_force_completion(rdev, i); + } else { + /* finish executing delayed work */ + flush_delayed_work(&rdev->fence_drv[i].lockup_work); } } =20 --=20 2.35.1