arch/powerpc/platforms/pseries/msi.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
When a system is being suspended to RAM, the PCI devices are also
suspended and the PPC code ends up calling pseries_msi_compose_msg() and
this triggers the BUG_ON() in __pci_read_msi_msg() because the device at
this point is in reduced power state. In reduced power state, the memory
mapped registers of the PCI device are not accessible.
To replicate the bug:
1. Make sure deep sleep is selected
# cat /sys/power/mem_sleep
s2idle [deep]
2. Make sure console is not suspended (so that dmesg logs are visible)
echo N > /sys/module/printk/parameters/console_suspend
3. Suspend the system
echo mem > /sys/power/state
To fix this behaviour, read the cached msi message of the device when the
device is not in PCI_D0 power state instead of touching the hardware.
Fixes: a5f3d2c17b07 ("powerpc/pseries/pci: Add MSI domains")
Cc: stable@vger.kernel.org # v5.15+
Signed-off-by: Gautam Menghani <gautam@linux.ibm.com>
---
arch/powerpc/platforms/pseries/msi.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/platforms/pseries/msi.c b/arch/powerpc/platforms/pseries/msi.c
index fdc2f7f38dc9..458d95c8c755 100644
--- a/arch/powerpc/platforms/pseries/msi.c
+++ b/arch/powerpc/platforms/pseries/msi.c
@@ -525,7 +525,12 @@ static struct msi_domain_info pseries_msi_domain_info = {
static void pseries_msi_compose_msg(struct irq_data *data, struct msi_msg *msg)
{
- __pci_read_msi_msg(irq_data_get_msi_desc(data), msg);
+ struct pci_dev *dev = msi_desc_to_pci_dev(irq_data_get_msi_desc(data));
+
+ if (dev->current_state == PCI_D0)
+ __pci_read_msi_msg(irq_data_get_msi_desc(data), msg);
+ else
+ get_cached_msi_msg(data->irq, msg);
}
static struct irq_chip pseries_msi_irq_chip = {
--
2.47.0
On Wed, 05 Mar 2025 14:32:36 +0530, Gautam Menghani wrote:
> When a system is being suspended to RAM, the PCI devices are also
> suspended and the PPC code ends up calling pseries_msi_compose_msg() and
> this triggers the BUG_ON() in __pci_read_msi_msg() because the device at
> this point is in reduced power state. In reduced power state, the memory
> mapped registers of the PCI device are not accessible.
>
> To replicate the bug:
> 1. Make sure deep sleep is selected
> # cat /sys/power/mem_sleep
> s2idle [deep]
>
> [...]
Applied to powerpc/next.
[1/1] powerpc/pseries/msi: Avoid reading PCI device registers in reduced power states
https://git.kernel.org/powerpc/c/9cc0eafd28c7faef300822992bb08d79cab2a36c
Thanks
Gautam Menghani <gautam@linux.ibm.com> writes:
> When a system is being suspended to RAM, the PCI devices are also
> suspended and the PPC code ends up calling pseries_msi_compose_msg() and
> this triggers the BUG_ON() in __pci_read_msi_msg() because the device at
> this point is in reduced power state. In reduced power state, the memory
> mapped registers of the PCI device are not accessible.
>
> To replicate the bug:
> 1. Make sure deep sleep is selected
> # cat /sys/power/mem_sleep
> s2idle [deep]
>
> 2. Make sure console is not suspended (so that dmesg logs are visible)
> echo N > /sys/module/printk/parameters/console_suspend
>
> 3. Suspend the system
> echo mem > /sys/power/state
>
> To fix this behaviour, read the cached msi message of the device when the
> device is not in PCI_D0 power state instead of touching the hardware.
>
> Fixes: a5f3d2c17b07 ("powerpc/pseries/pci: Add MSI domains")
> Cc: stable@vger.kernel.org # v5.15+
> Signed-off-by: Gautam Menghani <gautam@linux.ibm.com>
LGTM. Hence
Reviewed-by: Vaibhav Jain <vaibhav@linux.ibm.com>
--
Cheers
~ Vaibhav
On 10/03/25 10:30 am, Vaibhav Jain wrote:
> Gautam Menghani <gautam@linux.ibm.com> writes:
>
>> When a system is being suspended to RAM, the PCI devices are also
>> suspended and the PPC code ends up calling pseries_msi_compose_msg() and
>> this triggers the BUG_ON() in __pci_read_msi_msg() because the device at
>> this point is in reduced power state. In reduced power state, the memory
>> mapped registers of the PCI device are not accessible.
>>
>> To replicate the bug:
>> 1. Make sure deep sleep is selected
>> # cat /sys/power/mem_sleep
>> s2idle [deep]
>>
>> 2. Make sure console is not suspended (so that dmesg logs are visible)
>> echo N > /sys/module/printk/parameters/console_suspend
>>
>> 3. Suspend the system
>> echo mem > /sys/power/state
>>
>> To fix this behaviour, read the cached msi message of the device when the
>> device is not in PCI_D0 power state instead of touching the hardware.
>>
>> Fixes: a5f3d2c17b07 ("powerpc/pseries/pci: Add MSI domains")
>> Cc: stable@vger.kernel.org # v5.15+
>> Signed-off-by: Gautam Menghani <gautam@linux.ibm.com>
I am able to reporduce this issue without this patch and with this
pacth, there is no BUG_ON() in __pci_read_msi_msg(), but did see kernel
warnings. not sure if its side effect of this patch or a seperate issue.
Without this patch: [ 96.888399] ------------[ cut here ]------------ [
96.888402] kernel BUG at drivers/pci/msi/msi.c:158! [ 96.888407] Oops:
Exception in kernel mode, sig: 5 [#1] [ 96.888410] LE PAGE_SIZE=64K
MMU=Hash SMP NR_CPUS=8192 NUMA pSeries [ 96.888414] Modules linked in:
nft_compat nf_tables nfnetlink bonding tls rfkill binfmt_misc kmem
device_dax pseries_rng vmx_crypto dax_pmem drm
drm_panel_orientation_quirks xfs dm_service_time sd_mod sg nd_pmem
ibmvfc nd_btt ibmvscsi scsi_transport_fc ibmveth scsi_transport_srp
papr_scm libnvdimm tg3 dm_multipath dm_mirror dm_region_hash dm_log
dm_mod fuse [ 96.888473] CPU: 14 UID: 0 PID: 89 Comm: migration/14
Kdump: loaded Not tainted 6.14.0-auto #3 [ 96.888479] Hardware name:
IBM,9009-42A POWER9 (architected) 0x4e0202 0xf000005 of:IBM,FW950.A0
(VL950_141) hv:phyp pSeries [ 96.888481] Stopper:
multi_cpu_stop+0x0/0x22c <- __stop_cpus.constprop.0+0x68/0xc0 [
96.888494] NIP: c000000000995aec LR: c00000000010ec20 CTR:
c00000000010ebf8 [ 96.888498] REGS: c00000000680f830 TRAP: 0700 Not
tainted (6.14.0-auto) [ 96.888501] MSR: 8000000002823033
<SF,VEC,VSX,FP,ME,IR,DR,RI,LE> CR: 44004208 XER: 00000000 [ 96.888520]
CFAR: c00000000010ec1c IRQMASK: 3 GPR00: c00000000010ec20
c00000000680fad0 c000000001668100 c000000006c537e0 GPR04:
c00000000680fb80 0000000000000000 0000000000000000 c009ffffff8325f0
GPR08: 0000000000000001 0000000000000001 0000000000000003
0000000000001003 GPR12: c00000000010ebf8 c00000000f7beb00
c0000000001acbe0 c000000004056d40 GPR16: 0000000000000000
0000000000000000 0000000000000000 0000000000000000 GPR20:
0000000000000000 0000000000000003 000000000000001d c000000002cfaa88
GPR24: c00000006545b800 c000000002cfc080 c000000001126bb0
0000000000000000 GPR28: 0000000000000010 c00000000ce790c8
c00000000680fb80 c000000006c537e0 [ 96.888586] NIP [c000000000995aec]
__pci_read_msi_msg+0x48/0x278 [ 96.888592] LR [c00000000010ec20]
pseries_msi_compose_msg+0x28/0x3c [ 96.888599] Call Trace: [ 96.888600]
[c00000000680fad0] [000000000000001d] 0x1d (unreliable) [ 96.888608]
[c00000000680fb20] [c00000006545b820] 0xc00000006545b820 [ 96.888613]
[c00000000680fb40] [c00000000023b41c] irq_chip_compose_msi_msg+0x5c/0x90
[ 96.888620] [c00000000680fb60] [c000000000242aec]
msi_domain_set_affinity+0xb8/0xf4 [ 96.888627] [c00000000680fbb0]
[c000000000234634] irq_do_set_affinity+0x14c/0x25c [ 96.888633]
[c00000000680fc10] [c000000000234870]
irq_set_affinity_locked+0x12c/0x1c4 [ 96.888639] [c00000000680fc60]
[c000000000234a84] irq_set_affinity+0x64/0xa0 [ 96.888644]
[c00000000680fca0] [c0000000000c9d40] xics_migrate_irqs_away+0x27c/0x30c
[ 96.888650] [c00000000680fd60] [c000000000111834]
pseries_cpu_disable+0xc8/0xf0 [ 96.888657] [c00000000680fd90]
[c0000000000611e0] __cpu_disable+0x54/0xb0 [ 96.888662]
[c00000000680fdc0] [c0000000001715e8] take_cpu_down+0x4c/0xcc [
96.888669] [c00000000680fe10] [c0000000002ebbc4]
multi_cpu_stop+0xd8/0x22c [ 96.888676] [c00000000680fe80]
[c0000000002eb898] cpu_stopper_thread+0x158/0x24c [ 96.888683]
[c00000000680ff30] [c0000000001b7a0c] smpboot_thread_fn+0x1ec/0x25c [
96.888691] [c00000000680ff90] [c0000000001acd04] kthread+0x12c/0x14c [
96.888697] [c00000000680ffe0] [c00000000000df98]
start_kernel_thread+0x14/0x18 [ 96.888703] Code: fba1ffe8 39200001
f821ffb1 7c7f1b78 7c9e2378 e94d0c78 f9410028 39400000 eba30008 815dffd8
2c0a0000 7d20489e <0b090000> a123004c 712a0001 41820168 [ 96.888730]
---[ end trace 0000000000000000 ]---
With this patch: No System crash observed. But below warnings were observed.
[ 99.450644] ------------[ cut here ]------------ [ 99.450648] WARNING:
CPU: 0 PID: 17 at arch/powerpc/sysdev/xics/icp-hv.c:55
icp_hv_eoi+0xc4/0x120 [ 99.450659] Modules linked in: nft_compat
nf_tables nfnetlink bonding tls rfkill binfmt_misc kmem device_dax
pseries_rng vmx_crypto dax_pmem drm drm_panel_orientation_quirks xfs
dm_service_time sd_mod sg nd_pmem ibmvfc nd_btt ibmvscsi
scsi_transport_fc ibmveth scsi_transport_srp papr_scm libnvdimm tg3
dm_multipath dm_mirror dm_region_hash dm_log dm_mod fuse [ 99.450704]
CPU: 0 UID: 0 PID: 17 Comm: ksoftirqd/0 Kdump: loaded Not tainted
6.14.0-auto-00001-g03419579f433 #4 [ 99.450712] Hardware name:
IBM,9009-42A POWER9 (architected) 0x4e0202 0xf000005 of:IBM,FW950.A0
(VL950_141) hv:phyp pSeries [ 99.450717] NIP: c0000000000cadd4 LR:
c0000000000cadd0 CTR: 00000000007088ec [ 99.450722] REGS:
c000000004a2fa20 TRAP: 0700 Not tainted
(6.14.0-auto-00001-g03419579f433) [ 99.450727] MSR: 800000000282b033
<SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 2804424f XER: 00000010 [
99.450743] CFAR: c000000000224da8 IRQMASK: 1 GPR00: c0000000000cadd0
c000000004a2fcc0 c000000001668100 000000000000003f GPR04:
c0000007fd907c88 c0000007fd916000 c000000004a2fb08 00000007fb6a0000
GPR08: 0000000000000027 0000000000000000 0000000000000000
0000000000000001 GPR12: c000000002a37d48 c000000003000000
c0000000001acc60 c000000004052080 GPR16: 0000000000000006
0000000000000040 0000000000000006 0000000000000100 GPR20:
0000000004208040 0000000000000000 0000000000000001 c0000000002382c0
GPR24: 0000000000000001 0000000000000000 0000000000000006
0000000000000002 GPR28: c0000007fd9078b8 0000000000000000
c0000000010e69e8 00000000050a0002 [ 99.450802] NIP [c0000000000cadd4]
icp_hv_eoi+0xc4/0x120 [ 99.450808] LR [c0000000000cadd0]
icp_hv_eoi+0xc0/0x120 [ 99.450814] Call Trace: [ 99.450816]
[c000000004a2fcc0] [c0000000000cadd0] icp_hv_eoi+0xc0/0x120 (unreliable)
[ 99.450824] [c000000004a2fd30] [c000000000239eac]
handle_fasteoi_irq+0x16c/0x344 [ 99.450832] [c000000004a2fd70]
[c000000000238380] resend_irqs+0xc0/0x188 [ 99.450838]
[c000000004a2fdb0] [c00000000017b054] tasklet_action_common+0x154/0x418
[ 99.450845] [c000000004a2fe20] [c00000000017a788]
handle_softirqs+0x148/0x3b4 [ 99.450852] [c000000004a2ff10]
[c00000000017aa58] run_ksoftirqd+0x64/0xa0 [ 99.450858]
[c000000004a2ff30] [c0000000001b7a8c] smpboot_thread_fn+0x1ec/0x25c [
99.450866] [c000000004a2ff90] [c0000000001acd84] kthread+0x12c/0x14c [
99.450873] [c000000004a2ffe0] [c00000000000df98]
start_kernel_thread+0x14/0x18 [ 99.450879] Code: ebe1fff8 7c0803a6
4e800020 3c82ffa8 3c62ffdc 7fc5f378 3fc2ffa8 3884e908 38632268 3bdee8e8
48159f95 60000000 <0fe00000> 7bff4622 38600068 7fe4fb78 [ 99.450900]
---[ end trace 0000000000000000 ]---
Please add below tag:
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Regards,
Venkat.
> LGTM. Hence
> Reviewed-by: Vaibhav Jain <vaibhav@linux.ibm.com>
>
Hi Venkat, Thanks for the report. I looked into this and found that the new warning you reported can be observed even on current distro kernels, and is not caused by the patch I've posted. I was able to observe the same warning with fedora distro kernel 6.13.7-200.fc41 [ 70.294478] icp_hv_set_xirr: bad return code eoi xirr=0x50a0002 returned -4 [ 70.294521] ------------[ cut here ]------------ [ 70.294546] WARNING: CPU: 7 PID: 54 at arch/powerpc/sysdev/xics/icp-hv.c:55 icp_hv_eoi+0xf8/0x120 [ 70.294599] Modules linked in: xt_conntrack xt_MASQUERADE bridge stp llc ip6table_nat ip6table_filter ip6_tables xt_set ip_set iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype iptable_filter ip_tables kvm rpcrdma rdma_cm iw_cm ib_cm ib_core bonding overlay rfkill binfmt_misc vmx_crypto pseries_rng nfsd auth_rpcgss nfs_acl loop dm_multipath lockd grace nfs_localio nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vsock xfs nvme_tcp nvme_fabrics nvme_keyring nvme_core nvme_auth ibmvscsi ibmveth scsi_transport_srp crct10dif_vpmsum crc32c_vpmsum pseries_wdt sunrpc be2iscsi bnx2i cnic uio cxgb4i cxgb4 tls cxgb3i cxgb3 mdio libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi scsi_dh_rdac scsi_dh_emc scsi_dh_alua fuse aes_gcm_p10_crypto crypto_simd cryptd [ 70.295015] CPU: 7 UID: 0 PID: 54 Comm: ksoftirqd/7 Kdump: loaded Not tainted 6.13.7-200.fc41.ppc64le #1 [ 70.295064] Hardware name: IBM,9080-HEX POWER8 (architected) 0x800200 0xf000004 of:IBM,FW1060.00 (NH1060_022) hv:phyp pSeries [ 70.295120] NIP: c000000000197c98 LR: c000000000197c94 CTR: 0000000000000000 [ 70.295157] REGS: c000000007dd3a20 TRAP: 0700 Not tainted (6.13.7-200.fc41.ppc64le) [ 70.295197] MSR: 8000000002029033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 24004202 XER: 00000001 [ 70.295247] CFAR: c00000000032731c IRQMASK: 1 [ 70.295247] GPR00: c000000000197c94 c000000007dd3cc0 c0000000024daa00 000000000000003f [ 70.295247] GPR04: 00000000ffff7fff 00000000ffff7fff c000000007dd3ae8 00000007ec8e0000 [ 70.295247] GPR08: 0000000000000027 0000000000000000 0000000000000000 0000000000004000 [ 70.295247] GPR12: 0000000000000000 c00000000ffc6f00 c000000000287ef8 c000000004a51080 [ 70.295247] GPR16: 0000000000000000 0000000004208040 c000000003d62c80 c0000000031faf80 [ 70.295247] GPR20: 00000000ffffa63b 000000000000000a c0000000031e6990 c000000000335f10 [ 70.295247] GPR24: 0000000000000001 0000000000000000 0000000000000006 0000000000000002 [ 70.295247] GPR28: c0000007efac68b8 0000000000000000 00000000050a0002 00000000050a0002 [ 70.295603] NIP [c000000000197c98] icp_hv_eoi+0xf8/0x120 [ 70.295633] LR [c000000000197c94] icp_hv_eoi+0xf4/0x120 [ 70.295661] Call Trace: [ 70.295675] [c000000007dd3cc0] [c000000000197c94] icp_hv_eoi+0xf4/0x120 (unreliable) [ 70.295717] [c000000007dd3d40] [c000000000337a5c] handle_fasteoi_irq+0x16c/0x350 [ 70.295757] [c000000007dd3d70] [c000000000335fd0] resend_irqs+0xc0/0x190 [ 70.295793] [c000000007dd3db0] [c000000000254064] tasklet_action_common+0x154/0x440 [ 70.295833] [c000000007dd3e20] [c000000000253458] handle_softirqs+0x168/0x4f0 [ 70.295871] [c000000007dd3f10] [c000000000253848] run_ksoftirqd+0x68/0xb0 [ 70.295912] [c000000007dd3f30] [c000000000292f20] smpboot_thread_fn+0x1d0/0x240 [ 70.295951] [c000000007dd3f90] [c000000000288020] kthread+0x130/0x140 [ 70.295984] [c000000007dd3fe0] [c00000000000ded8] start_kernel_thread+0x14/0x18 [ 70.296022] Code: 48c84251 60000000 e9210068 4bffff98 7c661b78 3c82ff31 3c62ff7d 7fc5f378 38842b40 38639bf8 4818f649 60000000 <0fe00000> 38210080 7be34622 e8010010 [ 70.296104] ---[ end trace 0000000000000000 ]--- [ 70.297273] PM: resume devices took 0.000 seconds [ 70.297415] OOM killer enabled. [ 70.297433] Restarting tasks ... done. [ 70.298959] random: crng reseeded on system resumption [ 70.299106] PM: suspend exit This can be tracked as a separate bug, as it is unrelated to the patch. Thanks, Gautam
© 2016 - 2026 Red Hat, Inc.