[PATCH] PCI: endpoint: pci-epf-test: Fix sleeping function being called from atomic context

Bhanu Seshu Kumar Valluri posted 1 patch 2 weeks ago
There is a newer version of this series
drivers/pci/endpoint/functions/pci-epf-test.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
[PATCH] PCI: endpoint: pci-epf-test: Fix sleeping function being called from atomic context
Posted by Bhanu Seshu Kumar Valluri 2 weeks ago
When Root Complex(RC) triggers a Doorbell MSI interrupt to Endpoint(EP) it triggers a warning
in the EP. pci_endpoint kselftest target is compiled and used to run the Doorbell test in RC.

[  474.686193] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:271
[  474.694656] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/0
[  474.702473] preempt_count: 10001, expected: 0
[  474.706819] RCU nest depth: 0, expected: 0
[  474.710913] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.17.0-rc5-g7aac71907bde #12 PREEMPT
[  474.710926] Hardware name: Texas Instruments AM642 EVM (DT)
[  474.710934] Call trace:
[  474.710940]  show_stack+0x20/0x38 (C)
[  474.710969]  dump_stack_lvl+0x70/0x88
[  474.710984]  dump_stack+0x18/0x28
[  474.710995]  __might_resched+0x130/0x158
[  474.711011]  __might_sleep+0x70/0x88
[  474.711023]  mutex_lock+0x2c/0x80
[  474.711036]  pci_epc_get_msi+0x78/0xd8
[  474.711052]  pci_epf_test_raise_irq.isra.0+0x74/0x138
[  474.711063]  pci_epf_test_doorbell_handler+0x34/0x50
[  474.711072]  __handle_irq_event_percpu+0xac/0x1f0
[  474.711086]  handle_irq_event+0x54/0xb8
[  474.711096]  handle_fasteoi_irq+0x150/0x220
[  474.711110]  handle_irq_desc+0x48/0x68
[  474.711121]  generic_handle_domain_irq+0x24/0x38
[  474.711131]  gic_handle_irq+0x4c/0xc8
[  474.711141]  call_on_irq_stack+0x30/0x70
[  474.711151]  do_interrupt_handler+0x70/0x98
[  474.711163]  el1_interrupt+0x34/0x68
[  474.711176]  el1h_64_irq_handler+0x18/0x28
[  474.711189]  el1h_64_irq+0x6c/0x70
[  474.711198]  default_idle_call+0x10c/0x120 (P)
[  474.711208]  do_idle+0x128/0x268
[  474.711220]  cpu_startup_entry+0x3c/0x48
[  474.711231]  rest_init+0xe0/0xe8
[  474.711240]  start_kernel+0x6d4/0x760
[  474.711255]  __primary_switched+0x88/0x98

Warnings can be reproduced by following steps below.
*On EP side:
1. Configure the pci-epf-test function using steps given below
   mount -t configfs none /sys/kernel/config
   cd /sys/kernel/config/pci_ep/
   mkdir functions/pci_epf_test/func1
   echo 0x104c > functions/pci_epf_test/func1/vendorid
   echo 0xb010 > functions/pci_epf_test/func1/deviceid
   echo 32 > functions/pci_epf_test/func1/msi_interrupts
   echo 2048 > functions/pci_epf_test/func1/msix_interrupts
   ln -s functions/pci_epf_test/func1 controllers/f102000.pcie-ep/
   echo 1 > controllers/f102000.pcie-ep/start

*On RC side:
1. Once EP side configuration is done do pci rescan.
   echo 1 > /sys/bus/pci/rescan
2. Run Doorbell MSI test using pci_endpoint_test kselftest app.
  ./pci_endpoint_test -r pcie_ep_doorbell.DOORBELL_TEST
  Note: Kernel is compiled with CONFIG_DEBUG_KERNEL enabled.

The BUG arises because the EP's Doorbell MSI hard interrupt handler is making an
indirect call to pci_epc_get_msi, which uses mutex inside, from interrupt context.

This patch converts hard irq handler to a threaded irq handler to allow it
to call functions that can sleep during bottom half execution. The threaded
irq handler is registered with IRQF_ONESHOT and keeps interrupt line disabled
until the threaded irq handler completes execution.

Fixes: eff0c286aa916221a69126 ("PCI: endpoint: pci-epf-test: Add doorbell test support")
Signed-off-by: Bhanu Seshu Kumar Valluri <bhanuseshukumar@gmail.com>
---
 Note : It is compiled and tested on TI am642 board.

 drivers/pci/endpoint/functions/pci-epf-test.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c b/drivers/pci/endpoint/functions/pci-epf-test.c
index e091193bd..b9c1ad931 100644
--- a/drivers/pci/endpoint/functions/pci-epf-test.c
+++ b/drivers/pci/endpoint/functions/pci-epf-test.c
@@ -680,7 +680,7 @@ static void pci_epf_test_raise_irq(struct pci_epf_test *epf_test,
 	}
 }
 
-static irqreturn_t pci_epf_test_doorbell_handler(int irq, void *data)
+static irqreturn_t pci_epf_test_doorbell_irq_thread(int irq, void *data)
 {
 	struct pci_epf_test *epf_test = data;
 	enum pci_barno test_reg_bar = epf_test->test_reg_bar;
@@ -725,8 +725,8 @@ static void pci_epf_test_enable_doorbell(struct pci_epf_test *epf_test,
 	if (bar < BAR_0)
 		goto err_doorbell_cleanup;
 
-	ret = request_irq(epf->db_msg[0].virq, pci_epf_test_doorbell_handler, 0,
-			  "pci-ep-test-doorbell", epf_test);
+	ret = request_threaded_irq(epf->db_msg[0].virq, NULL, pci_epf_test_doorbell_irq_thread,
+				   IRQF_ONESHOT, "pci-ep-test-doorbell", epf_test);
 	if (ret) {
 		dev_err(&epf->dev,
 			"Failed to request doorbell IRQ: %d\n",
-- 
2.34.1
Re: [PATCH] PCI: endpoint: pci-epf-test: Fix sleeping function being called from atomic context
Posted by Manivannan Sadhasivam 2 days, 15 hours ago
On Wed, Sep 17, 2025 at 09:48:17PM +0530, Bhanu Seshu Kumar Valluri wrote:
> When Root Complex(RC) triggers a Doorbell MSI interrupt to Endpoint(EP) it triggers a warning
> in the EP. pci_endpoint kselftest target is compiled and used to run the Doorbell test in RC.
> 
> [  474.686193] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:271
> [  474.694656] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/0
> [  474.702473] preempt_count: 10001, expected: 0
> [  474.706819] RCU nest depth: 0, expected: 0
> [  474.710913] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.17.0-rc5-g7aac71907bde #12 PREEMPT
> [  474.710926] Hardware name: Texas Instruments AM642 EVM (DT)
> [  474.710934] Call trace:
> [  474.710940]  show_stack+0x20/0x38 (C)
> [  474.710969]  dump_stack_lvl+0x70/0x88
> [  474.710984]  dump_stack+0x18/0x28
> [  474.710995]  __might_resched+0x130/0x158
> [  474.711011]  __might_sleep+0x70/0x88
> [  474.711023]  mutex_lock+0x2c/0x80
> [  474.711036]  pci_epc_get_msi+0x78/0xd8
> [  474.711052]  pci_epf_test_raise_irq.isra.0+0x74/0x138
> [  474.711063]  pci_epf_test_doorbell_handler+0x34/0x50
> [  474.711072]  __handle_irq_event_percpu+0xac/0x1f0
> [  474.711086]  handle_irq_event+0x54/0xb8
> [  474.711096]  handle_fasteoi_irq+0x150/0x220
> [  474.711110]  handle_irq_desc+0x48/0x68
> [  474.711121]  generic_handle_domain_irq+0x24/0x38
> [  474.711131]  gic_handle_irq+0x4c/0xc8
> [  474.711141]  call_on_irq_stack+0x30/0x70
> [  474.711151]  do_interrupt_handler+0x70/0x98
> [  474.711163]  el1_interrupt+0x34/0x68
> [  474.711176]  el1h_64_irq_handler+0x18/0x28
> [  474.711189]  el1h_64_irq+0x6c/0x70
> [  474.711198]  default_idle_call+0x10c/0x120 (P)
> [  474.711208]  do_idle+0x128/0x268
> [  474.711220]  cpu_startup_entry+0x3c/0x48
> [  474.711231]  rest_init+0xe0/0xe8
> [  474.711240]  start_kernel+0x6d4/0x760
> [  474.711255]  __primary_switched+0x88/0x98
> 

You do not need to use full call trace. Refer:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v6.17#n761

> Warnings can be reproduced by following steps below.
> *On EP side:
> 1. Configure the pci-epf-test function using steps given below
>    mount -t configfs none /sys/kernel/config
>    cd /sys/kernel/config/pci_ep/
>    mkdir functions/pci_epf_test/func1
>    echo 0x104c > functions/pci_epf_test/func1/vendorid
>    echo 0xb010 > functions/pci_epf_test/func1/deviceid
>    echo 32 > functions/pci_epf_test/func1/msi_interrupts
>    echo 2048 > functions/pci_epf_test/func1/msix_interrupts
>    ln -s functions/pci_epf_test/func1 controllers/f102000.pcie-ep/
>    echo 1 > controllers/f102000.pcie-ep/start
> 
> *On RC side:
> 1. Once EP side configuration is done do pci rescan.
>    echo 1 > /sys/bus/pci/rescan
> 2. Run Doorbell MSI test using pci_endpoint_test kselftest app.
>   ./pci_endpoint_test -r pcie_ep_doorbell.DOORBELL_TEST

This info is already part of the kernel documentation. So it is redundant here.
It could be probably added in the comment section (where you added the Note).

>   Note: Kernel is compiled with CONFIG_DEBUG_KERNEL enabled.
> 
> The BUG arises because the EP's Doorbell MSI hard interrupt handler is making an
> indirect call to pci_epc_get_msi, which uses mutex inside, from interrupt context.
> 
> This patch converts hard irq handler to a threaded irq handler to allow it
> to call functions that can sleep during bottom half execution. The threaded
> irq handler is registered with IRQF_ONESHOT and keeps interrupt line disabled
> until the threaded irq handler completes execution.
> 
> Fixes: eff0c286aa916221a69126 ("PCI: endpoint: pci-epf-test: Add doorbell test support")

Use 12 char commit SHA.

> Signed-off-by: Bhanu Seshu Kumar Valluri <bhanuseshukumar@gmail.com>
> ---
>  Note : It is compiled and tested on TI am642 board.
> 
>  drivers/pci/endpoint/functions/pci-epf-test.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pci/endpoint/functions/pci-epf-test.c b/drivers/pci/endpoint/functions/pci-epf-test.c
> index e091193bd..b9c1ad931 100644
> --- a/drivers/pci/endpoint/functions/pci-epf-test.c
> +++ b/drivers/pci/endpoint/functions/pci-epf-test.c
> @@ -680,7 +680,7 @@ static void pci_epf_test_raise_irq(struct pci_epf_test *epf_test,
>  	}
>  }
>  
> -static irqreturn_t pci_epf_test_doorbell_handler(int irq, void *data)
> +static irqreturn_t pci_epf_test_doorbell_irq_thread(int irq, void *data)

No need to change the function name.

- Mani

-- 
மணிவண்ணன் சதாசிவம்
Re: [PATCH] PCI: endpoint: pci-epf-test: Fix sleeping function being called from atomic context
Posted by bhanuseshukumar 2 days, 7 hours ago
On 29/09/25 23:23, Manivannan Sadhasivam wrote:
> On Wed, Sep 17, 2025 at 09:48:17PM +0530, Bhanu Seshu Kumar Valluri wrote:
>> When Root Complex(RC) triggers a Doorbell MSI interrupt to Endpoint(EP) it triggers a warning
>> in the EP. pci_endpoint kselftest target is compiled and used to run the Doorbell test in RC.
>>
>> [  474.686193] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:271
>> [  474.694656] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/0
>> [  474.702473] preempt_count: 10001, expected: 0
>> [  474.706819] RCU nest depth: 0, expected: 0
>> [  474.710913] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.17.0-rc5-g7aac71907bde #12 PREEMPT
>> [  474.710926] Hardware name: Texas Instruments AM642 EVM (DT)
>> [  474.710934] Call trace:
>> [  474.710940]  show_stack+0x20/0x38 (C)
>> [  474.710969]  dump_stack_lvl+0x70/0x88
>> [  474.710984]  dump_stack+0x18/0x28
>> [  474.710995]  __might_resched+0x130/0x158
>> [  474.711011]  __might_sleep+0x70/0x88
>> [  474.711023]  mutex_lock+0x2c/0x80
>> [  474.711036]  pci_epc_get_msi+0x78/0xd8
>> [  474.711052]  pci_epf_test_raise_irq.isra.0+0x74/0x138
>> [  474.711063]  pci_epf_test_doorbell_handler+0x34/0x50
>> [  474.711072]  __handle_irq_event_percpu+0xac/0x1f0
>> [  474.711086]  handle_irq_event+0x54/0xb8
>> [  474.711096]  handle_fasteoi_irq+0x150/0x220
>> [  474.711110]  handle_irq_desc+0x48/0x68
>> [  474.711121]  generic_handle_domain_irq+0x24/0x38
>> [  474.711131]  gic_handle_irq+0x4c/0xc8
>> [  474.711141]  call_on_irq_stack+0x30/0x70
>> [  474.711151]  do_interrupt_handler+0x70/0x98
>> [  474.711163]  el1_interrupt+0x34/0x68
>> [  474.711176]  el1h_64_irq_handler+0x18/0x28
>> [  474.711189]  el1h_64_irq+0x6c/0x70
>> [  474.711198]  default_idle_call+0x10c/0x120 (P)
>> [  474.711208]  do_idle+0x128/0x268
>> [  474.711220]  cpu_startup_entry+0x3c/0x48
>> [  474.711231]  rest_init+0xe0/0xe8
>> [  474.711240]  start_kernel+0x6d4/0x760
>> [  474.711255]  __primary_switched+0x88/0x98
>>
> 
> You do not need to use full call trace. Refer:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v6.17#n761
> 
>> Warnings can be reproduced by following steps below.
>> *On EP side:
>> 1. Configure the pci-epf-test function using steps given below
>>    mount -t configfs none /sys/kernel/config
>>    cd /sys/kernel/config/pci_ep/
>>    mkdir functions/pci_epf_test/func1
>>    echo 0x104c > functions/pci_epf_test/func1/vendorid
>>    echo 0xb010 > functions/pci_epf_test/func1/deviceid
>>    echo 32 > functions/pci_epf_test/func1/msi_interrupts
>>    echo 2048 > functions/pci_epf_test/func1/msix_interrupts
>>    ln -s functions/pci_epf_test/func1 controllers/f102000.pcie-ep/
>>    echo 1 > controllers/f102000.pcie-ep/start
>>
>> *On RC side:
>> 1. Once EP side configuration is done do pci rescan.
>>    echo 1 > /sys/bus/pci/rescan
>> 2. Run Doorbell MSI test using pci_endpoint_test kselftest app.
>>   ./pci_endpoint_test -r pcie_ep_doorbell.DOORBELL_TEST
> 
> This info is already part of the kernel documentation. So it is redundant here.
> It could be probably added in the comment section (where you added the Note).
> 
>>   Note: Kernel is compiled with CONFIG_DEBUG_KERNEL enabled.
>>
>> The BUG arises because the EP's Doorbell MSI hard interrupt handler is making an
>> indirect call to pci_epc_get_msi, which uses mutex inside, from interrupt context.
>>
>> This patch converts hard irq handler to a threaded irq handler to allow it
>> to call functions that can sleep during bottom half execution. The threaded
>> irq handler is registered with IRQF_ONESHOT and keeps interrupt line disabled
>> until the threaded irq handler completes execution.
>>
>> Fixes: eff0c286aa916221a69126 ("PCI: endpoint: pci-epf-test: Add doorbell test support")
> 
> Use 12 char commit SHA.
> 
>> -static irqreturn_t pci_epf_test_doorbell_handler(int irq, void *data)
>> +static irqreturn_t pci_epf_test_doorbell_irq_thread(int irq, void *data)
> 
> No need to change the function name.

Thank you Mani for your helpful comments on the patch. I will send a v2 patch to address above review comments.

-Bhanu Seshu Kumar Valluri