[PATCH v02] mailbox: pcc: report errors for PCC clients

Adam Young posted 1 patch 6 days, 8 hours ago
drivers/mailbox/pcc.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
[PATCH v02] mailbox: pcc: report errors for PCC clients
Posted by Adam Young 6 days, 8 hours ago
The tx_done callback function has a return code (rc) parameter
that the tx_done callback can use to determine how to handle an error.
However the IRQ handler was not setting that value if there is an error.

The following clients are affected:

drivers/acpi/cppc_acpi.c
drivers/i2c/busses/i2c-xgene-slimpro.c
drivers/hwmon/xgene-hwmon.c
drivers/soc/hisilicon/kunpeng_hccs.c
drivers/devfreq/hisi_uncore_freq.c

All of these only use the error code to report, so they
are expecting an error code to come thorugh, but they
do not modify behavior based on this code.

In the case of an error code in the IRQ, the handler was returning
IRQ_NONE which is not correct:  the IRQ handler was matched
to the IRQ.  This mean that multiple error codes returned from
a PCC triggered interrupt would end up disabling the device.

In addition, if the error code IRQ was coming from a Type4 Device that was
expecting an IRQ response, that device would then be hung.

Fixes: c45ded7e1135 ("mailbox: pcc: Add support for PCCT extended PCC subspaces(type 3/4)")
Signed-off-by: Adam Young <admiyo@os.amperecomputing.com>

---
---
 drivers/mailbox/pcc.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/mailbox/pcc.c b/drivers/mailbox/pcc.c
index 636879ae1db7..16b9ce087b9e 100644
--- a/drivers/mailbox/pcc.c
+++ b/drivers/mailbox/pcc.c
@@ -314,6 +314,7 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p)
 {
 	struct pcc_chan_info *pchan;
 	struct mbox_chan *chan = p;
+	int rc;
 
 	pchan = chan->con_priv;
 
@@ -327,8 +328,7 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p)
 	if (!pcc_mbox_cmd_complete_check(pchan))
 		return IRQ_NONE;
 
-	if (pcc_mbox_error_check_and_clear(pchan))
-		return IRQ_NONE;
+	rc = pcc_mbox_error_check_and_clear(pchan);
 
 	/*
 	 * Clear this flag after updating interrupt ack register and just
@@ -337,8 +337,9 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p)
 	 * required to avoid any possible race in updatation of this flag.
 	 */
 	pchan->chan_in_use = false;
-	mbox_chan_received_data(chan, NULL);
-	mbox_chan_txdone(chan, 0);
+	if (!rc)
+		mbox_chan_received_data(chan, NULL);
+	mbox_chan_txdone(chan, rc);
 
 	pcc_chan_acknowledge(pchan);
 
-- 
2.43.0
Re: [PATCH v02] mailbox: pcc: report errors for PCC clients
Posted by lihuisong (C) 5 days, 13 hours ago
On 5/19/2026 3:30 AM, Adam Young wrote:
> The tx_done callback function has a return code (rc) parameter
> that the tx_done callback can use to determine how to handle an error.
> However the IRQ handler was not setting that value if there is an error.
>
> The following clients are affected:
>
> drivers/acpi/cppc_acpi.c
> drivers/i2c/busses/i2c-xgene-slimpro.c
> drivers/hwmon/xgene-hwmon.c
> drivers/soc/hisilicon/kunpeng_hccs.c
> drivers/devfreq/hisi_uncore_freq.c
>
> All of these only use the error code to report, so they
> are expecting an error code to come thorugh, but they
> do not modify behavior based on this code.
>
> In the case of an error code in the IRQ, the handler was returning
> IRQ_NONE which is not correct:  the IRQ handler was matched
> to the IRQ.  This mean that multiple error codes returned from
> a PCC triggered interrupt would end up disabling the device.
>
> In addition, if the error code IRQ was coming from a Type4 Device that was
> expecting an IRQ response, that device would then be hung.
>
> Fixes: c45ded7e1135 ("mailbox: pcc: Add support for PCCT extended PCC subspaces(type 3/4)")
Not fix above commit.
mbox_chan_txdone() was added in below patch.
Fixes: 9c753f7c953c (mailbox: pcc: Mark Tx as complete in PCC IRQ handler)
> Signed-off-by: Adam Young <admiyo@os.amperecomputing.com>
>
> ---
> ---
>   drivers/mailbox/pcc.c | 9 +++++----
>   1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/mailbox/pcc.c b/drivers/mailbox/pcc.c
> index 636879ae1db7..16b9ce087b9e 100644
> --- a/drivers/mailbox/pcc.c
> +++ b/drivers/mailbox/pcc.c
> @@ -314,6 +314,7 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p)
>   {
>   	struct pcc_chan_info *pchan;
>   	struct mbox_chan *chan = p;
> +	int rc;
>   
>   	pchan = chan->con_priv;
>   
> @@ -327,8 +328,7 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p)
>   	if (!pcc_mbox_cmd_complete_check(pchan))
>   		return IRQ_NONE;
>   
> -	if (pcc_mbox_error_check_and_clear(pchan))
> -		return IRQ_NONE;
> +	rc = pcc_mbox_error_check_and_clear(pchan);

I think it is not necessary. This function just return -EIO on failure.

>   
>   	/*
>   	 * Clear this flag after updating interrupt ack register and just
> @@ -337,8 +337,9 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p)
>   	 * required to avoid any possible race in updatation of this flag.
>   	 */
>   	pchan->chan_in_use = false;
> -	mbox_chan_received_data(chan, NULL);
> -	mbox_chan_txdone(chan, 0);
> +	if (!rc)
> +		mbox_chan_received_data(chan, NULL);
> +	mbox_chan_txdone(chan, rc);
@Sudeep, I have always had doubts about the addition of this line of 
code in the
  commit 9c753f7c953c (mailbox: pcc: Mark Tx as complete in PCC IRQ 
handler).
The patch seems to avoid the timeouts in the mailbox core according to 
its commit log.
Regardless of whether the command succeeds or fails, each mbox client 
driver, like cppc_acpi/acpi_pcc,kunpeng_hccs and so on, is responsible 
to call mbox_chan_txdone() to tell mailbox core.
This is done after executing mbox_chan_received_data(). So I think this 
line in this function is redundant.
Can you take a look at this agian?
>   
>   	pcc_chan_acknowledge(pchan);
>   
Re: [PATCH v02] mailbox: pcc: report errors for PCC clients
Posted by Sudeep Holla 5 days, 11 hours ago
On Tue, May 19, 2026 at 09:54:47PM +0800, lihuisong (C) wrote:
> 
> On 5/19/2026 3:30 AM, Adam Young wrote:
> > The tx_done callback function has a return code (rc) parameter
> > that the tx_done callback can use to determine how to handle an error.
> > However the IRQ handler was not setting that value if there is an error.
> > 
> > The following clients are affected:
> > 
> > drivers/acpi/cppc_acpi.c
> > drivers/i2c/busses/i2c-xgene-slimpro.c
> > drivers/hwmon/xgene-hwmon.c
> > drivers/soc/hisilicon/kunpeng_hccs.c
> > drivers/devfreq/hisi_uncore_freq.c
> > 
> > All of these only use the error code to report, so they
> > are expecting an error code to come thorugh, but they
> > do not modify behavior based on this code.
> > 
> > In the case of an error code in the IRQ, the handler was returning
> > IRQ_NONE which is not correct:  the IRQ handler was matched
> > to the IRQ.  This mean that multiple error codes returned from
> > a PCC triggered interrupt would end up disabling the device.
> > 
> > In addition, if the error code IRQ was coming from a Type4 Device that was
> > expecting an IRQ response, that device would then be hung.
> > 
> > Fixes: c45ded7e1135 ("mailbox: pcc: Add support for PCCT extended PCC subspaces(type 3/4)")
> Not fix above commit.
> mbox_chan_txdone() was added in below patch.
> Fixes: 9c753f7c953c (mailbox: pcc: Mark Tx as complete in PCC IRQ handler)
> > Signed-off-by: Adam Young <admiyo@os.amperecomputing.com>
> > 
> > ---
> > ---
> >   drivers/mailbox/pcc.c | 9 +++++----
> >   1 file changed, 5 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/mailbox/pcc.c b/drivers/mailbox/pcc.c
> > index 636879ae1db7..16b9ce087b9e 100644
> > --- a/drivers/mailbox/pcc.c
> > +++ b/drivers/mailbox/pcc.c
> > @@ -314,6 +314,7 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p)
> >   {
> >   	struct pcc_chan_info *pchan;
> >   	struct mbox_chan *chan = p;
> > +	int rc;
> >   	pchan = chan->con_priv;
> > @@ -327,8 +328,7 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p)
> >   	if (!pcc_mbox_cmd_complete_check(pchan))
> >   		return IRQ_NONE;
> > -	if (pcc_mbox_error_check_and_clear(pchan))
> > -		return IRQ_NONE;
> > +	rc = pcc_mbox_error_check_and_clear(pchan);
> 
> I think it is not necessary. This function just return -EIO on failure.
> 
> >   	/*
> >   	 * Clear this flag after updating interrupt ack register and just
> > @@ -337,8 +337,9 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p)
> >   	 * required to avoid any possible race in updatation of this flag.
> >   	 */
> >   	pchan->chan_in_use = false;
> > -	mbox_chan_received_data(chan, NULL);
> > -	mbox_chan_txdone(chan, 0);
> > +	if (!rc)
> > +		mbox_chan_received_data(chan, NULL);
> > +	mbox_chan_txdone(chan, rc);
> @Sudeep, I have always had doubts about the addition of this line of code in
> the
>  commit 9c753f7c953c (mailbox: pcc: Mark Tx as complete in PCC IRQ handler).
> The patch seems to avoid the timeouts in the mailbox core according to its
> commit log.
> Regardless of whether the command succeeds or fails, each mbox client
> driver, like cppc_acpi/acpi_pcc,kunpeng_hccs and so on, is responsible to
> call mbox_chan_txdone() to tell mailbox core.

Few controller drivers do have mbox_chan_txdone(), so Tx complete is detected
by PCC, so not sure why you think this is not the right place to do. The irq
is to indicate the completion. I am confused as why you think otherwise.
It is defined in include/linux/mailbox_controller.h for the same reason.

The client drivers can you mbox_client_txdone() if they wish to as defined
in include/linux/mailbox_client.h

> This is done after executing mbox_chan_received_data(). So I think this line
> in this function is redundant.

No, I think otherwise, see details above.

-- 
Regards,
Sudeep
Re: [PATCH v02] mailbox: pcc: report errors for PCC clients
Posted by lihuisong (C) 4 days, 15 hours ago
On 5/20/2026 12:25 AM, Sudeep Holla wrote:
> On Tue, May 19, 2026 at 09:54:47PM +0800, lihuisong (C) wrote:
>> On 5/19/2026 3:30 AM, Adam Young wrote:
>>> The tx_done callback function has a return code (rc) parameter
>>> that the tx_done callback can use to determine how to handle an error.
>>> However the IRQ handler was not setting that value if there is an error.
>>>
>>> The following clients are affected:
>>>
>>> drivers/acpi/cppc_acpi.c
>>> drivers/i2c/busses/i2c-xgene-slimpro.c
>>> drivers/hwmon/xgene-hwmon.c
>>> drivers/soc/hisilicon/kunpeng_hccs.c
>>> drivers/devfreq/hisi_uncore_freq.c
>>>
>>> All of these only use the error code to report, so they
>>> are expecting an error code to come thorugh, but they
>>> do not modify behavior based on this code.
>>>
>>> In the case of an error code in the IRQ, the handler was returning
>>> IRQ_NONE which is not correct:  the IRQ handler was matched
>>> to the IRQ.  This mean that multiple error codes returned from
>>> a PCC triggered interrupt would end up disabling the device.
>>>
>>> In addition, if the error code IRQ was coming from a Type4 Device that was
>>> expecting an IRQ response, that device would then be hung.
>>>
>>> Fixes: c45ded7e1135 ("mailbox: pcc: Add support for PCCT extended PCC subspaces(type 3/4)")
>> Not fix above commit.
>> mbox_chan_txdone() was added in below patch.
>> Fixes: 9c753f7c953c (mailbox: pcc: Mark Tx as complete in PCC IRQ handler)
>>> Signed-off-by: Adam Young <admiyo@os.amperecomputing.com>
>>>
>>> ---
>>> ---
>>>    drivers/mailbox/pcc.c | 9 +++++----
>>>    1 file changed, 5 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/mailbox/pcc.c b/drivers/mailbox/pcc.c
>>> index 636879ae1db7..16b9ce087b9e 100644
>>> --- a/drivers/mailbox/pcc.c
>>> +++ b/drivers/mailbox/pcc.c
>>> @@ -314,6 +314,7 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p)
>>>    {
>>>    	struct pcc_chan_info *pchan;
>>>    	struct mbox_chan *chan = p;
>>> +	int rc;
>>>    	pchan = chan->con_priv;
>>> @@ -327,8 +328,7 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p)
>>>    	if (!pcc_mbox_cmd_complete_check(pchan))
>>>    		return IRQ_NONE;
>>> -	if (pcc_mbox_error_check_and_clear(pchan))
>>> -		return IRQ_NONE;
>>> +	rc = pcc_mbox_error_check_and_clear(pchan);
>> I think it is not necessary. This function just return -EIO on failure.
>>
>>>    	/*
>>>    	 * Clear this flag after updating interrupt ack register and just
>>> @@ -337,8 +337,9 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p)
>>>    	 * required to avoid any possible race in updatation of this flag.
>>>    	 */
>>>    	pchan->chan_in_use = false;
>>> -	mbox_chan_received_data(chan, NULL);
>>> -	mbox_chan_txdone(chan, 0);
>>> +	if (!rc)
>>> +		mbox_chan_received_data(chan, NULL);
>>> +	mbox_chan_txdone(chan, rc);
>> @Sudeep, I have always had doubts about the addition of this line of code in
>> the
>>   commit 9c753f7c953c (mailbox: pcc: Mark Tx as complete in PCC IRQ handler).
>> The patch seems to avoid the timeouts in the mailbox core according to its
>> commit log.
>> Regardless of whether the command succeeds or fails, each mbox client
>> driver, like cppc_acpi/acpi_pcc,kunpeng_hccs and so on, is responsible to
>> call mbox_chan_txdone() to tell mailbox core.
> Few controller drivers do have mbox_chan_txdone(), so Tx complete is detected
Which controller driver?
> by PCC, so not sure why you think this is not the right place to do. The irq
Because many mbox client drivers call mbox_chan_txdone() after running 
rx_callback() in mbox_chan_received_data().
These drivers doesn't set chan->cl->tx_block to true.
It seems that the client driver having tx_block need to set 
chan->tx_complete in tx_tick().
Do you add this code for them?
> is to indicate the completion. I am confused as why you think otherwise.
> It is defined in include/linux/mailbox_controller.h for the same reason.
>
> The client drivers can you mbox_client_txdone() if they wish to as defined
> in include/linux/mailbox_client.h
mbox_client_txdone() is used in the case that txdone_method is 
MBOX_TXDONE_BY_ACK.
And mbox clinte driver using IRQ method need to use mbox_chan_txdone().
It seems that all the current client drivers are used in this way.
These interface internal would verify chan->txdone_method.

In addition, I find that you also modify the txdone_irq/poll in the 
commit 3349f800609e (mailbox: pcc: Set txdone_irq/txdone_poll based on 
PCCT flags).
The txdone_method will change from MBOX_TXDONE_BY_ACK to 
MBOX_TXDONE_BY_POLL on the platform using poll mode.
This may lead to the original mbox client driver printing exceptions in 
mbox_client_txdone.
I haven't observed it based on the latest code yet, it's just code 
analysis.
>
>> This is done after executing mbox_chan_received_data(). So I think this line
>> in this function is redundant.
> No, I think otherwise, see details above.
>
Re: [PATCH v02] mailbox: pcc: report errors for PCC clients
Posted by Sudeep Holla 4 days, 14 hours ago
On Wed, May 20, 2026 at 07:53:45PM +0800, lihuisong (C) wrote:
> 
> On 5/20/2026 12:25 AM, Sudeep Holla wrote:
> > On Tue, May 19, 2026 at 09:54:47PM +0800, lihuisong (C) wrote:

[...]

> > > @Sudeep, I have always had doubts about the addition of this line of code in
> > > the
> > >   commit 9c753f7c953c (mailbox: pcc: Mark Tx as complete in PCC IRQ handler).
> > > The patch seems to avoid the timeouts in the mailbox core according to its
> > > commit log.
> > > Regardless of whether the command succeeds or fails, each mbox client
> > > driver, like cppc_acpi/acpi_pcc,kunpeng_hccs and so on, is responsible to
> > > call mbox_chan_txdone() to tell mailbox core.
> > Few controller drivers do have mbox_chan_txdone(), so Tx complete is detected
> Which controller driver?

git grep mbox_chan_txdone drivers/mailbox/

> > by PCC, so not sure why you think this is not the right place to do. The irq
> Because many mbox client drivers call mbox_chan_txdone() after running
> rx_callback() in mbox_chan_received_data().

OK, but why can't the controller hide that for the clients ? What am I missing?

> These drivers doesn't set chan->cl->tx_block to true.
> It seems that the client driver having tx_block need to set
> chan->tx_complete in tx_tick().
> Do you add this code for them?

I don't quite follow you.

> > is to indicate the completion. I am confused as why you think otherwise.
> > It is defined in include/linux/mailbox_controller.h for the same reason.
> > 
> > The client drivers can you mbox_client_txdone() if they wish to as defined
> > in include/linux/mailbox_client.h
> mbox_client_txdone() is used in the case that txdone_method is
> MBOX_TXDONE_BY_ACK.

Yes and agreed.

> And mbox clinte driver using IRQ method need to use mbox_chan_txdone().

Client doesn't handle IRQ its always controller driver and client must have
no business to do that IMO.

> It seems that all the current client drivers are used in this way.
> These interface internal would verify chan->txdone_method.
> 

Yes, sounds wrong to me.

drivers/acpi/acpi_pcc.c
drivers/acpi/cppc_acpi.c
drivers/hwmon/xgene-hwmon.c
drivers/i2c/busses/i2c-xgene-slimpro.c
drivers/soc/hisilicon/kunpeng_hccs.c

It is very clear from the code in mailbox.c, mbox_client_txdone() is for
the client drivers and mbox_chan_txdone() is for the controller. We need
to fix the above list but I need to check if there is anything I am missing
to understand first. Please let me know.

> In addition, I find that you also modify the txdone_irq/poll in the commit
> 3349f800609e (mailbox: pcc: Set txdone_irq/txdone_poll based on PCCT flags).
> The txdone_method will change from MBOX_TXDONE_BY_ACK to MBOX_TXDONE_BY_POLL
> on the platform using poll mode.
> This may lead to the original mbox client driver printing exceptions in
> mbox_client_txdone.
> I haven't observed it based on the latest code yet, it's just code analysis.

Right, I do remember seeing something and wonder if I moved to
mbox_chan_txdone() in drivers/acpi/acpi_pcc.c for that reason. But if the
expectations I have mentioned are correct, then we need to fix the framework
to avoid throwing that warnings.

-- 
Regards,
Sudeep
Re: [PATCH v02] mailbox: pcc: report errors for PCC clients
Posted by Adam Young 2 days, 10 hours ago
I am not getting li hui song's messages, only your (Sudeep's) responses.

On 5/20/26 09:32, Sudeep Holla wrote:
> On Wed, May 20, 2026 at 07:53:45PM +0800, lihuisong (C) wrote:
>> On 5/20/2026 12:25 AM, Sudeep Holla wrote:
>>> On Tue, May 19, 2026 at 09:54:47PM +0800, lihuisong (C) wrote:
> [...]
>
>>>> @Sudeep, I have always had doubts about the addition of this line of code in
>>>> the
>>>>    commit 9c753f7c953c (mailbox: pcc: Mark Tx as complete in PCC IRQ handler).
>>>> The patch seems to avoid the timeouts in the mailbox core according to its
>>>> commit log.
>>>> Regardless of whether the command succeeds or fails, each mbox client
>>>> driver, like cppc_acpi/acpi_pcc,kunpeng_hccs and so on, is responsible to
>>>> call mbox_chan_txdone() to tell mailbox core.
>>> Few controller drivers do have mbox_chan_txdone(), so Tx complete is detected
>> Which controller driver?
> git grep mbox_chan_txdone drivers/mailbox/


These are the only drivers that have a callback defined so far. IN all 
cases, they are only doing error reporting, but no change of behavior.


     drivers/acpi/cppc_acpi.c
     drivers/i2c/busses/i2c-xgene-slimpro.c
     drivers/hwmon/xgene-hwmon.c
     drivers/soc/hisilicon/kunpeng_hccs.c
     drivers/devfreq/hisi_uncore_freq.c


>
>>> by PCC, so not sure why you think this is not the right place to do. The irq
>> Because many mbox client drivers call mbox_chan_txdone() after running
>> rx_callback() in mbox_chan_received_data().
> OK, but why can't the controller hide that for the clients ? What am I missing?
>
>> These drivers doesn't set chan->cl->tx_block to true.
>> It seems that the client driver having tx_block need to set
>> chan->tx_complete in tx_tick().
>> Do you add this code for them?
> I don't quite follow you.
>
>>> is to indicate the completion. I am confused as why you think otherwise.
>>> It is defined in include/linux/mailbox_controller.h for the same reason.
>>>
>>> The client drivers can you mbox_client_txdone() if they wish to as defined
>>> in include/linux/mailbox_client.h
>> mbox_client_txdone() is used in the case that txdone_method is
>> MBOX_TXDONE_BY_ACK.
> Yes and agreed.

I could make this path conditional on that being set.  Something like:

rc = pcc_mbox_error_check_and_clear(pchan);

       if (rc  &&   chan->txdone_method & MBOX_TXDONE_BY_POLL)
-               return IRQ_NONE;

Which lets the ACK and IRQ paths continue.




>
>> And mbox clinte driver using IRQ method need to use mbox_chan_txdone().
> Client doesn't handle IRQ its always controller driver and client must have
> no business to do that IMO.

IN the PCC case, an error in handling a packet (PCC message) is returned 
in the error register and read during the IRQ response. That error 
message needs to propagate to the MCTP network driver so it can free up 
the SKB and not leak memory. We cannot free it before that point as it 
is still in the rbuf/active_request pointer.

>
>> It seems that all the current client drivers are used in this way.
>> These interface internal would verify chan->txdone_method.
>>
> Yes, sounds wrong to me.
>
> drivers/acpi/acpi_pcc.c
> drivers/acpi/cppc_acpi.c
> drivers/hwmon/xgene-hwmon.c
> drivers/i2c/busses/i2c-xgene-slimpro.c
> drivers/soc/hisilicon/kunpeng_hccs.c
>
> It is very clear from the code in mailbox.c, mbox_client_txdone() is for
> the client drivers and mbox_chan_txdone() is for the controller. We need
> to fix the above list but I need to check if there is anything I am missing
> to understand first. Please let me know.
>
>> In addition, I find that you also modify the txdone_irq/poll in the commit
>> 3349f800609e (mailbox: pcc: Set txdone_irq/txdone_poll based on PCCT flags).
>> The txdone_method will change from MBOX_TXDONE_BY_ACK to MBOX_TXDONE_BY_POLL
>> on the platform using poll mode.
>> This may lead to the original mbox client driver printing exceptions in
>> mbox_client_txdone.
>> I haven't observed it based on the latest code yet, it's just code analysis.
> Right, I do remember seeing something and wonder if I moved to
> mbox_chan_txdone() in drivers/acpi/acpi_pcc.c for that reason. But if the
> expectations I have mentioned are correct, then we need to fix the framework
> to avoid throwing that warnings.
>
Re: [PATCH v02] mailbox: pcc: report errors for PCC clients
Posted by lihuisong (C) 3 days, 15 hours ago
On 5/20/2026 9:32 PM, Sudeep Holla wrote:
> On Wed, May 20, 2026 at 07:53:45PM +0800, lihuisong (C) wrote:
>> On 5/20/2026 12:25 AM, Sudeep Holla wrote:
>>> On Tue, May 19, 2026 at 09:54:47PM +0800, lihuisong (C) wrote:
> [...]
>
>>>> @Sudeep, I have always had doubts about the addition of this line of code in
>>>> the
>>>>    commit 9c753f7c953c (mailbox: pcc: Mark Tx as complete in PCC IRQ handler).
>>>> The patch seems to avoid the timeouts in the mailbox core according to its
>>>> commit log.
>>>> Regardless of whether the command succeeds or fails, each mbox client
>>>> driver, like cppc_acpi/acpi_pcc,kunpeng_hccs and so on, is responsible to
>>>> call mbox_chan_txdone() to tell mailbox core.
>>> Few controller drivers do have mbox_chan_txdone(), so Tx complete is detected
>> Which controller driver?
> git grep mbox_chan_txdone drivers/mailbox/
Ok
>
>>> by PCC, so not sure why you think this is not the right place to do. The irq
>> Because many mbox client drivers call mbox_chan_txdone() after running
>> rx_callback() in mbox_chan_received_data().
> OK, but why can't the controller hide that for the clients ? What am I missing?
Now I know what you want to do.
It's ok for me to do that in controller on irq scene.
But we also need to fix the mbox_chan_txdone() code in all client drivers.
>
>> These drivers doesn't set chan->cl->tx_block to true.
>> It seems that the client driver having tx_block need to set
>> chan->tx_complete in tx_tick().
>> Do you add this code for them?
> I don't quite follow you.
please ship this.
>
>>> is to indicate the completion. I am confused as why you think otherwise.
>>> It is defined in include/linux/mailbox_controller.h for the same reason.
>>>
>>> The client drivers can you mbox_client_txdone() if they wish to as defined
>>> in include/linux/mailbox_client.h
>> mbox_client_txdone() is used in the case that txdone_method is
>> MBOX_TXDONE_BY_ACK.
> Yes and agreed.
>
>> And mbox clinte driver using IRQ method need to use mbox_chan_txdone().
> Client doesn't handle IRQ its always controller driver and client must have
> no business to do that IMO.
Ack
mbox_chan_txdone should be used by controller as this function comment 
said.
>
>> It seems that all the current client drivers are used in this way.
>> These interface internal would verify chan->txdone_method.
>>
> Yes, sounds wrong to me.
>
> drivers/acpi/acpi_pcc.c
> drivers/acpi/cppc_acpi.c
> drivers/hwmon/xgene-hwmon.c
> drivers/i2c/busses/i2c-xgene-slimpro.c
> drivers/soc/hisilicon/kunpeng_hccs.c
>
> It is very clear from the code in mailbox.c, mbox_client_txdone() is for
> the client drivers and mbox_chan_txdone() is for the controller. We need
> to fix the above list but I need to check if there is anything I am missing
> to understand first. Please let me know.
Agreed.
>
>> In addition, I find that you also modify the txdone_irq/poll in the commit
>> 3349f800609e (mailbox: pcc: Set txdone_irq/txdone_poll based on PCCT flags).
>> The txdone_method will change from MBOX_TXDONE_BY_ACK to MBOX_TXDONE_BY_POLL
>> on the platform using poll mode.
>> This may lead to the original mbox client driver printing exceptions in
>> mbox_client_txdone.
>> I haven't observed it based on the latest code yet, it's just code analysis.
> Right, I do remember seeing something and wonder if I moved to
> mbox_chan_txdone() in drivers/acpi/acpi_pcc.c for that reason. But if the
> expectations I have mentioned are correct, then we need to fix the framework
> to avoid throwing that warnings.
Yeah, we also need to do something for your another commit
3349f800609e (mailbox: pcc: Set txdone_irq/txdone_poll based on PCCT 
flags).
>
Re: [PATCH v02] mailbox: pcc: report errors for PCC clients
Posted by Sudeep Holla 5 days, 14 hours ago
On Mon, May 18, 2026 at 03:30:06PM -0400, Adam Young wrote:
> The tx_done callback function has a return code (rc) parameter
> that the tx_done callback can use to determine how to handle an error.
> However the IRQ handler was not setting that value if there is an error.
> 
> The following clients are affected:
> 
> drivers/acpi/cppc_acpi.c
> drivers/i2c/busses/i2c-xgene-slimpro.c
> drivers/hwmon/xgene-hwmon.c
> drivers/soc/hisilicon/kunpeng_hccs.c
> drivers/devfreq/hisi_uncore_freq.c
> 
> All of these only use the error code to report, so they
> are expecting an error code to come thorugh, but they
> do not modify behavior based on this code.
> 
> In the case of an error code in the IRQ, the handler was returning
> IRQ_NONE which is not correct:  the IRQ handler was matched
> to the IRQ.  This mean that multiple error codes returned from
> a PCC triggered interrupt would end up disabling the device.
> 
> In addition, if the error code IRQ was coming from a Type4 Device that was
> expecting an IRQ response, that device would then be hung.
> 
> Fixes: c45ded7e1135 ("mailbox: pcc: Add support for PCCT extended PCC subspaces(type 3/4)")
> Signed-off-by: Adam Young <admiyo@os.amperecomputing.com>
> 
> ---
> ---
>  drivers/mailbox/pcc.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/mailbox/pcc.c b/drivers/mailbox/pcc.c
> index 636879ae1db7..16b9ce087b9e 100644
> --- a/drivers/mailbox/pcc.c
> +++ b/drivers/mailbox/pcc.c
> @@ -314,6 +314,7 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p)
>  {
>  	struct pcc_chan_info *pchan;
>  	struct mbox_chan *chan = p;
> +	int rc;
>  
>  	pchan = chan->con_priv;
>  
> @@ -327,8 +328,7 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p)
>  	if (!pcc_mbox_cmd_complete_check(pchan))
>  		return IRQ_NONE;
>  
> -	if (pcc_mbox_error_check_and_clear(pchan))
> -		return IRQ_NONE;
> +	rc = pcc_mbox_error_check_and_clear(pchan);

I think we may have to skip the check inside pcc_mbox_error_check_and_clear()
for Type 4 channel as the spec expects OSPM to ignore it. It is a separate
fix, just noting that here.

>  
>  	/*
>  	 * Clear this flag after updating interrupt ack register and just
> @@ -337,8 +337,9 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p)
>  	 * required to avoid any possible race in updatation of this flag.
>  	 */
>  	pchan->chan_in_use = false;
> -	mbox_chan_received_data(chan, NULL);
> -	mbox_chan_txdone(chan, 0);
> +	if (!rc)
> +		mbox_chan_received_data(chan, NULL);

Not sure if making this conditional is good as some platforms may expect
it to move the state machine, I am not sure 100% just thinking aloud here.

-- 
Regards,
Sudeep