[PATCH v2 1/2] rpmsg: glink: Handle rejected intent request better

Bjorn Andersson posted 2 patches 1 month ago
Re: [PATCH v2 1/2] rpmsg: glink: Handle rejected intent request better
Posted by Chris Lew 1 month ago

On 10/23/2024 10:24 AM, Bjorn Andersson wrote:
> GLINK operates using pre-allocated buffers, aka intents, where incoming
> messages are aggregated before being passed up the stack. In the case
> that no suitable intents have been announced by the receiver, the sender
> can request an intent to be allocated.
> 
> The initial implementation of the response to such request dealt
> with two outcomes; granted allocations, and all other cases being
> considered -ECANCELLED (likely from "cancelling the operation as the
> remote is going down").
> 
> But on some channels intent allocation is not supported, instead the
> remote will pre-allocate and announce a fixed number of intents for the
> sender to use. If for such channels an rpmsg_send() is being invoked
> before any channels have been announced, an intent request will be
> issued and as this comes back rejected the call fails with -ECANCELED.
> 
> Given that this is reported in the same way as the remote being shut
> down, there's no way for the client to differentiate the two cases.
> 
> In line with the original GLINK design, change the return value to
> -EAGAIN for the case where the remote rejects an intent allocation
> request.
> 
> It's tempting to handle this case in the GLINK core, as we expect
> intents to show up in this case. But there's no way to distinguish
> between this case and a rejection for a too big allocation, nor is it
> possible to predict if a currently used (and seemingly suitable) intent
> will be returned for reuse or not. As such, returning the error to the
> client and allow it to react seems to be the only sensible solution.
> 
> In addition to this, commit 'c05dfce0b89e ("rpmsg: glink: Wait for
> intent, not just request ack")' changed the logic such that the code
> always wait for an intent request response and an intent. This works out
> in most cases, but in the event that an intent request is rejected and no
> further intent arrives (e.g. client asks for a too big intent), the code
> will stall for 10 seconds and then return -ETIMEDOUT; instead of a more
> suitable error.
> 
> This change also resulted in intent requests racing with the shutdown of
> the remote would be exposed to this same problem, unless some intent
> happens to arrive. A patch for this was developed and posted by Sarannya
> S [1], and has been incorporated here.
> 
> To summarize, the intent request can end in 4 ways:
> - Timeout, no response arrived => return -ETIMEDOUT
> - Abort TX, the edge is going away => return -ECANCELLED
> - Intent request was rejected => return -EAGAIN
> - Intent request was accepted, and an intent arrived => return 0
> 
> This patch was developed with input from Sarannya S, Deepak Kumar Singh,
> and Chris Lew.
> 
> [1] https://lore.kernel.org/all/20240925072328.1163183-1-quic_deesin@quicinc.com/
> 
> Fixes: c05dfce0b89e ("rpmsg: glink: Wait for intent, not just request ack")
> Cc: stable@vger.kernel.org
> Tested-by: Johan Hovold <johan+linaro@kernel.org>
> Signed-off-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com>
> ---

Reviewed-by: Chris Lew <quic_clew@quicinc.com>