drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
There is a possible null-pointer dereference related to the wait-completion
synchronization mechanism in the function chcr_ktls_dev_add().
Consider the following execution scenario:
chcr_ktls_cpl_act_open_rpl() //641
u_ctx = adap->uld[CXGB4_ULD_KTLS].handle; //686
if (u_ctx) { //687
complete(&tx_info->completion); //704
The variable u_ctx is checked by an if statement at Line 687, which means
it can be NULL. Then, complete() is called at Line 704, which will wake
up wait_for_completion_xxx().
Consider the wait_for_completion_timeout() in chcr_ktls_dev_add():
chcr_ktls_dev_add() //412
u_ctx = adap->uld[CXGB4_ULD_KTLS].handle; //432
wait_for_completion_timeout(&tx_info->completion, 30 * HZ); //551
xa_erase(&u_ctx->tid_list, tx_info->tid); //580
The variable u_ctx is dereferenced without being rechecked at Line 580
after the wait_for_completion_timeout(), which can introduce a null-pointer
dereference. Besides, the variable u_ctx is also checked at Line 442 in
chcr_ktls_dev_add(), which indicates that u_ctx is likely to be NULL in
some execution contexts.
To fix this possible null-pointer dereference, a NULL check is put ahead of
the call to xa_erase().
This potential bug was discovered using an experimental static analysis
tool developed by our team. The tool deduces complete() and
wait_for_completion() pairs using alias analysis. It then applies data
flow analysis to detect null-pointer dereferences across synchronization
points.
Fixes: 65e302a9bd57 ("cxgb4/ch_ktls: Clear resources when pf4 device is removed")
Signed-off-by: Tuo Li <islituo@gmail.com>
---
drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c b/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
index e8e460a92e0e..524c8e032bc8 100644
--- a/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
+++ b/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c
@@ -577,7 +577,8 @@ static int chcr_ktls_dev_add(struct net_device *netdev, struct sock *sk,
cxgb4_remove_tid(&tx_info->adap->tids, tx_info->tx_chan,
tx_info->tid, tx_info->ip_family);
- xa_erase(&u_ctx->tid_list, tx_info->tid);
+ if (u_ctx)
+ xa_erase(&u_ctx->tid_list, tx_info->tid);
put_module:
/* release module refcount */
--
2.43.0
… > Consider the following execution scenario: > > chcr_ktls_cpl_act_open_rpl() //641 > u_ctx = adap->uld[CXGB4_ULD_KTLS].handle; //686 > if (u_ctx) { //687 > complete(&tx_info->completion); //704 > > The variable u_ctx is checked by an if statement at Line 687, which means > it can be NULL. Then, complete() is called at Line 704, which will wake > up wait_for_completion_xxx(). … To which software revision would you like to refer here? How does the presented information fit to a statement like the following? https://elixir.bootlin.com/linux/v6.12-rc6/source/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c#L442 if (u_ctx && u_ctx->detach) goto out; Would you eventually like to trace the control flow back any further for the data structure member “handle”? Regards, Markus
On 2024/11/8 23:00, Markus Elfring wrote: > … >> Consider the following execution scenario: >> >> chcr_ktls_cpl_act_open_rpl() //641 >> u_ctx = adap->uld[CXGB4_ULD_KTLS].handle; //686 >> if (u_ctx) { //687 >> complete(&tx_info->completion); //704 >> >> The variable u_ctx is checked by an if statement at Line 687, which means >> it can be NULL. Then, complete() is called at Line 704, which will wake >> up wait_for_completion_xxx(). > … > > To which software revision would you like to refer here? > > How does the presented information fit to a statement like the following? > https://elixir.bootlin.com/linux/v6.12-rc6/source/drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c#L442 > > if (u_ctx && u_ctx->detach) > goto out; We have run our tool on Linux 6.11, and the line numbers correspond to the code in that version. > Would you eventually like to trace the control flow back any further > for the data structure member “handle”? > I have traced the control flow further for the data structure member 'handle,' but I have not found where the member is assigned a NULL value. I am not sure if I might have overlooked something.
> We have run our tool on Linux 6.11, and the line numbers correspond to the > code in that version. Would you like to share any source code analysis results for more recent software versions? Regards, Markus
On 2024/11/14 20:26, Markus Elfring wrote: >> We have run our tool on Linux 6.11, and the line numbers correspond to the >> code in that version. > > Would you like to share any source code analysis results for more recent software versions? Hi Elfring, Thanks for your reply. I ran our tool on Linux 6.12-rc7 (https://elixir.bootlin.com/linux/v6.12-rc7/source), and the same issue persists. The line number is identical to that on Linux 6.11. chcr_ktls_cpl_act_open_rpl() //641 u_ctx = adap->uld[CXGB4_ULD_KTLS].handle; //686 if (u_ctx) { //687 complete(&tx_info->completion); //704 chcr_ktls_dev_add() //412 u_ctx = adap->uld[CXGB4_ULD_KTLS].handle; //432 wait_for_completion_timeout(&tx_info->completion, 30 * HZ); //551 xa_erase(&u_ctx->tid_list, tx_info->tid); //580 Any further feedback would be appreciated! Sincerely, Tuo Li
… > chcr_ktls_cpl_act_open_rpl() //641 … > chcr_ktls_dev_add() //412… * How do you think about to improve your change description another bit? * Why do you try to refer to two different function implementations here? * Will further adjustment possibilities become interesting? Regards, Markus
On Thu, Oct 31, 2024 at 12:23:52AM +1100, Tuo Li wrote: > There is a possible null-pointer dereference related to the wait-completion > synchronization mechanism in the function chcr_ktls_dev_add(). > > Consider the following execution scenario: > > chcr_ktls_cpl_act_open_rpl() //641 > u_ctx = adap->uld[CXGB4_ULD_KTLS].handle; //686 > if (u_ctx) { //687 > complete(&tx_info->completion); //704 > > The variable u_ctx is checked by an if statement at Line 687, which means > it can be NULL. Then, complete() is called at Line 704, which will wake > up wait_for_completion_xxx(). > > Consider the wait_for_completion_timeout() in chcr_ktls_dev_add(): > > chcr_ktls_dev_add() //412 > u_ctx = adap->uld[CXGB4_ULD_KTLS].handle; //432 > wait_for_completion_timeout(&tx_info->completion, 30 * HZ); //551 > xa_erase(&u_ctx->tid_list, tx_info->tid); //580 > > The variable u_ctx is dereferenced without being rechecked at Line 580 > after the wait_for_completion_timeout(), which can introduce a null-pointer > dereference. Besides, the variable u_ctx is also checked at Line 442 in > chcr_ktls_dev_add(), which indicates that u_ctx is likely to be NULL in > some execution contexts. > > To fix this possible null-pointer dereference, a NULL check is put ahead of > the call to xa_erase(). > > This potential bug was discovered using an experimental static analysis > tool developed by our team. The tool deduces complete() and > wait_for_completion() pairs using alias analysis. It then applies data > flow analysis to detect null-pointer dereferences across synchronization > points. > > Fixes: 65e302a9bd57 ("cxgb4/ch_ktls: Clear resources when pf4 device is removed") > Signed-off-by: Tuo Li <islituo@gmail.com> Hi Tuo Li, I do see that the checking of u_ctx is inconsistent, but it is not clear to me that is because one part is too defensive or, OTOH, there is a bug as you suggest. And I think that we need more analysis to determine which case it is. Also, if it is the case that there is a bug as you suggest, after a quick search, I think it also exists in at least one other place in this file. ...
Hi Simon, Thanks for your reply! It is very helpful. Any further feedback will be appreciated. Thank you very much! Sincerely, Tuo Li On 2024/11/5 0:07, Simon Horman wrote: > On Thu, Oct 31, 2024 at 12:23:52AM +1100, Tuo Li wrote: >> There is a possible null-pointer dereference related to the wait-completion >> synchronization mechanism in the function chcr_ktls_dev_add(). >> >> Consider the following execution scenario: >> >> chcr_ktls_cpl_act_open_rpl() //641 >> u_ctx = adap->uld[CXGB4_ULD_KTLS].handle; //686 >> if (u_ctx) { //687 >> complete(&tx_info->completion); //704 >> >> The variable u_ctx is checked by an if statement at Line 687, which means >> it can be NULL. Then, complete() is called at Line 704, which will wake >> up wait_for_completion_xxx(). >> >> Consider the wait_for_completion_timeout() in chcr_ktls_dev_add(): >> >> chcr_ktls_dev_add() //412 >> u_ctx = adap->uld[CXGB4_ULD_KTLS].handle; //432 >> wait_for_completion_timeout(&tx_info->completion, 30 * HZ); //551 >> xa_erase(&u_ctx->tid_list, tx_info->tid); //580 >> >> The variable u_ctx is dereferenced without being rechecked at Line 580 >> after the wait_for_completion_timeout(), which can introduce a null-pointer >> dereference. Besides, the variable u_ctx is also checked at Line 442 in >> chcr_ktls_dev_add(), which indicates that u_ctx is likely to be NULL in >> some execution contexts. >> >> To fix this possible null-pointer dereference, a NULL check is put ahead of >> the call to xa_erase(). >> >> This potential bug was discovered using an experimental static analysis >> tool developed by our team. The tool deduces complete() and >> wait_for_completion() pairs using alias analysis. It then applies data >> flow analysis to detect null-pointer dereferences across synchronization >> points. >> >> Fixes: 65e302a9bd57 ("cxgb4/ch_ktls: Clear resources when pf4 device is removed") >> Signed-off-by: Tuo Li <islituo@gmail.com> > > Hi Tuo Li, > > I do see that the checking of u_ctx is inconsistent, > but it is not clear to me that is because one part is too defensive > or, OTOH, there is a bug as you suggest. And I think that we need > more analysis to determine which case it is. > > Also, if it is the case that there is a bug as you suggest, after a quick > search, I think it also exists in at least one other place in this file. > > ...
On Tue, Nov 05, 2024 at 07:32:31AM +0800, Tuo Li wrote: > Hi Simon, > > Thanks for your reply! It is very helpful. > Any further feedback will be appreciated. Hi Tuo Li, 1. Please don't top-post on Linux mailing lists. 2. I think you need to do some careful analysis to understand if the condition you are concerned about can occur or not: can u_ctx ever be NULL in these code paths?
On 2024/11/8 21:34, Simon Horman wrote: > On Tue, Nov 05, 2024 at 07:32:31AM +0800, Tuo Li wrote: >> Hi Simon, >> >> Thanks for your reply! It is very helpful. >> Any further feedback will be appreciated. > > Hi Tuo Li, > > 1. Please don't top-post on Linux mailing lists. Apologies for the top-posting; I will avoid that in the future. > 2. I think you need to do some careful analysis to understand > if the condition you are concerned about can occur or not: > can u_ctx ever be NULL in these code paths? I have carefully reviewed the code paths, but I could not find any instance where the variable u_ctx is assigned a NULL value. It might be a defensive check to handle potential NULL values.
© 2016 - 2024 Red Hat, Inc.