[PATCH] staging: Fix atomicity violation in get_serial_info()

Qiu-ji Chen posted 1 patch 1 month, 4 weeks ago
drivers/staging/greybus/uart.c | 2 ++
1 file changed, 2 insertions(+)
[PATCH] staging: Fix atomicity violation in get_serial_info()
Posted by Qiu-ji Chen 1 month, 4 weeks ago
Atomicity violation occurs during consecutive reads of the members of 
gb_tty. Consider a scenario where, because the consecutive reads of gb_tty
members are not protected by a lock, the value of gb_tty may still be 
changing during the read process. 

gb_tty->port.close_delay and gb_tty->port.closing_wait are updated
together, such as in the set_serial_info() function. If during the
read process, gb_tty->port.close_delay and gb_tty->port.closing_wait
are still being updated, it is possible that gb_tty->port.close_delay
is updated while gb_tty->port.closing_wait is not. In this case,
the code first reads gb_tty->port.close_delay and then
gb_tty->port.closing_wait. A new gb_tty->port.close_delay and an
old gb_tty->port.closing_wait could be read. Such values, whether
before or after the update, should not coexist as they represent an
intermediate state.

This could result in a mismatch of the values read for gb_tty->minor, 
gb_tty->port.close_delay, and gb_tty->port.closing_wait, which in turn 
could cause ss->close_delay and ss->closing_wait to be mismatched.

To address this issue, we have enclosed all sequential read operations of 
the gb_tty variable within a lock. This ensures that the value of gb_tty 
remains unchanged throughout the process, guaranteeing its validity.

This possible bug is found by an experimental static analysis tool
developed by our team. This tool analyzes the locking APIs
to extract function pairs that can be concurrently executed, and then
analyzes the instructions in the paired functions to identify possible
concurrency bugs including data races and atomicity violations.

Fixes: b71e571adaa5 ("staging: greybus: uart: fix TIOCSSERIAL jiffies conversions")
Cc: stable@vger.kernel.org
Signed-off-by: Qiu-ji Chen <chenqiuji666@gmail.com>
---
 drivers/staging/greybus/uart.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/staging/greybus/uart.c b/drivers/staging/greybus/uart.c
index cdf4ebb93b10..8cc18590d97b 100644
--- a/drivers/staging/greybus/uart.c
+++ b/drivers/staging/greybus/uart.c
@@ -595,12 +595,14 @@ static int get_serial_info(struct tty_struct *tty,
 {
 	struct gb_tty *gb_tty = tty->driver_data;
 
+	mutex_lock(&gb_tty->port.mutex);
 	ss->line = gb_tty->minor;
 	ss->close_delay = jiffies_to_msecs(gb_tty->port.close_delay) / 10;
 	ss->closing_wait =
 		gb_tty->port.closing_wait == ASYNC_CLOSING_WAIT_NONE ?
 		ASYNC_CLOSING_WAIT_NONE :
 		jiffies_to_msecs(gb_tty->port.closing_wait) / 10;
+	mutex_unlock(&gb_tty->port.mutex);
 
 	return 0;
 }
-- 
2.34.1
Re: [PATCH] staging: Fix atomicity violation in get_serial_info()
Posted by Johan Hovold 1 month, 4 weeks ago
On Mon, Sep 30, 2024 at 06:14:03PM +0800, Qiu-ji Chen wrote:
> Atomicity violation occurs during consecutive reads of the members of 
> gb_tty. Consider a scenario where, because the consecutive reads of gb_tty
> members are not protected by a lock, the value of gb_tty may still be 
> changing during the read process. 
> 
> gb_tty->port.close_delay and gb_tty->port.closing_wait are updated
> together, such as in the set_serial_info() function. If during the
> read process, gb_tty->port.close_delay and gb_tty->port.closing_wait
> are still being updated, it is possible that gb_tty->port.close_delay
> is updated while gb_tty->port.closing_wait is not. In this case,
> the code first reads gb_tty->port.close_delay and then
> gb_tty->port.closing_wait. A new gb_tty->port.close_delay and an
> old gb_tty->port.closing_wait could be read. Such values, whether
> before or after the update, should not coexist as they represent an
> intermediate state.
> 
> This could result in a mismatch of the values read for gb_tty->minor, 

No, gb_tty minor is only set at probe().

> gb_tty->port.close_delay, and gb_tty->port.closing_wait, which in turn 
> could cause ss->close_delay and ss->closing_wait to be mismatched.

Sure, but that's a pretty minor issue as Dan already pointed out.

> To address this issue, we have enclosed all sequential read operations of 
> the gb_tty variable within a lock. This ensures that the value of gb_tty 
> remains unchanged throughout the process, guaranteeing its validity.
> 
> This possible bug is found by an experimental static analysis tool
> developed by our team. This tool analyzes the locking APIs
> to extract function pairs that can be concurrently executed, and then
> analyzes the instructions in the paired functions to identify possible
> concurrency bugs including data races and atomicity violations.
> 
> Fixes: b71e571adaa5 ("staging: greybus: uart: fix TIOCSSERIAL jiffies conversions")

And this obviously isn't the correct commit to blame. Please be more
careful.

> Cc: stable@vger.kernel.org

Since this is unlikely to cause any issues for a user, I don't think
stable backport is warranted either.

> Signed-off-by: Qiu-ji Chen <chenqiuji666@gmail.com>
> ---
>  drivers/staging/greybus/uart.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/staging/greybus/uart.c b/drivers/staging/greybus/uart.c
> index cdf4ebb93b10..8cc18590d97b 100644
> --- a/drivers/staging/greybus/uart.c
> +++ b/drivers/staging/greybus/uart.c
> @@ -595,12 +595,14 @@ static int get_serial_info(struct tty_struct *tty,
>  {
>  	struct gb_tty *gb_tty = tty->driver_data;
>  
> +	mutex_lock(&gb_tty->port.mutex);
>  	ss->line = gb_tty->minor;

gb_tty is not protected by the port mutex.

>  	ss->close_delay = jiffies_to_msecs(gb_tty->port.close_delay) / 10;
>  	ss->closing_wait =
>  		gb_tty->port.closing_wait == ASYNC_CLOSING_WAIT_NONE ?
>  		ASYNC_CLOSING_WAIT_NONE :
>  		jiffies_to_msecs(gb_tty->port.closing_wait) / 10;
> +	mutex_unlock(&gb_tty->port.mutex);
>  
>  	return 0;
>  }

Johan
Re: [PATCH] staging: Fix atomicity violation in get_serial_info()
Posted by Dan Carpenter 1 month, 4 weeks ago
On Mon, Sep 30, 2024 at 06:14:03PM +0800, Qiu-ji Chen wrote:
> Atomicity violation occurs during consecutive reads of the members of 
> gb_tty. Consider a scenario where, because the consecutive reads of gb_tty
> members are not protected by a lock, the value of gb_tty may still be 
> changing during the read process. 
> 
> gb_tty->port.close_delay and gb_tty->port.closing_wait are updated
> together, such as in the set_serial_info() function. If during the
> read process, gb_tty->port.close_delay and gb_tty->port.closing_wait
> are still being updated, it is possible that gb_tty->port.close_delay
> is updated while gb_tty->port.closing_wait is not. In this case,
> the code first reads gb_tty->port.close_delay and then
> gb_tty->port.closing_wait. A new gb_tty->port.close_delay and an
> old gb_tty->port.closing_wait could be read. Such values, whether
> before or after the update, should not coexist as they represent an
> intermediate state.
> 
> This could result in a mismatch of the values read for gb_tty->minor, 
> gb_tty->port.close_delay, and gb_tty->port.closing_wait, which in turn 
> could cause ss->close_delay and ss->closing_wait to be mismatched.
> 
> To address this issue, we have enclosed all sequential read operations of 
> the gb_tty variable within a lock. This ensures that the value of gb_tty 
> remains unchanged throughout the process, guaranteeing its validity.
> 
> This possible bug is found by an experimental static analysis tool
> developed by our team. This tool analyzes the locking APIs
> to extract function pairs that can be concurrently executed, and then
> analyzes the instructions in the paired functions to identify possible
> concurrency bugs including data races and atomicity violations.
> 

Ideally a commit message should say what the bug looks like to the user.
Obviously when you're doing static analysis and not using the code, it's more
difficult to tell the impact.

I would say that this commit message is confusing and makes it seem like a
bigger deal than it is.  The "ss" struct is information that we're going to send
to the user.  It's not used again in the kernel.

Could you re-write the commit message to say something like, "Our static checker
found a bug where set serial takes a mutex and get serial doesn't.  Fortunately,
the impact of this is relatively minor.  It doesn't cause a crash or anything.
If the user races set serial and get serial there is a chance that the get
serial information will be garbage."

regards,
dan carpenter