[PATCH v5] tty: tty_port: add workqueue to flip tty buffer

Xin Zhao posted 1 patch 2 weeks ago
There is a newer version of this series
drivers/tty/pty.c          | 19 +++++++++++++++----
drivers/tty/tty_buffer.c   |  8 ++++----
drivers/tty/tty_io.c       | 19 +++++++++++++++++++
drivers/tty/tty_port.c     | 19 +++++++++++++++++++
include/linux/tty_buffer.h |  1 +
include/linux/tty_driver.h |  7 +++++++
include/linux/tty_port.h   |  9 +++++++++
7 files changed, 74 insertions(+), 8 deletions(-)
[PATCH v5] tty: tty_port: add workqueue to flip tty buffer
Posted by Xin Zhao 2 weeks ago
On the embedded platform, certain critical data, such as IMU data, is
transmitted through UART. The tty_flip_buffer_push interface in the TTY
layer uses system_unbound_wq to handle the flipping of the TTY buffer.
Although the unbound workqueue can create new threads on demand and wake
up the kworker thread on an idle CPU, it may be preeempted by real-time
tasks or other high-prio tasks.
In flush_to_ldisc, when executing n_tty_receive_buf_common, it wakes up
other tasks. __wake_up_common_lock calls spin_lock_irqsave, which does
not disable preemption but disable migration in RT-Linux. This prevents
the kworker thread from being migrated to other cores by CPU's balancing
logic, resulting in long delays.
In our system, the processing interval for each frame of IMU data
transmitted via UART can experience significant jitter due to this issue.
Instead of the expected 10 to 15 ms frame processing interval, we see
spikes up to 30 to 35 ms. Moreover, in just one or two hours, there can
be 2 to 3 occurrences of such high jitter, which is quite frequent. This
jitter exceeds the software's tolerable limit of 20 ms.
Introduce flip_wq in tty_port which can be set by tty_port_link_wq or as
default linked to workqueue allocated when tty_register_driver using flag
WQ_SYSFS, so that cpumask and nice can be set dynamically.
Introduce TTY_DRIVER_CUSTOM_WORKQUEUE flag meaning not to create the
default single tty_driver workqueue.
We set the cpumask to the same cpu where the IMU data is handled and has
less long-time high-prio jobs, and then set nice to -20, the frame
processing interval remains between 10 and 15ms, no jitter occurs.

---
Change in v5:
- Do not allocate workqueue twice when CONFIG_UNIX98_PTYS and
  CONFIG_LEGACY_PTYS are all enabled.

Change in v4:
- Simplify the logic for creating and releasing the workqueue,
  as suggested by Tejun Heo.
- Allocate single workqueue of one tty_driver as default, link it to
  port when tty_port register device or tty_driver.
- Introduce tty_port_link_wq to link specific workqueue to port.
- Add driver flag TTY_DRIVER_CUSTOM_WORKQUEUE meaning not to create the
  default single tty_driver workqueue.
- Link to v4: https://lore.kernel.org/all/202512041303.7192024b-lkp@intel.com/T/#t

Change in v3:
- Add tty flip workqueue for all tty ports, as suggested by Greg KH.
  Every tty port use an individual flip workqueue, while all pty ports
  share the same workqueue created in pty_flip_wq_init.
- Modify the commit log to describe the reason for latency spikes in
  RT-Linux.
- Link to v3: https://lore.kernel.org/all/20251027060929.394053-1-jackzxcui1989@163.com/

Change in v2:
- Do not add new module parameters
  as suggested by Greg KH
- Set WQ_SYSFS to allow properties changes from userspace
  as suggested by Tejun Heo
- Link to v2: https://lore.kernel.org/all/20251024155534.2302590-1-jackzxcui1989@163.com

Signed-off-by: Xin Zhao <jackzxcui1989@163.com>
---
 drivers/tty/pty.c          | 19 +++++++++++++++----
 drivers/tty/tty_buffer.c   |  8 ++++----
 drivers/tty/tty_io.c       | 19 +++++++++++++++++++
 drivers/tty/tty_port.c     | 19 +++++++++++++++++++
 include/linux/tty_buffer.h |  1 +
 include/linux/tty_driver.h |  7 +++++++
 include/linux/tty_port.h   |  9 +++++++++
 7 files changed, 74 insertions(+), 8 deletions(-)

diff --git a/drivers/tty/pty.c b/drivers/tty/pty.c
index 8bb1a01fe..e51579393 100644
--- a/drivers/tty/pty.c
+++ b/drivers/tty/pty.c
@@ -44,6 +44,8 @@ static struct tty_driver *pts_driver;
 static DEFINE_MUTEX(devpts_mutex);
 #endif
 
+static struct workqueue_struct *pty_flip_wq;
+
 static void pty_close(struct tty_struct *tty, struct file *filp)
 {
 	if (tty->driver->subtype == PTY_TYPE_MASTER)
@@ -407,6 +409,8 @@ static int pty_common_install(struct tty_driver *driver, struct tty_struct *tty,
 	o_tty->link = tty;
 	tty_port_init(ports[0]);
 	tty_port_init(ports[1]);
+	tty_port_link_wq(ports[0], pty_flip_wq);
+	tty_port_link_wq(ports[1], pty_flip_wq);
 	tty_buffer_set_limit(ports[0], 8192);
 	tty_buffer_set_limit(ports[1], 8192);
 	o_tty->port = ports[0];
@@ -536,14 +540,16 @@ static void __init legacy_pty_init(void)
 	pty_driver = tty_alloc_driver(legacy_count,
 			TTY_DRIVER_RESET_TERMIOS |
 			TTY_DRIVER_REAL_RAW |
-			TTY_DRIVER_DYNAMIC_ALLOC);
+			TTY_DRIVER_DYNAMIC_ALLOC |
+			TTY_DRIVER_CUSTOM_WORKQUEUE);
 	if (IS_ERR(pty_driver))
 		panic("Couldn't allocate pty driver");
 
 	pty_slave_driver = tty_alloc_driver(legacy_count,
 			TTY_DRIVER_RESET_TERMIOS |
 			TTY_DRIVER_REAL_RAW |
-			TTY_DRIVER_DYNAMIC_ALLOC);
+			TTY_DRIVER_DYNAMIC_ALLOC |
+			TTY_DRIVER_CUSTOM_WORKQUEUE);
 	if (IS_ERR(pty_slave_driver))
 		panic("Couldn't allocate pty slave driver");
 
@@ -877,7 +883,8 @@ static void __init unix98_pty_init(void)
 			TTY_DRIVER_REAL_RAW |
 			TTY_DRIVER_DYNAMIC_DEV |
 			TTY_DRIVER_DEVPTS_MEM |
-			TTY_DRIVER_DYNAMIC_ALLOC);
+			TTY_DRIVER_DYNAMIC_ALLOC |
+			TTY_DRIVER_CUSTOM_WORKQUEUE);
 	if (IS_ERR(ptm_driver))
 		panic("Couldn't allocate Unix98 ptm driver");
 	pts_driver = tty_alloc_driver(NR_UNIX98_PTY_MAX,
@@ -885,7 +892,8 @@ static void __init unix98_pty_init(void)
 			TTY_DRIVER_REAL_RAW |
 			TTY_DRIVER_DYNAMIC_DEV |
 			TTY_DRIVER_DEVPTS_MEM |
-			TTY_DRIVER_DYNAMIC_ALLOC);
+			TTY_DRIVER_DYNAMIC_ALLOC |
+			TTY_DRIVER_CUSTOM_WORKQUEUE);
 	if (IS_ERR(pts_driver))
 		panic("Couldn't allocate Unix98 pts driver");
 
@@ -940,6 +948,9 @@ static inline void unix98_pty_init(void) { }
 
 static int __init pty_init(void)
 {
+	pty_flip_wq = alloc_workqueue("pty-flip-wq", WQ_UNBOUND | WQ_SYSFS, 0);
+	if (!pty_flip_wq)
+		panic("Couldn't allocate pty flip workqueue");
 	legacy_pty_init();
 	unix98_pty_init();
 	return 0;
diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c
index 67271fc0b..86e1e7178 100644
--- a/drivers/tty/tty_buffer.c
+++ b/drivers/tty/tty_buffer.c
@@ -76,7 +76,7 @@ void tty_buffer_unlock_exclusive(struct tty_port *port)
 	mutex_unlock(&buf->lock);
 
 	if (restart)
-		queue_work(system_unbound_wq, &buf->work);
+		queue_work(buf->flip_wq, &buf->work);
 }
 EXPORT_SYMBOL_GPL(tty_buffer_unlock_exclusive);
 
@@ -530,7 +530,7 @@ void tty_flip_buffer_push(struct tty_port *port)
 	struct tty_bufhead *buf = &port->buf;
 
 	tty_flip_buffer_commit(buf->tail);
-	queue_work(system_unbound_wq, &buf->work);
+	queue_work(buf->flip_wq, &buf->work);
 }
 EXPORT_SYMBOL(tty_flip_buffer_push);
 
@@ -560,7 +560,7 @@ int tty_insert_flip_string_and_push_buffer(struct tty_port *port,
 		tty_flip_buffer_commit(buf->tail);
 	spin_unlock_irqrestore(&port->lock, flags);
 
-	queue_work(system_unbound_wq, &buf->work);
+	queue_work(buf->flip_wq, &buf->work);
 
 	return size;
 }
@@ -613,7 +613,7 @@ void tty_buffer_set_lock_subclass(struct tty_port *port)
 
 bool tty_buffer_restart_work(struct tty_port *port)
 {
-	return queue_work(system_unbound_wq, &port->buf.work);
+	return queue_work(port->buf.flip_wq, &port->buf.work);
 }
 
 bool tty_buffer_cancel_work(struct tty_port *port)
diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index e2d92cf70..a3e019873 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -3452,6 +3452,20 @@ int tty_register_driver(struct tty_driver *driver)
 			goto err_unreg_char;
 	}
 
+	if (!(driver->flags & TTY_DRIVER_CUSTOM_WORKQUEUE)) {
+		driver->flip_wq = alloc_workqueue("%s-flip-wq",
+						WQ_UNBOUND | WQ_SYSFS,
+						0, driver->name);
+		if (!driver->flip_wq) {
+			error = -ENOMEM;
+			goto err_unreg_char;
+		}
+		for (i = 0; i < driver->num; i++) {
+			if (driver->ports[i] && !driver->ports[i]->buf.flip_wq)
+				tty_port_link_driver_wq(driver->ports[i], driver);
+		}
+	}
+
 	scoped_guard(mutex, &tty_mutex)
 		list_add(&driver->tty_drivers, &tty_drivers);
 
@@ -3475,6 +3489,9 @@ int tty_register_driver(struct tty_driver *driver)
 	scoped_guard(mutex, &tty_mutex)
 		list_del(&driver->tty_drivers);
 
+	if (!(driver->flags & TTY_DRIVER_CUSTOM_WORKQUEUE))
+		destroy_workqueue(driver->flip_wq);
+
 err_unreg_char:
 	unregister_chrdev_region(dev, driver->num);
 err:
@@ -3494,6 +3511,8 @@ void tty_unregister_driver(struct tty_driver *driver)
 				driver->num);
 	scoped_guard(mutex, &tty_mutex)
 		list_del(&driver->tty_drivers);
+	if (!(driver->flags & TTY_DRIVER_CUSTOM_WORKQUEUE))
+		destroy_workqueue(driver->flip_wq);
 }
 EXPORT_SYMBOL(tty_unregister_driver);
 
diff --git a/drivers/tty/tty_port.c b/drivers/tty/tty_port.c
index 5b4d5fb99..b8ca89e45 100644
--- a/drivers/tty/tty_port.c
+++ b/drivers/tty/tty_port.c
@@ -103,6 +103,22 @@ void tty_port_init(struct tty_port *port)
 }
 EXPORT_SYMBOL(tty_port_init);
 
+/**
+ * tty_port_link_wq - link tty_port and flip workqueue
+ * @port: tty_port of the device
+ * @flip_wq: workqueue to queue flip buffer work on
+ *
+ * Assign a specific workqueue to a certain port, instead of using the
+ * workqueue allocated in tty_register_driver when TTY_DRIVER_CUSTOM_WORKQUEUE
+ *
+ * Note tty port api will not destroy the workqueue in tty_port_destroy.
+ */
+void tty_port_link_wq(struct tty_port *port, struct workqueue_struct *flip_wq)
+{
+	port->buf.flip_wq = flip_wq;
+}
+EXPORT_SYMBOL(tty_port_link_wq);
+
 /**
  * tty_port_link_device - link tty and tty_port
  * @port: tty_port of the device
@@ -161,6 +177,7 @@ struct device *tty_port_register_device_attr(struct tty_port *port,
 		const struct attribute_group **attr_grp)
 {
 	tty_port_link_device(port, driver, index);
+	tty_port_link_driver_wq(port, driver);
 	return tty_register_device_attr(driver, index, device, drvdata,
 			attr_grp);
 }
@@ -187,6 +204,7 @@ struct device *tty_port_register_device_attr_serdev(struct tty_port *port,
 	struct device *dev;
 
 	tty_port_link_device(port, driver, index);
+	tty_port_link_driver_wq(port, driver);
 
 	dev = serdev_tty_port_register(port, host, parent, driver, index);
 	if (PTR_ERR(dev) != -ENODEV) {
@@ -718,6 +736,7 @@ int tty_port_install(struct tty_port *port, struct tty_driver *driver,
 		struct tty_struct *tty)
 {
 	tty->port = port;
+	tty_port_link_driver_wq(port, driver);
 	return tty_standard_install(driver, tty);
 }
 EXPORT_SYMBOL_GPL(tty_port_install);
diff --git a/include/linux/tty_buffer.h b/include/linux/tty_buffer.h
index 31125e3be..48adcb0e8 100644
--- a/include/linux/tty_buffer.h
+++ b/include/linux/tty_buffer.h
@@ -34,6 +34,7 @@ static inline u8 *flag_buf_ptr(struct tty_buffer *b, unsigned int ofs)
 
 struct tty_bufhead {
 	struct tty_buffer *head;	/* Queue head */
+	struct workqueue_struct *flip_wq;
 	struct work_struct work;
 	struct mutex	   lock;
 	atomic_t	   priority;
diff --git a/include/linux/tty_driver.h b/include/linux/tty_driver.h
index 188ee9b76..cd93345bd 100644
--- a/include/linux/tty_driver.h
+++ b/include/linux/tty_driver.h
@@ -69,6 +69,10 @@ struct serial_struct;
  *	Do not create numbered ``/dev`` nodes. For example, create
  *	``/dev/ttyprintk`` and not ``/dev/ttyprintk0``. Applicable only when a
  *	driver for a single tty device is being allocated.
+ *
+ * @TTY_DRIVER_CUSTOM_WORKQUEUE:
+ *	Do not create workqueue when tty_register_driver. Set flip buffer
+ *	workqueue by tty_port_link_wq every port.
  */
 enum tty_driver_flag {
 	TTY_DRIVER_INSTALLED		= BIT(0),
@@ -79,6 +83,7 @@ enum tty_driver_flag {
 	TTY_DRIVER_HARDWARE_BREAK	= BIT(5),
 	TTY_DRIVER_DYNAMIC_ALLOC	= BIT(6),
 	TTY_DRIVER_UNNUMBERED_NODE	= BIT(7),
+	TTY_DRIVER_CUSTOM_WORKQUEUE	= BIT(8),
 };
 
 enum tty_driver_type {
@@ -506,6 +511,7 @@ struct tty_operations {
  * @flags: tty driver flags (%TTY_DRIVER_)
  * @proc_entry: proc fs entry, used internally
  * @other: driver of the linked tty; only used for the PTY driver
+ * @flip_wq: workqueue to queue flip buffer work on
  * @ttys: array of active &struct tty_struct, set by tty_standard_install()
  * @ports: array of &struct tty_port; can be set during initialization by
  *	   tty_port_link_device() and similar
@@ -539,6 +545,7 @@ struct tty_driver {
 	unsigned long	flags;
 	struct proc_dir_entry *proc_entry;
 	struct tty_driver *other;
+	struct workqueue_struct *flip_wq;
 
 	/*
 	 * Pointer to the tty data structures
diff --git a/include/linux/tty_port.h b/include/linux/tty_port.h
index 332ddb936..86e01bd51 100644
--- a/include/linux/tty_port.h
+++ b/include/linux/tty_port.h
@@ -138,6 +138,7 @@ struct tty_port {
 					   kernel */
 
 void tty_port_init(struct tty_port *port);
+void tty_port_link_wq(struct tty_port *port, struct workqueue_struct *flip_wq);
 void tty_port_link_device(struct tty_port *port, struct tty_driver *driver,
 		unsigned index);
 struct device *tty_port_register_device(struct tty_port *port,
@@ -165,6 +166,14 @@ static inline struct tty_port *tty_port_get(struct tty_port *port)
 	return NULL;
 }
 
+/* No effect when TTY_DRIVER_CUSTOM_WORKQUEUE, as driver->flip_wq is NULL */
+static inline void tty_port_link_driver_wq(struct tty_port *port,
+					   struct tty_driver *driver)
+{
+	if (!port->buf.flip_wq)
+		port->buf.flip_wq = driver->flip_wq;
+}
+
 /* If the cts flow control is enabled, return true. */
 static inline bool tty_port_cts_enabled(const struct tty_port *port)
 {
-- 
2.34.1
Re: [PATCH v5] tty: tty_port: add workqueue to flip tty buffer
Posted by Jiri Slaby 1 week, 4 days ago
Hi,

On 05. 12. 25, 4:08, Xin Zhao wrote:
> On the embedded platform, certain critical data, such as IMU data, is
> transmitted through UART. The tty_flip_buffer_push interface in the TTY

In commit logs, we tend to add () after function names.

> layer uses system_unbound_wq to handle the flipping of the TTY buffer.
> Although the unbound workqueue can create new threads on demand and wake
> up the kworker thread on an idle CPU, it may be preeempted by real-time

Too many 'e's in preeempted :).

> tasks or other high-prio tasks.
> In flush_to_ldisc, when executing n_tty_receive_buf_common,
> it wakes up

What is this "it" supposed to refer to? In English, this "it" here 
actually refers to n_tty_receive_buf_common().

> other tasks. __wake_up_common_lock calls spin_lock_irqsave, which does
> not disable preemption but disable migration in RT-Linux. This prevents

"disables"

> the kworker thread from being migrated to other cores by CPU's balancing
> logic, resulting in long delays.

Here should be another \n to separate paragraphs.

> In our system, the processing interval for each frame of IMU data
> transmitted via UART can experience significant jitter due to this issue.
> Instead of the expected 10 to 15 ms frame processing interval, we see
> spikes up to 30 to 35 ms. Moreover, in just one or two hours, there can
> be 2 to 3 occurrences of such high jitter, which is quite frequent. This
> jitter exceeds the software's tolerable limit of 20 ms.

One more \n here too.

> Introduce flip_wq in tty_port which can be set by tty_port_link_wq or as
> default linked to workqueue allocated when tty_register_driver using flag
> WQ_SYSFS, so that cpumask and nice can be set dynamically.

This is a heavy sentence. Split it.

> Introduce TTY_DRIVER_CUSTOM_WORKQUEUE flag meaning not to create the
> default single tty_driver workqueue.
> We set the cpumask to the same cpu where the IMU data is handled and has
> less long-time high-prio jobs, and then set nice to -20, the frame

Period after 20, not comma.

> processing interval remains between 10 and 15ms, no jitter occurs.

You can perhaps use some LLM to rephrase the text?

> ---
> Change in v5:
> - Do not allocate workqueue twice when CONFIG_UNIX98_PTYS and
>    CONFIG_LEGACY_PTYS are all enabled.
> 
> Change in v4:
> - Simplify the logic for creating and releasing the workqueue,
>    as suggested by Tejun Heo.
> - Allocate single workqueue of one tty_driver as default, link it to
>    port when tty_port register device or tty_driver.
> - Introduce tty_port_link_wq to link specific workqueue to port.
> - Add driver flag TTY_DRIVER_CUSTOM_WORKQUEUE meaning not to create the
>    default single tty_driver workqueue.
> - Link to v4: https://lore.kernel.org/all/202512041303.7192024b-lkp@intel.com/T/#t
> 
> Change in v3:
> - Add tty flip workqueue for all tty ports, as suggested by Greg KH.
>    Every tty port use an individual flip workqueue, while all pty ports
>    share the same workqueue created in pty_flip_wq_init.
> - Modify the commit log to describe the reason for latency spikes in
>    RT-Linux.
> - Link to v3: https://lore.kernel.org/all/20251027060929.394053-1-jackzxcui1989@163.com/
> 
> Change in v2:
> - Do not add new module parameters
>    as suggested by Greg KH
> - Set WQ_SYSFS to allow properties changes from userspace
>    as suggested by Tejun Heo
> - Link to v2: https://lore.kernel.org/all/20251024155534.2302590-1-jackzxcui1989@163.com
> 
> Signed-off-by: Xin Zhao <jackzxcui1989@163.com>

This S-O-B is too late -- it would be dropped. You have to add it before 
the first "---".

> --- a/drivers/tty/pty.c
> +++ b/drivers/tty/pty.c
> @@ -44,6 +44,8 @@ static struct tty_driver *pts_driver;
>   static DEFINE_MUTEX(devpts_mutex);
>   #endif
>   
> +static struct workqueue_struct *pty_flip_wq;

It's not clear to me, why ptys need a separate wq. IOW: you should 
describe this in the commit log.

> --- a/drivers/tty/tty_io.c
> +++ b/drivers/tty/tty_io.c
> @@ -3452,6 +3452,20 @@ int tty_register_driver(struct tty_driver *driver)
>   			goto err_unreg_char;
>   	}
>   
> +	if (!(driver->flags & TTY_DRIVER_CUSTOM_WORKQUEUE)) {
> +		driver->flip_wq = alloc_workqueue("%s-flip-wq",
> +						WQ_UNBOUND | WQ_SYSFS,
> +						0, driver->name);

Do you have to wrap the line here?

> +		if (!driver->flip_wq) {
> +			error = -ENOMEM;
> +			goto err_unreg_char;

Who is going to free cdevs in this fail path?

> +		}
> +		for (i = 0; i < driver->num; i++) {
> +			if (driver->ports[i] && !driver->ports[i]->buf.flip_wq)

You test it here and again in tty_port_link_driver_wq().

> +				tty_port_link_driver_wq(driver->ports[i], driver);

There are not many drivers having tty ports set at this point. Why are 
you doing this here, actually? Given you do this later again in 
register_device().

> +		}
> +	}
> +
>   	scoped_guard(mutex, &tty_mutex)
>   		list_add(&driver->tty_drivers, &tty_drivers);
>   
> @@ -3475,6 +3489,9 @@ int tty_register_driver(struct tty_driver *driver)
>   	scoped_guard(mutex, &tty_mutex)
>   		list_del(&driver->tty_drivers);
>   
> +	if (!(driver->flags & TTY_DRIVER_CUSTOM_WORKQUEUE))
> +		destroy_workqueue(driver->flip_wq);
> +
>   err_unreg_char:
>   	unregister_chrdev_region(dev, driver->num);
>   err:
...

> --- a/drivers/tty/tty_port.c
> +++ b/drivers/tty/tty_port.c
> @@ -103,6 +103,22 @@ void tty_port_init(struct tty_port *port)
>   }
>   EXPORT_SYMBOL(tty_port_init);
>   
> +/**
> + * tty_port_link_wq - link tty_port and flip workqueue
> + * @port: tty_port of the device
> + * @flip_wq: workqueue to queue flip buffer work on
> + *
> + * Assign a specific workqueue to a certain port, instead of using the
> + * workqueue allocated in tty_register_driver when TTY_DRIVER_CUSTOM_WORKQUEUE

The same as for commit logs: functions end with (). Furthermore, 
constants start with % IIRC. A period is missing too. It should sound like:
"when TTY_DRIVER_CUSTOM_WORKQUEUE is used."

> + *
> + * Note tty port api will not destroy the workqueue in tty_port_destroy.

the TTY port API

> + */
> +void tty_port_link_wq(struct tty_port *port, struct workqueue_struct *flip_wq)
> +{
> +	port->buf.flip_wq = flip_wq;
> +}
> +EXPORT_SYMBOL(tty_port_link_wq);

_GPL likely?

> +
>   /**
>    * tty_port_link_device - link tty and tty_port
>    * @port: tty_port of the device
> @@ -161,6 +177,7 @@ struct device *tty_port_register_device_attr(struct tty_port *port,
>   		const struct attribute_group **attr_grp)
>   {
>   	tty_port_link_device(port, driver, index);
> +	tty_port_link_driver_wq(port, driver);
>   	return tty_register_device_attr(driver, index, device, drvdata,
>   			attr_grp);
>   }
> @@ -187,6 +204,7 @@ struct device *tty_port_register_device_attr_serdev(struct tty_port *port,
>   	struct device *dev;
>   
>   	tty_port_link_device(port, driver, index);
> +	tty_port_link_driver_wq(port, driver);
>   
>   	dev = serdev_tty_port_register(port, host, parent, driver, index);
>   	if (PTR_ERR(dev) != -ENODEV) {
> @@ -718,6 +736,7 @@ int tty_port_install(struct tty_port *port, struct tty_driver *driver,
>   		struct tty_struct *tty)
>   {
>   	tty->port = port;
> +	tty_port_link_driver_wq(port, driver);
>   	return tty_standard_install(driver, tty);
>   }
>   EXPORT_SYMBOL_GPL(tty_port_install);
> diff --git a/include/linux/tty_buffer.h b/include/linux/tty_buffer.h
> index 31125e3be..48adcb0e8 100644
> --- a/include/linux/tty_buffer.h
> +++ b/include/linux/tty_buffer.h
> @@ -34,6 +34,7 @@ static inline u8 *flag_buf_ptr(struct tty_buffer *b, unsigned int ofs)
>   
>   struct tty_bufhead {
>   	struct tty_buffer *head;	/* Queue head */
> +	struct workqueue_struct *flip_wq;
>   	struct work_struct work;
>   	struct mutex	   lock;
>   	atomic_t	   priority;
> diff --git a/include/linux/tty_driver.h b/include/linux/tty_driver.h
> index 188ee9b76..cd93345bd 100644
> --- a/include/linux/tty_driver.h
> +++ b/include/linux/tty_driver.h
> @@ -69,6 +69,10 @@ struct serial_struct;
>    *	Do not create numbered ``/dev`` nodes. For example, create
>    *	``/dev/ttyprintk`` and not ``/dev/ttyprintk0``. Applicable only when a
>    *	driver for a single tty device is being allocated.
> + *
> + * @TTY_DRIVER_CUSTOM_WORKQUEUE:
> + *	Do not create workqueue when tty_register_driver. Set flip buffer
> + *	workqueue by tty_port_link_wq every port.

Sorry, parser error.

>    */
>   enum tty_driver_flag {
>   	TTY_DRIVER_INSTALLED		= BIT(0),
> @@ -79,6 +83,7 @@ enum tty_driver_flag {
>   	TTY_DRIVER_HARDWARE_BREAK	= BIT(5),
>   	TTY_DRIVER_DYNAMIC_ALLOC	= BIT(6),
>   	TTY_DRIVER_UNNUMBERED_NODE	= BIT(7),
> +	TTY_DRIVER_CUSTOM_WORKQUEUE	= BIT(8),
>   };
>   
>   enum tty_driver_type {
> @@ -506,6 +511,7 @@ struct tty_operations {
>    * @flags: tty driver flags (%TTY_DRIVER_)
>    * @proc_entry: proc fs entry, used internally
>    * @other: driver of the linked tty; only used for the PTY driver
> + * @flip_wq: workqueue to queue flip buffer work on
>    * @ttys: array of active &struct tty_struct, set by tty_standard_install()
>    * @ports: array of &struct tty_port; can be set during initialization by
>    *	   tty_port_link_device() and similar
> @@ -539,6 +545,7 @@ struct tty_driver {
>   	unsigned long	flags;
>   	struct proc_dir_entry *proc_entry;
>   	struct tty_driver *other;
> +	struct workqueue_struct *flip_wq;
>   
>   	/*
>   	 * Pointer to the tty data structures
> diff --git a/include/linux/tty_port.h b/include/linux/tty_port.h
> index 332ddb936..86e01bd51 100644
> --- a/include/linux/tty_port.h
> +++ b/include/linux/tty_port.h
> @@ -138,6 +138,7 @@ struct tty_port {
>   					   kernel */
>   
>   void tty_port_init(struct tty_port *port);
> +void tty_port_link_wq(struct tty_port *port, struct workqueue_struct *flip_wq);
>   void tty_port_link_device(struct tty_port *port, struct tty_driver *driver,
>   		unsigned index);
>   struct device *tty_port_register_device(struct tty_port *port,
> @@ -165,6 +166,14 @@ static inline struct tty_port *tty_port_get(struct tty_port *port)
>   	return NULL;
>   }
>   
> +/* No effect when TTY_DRIVER_CUSTOM_WORKQUEUE, as driver->flip_wq is NULL */
> +static inline void tty_port_link_driver_wq(struct tty_port *port,
> +					   struct tty_driver *driver)

I am not sure why you introduce two interfaces: 
tty_port_link_driver_wq() and tty_port_link_wq(). Can't you add the if 
to the latter and drop the former? To me at least, the latter is confusing.

> +{
> +	if (!port->buf.flip_wq)
> +		port->buf.flip_wq = driver->flip_wq;
> +}
> +
>   /* If the cts flow control is enabled, return true. */
>   static inline bool tty_port_cts_enabled(const struct tty_port *port)
>   {

thanks,
-- 
js
suse labs
Re: [PATCH v5] tty: tty_port: add workqueue to flip tty buffer
Posted by Xin Zhao 1 week, 2 days ago
Dear Jiri,

On Mon, 8 Dec 2025 08:26:51 +0100 Jiri Slaby <jirislaby@kernel.org> wrote:

> > +	if (!(driver->flags & TTY_DRIVER_CUSTOM_WORKQUEUE)) {
> > +		driver->flip_wq = alloc_workqueue("%s-flip-wq",
> > +						WQ_UNBOUND | WQ_SYSFS,
> > +						0, driver->name);
> 
> Do you have to wrap the line here?

It looks like I need to add line breaks:

./scripts/checkpatch.pl 0001-tty-tty_port-add-workqueue-to-flip-tty-buffer.patch

WARNING: line length of 104 exceeds 100 columns
#227: FILE: drivers/tty/tty_io.c:3450:
+               driver->flip_wq = alloc_workqueue("%s-flip-wq", WQ_UNBOUND | WQ_SYSFS, 0, driver->name);

--
Xin Zhao
Re: [PATCH v5] tty: tty_port: add workqueue to flip tty buffer
Posted by Xin Zhao 1 week, 3 days ago
Dear Jiri,

On Mon, 8 Dec 2025 08:26:51 +0100 Jiri Slaby <jirislaby@kernel.org> wrote:

> > On the embedded platform, certain critical data, such as IMU data, is
> > transmitted through UART. The tty_flip_buffer_push interface in the TTY
> 
> In commit logs, we tend to add () after function names.

OK, I will revise it.

> > up the kworker thread on an idle CPU, it may be preeempted by real-time
> 
> Too many 'e's in preeempted :).

It’s my fault.

> > In flush_to_ldisc, when executing n_tty_receive_buf_common,
> > it wakes up
> 
> What is this "it" supposed to refer to? In English, this "it" here 
> actually refers to n_tty_receive_buf_common().

Here I mean n_tty_receive_buf_common(). I will add a dump stack info in v5
like the following:
flush_to_ldisc() needs to wake up the relevant data handle thread. When 
executing __wake_up_common_lock, it calls spin_lock_irqsave(), which does
not disable preemption but disables migration in RT-Linux. This prevents
the kworker thread from being migrated to other cores by CPU's balancing
logic, resulting in long delays. The call trace is as follows:
    __wake_up_common_lock
    __wake_up
    ep_poll_callback
    __wake_up_common
    __wake_up_common_lock
    __wake_up
    n_tty_receive_buf_common
    n_tty_receive_buf2
    tty_ldisc_receive_buf
    tty_port_default_receive_buf
    flush_to_ldisc

> > not disable preemption but disable migration in RT-Linux. This prevents
> 
> "disables"

It’s my fault.

> > the kworker thread from being migrated to other cores by CPU's balancing
> > logic, resulting in long delays.
> 
> Here should be another \n to separate paragraphs.
> 
> > In our system, the processing interval for each frame of IMU data

OK.

> > jitter exceeds the software's tolerable limit of 20 ms.
> 
> One more \n here too.
> 
> > Introduce flip_wq in tty_port which can be set by tty_port_link_wq or as

OK.

> > Introduce flip_wq in tty_port which can be set by tty_port_link_wq or as
> > default linked to workqueue allocated when tty_register_driver using flag
> > WQ_SYSFS, so that cpumask and nice can be set dynamically.
> 
> This is a heavy sentence. Split it.

OK. I may change it as the following:
Introduce flip_wq in tty_port which can be set by tty_port_link_wq() or as
default linked to default workqueue allocated when tty_register_driver().
The default workqueue is allocated with flag WQ_SYSFS, so that cpumask and
nice can be set dynamically. The execution timing of tty_port_link_wq() is
not clearly restricted. The newly added function tty_port_link_driver_wq()
checks whether the flip_wq of the tty_port has already been assigned when
linking the default tty_driver's workqueue to the port. After the user has
set a custom workqueue for a certain tty_port using tty_port_link_wq(), the
system will only use this custom workqueue, even if tty_driver does not
have %TTY_DRIVER_CUSTOM_WORKQUEUE flag.

> > Introduce TTY_DRIVER_CUSTOM_WORKQUEUE flag meaning not to create the
> > default single tty_driver workqueue.
> > We set the cpumask to the same cpu where the IMU data is handled and has
> > less long-time high-prio jobs, and then set nice to -20, the frame
> 
> Period after 20, not comma.
> 
> > processing interval remains between 10 and 15ms, no jitter occurs.
> 
> You can perhaps use some LLM to rephrase the text?
>
> > ---

OK. I may change it as the following:
Introduce TTY_DRIVER_CUSTOM_WORKQUEUE flag meaning not to create the
default single tty_driver workqueue. Two reasons why need to introduce the
TTY_DRIVER_CUSTOM_WORKQUEUE flag:
1. If the WQ_SYSFS parameter is enabled, workqueue_sysfs_register() will
fail when trying to create a workqueue with the same name. The pty is an
example of this; if both CONFIG_LEGACY_PTYS and CONFIG_UNIX98_PTYS are
enabled, the call to tty_register_driver() in unix98_pty_init() will fail.
2. Different tty ports may be used for different tasks, which may require
separate core binding control via workqueues. In this case, the workqueue
created by default in the tty driver is unnecessary. Enabling this flag
prevents the creation of this redundant workqueue.

After applying this patch, we can set the related UART tty flip buffer
workqueue by sysfs. We set the cpumask to CPU cores associated with the
IMU tasks, and set the nice to -20. Testing has shown significant
improvement in the previously described issue, with almost no stuttering
occurring anymore.

Signed-off-by: Xin Zhao <jackzxcui1989@163.com>
---

> > Signed-off-by: Xin Zhao <jackzxcui1989@163.com>
> 
> This S-O-B is too late -- it would be dropped. You have to add it before 
> the first "---".

OK. I will revise it. Thanks.

> It's not clear to me, why ptys need a separate wq. IOW: you should 
> describe this in the commit log.
> 
> > --- a/drivers/tty/tty_io.c
> > +++ b/drivers/tty/tty_io.c
> > @@ -3452,6 +3452,20 @@ int tty_register_driver(struct tty_driver *driver)
> >   			goto err_unreg_char;
> >   	}
> > 
> > +	if (!(driver->flags & TTY_DRIVER_CUSTOM_WORKQUEUE)) {
> > +		driver->flip_wq = alloc_workqueue("%s-flip-wq",
> > +						WQ_UNBOUND | WQ_SYSFS,
> > +						0, driver->name);

OK. Like the following:
Introduce TTY_DRIVER_CUSTOM_WORKQUEUE flag meaning not to create the
default single tty_driver workqueue. Two reasons why need to introduce the
TTY_DRIVER_CUSTOM_WORKQUEUE flag:
1. If the WQ_SYSFS parameter is enabled, workqueue_sysfs_register() will
fail when trying to create a workqueue with the same name. The pty is an
example of this; if both CONFIG_LEGACY_PTYS and CONFIG_UNIX98_PTYS are
enabled, the call to tty_register_driver() in unix98_pty_init() will fail.
2. Different tty ports may be used for different tasks, which may require
separate core binding control via workqueues. In this case, the workqueue
created by default in the tty driver is unnecessary. Enabling this flag
prevents the creation of this redundant workqueue.

> > +	if (!(driver->flags & TTY_DRIVER_CUSTOM_WORKQUEUE)) {
> > +		driver->flip_wq = alloc_workqueue("%s-flip-wq",
> > +						WQ_UNBOUND | WQ_SYSFS,
> > +						0, driver->name);
> 
> Do you have to wrap the line here?

OK, it hasn't exceeded 100 characters, so I won't break it into a new line.

> > +		if (!driver->flip_wq) {
> > +			error = -ENOMEM;
> > +			goto err_unreg_char;
> 
> Who is going to free cdevs in this fail path?

I overlooked the fact that tty_unregister_device performs the cdev_del
release action. I will move the alloc_workqueue before tty_cdev_add() like
the following:

	if (error < 0)
		goto err;

	if (!(driver->flags & TTY_DRIVER_CUSTOM_WORKQUEUE)) {
		driver->flip_wq = alloc_workqueue("%s-flip-wq", WQ_UNBOUND | WQ_SYSFS, 0, driver->name);
		if (!driver->flip_wq) {
			error = -ENOMEM;
			goto err;
		}
        ...
	}

	if (driver->flags & TTY_DRIVER_DYNAMIC_ALLOC) {
		error = tty_cdev_add(driver, dev, 0, driver->num);
		if (error)
			goto err_destroy_wq;
	}
    ...

err_unreg_devs:
	for (i--; i >= 0; i--)
		tty_unregister_device(driver, i);

	scoped_guard(mutex, &tty_mutex)
		list_del(&driver->tty_drivers);

err_destroy_wq:
	if (!(driver->flags & TTY_DRIVER_CUSTOM_WORKQUEUE))
		destroy_workqueue(driver->flip_wq);

> > +		}
> > +		for (i = 0; i < driver->num; i++) {
> > +			if (driver->ports[i] && !driver->ports[i]->buf.flip_wq)
> 
> You test it here and again in tty_port_link_driver_wq().

It's my fault. I will change it as follows:
		for (i = 0; i < driver->num; i++) {
			if (driver->ports[i])
				tty_port_link_driver_wq(driver->ports[i], driver);
		}

> > +				tty_port_link_driver_wq(driver->ports[i], driver);
> 
> There are not many drivers having tty ports set at this point. Why are 
> you doing this here, actually? Given you do this later again in 
> register_device().

I do not perform the tty_port_link_wq() in tty_register_device_attr(). I
execute tty_port_link_wq() in tty_register_driver().
In my previous version, I created the workqueue in tty_port_init, but
since tty_port_init is a void-return function, it cannot handle errors
related to workqueue allocation failure. So, I looked for other suitable
functions to handle workqueue allocation and found that both
tty_register_driver() and tty_register_device_attr() were potential
options. However, I realized that tty_port_register_device_attr_serdev()
may not call tty_register_device_attr, while every tty-related driver
would call tty_register_driver. Therefore, I decided to perform the
workqueue allocation in tty_register_driver.
I did not directly perform the tty_port_link_wq() action in
tty_port_link_device() because there are several usage scenarios that
follow this execution order:
    tty_alloc_driver()
    driver->name = xxx
    tty_port_link_device()
    tty_register_driver
In such cases, if I were to directly perform the tty_port_link_wq() action
in tty_port_link_device(), I would need to create the workqueue in
tty_alloc_driver, but at that point, the driver’s name is not yet known,
which prevents the creation of a workqueue with WQ_SYSFS. Therefore, for
this scenario, I can only perform the workqueue creation in
tty_register_driver before executing the link action.
Additionally, I have confirmed that every call to tty_port_link_device() in
the current kernel code is accompanied by a call to tty_register_driver().
The tty_port_link_device() interface is also used in tty_port.c, and I have
added the tty_port_link_driver_wq() call at every location where the
tty_port_link_device() interface is called.

> > --- a/drivers/tty/tty_port.c
> > +++ b/drivers/tty/tty_port.c
> > @@ -103,6 +103,22 @@ void tty_port_init(struct tty_port *port)
> >   }
> >   EXPORT_SYMBOL(tty_port_init);
> > 
> > +/**
> > + * tty_port_link_wq - link tty_port and flip workqueue
> > + * @port: tty_port of the device
> > + * @flip_wq: workqueue to queue flip buffer work on
> > + *
> > + * Assign a specific workqueue to a certain port, instead of using the
> > + * workqueue allocated in tty_register_driver when TTY_DRIVER_CUSTOM_WORKQUEUE
> 
> The same as for commit logs: functions end with (). Furthermore, 
> constants start with % IIRC. A period is missing too. It should sound like:
> "when TTY_DRIVER_CUSTOM_WORKQUEUE is used."
> 
> > + *
> > + * Note tty port api will not destroy the workqueue in tty_port_destroy.
> 
> the TTY port API
> 
> > + */
> > +void tty_port_link_wq(struct tty_port *port, struct workqueue_struct *flip_wq)
> > +{
> > +	port->buf.flip_wq = flip_wq;
> > +}
> > +EXPORT_SYMBOL(tty_port_link_wq);
> 
> _GPL likely?

OK, I will change it as follows:
/**
 * tty_port_link_wq - link tty_port and flip workqueue
 * @port: tty_port of the device
 * @flip_wq: workqueue to queue flip buffer work on
 *
 * When %TTY_DRIVER_CUSTOM_WORKQUEUE is used, you must link every tty port to
 * workqueue manually by this function, otherwise tty_flip_buffer_push() will
 * see NULL flip_wq pointer when queue_work.
 * When %TTY_DRIVER_CUSTOM_WORKQUEUE is NOT used, you can also use the function
 * to link a certain port to a specific workqueue, instead of using the
 * workqueue allocated in tty_register_driver().
 *
 * Note tty port api will not destroy the workqueue in the TTY port API.
 */
void tty_port_link_wq(struct tty_port *port, struct workqueue_struct *flip_wq)
{
	port->buf.flip_wq = flip_wq;
}
EXPORT_SYMBOL_GPL(tty_port_link_wq);

> > + * @TTY_DRIVER_CUSTOM_WORKQUEUE:
> > + *	Do not create workqueue when tty_register_driver. Set flip buffer
> > + *	workqueue by tty_port_link_wq every port.
> 
> Sorry, parser error.

Are you saying to prefix the macro with @? I see that in the current code,
they are also prefixed with %, as shown in the comments in the tty_driver.h
file below:
 *	If the driver sets %TTY_DRIVER_HARDWARE_BREAK in tty_alloc_driver(),
 *	then the interface will also be called with actual times and the
 *	hardware is expected to do the delay work itself. 0 and -1 are still
 *	used for on/off.
I'll change as following using () to mean function:
 * @TTY_DRIVER_CUSTOM_WORKQUEUE:
 *	Do not create workqueue when tty_register_driver(). In the case, you must
 *	set flip buffer workqueue by tty_port_link_wq() every port.
 */

> > +/* No effect when TTY_DRIVER_CUSTOM_WORKQUEUE, as driver->flip_wq is NULL */
> > +static inline void tty_port_link_driver_wq(struct tty_port *port,
> > +					   struct tty_driver *driver)
> 
> I am not sure why you introduce two interfaces: 
> tty_port_link_driver_wq() and tty_port_link_wq(). Can't you add the if 
> to the latter and drop the former? To me at least, the latter is confusing.

It is indeed confusing, I will add the description of tty_port_link_driver_wq()
in the commit log and before tty_port_link_driver_wq().
In the commit log:
...
nice can be set dynamically. The execution timing of tty_port_link_wq() is
not clearly restricted. The newly added function tty_port_link_driver_wq()
checks whether the flip_wq of the tty_port has already been assigned when
linking the default tty_driver's workqueue to the port. After the user has
set a custom workqueue for a certain tty_port using tty_port_link_wq(), the
system will only use this custom workqueue, even if tty_driver does not
have %TTY_DRIVER_CUSTOM_WORKQUEUE flag.
Before tty_port_link_driver_wq():
/*
 * Never overwrite the workqueue set by tty_port_link_wq().
 * No effect when TTY_DRIVER_CUSTOM_WORKQUEUE, as driver->flip_wq is NULL.
 */
static inline void tty_port_link_driver_wq(struct tty_port *port,
					   struct tty_driver *driver)

My thought is that even if the TTY_DRIVER_CUSTOM_WORKQUEUE flag is not set,
the user can still call tty_port_link_wq to link a specific tty port to a
custom workqueue, thus overriding the existing workqueue in the current
tty_port. Therefore, I cannot check for NULL in the tty_port_link_wq(), as
it must unconditionally set the workqueue.

--
Xin Zhao