[PATCH v5 5/6] drm/log: Implement suspend/resume

Jocelyn Falempe posted 6 patches 1 month ago
There is a newer version of this series
[PATCH v5 5/6] drm/log: Implement suspend/resume
Posted by Jocelyn Falempe 1 month ago
The console is already suspended in printk.c.
Just make sure we don't write to the framebuffer while the graphic
driver is suspended.
It may lose a few messages between graphic suspend and console
suspend.

Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com>
---
 drivers/gpu/drm/drm_log.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/drivers/gpu/drm/drm_log.c b/drivers/gpu/drm/drm_log.c
index 635dff7b37ce5..07d1513001468 100644
--- a/drivers/gpu/drm/drm_log.c
+++ b/drivers/gpu/drm/drm_log.c
@@ -50,6 +50,7 @@ struct drm_log {
 	struct drm_client_dev client;
 	struct console con;
 	bool probed;
+	bool suspended;
 	u32 n_scanout;
 	struct drm_log_scanout *scanout;
 };
@@ -310,10 +311,32 @@ static int drm_log_client_hotplug(struct drm_client_dev *client)
 	return 0;
 }
 
+static int drm_log_client_suspend(struct drm_client_dev *client, bool _console_lock)
+{
+	struct drm_log *dlog = client_to_drm_log(client);
+
+	mutex_lock(&dlog->lock);
+	dlog->suspended = true;
+	mutex_unlock(&dlog->lock);
+	return 0;
+}
+
+static int drm_log_client_resume(struct drm_client_dev *client, bool _console_lock)
+{
+	struct drm_log *dlog = client_to_drm_log(client);
+
+	mutex_lock(&dlog->lock);
+	dlog->suspended = false;
+	mutex_unlock(&dlog->lock);
+	return 0;
+}
+
 static const struct drm_client_funcs drm_log_client_funcs = {
 	.owner		= THIS_MODULE,
 	.unregister	= drm_log_client_unregister,
 	.hotplug	= drm_log_client_hotplug,
+	.suspend	= drm_log_client_suspend,
+	.resume		= drm_log_client_resume,
 };
 
 static void drm_log_write_thread(struct console *con, struct nbcon_write_context *wctxt)
@@ -321,6 +344,9 @@ static void drm_log_write_thread(struct console *con, struct nbcon_write_context
 	struct drm_log *dlog = console_to_drm_log(con);
 	int i;
 
+	if (dlog->suspended)
+		return;
+
 	if (!dlog->probed)
 		drm_log_init_client(dlog);
 
-- 
2.47.0
Re: [PATCH v5 5/6] drm/log: Implement suspend/resume
Posted by Petr Mladek 1 month ago
On Wed 2024-10-23 14:00:13, Jocelyn Falempe wrote:
> The console is already suspended in printk.c.

Does this mean that drm_log_client_suspend() is called
after suspend_console(), please?

By other words, does it mean that "dlog->suspended == true" is set
only when CON_SUSPENDED is already set in the related con->flags?

> Just make sure we don't write to the framebuffer while the graphic
> driver is suspended.
> It may lose a few messages between graphic suspend and console
> suspend.

The messages should not get lost when the console is properly
suspended by suspend_console(), set CON_SUSPENDED.

Or maybe, I do not understand it correctly. Maybe you want to say
that it should work correctly even without this patch. And this
patch creates just a safeguard to make sure that nothing wrong
happens even when suspend_console() was not called from some
reasons.


Note: I tried to check the order by reading the code. But
      drm_log_client_suspend() was called via too many layers.
      And I was not able to find where exactly it was called,
      for example, from hibernate() in kernel/power/hibernate.c


> --- a/drivers/gpu/drm/drm_log.c
> +++ b/drivers/gpu/drm/drm_log.c
> @@ -310,10 +311,32 @@ static int drm_log_client_hotplug(struct drm_client_dev *client)
>  	return 0;
>  }
>  
> +static int drm_log_client_suspend(struct drm_client_dev *client, bool _console_lock)
> +{
> +	struct drm_log *dlog = client_to_drm_log(client);
> +
> +	mutex_lock(&dlog->lock);
> +	dlog->suspended = true;
> +	mutex_unlock(&dlog->lock);

It might also be possible to explicitly set the CON_SUSPENDED flag
here to be always on the safe side. We could create variant of
suspend_console() just for one console. Something like:

void suspend_one_console(struct console *con)
{
	struct console *con;

	if (!console_suspend_enabled)
		return;

	pr_info("Suspending console(%s) (use no_console_suspend to debug)\n");
	pr_flush(1000, true);

	console_list_lock();
	if (con && console_is_registered_locked(con))
		console_srcu_write_flags(con, con->flags | CON_SUSPENDED);
	console_list_unlock();

	/*
	 * Ensure that all SRCU list walks have completed. All printing
	 * contexts must be able to see that they are suspended so that it
	 * is guaranteed that all printing has stopped when this function
	 * completes.
	 */
	synchronize_srcu(&console_srcu);
}

and call here:

	suspend_one_console(dlog->con);


But this is not needed when the console is already supposed to be
disabled here. If this is the case then it might be better
to just check and warn when it does not happen. Something like:

void assert_console_suspended(struct console *con)
{
	int cookie;

	cookie = console_srcu_read_lock();

	/* Do not care about unregistered console */
	if (!con || hlist_unhashed_lockless(&con->node))
		goto out;

	if (WARN_ON_ONCE(!(console_srcu_read_flags(con) & CON_SUSPENDED)))
		pr_flush(1000, true);
out:
	console_srcu_read_unlock(cookie);
}

> +	return 0;
> +}


Best Regards,
Petr

PS: I have vacation the following week and might not be able to
    follow the discussion before I am back.
Re: [PATCH v5 5/6] drm/log: Implement suspend/resume
Posted by Jocelyn Falempe 1 month ago
On 24/10/2024 16:34, Petr Mladek wrote:
> On Wed 2024-10-23 14:00:13, Jocelyn Falempe wrote:
>> The console is already suspended in printk.c.
> 
> Does this mean that drm_log_client_suspend() is called
> after suspend_console(), please?

To be honest, I wasn't able to tell which one is called first, and if 
the order is enforced (I didn't check if drivers can be suspended in 
parallel, or if it's all sequential)..

I then checked if it's possible to suspend the console, but didn't found 
an easy API to do so, so I went with this lazy patch, just ensuring 
we're not writing to a suspended graphic driver.

> 
> By other words, does it mean that "dlog->suspended == true" is set
> only when CON_SUSPENDED is already set in the related con->flags?
> 8
>> Just make sure we don't write to the framebuffer while the graphic
>> driver is suspended.
>> It may lose a few messages between graphic suspend and console
>> suspend.
> 
> The messages should not get lost when the console is properly
> suspended by suspend_console(), set CON_SUSPENDED.
> 
> Or maybe, I do not understand it correctly. Maybe you want to say
> that it should work correctly even without this patch. And this
> patch creates just a safeguard to make sure that nothing wrong
> happens even when suspend_console() was not called from some
> reasons.

I mean that with this patch if the console is suspended after the 
graphic driver, then the message between the suspend of the graphic 
driver and the suspend of the console won't be drawn. I don't see that 
as a big problem, if you debug suspend/resume with drm_log, and the 
screen goes blank, you won't see much anyway. And using dmesg when the 
system is resumed, would have all the messages.

Without this patch, it may crash if the framebuffer is no more 
accessible, and drm_log tries to draw a new line on it.
> 
> 
> Note: I tried to check the order by reading the code. But
>        drm_log_client_suspend() was called via too many layers.
>        And I was not able to find where exactly it was called,
>        for example, from hibernate() in kernel/power/hibernate.c
> 
> 
>> --- a/drivers/gpu/drm/drm_log.c
>> +++ b/drivers/gpu/drm/drm_log.c
>> @@ -310,10 +311,32 @@ static int drm_log_client_hotplug(struct drm_client_dev *client)
>>   	return 0;
>>   }
>>   
>> +static int drm_log_client_suspend(struct drm_client_dev *client, bool _console_lock)
>> +{
>> +	struct drm_log *dlog = client_to_drm_log(client);
>> +
>> +	mutex_lock(&dlog->lock);
>> +	dlog->suspended = true;
>> +	mutex_unlock(&dlog->lock);
> 
> It might also be possible to explicitly set the CON_SUSPENDED flag
> here to be always on the safe side. We could create variant of
> suspend_console() just for one console. Something like:
> 
> void suspend_one_console(struct console *con)
> {
> 	struct console *con;
> 
> 	if (!console_suspend_enabled)
> 		return;
> 
> 	pr_info("Suspending console(%s) (use no_console_suspend to debug)\n");
> 	pr_flush(1000, true);
> 
> 	console_list_lock();
> 	if (con && console_is_registered_locked(con))
> 		console_srcu_write_flags(con, con->flags | CON_SUSPENDED);
> 	console_list_unlock();
> 
> 	/*
> 	 * Ensure that all SRCU list walks have completed. All printing
> 	 * contexts must be able to see that they are suspended so that it
> 	 * is guaranteed that all printing has stopped when this function
> 	 * completes.
> 	 */
> 	synchronize_srcu(&console_srcu);
> }
> 
> and call here:
> 
> 	suspend_one_console(dlog->con);
> 
> 
> But this is not needed when the console is already supposed to be
> disabled here. If this is the case then it might be better
> to just check and warn when it does not happen. Something like:
> 
> void assert_console_suspended(struct console *con)
> {
> 	int cookie;
> 
> 	cookie = console_srcu_read_lock();
> 
> 	/* Do not care about unregistered console */
> 	if (!con || hlist_unhashed_lockless(&con->node))
> 		goto out;
> 
> 	if (WARN_ON_ONCE(!(console_srcu_read_flags(con) & CON_SUSPENDED)))
> 		pr_flush(1000, true);
> out:
> 	console_srcu_read_unlock(cookie);
> }
> 
>> +	return 0;
>> +}
> 
> 

Thanks for this two suggestions, this is really what I was looking for.
I will run some tests on real hardware, to see which one is suspended first.

Best regards,

-- 

Jocelyn

> Best Regards,
> Petr
> 
> PS: I have vacation the following week and might not be able to
>      follow the discussion before I am back.
>
Re: [PATCH v5 5/6] drm/log: Implement suspend/resume
Posted by Jocelyn Falempe 1 month ago
On 25/10/2024 01:11, Jocelyn Falempe wrote:
> On 24/10/2024 16:34, Petr Mladek wrote:
>> On Wed 2024-10-23 14:00:13, Jocelyn Falempe wrote:
>>> The console is already suspended in printk.c.
>>
>> Does this mean that drm_log_client_suspend() is called
>> after suspend_console(), please?
> 
> To be honest, I wasn't able to tell which one is called first, and if 
> the order is enforced (I didn't check if drivers can be suspended in 
> parallel, or if it's all sequential)..
> 
> I then checked if it's possible to suspend the console, but didn't found 
> an easy API to do so, so I went with this lazy patch, just ensuring 
> we're not writing to a suspended graphic driver.

I've run some tests on my hardware, and the console is suspended before 
the graphic driver:

[   56.409604] printk: Suspending console(s) (use no_console_suspend to 
debug)
[   56.411430] serial 00:05: disabled
[   56.421877] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[   56.421954] sd 4:0:0:0: [sdb] Synchronizing SCSI cache
[   56.422545] ata1.00: Entering standby power mode
[   56.422793] DRM log suspend

But because there is the "no_console_suspend" parameter, and we should 
make sure to not draw when the graphic driver is suspended, I think this 
patch is needed, and good enough.
I will just rephrase the commit message, to make it clear, that some 
message won't be drawn, only if "no_console_suspend" is set.

Best regards,

-- 

Jocelyn

> 
>>
>> By other words, does it mean that "dlog->suspended == true" is set
>> only when CON_SUSPENDED is already set in the related con->flags?
>> 8
>>> Just make sure we don't write to the framebuffer while the graphic
>>> driver is suspended.
>>> It may lose a few messages between graphic suspend and console
>>> suspend.
>>
>> The messages should not get lost when the console is properly
>> suspended by suspend_console(), set CON_SUSPENDED.
>>
>> Or maybe, I do not understand it correctly. Maybe you want to say
>> that it should work correctly even without this patch. And this
>> patch creates just a safeguard to make sure that nothing wrong
>> happens even when suspend_console() was not called from some
>> reasons.
> 
> I mean that with this patch if the console is suspended after the 
> graphic driver, then the message between the suspend of the graphic 
> driver and the suspend of the console won't be drawn. I don't see that 
> as a big problem, if you debug suspend/resume with drm_log, and the 
> screen goes blank, you won't see much anyway. And using dmesg when the 
> system is resumed, would have all the messages.
> 
> Without this patch, it may crash if the framebuffer is no more 
> accessible, and drm_log tries to draw a new line on it.
>>
>>
>> Note: I tried to check the order by reading the code. But
>>        drm_log_client_suspend() was called via too many layers.
>>        And I was not able to find where exactly it was called,
>>        for example, from hibernate() in kernel/power/hibernate.c
>>
>>
>>> --- a/drivers/gpu/drm/drm_log.c
>>> +++ b/drivers/gpu/drm/drm_log.c
>>> @@ -310,10 +311,32 @@ static int drm_log_client_hotplug(struct 
>>> drm_client_dev *client)
>>>       return 0;
>>>   }
>>> +static int drm_log_client_suspend(struct drm_client_dev *client, 
>>> bool _console_lock)
>>> +{
>>> +    struct drm_log *dlog = client_to_drm_log(client);
>>> +
>>> +    mutex_lock(&dlog->lock);
>>> +    dlog->suspended = true;
>>> +    mutex_unlock(&dlog->lock);
>>
>> It might also be possible to explicitly set the CON_SUSPENDED flag
>> here to be always on the safe side. We could create variant of
>> suspend_console() just for one console. Something like:
>>
>> void suspend_one_console(struct console *con)
>> {
>>     struct console *con;
>>
>>     if (!console_suspend_enabled)
>>         return;
>>
>>     pr_info("Suspending console(%s) (use no_console_suspend to 
>> debug)\n");
>>     pr_flush(1000, true);
>>
>>     console_list_lock();
>>     if (con && console_is_registered_locked(con))
>>         console_srcu_write_flags(con, con->flags | CON_SUSPENDED);
>>     console_list_unlock();
>>
>>     /*
>>      * Ensure that all SRCU list walks have completed. All printing
>>      * contexts must be able to see that they are suspended so that it
>>      * is guaranteed that all printing has stopped when this function
>>      * completes.
>>      */
>>     synchronize_srcu(&console_srcu);
>> }
>>
>> and call here:
>>
>>     suspend_one_console(dlog->con);
>>
>>
>> But this is not needed when the console is already supposed to be
>> disabled here. If this is the case then it might be better
>> to just check and warn when it does not happen. Something like:
>>
>> void assert_console_suspended(struct console *con)
>> {
>>     int cookie;
>>
>>     cookie = console_srcu_read_lock();
>>
>>     /* Do not care about unregistered console */
>>     if (!con || hlist_unhashed_lockless(&con->node))
>>         goto out;
>>
>>     if (WARN_ON_ONCE(!(console_srcu_read_flags(con) & CON_SUSPENDED)))
>>         pr_flush(1000, true);
>> out:
>>     console_srcu_read_unlock(cookie);
>> }
>>
>>> +    return 0;
>>> +}
>>
>>
> 
> Thanks for this two suggestions, this is really what I was looking for.
> I will run some tests on real hardware, to see which one is suspended 
> first.
> 
> Best regards,
> 

Re: [PATCH v5 5/6] drm/log: Implement suspend/resume
Posted by Petr Mladek 3 weeks, 3 days ago
On Fri 2024-10-25 11:46:16, Jocelyn Falempe wrote:
> On 25/10/2024 01:11, Jocelyn Falempe wrote:
> > On 24/10/2024 16:34, Petr Mladek wrote:
> > > On Wed 2024-10-23 14:00:13, Jocelyn Falempe wrote:
> > > > The console is already suspended in printk.c.
> > > 
> > > Does this mean that drm_log_client_suspend() is called
> > > after suspend_console(), please?
> > 
> > To be honest, I wasn't able to tell which one is called first, and if
> > the order is enforced (I didn't check if drivers can be suspended in
> > parallel, or if it's all sequential)..
> > 
> > I then checked if it's possible to suspend the console, but didn't found
> > an easy API to do so, so I went with this lazy patch, just ensuring
> > we're not writing to a suspended graphic driver.
> 
> I've run some tests on my hardware, and the console is suspended before the
> graphic driver:
> 
> [   56.409604] printk: Suspending console(s) (use no_console_suspend to
> debug)
> [   56.411430] serial 00:05: disabled
> [   56.421877] sd 0:0:0:0: [sda] Synchronizing SCSI cache
> [   56.421954] sd 4:0:0:0: [sdb] Synchronizing SCSI cache
> [   56.422545] ata1.00: Entering standby power mode
> [   56.422793] DRM log suspend
> 
> But because there is the "no_console_suspend" parameter, and we should make
> sure to not draw when the graphic driver is suspended, I think this patch is
> needed, and good enough.
> I will just rephrase the commit message, to make it clear, that some message
> won't be drawn, only if "no_console_suspend" is set.

Ah, I forgot about the "no_console_suspend" parameter. The problem
with this patch is that it would quietly drop all pending messages.

drm_log_write_thread() does not have any return value.
nbcon_emit_next_record() would assume that the message was printed.
And the kthread would continue emitting next message...

In compare, CON_SUSPENDED would cause that console_is_usable()
returns false. As a result, nbcon_kthread_func() would not try
to emit any message and go into a sleep.

If we set CON_SUSPENDED then the pending messages will get printed
after the resume. If we use this patch, the messages would get lost.


This is why I am not happy with this patch. I would prefer to
block the console. I see three better solutions:

  1. Set CON_SUSPENDED from drm_log_client_suspend even when
     "no_console_suspend" is used.

     It is a bit dirty and might cause some confusion.


  2. Add a new flag, e.g. CON_BLOCKED or CON_DRIVER_BLOCKED,
     which might be used for this purpose.


  3. Allow con->write_thread() to return an error code.

     The question is how exactly the error should be handled.
     The kthread would not know when the printing might succeed
     again.


I personally prefer the 2nd variant.


Best Regards,
Petr

PS: I am sorry for the late reply. I had vacation...
Re: [PATCH v5 5/6] drm/log: Implement suspend/resume
Posted by Jocelyn Falempe 3 weeks ago
On 01/11/2024 17:00, Petr Mladek wrote:
> On Fri 2024-10-25 11:46:16, Jocelyn Falempe wrote:
>> On 25/10/2024 01:11, Jocelyn Falempe wrote:
>>> On 24/10/2024 16:34, Petr Mladek wrote:
>>>> On Wed 2024-10-23 14:00:13, Jocelyn Falempe wrote:
>>>>> The console is already suspended in printk.c.
>>>>
>>>> Does this mean that drm_log_client_suspend() is called
>>>> after suspend_console(), please?
>>>
>>> To be honest, I wasn't able to tell which one is called first, and if
>>> the order is enforced (I didn't check if drivers can be suspended in
>>> parallel, or if it's all sequential)..
>>>
>>> I then checked if it's possible to suspend the console, but didn't found
>>> an easy API to do so, so I went with this lazy patch, just ensuring
>>> we're not writing to a suspended graphic driver.
>>
>> I've run some tests on my hardware, and the console is suspended before the
>> graphic driver:
>>
>> [   56.409604] printk: Suspending console(s) (use no_console_suspend to
>> debug)
>> [   56.411430] serial 00:05: disabled
>> [   56.421877] sd 0:0:0:0: [sda] Synchronizing SCSI cache
>> [   56.421954] sd 4:0:0:0: [sdb] Synchronizing SCSI cache
>> [   56.422545] ata1.00: Entering standby power mode
>> [   56.422793] DRM log suspend
>>
>> But because there is the "no_console_suspend" parameter, and we should make
>> sure to not draw when the graphic driver is suspended, I think this patch is
>> needed, and good enough.
>> I will just rephrase the commit message, to make it clear, that some message
>> won't be drawn, only if "no_console_suspend" is set.
> 
> Ah, I forgot about the "no_console_suspend" parameter. The problem
> with this patch is that it would quietly drop all pending messages.
> 
> drm_log_write_thread() does not have any return value.
> nbcon_emit_next_record() would assume that the message was printed.
> And the kthread would continue emitting next message...
> 
> In compare, CON_SUSPENDED would cause that console_is_usable()
> returns false. As a result, nbcon_kthread_func() would not try
> to emit any message and go into a sleep.
> 
> If we set CON_SUSPENDED then the pending messages will get printed
> after the resume. If we use this patch, the messages would get lost.
> 
> 
> This is why I am not happy with this patch. I would prefer to
> block the console. I see three better solutions:
> 
>    1. Set CON_SUSPENDED from drm_log_client_suspend even when
>       "no_console_suspend" is used.
> 
>       It is a bit dirty and might cause some confusion.
> 
> 
>    2. Add a new flag, e.g. CON_BLOCKED or CON_DRIVER_BLOCKED,
>       which might be used for this purpose.
> 
> 
>    3. Allow con->write_thread() to return an error code.
> 
>       The question is how exactly the error should be handled.
>       The kthread would not know when the printing might succeed
>       again.
> 
> 
> I personally prefer the 2nd variant.

I looked at what serial drivers are doing, because they can also have 
their clock gated in suspend.

Would calling console_stop() in the suspend and console_start() in 
resume work ?

https://elixir.bootlin.com/linux/v6.11.6/source/drivers/tty/serial/serial_core.c#L2462

https://elixir.bootlin.com/linux/v6.11.6/source/kernel/printk/printk.c#L3323

It looks like it should do exactly what we need ?

Best regards,

-- 

Jocelyn

> 
> 
> Best Regards,
> Petr
> 
> PS: I am sorry for the late reply. I had vacation...
>
Re: [PATCH v5 5/6] drm/log: Implement suspend/resume
Posted by John Ogness 3 weeks ago
On 2024-11-04, Jocelyn Falempe <jfalempe@redhat.com> wrote:
> I looked at what serial drivers are doing, because they can also have 
> their clock gated in suspend.
>
> Would calling console_stop() in the suspend and console_start() in 
> resume work ?

Yes. That is what it is for.

John Ogness
Re: [PATCH v5 5/6] drm/log: Implement suspend/resume
Posted by Petr Mladek 3 weeks ago
On Mon 2024-11-04 11:52:33, John Ogness wrote:
> On 2024-11-04, Jocelyn Falempe <jfalempe@redhat.com> wrote:
> > I looked at what serial drivers are doing, because they can also have 
> > their clock gated in suspend.
> >
> > Would calling console_stop() in the suspend and console_start() in 
> > resume work ?
> 
> Yes. That is what it is for.

It seems that you are right. I have never really investigated the purpose
of this API /o\

One problem with this API is that it does not check whether the
console is registered. I wonder whether it might cause problems.

For example, we should not set the CON_ENABLE flag when the console is not
registered. Doing so would cause register_console() to always enable
the console, even when it is not preferred.

Additionally, nbcon_kthread_wake() uses con->rcuwait, which is initialized
by nbcon_alloc() called from register_console(). Fortunately, nbcon_alloc()
is always called even if the console is not enabled in the end, but this
might change in the future and cause subtle errors.

[ After even more thinking ]

I wonder whether console_start()/console_stop() should really
manipulate CON_ENABLE flag. It might be historical solution when
@console_suspended was a global variable.

But it has changed with the commit 9e70a5e109a4a2336 ("printk: Add
per-console suspended state").

It might make more sense when console_start()/console_stop()
manipulates CON_SUSPENDED flag. Then it would make sense
to rename them suspend_this_console()/resume_this_console().

What do you think?

Best Regards,
Petr
Re: [PATCH v5 5/6] drm/log: Implement suspend/resume
Posted by John Ogness 3 weeks ago
On 2024-11-04, Petr Mladek <pmladek@suse.com> wrote:
> I wonder whether console_start()/console_stop() should really
> manipulate CON_ENABLE flag. It might be historical solution when
> @console_suspended was a global variable.
>
> But it has changed with the commit 9e70a5e109a4a2336 ("printk: Add
> per-console suspended state").
>
> It might make more sense when console_start()/console_stop()
> manipulates CON_SUSPENDED flag. Then it would make sense
> to rename them suspend_this_console()/resume_this_console().

I worry about letting console drivers and printk.c both modify this flag
during normal runtime. One might clear CON_SUSPENDED too soon and cause
trouble.

CON_ENABLE and @console_suspended were always orthogonal. Moving
@console_suspended to CON_SUSPENDED did not change that relationship.

IMHO we should continue to keep them separate. But your point about the
console not being registered is a good one. We should update
console_stop()/console_start() to only operate on @console if it is
registered. Since those functions take the console_list_lock anyway, it
would be a simple change.

John
Re: [PATCH v5 5/6] drm/log: Implement suspend/resume
Posted by Petr Mladek 2 weeks, 6 days ago
On Mon 2024-11-04 16:38:53, John Ogness wrote:
> On 2024-11-04, Petr Mladek <pmladek@suse.com> wrote:
> > I wonder whether console_start()/console_stop() should really
> > manipulate CON_ENABLE flag. It might be historical solution when
> > @console_suspended was a global variable.
> >
> > But it has changed with the commit 9e70a5e109a4a2336 ("printk: Add
> > per-console suspended state").
> >
> > It might make more sense when console_start()/console_stop()
> > manipulates CON_SUSPENDED flag. Then it would make sense
> > to rename them suspend_this_console()/resume_this_console().
> 
> I worry about letting console drivers and printk.c both modify this flag
> during normal runtime. One might clear CON_SUSPENDED too soon and cause
> trouble.
> 
> CON_ENABLE and @console_suspended were always orthogonal. Moving
> @console_suspended to CON_SUSPENDED did not change that relationship.
> 
> IMHO we should continue to keep them separate. But your point about the
> console not being registered is a good one. We should update
> console_stop()/console_start() to only operate on @console if it is
> registered. Since those functions take the console_list_lock anyway, it
> would be a simple change.

First, I am fine with using console_start()/console_stop() in this
patchset. I agree that this API was created for this purpose
and should still work fine.

But I think that the API is a bit messy and would deserve a clean up.
We should do it in a separate patchset.


History:

  + commit 56dafec3913935c997 ("Import 2.1.71") in v2.1.71, Nov 2007 [1]

    This version introduced "console=" parameter which allowed to
    choose the consoles on the commandline. Before, they were
    selected at build time.

    The @flags and CON_ENABLED flag were added here as well.
    It looks to me like all available console drivers were registered
    but only consoles with CON_ENABLE flag printed the messages.


  + commit 33c0d1b0c3ebb61243d9b ("[PATCH] Serial driver stuff")
    in v2.5.28, Jul 2002 [1]

    Added generic serial_core. The CON_ENABLED flag was re-used
    to disable console when suspending the serial drivers.


  + commit 557240b48e2dc4f6fa878 ("Add support for suspending and
    resuming the whole console subsystem") in v2.6.18, Jun 2006

    Added @console_suspended global variable. It was used as a big hammer
    to block all console drivers and avoid subtle problems during suspend.


  + commit 9e70a5e109a4a233678 ("printk: Add per-console suspended state")
    in v6.6, Jul 2023

    Replaced the global @console_supended global variable with
    per-console CON_SUSPENDED flag.

    The motivation seems to be to remove dependency on console_lock.
    The per-CPU flag allows to query the state via SRCU.

    But the flag is set for all consoles at the same time in
    console_suspend()/console_resume()

	=> it still works as the big hammer.


Observation:

  + CON_ENABLED is not needed for the original purpose. Only enabled
    consoles are added into @console_list.

  + CON_ENABLED is still used to explicitely block the console driver
    during suspend by console_stop()/console_start() in serial_core.c.

    It is not bad. But it is a bit confusing because we have
    CON_SUSPENDED flag now and this is about suspend/resume.


  + CON_SUSPENDED is per-console flag but it is set synchronously
    for all consoles.

    IMHO, a global variable would make more sense for the big hammer
    purpose.


Big question:

  Does the driver really needs to call console_stop()/console_start()
  when there is the big hammer?

  I would preserve it because it makes the design more robust.

  Anyway, the driver-specific handling looks like the right solution.
  The big hammer feels like a workaround.


Reasonable semantic:

  1. Rename:

	console_suspend() -> console_suspend_all()
	console_resume()  -> console_resume_all()

     and manipulate a global @consoles_suspended variable agagin.
     It is the big hammer API.


  2. Rename:

	console_stop(con)  -> console_suspend(con)
	console_start(con) -> console_resume(con)

     and manipulare the per-console CON_SUSPENDED flag here.


   3. Get rid of the ambiguous CON_ENABLED flag. It won't longer
      have any purpose.

      Except that it is also used to force console registration.
      But it can be done a better way, e.g. by introducing
      register_console_force() API.


As I said, we could/should this clean up in a separate patchset.
Like printk-people should fix the printk-mess.


[1] pre-git linux kernel history:
    git://git.kernel.org/pub/scm/linux/kernel/git/history/history.git

Best Regards,
Petr
Re: [PATCH v5 5/6] drm/log: Implement suspend/resume
Posted by John Ogness 2 weeks, 6 days ago
On 2024-11-05, Petr Mladek <pmladek@suse.com> wrote:
> Observation:
>
>   + CON_ENABLED is not needed for the original purpose. Only enabled
>     consoles are added into @console_list.
>
>   + CON_ENABLED is still used to explicitely block the console driver
>     during suspend by console_stop()/console_start() in serial_core.c.
>
>     It is not bad. But it is a bit confusing because we have
>     CON_SUSPENDED flag now and this is about suspend/resume.

Also note that CON_ENABLED is used to gate ->unblank(). It should
probably consider CON_SUSPENDED as well.

>   + CON_SUSPENDED is per-console flag but it is set synchronously
>     for all consoles.
>
>     IMHO, a global variable would make more sense for the big hammer
>     purpose.
>
>
> Big question:
>
>   Does the driver really needs to call console_stop()/console_start()
>   when there is the big hammer?
>
>   I would preserve it because it makes the design more robust.

Agreed. They serve different purposes.

console_stop()/console_start() is a method for _drivers_ to communicate
that they must not be called because their hardware is not
available/functioning.

console_suspend()/console_resume() is a method for the _system_ to
communicate that consoles should be silent because they are annoying or
we do not trust that they won't cause problems.

>   Anyway, the driver-specific handling looks like the right solution.
>   The big hammer feels like a workaround.

Agreed. Do the 6 call sites even really need the big hammer? I am
guessing yes because there are probably console drivers that do not use
console_stop()/console_start() in their suspend/resume and thus rely on
the whole subsystem being disabled.

> Reasonable semantic:
>
>   1. Rename:
>
> 	console_suspend() -> console_suspend_all()
> 	console_resume()  -> console_resume_all()
>
>      and manipulate a global @consoles_suspended variable agagin.
>      It is the big hammer API.

Agreed. As a global variable, it can still rely on SRCU for
synchronization.

>   2. Rename:
>
> 	console_stop(con)  -> console_suspend(con)
> 	console_start(con) -> console_resume(con)
>
>      and manipulare the per-console CON_SUSPENDED flag here.

Agreed.

>    3. Get rid of the ambiguous CON_ENABLED flag. It won't longer
>       have any purpose.
>
>       Except that it is also used to force console registration.
>       But it can be done a better way, e.g. by introducing
>       register_console_force() API.

Agreed.

John
Re: [PATCH v5 5/6] drm/log: Implement suspend/resume
Posted by Jocelyn Falempe 3 weeks ago
On 04/11/2024 15:15, Petr Mladek wrote:
> On Mon 2024-11-04 11:52:33, John Ogness wrote:
>> On 2024-11-04, Jocelyn Falempe <jfalempe@redhat.com> wrote:
>>> I looked at what serial drivers are doing, because they can also have
>>> their clock gated in suspend.
>>>
>>> Would calling console_stop() in the suspend and console_start() in
>>> resume work ?
>>
>> Yes. That is what it is for.
> 
> It seems that you are right. I have never really investigated the purpose
> of this API /o\
> 
Thanks, I will send a v6 with that change.

> One problem with this API is that it does not check whether the
> console is registered. I wonder whether it might cause problems.

At least for drm_log, register_console() will always be called before.
> 
> For example, we should not set the CON_ENABLE flag when the console is not
> registered. Doing so would cause register_console() to always enable
> the console, even when it is not preferred.
> 
> Additionally, nbcon_kthread_wake() uses con->rcuwait, which is initialized
> by nbcon_alloc() called from register_console(). Fortunately, nbcon_alloc()
> is always called even if the console is not enabled in the end, but this
> might change in the future and cause subtle errors.
> 
> [ After even more thinking ]
> 
> I wonder whether console_start()/console_stop() should really
> manipulate CON_ENABLE flag. It might be historical solution when
> @console_suspended was a global variable.
> 
> But it has changed with the commit 9e70a5e109a4a2336 ("printk: Add
> per-console suspended state").
> 
> It might make more sense when console_start()/console_stop()
> manipulates CON_SUSPENDED flag. Then it would make sense
> to rename them suspend_this_console()/resume_this_console().
> 
> What do you think?

Maybe when registering the console, having a flag to say "I want this 
console to be suspended with the console subsystem" or "I want to handle 
suspend/resume on my own, and call the relevant functions" would be better ?

That would avoid having the same console being suspended/resumed twice, 
and making clear what to expect.

Of course "no_console_suspend" won't really work for drivers handling 
suspend/resume themselves.

-- 

Jocelyn
> 
> Best Regards,
> Petr
>