[PATCH] xen/console: do not drop serial output from the hardware domain

Roger Pau Monne posted 1 patch 2 years, 5 months ago
Test gitlab-ci failed
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20220610150651.29933-1-roger.pau@citrix.com
xen/drivers/char/console.c | 3 +++
1 file changed, 3 insertions(+)
[PATCH] xen/console: do not drop serial output from the hardware domain
Posted by Roger Pau Monne 2 years, 5 months ago
Prevent dropping console output from the hardware domain, since it's
likely important to have all the output if the boot fails without
having to resort to sync_console (which also affects the output from
other guests).

Do so by pairing the console_serial_puts() with
serial_{start,end}_log_everything(), so that no output is dropped.

Note that such calls are placed inside of a section already protected
by the console_lock so there are no concurrent callers that could
abuse of the setting of serial_start_log_everything().

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/drivers/char/console.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/xen/drivers/char/console.c b/xen/drivers/char/console.c
index f9937c5134..13207f4d88 100644
--- a/xen/drivers/char/console.c
+++ b/xen/drivers/char/console.c
@@ -614,7 +614,10 @@ static long guest_console_write(XEN_GUEST_HANDLE_PARAM(char) buffer,
             /* Use direct console output as it could be interactive */
             spin_lock_irq(&console_lock);
 
+            serial_start_log_everything(sercon_handle);
             console_serial_puts(kbuf, kcount);
+            serial_end_log_everything(sercon_handle);
+
             video_puts(kbuf, kcount);
 
 #ifdef CONFIG_X86
-- 
2.36.1


Re: [PATCH] xen/console: do not drop serial output from the hardware domain
Posted by Jan Beulich 2 years, 5 months ago
On 10.06.2022 17:06, Roger Pau Monne wrote:
> Prevent dropping console output from the hardware domain, since it's
> likely important to have all the output if the boot fails without
> having to resort to sync_console (which also affects the output from
> other guests).
> 
> Do so by pairing the console_serial_puts() with
> serial_{start,end}_log_everything(), so that no output is dropped.

While I can see the goal, why would Dom0 output be (effectively) more
important than Xen's own one (which isn't "forced")? And with this
aiming at boot output only, wouldn't you want to stop the overriding
once boot has completed (of which, if I'm not mistaken, we don't
really have any signal coming from Dom0)? And even during boot I'm
not convinced we'd want to let through everything, but perhaps just
Dom0's kernel messages?

I'm also a little puzzled by the sync_console argument: If boot
fails, other guests aren't really involved yet, are they?

Finally, what about (if such configured) output from a Xenstore
domain? That's kind of importantish as well, I'd say.

Jan
Re: [PATCH] xen/console: do not drop serial output from the hardware domain
Posted by Roger Pau Monné 2 years, 5 months ago
On Mon, Jun 13, 2022 at 09:30:06AM +0200, Jan Beulich wrote:
> On 10.06.2022 17:06, Roger Pau Monne wrote:
> > Prevent dropping console output from the hardware domain, since it's
> > likely important to have all the output if the boot fails without
> > having to resort to sync_console (which also affects the output from
> > other guests).
> > 
> > Do so by pairing the console_serial_puts() with
> > serial_{start,end}_log_everything(), so that no output is dropped.
> 
> While I can see the goal, why would Dom0 output be (effectively) more
> important than Xen's own one (which isn't "forced")? And with this
> aiming at boot output only, wouldn't you want to stop the overriding
> once boot has completed (of which, if I'm not mistaken, we don't
> really have any signal coming from Dom0)? And even during boot I'm
> not convinced we'd want to let through everything, but perhaps just
> Dom0's kernel messages?

I normally use sync_console on all the boxes I'm doing dev work, so
this request is something that come up internally.

Didn't realize Xen output wasn't forced, since we already have rate
limiting based on log levels I was assuming that non-ratelimited
messages wouldn't be dropped.  But yes, I agree that Xen (non-guest
triggered) output shouldn't be rate limited either.

> I'm also a little puzzled by the sync_console argument: If boot
> fails, other guests aren't really involved yet, are they?

No, but it would useful to get all relevant info without having to ask
users to use sync_console.

> Finally, what about (if such configured) output from a Xenstore
> domain? That's kind of importantish as well, I'd say.

I would be less inclined to do so.  Xenstore domains can use a regular
PV console, which shouldn't be affected by the rate limiting applied to
the serial.  Also that would give the xenstore domain a way to trigger
DoS attacks.

Thanks, Roger.
Re: [PATCH] xen/console: do not drop serial output from the hardware domain
Posted by Jan Beulich 2 years, 5 months ago
On 13.06.2022 10:21, Roger Pau Monné wrote:
> On Mon, Jun 13, 2022 at 09:30:06AM +0200, Jan Beulich wrote:
>> On 10.06.2022 17:06, Roger Pau Monne wrote:
>>> Prevent dropping console output from the hardware domain, since it's
>>> likely important to have all the output if the boot fails without
>>> having to resort to sync_console (which also affects the output from
>>> other guests).
>>>
>>> Do so by pairing the console_serial_puts() with
>>> serial_{start,end}_log_everything(), so that no output is dropped.
>>
>> While I can see the goal, why would Dom0 output be (effectively) more
>> important than Xen's own one (which isn't "forced")? And with this
>> aiming at boot output only, wouldn't you want to stop the overriding
>> once boot has completed (of which, if I'm not mistaken, we don't
>> really have any signal coming from Dom0)? And even during boot I'm
>> not convinced we'd want to let through everything, but perhaps just
>> Dom0's kernel messages?
> 
> I normally use sync_console on all the boxes I'm doing dev work, so
> this request is something that come up internally.
> 
> Didn't realize Xen output wasn't forced, since we already have rate
> limiting based on log levels I was assuming that non-ratelimited
> messages wouldn't be dropped.  But yes, I agree that Xen (non-guest
> triggered) output shouldn't be rate limited either.

Which would raise the question of why we have log levels for non-guest
messages.

>> Finally, what about (if such configured) output from a Xenstore
>> domain? That's kind of importantish as well, I'd say.
> 
> I would be less inclined to do so.  Xenstore domains can use a regular
> PV console, which shouldn't be affected by the rate limiting applied to
> the serial.

Fair point.

>  Also that would give the xenstore domain a way to trigger
> DoS attacks.

I guess a Xenstore domain can do so anyway, by simply refusing to
fulfill its job.

Jan

Re: [PATCH] xen/console: do not drop serial output from the hardware domain
Posted by Roger Pau Monné 2 years, 5 months ago
On Mon, Jun 13, 2022 at 10:29:43AM +0200, Jan Beulich wrote:
> On 13.06.2022 10:21, Roger Pau Monné wrote:
> > On Mon, Jun 13, 2022 at 09:30:06AM +0200, Jan Beulich wrote:
> >> On 10.06.2022 17:06, Roger Pau Monne wrote:
> >>> Prevent dropping console output from the hardware domain, since it's
> >>> likely important to have all the output if the boot fails without
> >>> having to resort to sync_console (which also affects the output from
> >>> other guests).
> >>>
> >>> Do so by pairing the console_serial_puts() with
> >>> serial_{start,end}_log_everything(), so that no output is dropped.
> >>
> >> While I can see the goal, why would Dom0 output be (effectively) more
> >> important than Xen's own one (which isn't "forced")? And with this
> >> aiming at boot output only, wouldn't you want to stop the overriding
> >> once boot has completed (of which, if I'm not mistaken, we don't
> >> really have any signal coming from Dom0)? And even during boot I'm
> >> not convinced we'd want to let through everything, but perhaps just
> >> Dom0's kernel messages?
> > 
> > I normally use sync_console on all the boxes I'm doing dev work, so
> > this request is something that come up internally.
> > 
> > Didn't realize Xen output wasn't forced, since we already have rate
> > limiting based on log levels I was assuming that non-ratelimited
> > messages wouldn't be dropped.  But yes, I agree that Xen (non-guest
> > triggered) output shouldn't be rate limited either.
> 
> Which would raise the question of why we have log levels for non-guest
> messages.

Hm, maybe I'm confused, but I don't see a direct relation between log
levels and rate limiting.  If I set log level to WARNING I would
expect to not loose _any_ non-guest log messages with level WARNING or
above.  It's still useful to have log levels for non-guest messages,
since user might want to filter out DEBUG non-guest messages for
example.

> >  Also that would give the xenstore domain a way to trigger
> > DoS attacks.
> 
> I guess a Xenstore domain can do so anyway, by simply refusing to
> fulfill its job.

Right, but that's IMO a DoS strictly related to the purpose of the
domain.

Thanks, Roger.

Re: [PATCH] xen/console: do not drop serial output from the hardware domain
Posted by Jan Beulich 2 years, 5 months ago
On 13.06.2022 11:04, Roger Pau Monné wrote:
> On Mon, Jun 13, 2022 at 10:29:43AM +0200, Jan Beulich wrote:
>> On 13.06.2022 10:21, Roger Pau Monné wrote:
>>> On Mon, Jun 13, 2022 at 09:30:06AM +0200, Jan Beulich wrote:
>>>> On 10.06.2022 17:06, Roger Pau Monne wrote:
>>>>> Prevent dropping console output from the hardware domain, since it's
>>>>> likely important to have all the output if the boot fails without
>>>>> having to resort to sync_console (which also affects the output from
>>>>> other guests).
>>>>>
>>>>> Do so by pairing the console_serial_puts() with
>>>>> serial_{start,end}_log_everything(), so that no output is dropped.
>>>>
>>>> While I can see the goal, why would Dom0 output be (effectively) more
>>>> important than Xen's own one (which isn't "forced")? And with this
>>>> aiming at boot output only, wouldn't you want to stop the overriding
>>>> once boot has completed (of which, if I'm not mistaken, we don't
>>>> really have any signal coming from Dom0)? And even during boot I'm
>>>> not convinced we'd want to let through everything, but perhaps just
>>>> Dom0's kernel messages?
>>>
>>> I normally use sync_console on all the boxes I'm doing dev work, so
>>> this request is something that come up internally.
>>>
>>> Didn't realize Xen output wasn't forced, since we already have rate
>>> limiting based on log levels I was assuming that non-ratelimited
>>> messages wouldn't be dropped.  But yes, I agree that Xen (non-guest
>>> triggered) output shouldn't be rate limited either.
>>
>> Which would raise the question of why we have log levels for non-guest
>> messages.
> 
> Hm, maybe I'm confused, but I don't see a direct relation between log
> levels and rate limiting.  If I set log level to WARNING I would
> expect to not loose _any_ non-guest log messages with level WARNING or
> above.  It's still useful to have log levels for non-guest messages,
> since user might want to filter out DEBUG non-guest messages for
> example.

It was me who was confused, because of the two log-everything variants
we have (console and serial). You're right that your change is unrelated
to log levels. However, when there are e.g. many warnings or when an
admin has lowered the log level, what you (would) do is effectively
force sync_console mode transiently (for a subset of messages, but
that's secondary, especially because the "forced" output would still
be waiting for earlier output to make it out). We strongly advise against
use of sync_console in production environments, so I'm afraid I have
trouble seeing how using this mode transiently can be safe. This is quite
different from forcing all output to appear when e.g. we're about to
crash Xen.

Jan

Re: [PATCH] xen/console: do not drop serial output from the hardware domain
Posted by Roger Pau Monné 2 years, 5 months ago
On Mon, Jun 13, 2022 at 11:18:49AM +0200, Jan Beulich wrote:
> On 13.06.2022 11:04, Roger Pau Monné wrote:
> > On Mon, Jun 13, 2022 at 10:29:43AM +0200, Jan Beulich wrote:
> >> On 13.06.2022 10:21, Roger Pau Monné wrote:
> >>> On Mon, Jun 13, 2022 at 09:30:06AM +0200, Jan Beulich wrote:
> >>>> On 10.06.2022 17:06, Roger Pau Monne wrote:
> >>>>> Prevent dropping console output from the hardware domain, since it's
> >>>>> likely important to have all the output if the boot fails without
> >>>>> having to resort to sync_console (which also affects the output from
> >>>>> other guests).
> >>>>>
> >>>>> Do so by pairing the console_serial_puts() with
> >>>>> serial_{start,end}_log_everything(), so that no output is dropped.
> >>>>
> >>>> While I can see the goal, why would Dom0 output be (effectively) more
> >>>> important than Xen's own one (which isn't "forced")? And with this
> >>>> aiming at boot output only, wouldn't you want to stop the overriding
> >>>> once boot has completed (of which, if I'm not mistaken, we don't
> >>>> really have any signal coming from Dom0)? And even during boot I'm
> >>>> not convinced we'd want to let through everything, but perhaps just
> >>>> Dom0's kernel messages?
> >>>
> >>> I normally use sync_console on all the boxes I'm doing dev work, so
> >>> this request is something that come up internally.
> >>>
> >>> Didn't realize Xen output wasn't forced, since we already have rate
> >>> limiting based on log levels I was assuming that non-ratelimited
> >>> messages wouldn't be dropped.  But yes, I agree that Xen (non-guest
> >>> triggered) output shouldn't be rate limited either.
> >>
> >> Which would raise the question of why we have log levels for non-guest
> >> messages.
> > 
> > Hm, maybe I'm confused, but I don't see a direct relation between log
> > levels and rate limiting.  If I set log level to WARNING I would
> > expect to not loose _any_ non-guest log messages with level WARNING or
> > above.  It's still useful to have log levels for non-guest messages,
> > since user might want to filter out DEBUG non-guest messages for
> > example.
> 
> It was me who was confused, because of the two log-everything variants
> we have (console and serial). You're right that your change is unrelated
> to log levels. However, when there are e.g. many warnings or when an
> admin has lowered the log level, what you (would) do is effectively
> force sync_console mode transiently (for a subset of messages, but
> that's secondary, especially because the "forced" output would still
> be waiting for earlier output to make it out).

Right, it would have to wait for any previous output on the buffer to
go out first.  In any case we can guarantee that no more output will
be added to the buffer while Xen waits for it to be flushed.

So for the hardware domain it might make sense to wait for the TX
buffers to be half empty (the current tx_quench logic) by preempting
the hypercall.  That however could cause issues if guests manage to
keep filling the buffer while the hardware domain is being preempted.

Alternatively we could always reserve half of the buffer for the
hardware domain, and allow it to be preempted while waiting for space
(since it's guaranteed non hardware domains won't be able to steal the
allocation from the hardware domain).

For Xen it's not trivial to prevent messages from being dropped. At
least during Xen boot (system_state < SYS_STATE_active) we could also
active the sync mode and make the spin wait in __serial_putc process
softirqs.

Thanks, Roger.

Re: [PATCH] xen/console: do not drop serial output from the hardware domain
Posted by Jan Beulich 2 years, 5 months ago
On 13.06.2022 14:32, Roger Pau Monné wrote:
> On Mon, Jun 13, 2022 at 11:18:49AM +0200, Jan Beulich wrote:
>> On 13.06.2022 11:04, Roger Pau Monné wrote:
>>> On Mon, Jun 13, 2022 at 10:29:43AM +0200, Jan Beulich wrote:
>>>> On 13.06.2022 10:21, Roger Pau Monné wrote:
>>>>> On Mon, Jun 13, 2022 at 09:30:06AM +0200, Jan Beulich wrote:
>>>>>> On 10.06.2022 17:06, Roger Pau Monne wrote:
>>>>>>> Prevent dropping console output from the hardware domain, since it's
>>>>>>> likely important to have all the output if the boot fails without
>>>>>>> having to resort to sync_console (which also affects the output from
>>>>>>> other guests).
>>>>>>>
>>>>>>> Do so by pairing the console_serial_puts() with
>>>>>>> serial_{start,end}_log_everything(), so that no output is dropped.
>>>>>>
>>>>>> While I can see the goal, why would Dom0 output be (effectively) more
>>>>>> important than Xen's own one (which isn't "forced")? And with this
>>>>>> aiming at boot output only, wouldn't you want to stop the overriding
>>>>>> once boot has completed (of which, if I'm not mistaken, we don't
>>>>>> really have any signal coming from Dom0)? And even during boot I'm
>>>>>> not convinced we'd want to let through everything, but perhaps just
>>>>>> Dom0's kernel messages?
>>>>>
>>>>> I normally use sync_console on all the boxes I'm doing dev work, so
>>>>> this request is something that come up internally.
>>>>>
>>>>> Didn't realize Xen output wasn't forced, since we already have rate
>>>>> limiting based on log levels I was assuming that non-ratelimited
>>>>> messages wouldn't be dropped.  But yes, I agree that Xen (non-guest
>>>>> triggered) output shouldn't be rate limited either.
>>>>
>>>> Which would raise the question of why we have log levels for non-guest
>>>> messages.
>>>
>>> Hm, maybe I'm confused, but I don't see a direct relation between log
>>> levels and rate limiting.  If I set log level to WARNING I would
>>> expect to not loose _any_ non-guest log messages with level WARNING or
>>> above.  It's still useful to have log levels for non-guest messages,
>>> since user might want to filter out DEBUG non-guest messages for
>>> example.
>>
>> It was me who was confused, because of the two log-everything variants
>> we have (console and serial). You're right that your change is unrelated
>> to log levels. However, when there are e.g. many warnings or when an
>> admin has lowered the log level, what you (would) do is effectively
>> force sync_console mode transiently (for a subset of messages, but
>> that's secondary, especially because the "forced" output would still
>> be waiting for earlier output to make it out).
> 
> Right, it would have to wait for any previous output on the buffer to
> go out first.  In any case we can guarantee that no more output will
> be added to the buffer while Xen waits for it to be flushed.
> 
> So for the hardware domain it might make sense to wait for the TX
> buffers to be half empty (the current tx_quench logic) by preempting
> the hypercall.  That however could cause issues if guests manage to
> keep filling the buffer while the hardware domain is being preempted.
> 
> Alternatively we could always reserve half of the buffer for the
> hardware domain, and allow it to be preempted while waiting for space
> (since it's guaranteed non hardware domains won't be able to steal the
> allocation from the hardware domain).

Getting complicated it seems. I have to admit that I wonder whether we
wouldn't be better off leaving the current logic as is.

> For Xen it's not trivial to prevent messages from being dropped. At
> least during Xen boot (system_state < SYS_STATE_active) we could also
> active the sync mode and make the spin wait in __serial_putc process
> softirqs.

Yeah, that would seem doable _and_ safe (enough).

Jan

Re: [PATCH] xen/console: do not drop serial output from the hardware domain
Posted by Roger Pau Monné 2 years, 5 months ago
On Mon, Jun 13, 2022 at 03:56:54PM +0200, Jan Beulich wrote:
> On 13.06.2022 14:32, Roger Pau Monné wrote:
> > On Mon, Jun 13, 2022 at 11:18:49AM +0200, Jan Beulich wrote:
> >> On 13.06.2022 11:04, Roger Pau Monné wrote:
> >>> On Mon, Jun 13, 2022 at 10:29:43AM +0200, Jan Beulich wrote:
> >>>> On 13.06.2022 10:21, Roger Pau Monné wrote:
> >>>>> On Mon, Jun 13, 2022 at 09:30:06AM +0200, Jan Beulich wrote:
> >>>>>> On 10.06.2022 17:06, Roger Pau Monne wrote:
> >>>>>>> Prevent dropping console output from the hardware domain, since it's
> >>>>>>> likely important to have all the output if the boot fails without
> >>>>>>> having to resort to sync_console (which also affects the output from
> >>>>>>> other guests).
> >>>>>>>
> >>>>>>> Do so by pairing the console_serial_puts() with
> >>>>>>> serial_{start,end}_log_everything(), so that no output is dropped.
> >>>>>>
> >>>>>> While I can see the goal, why would Dom0 output be (effectively) more
> >>>>>> important than Xen's own one (which isn't "forced")? And with this
> >>>>>> aiming at boot output only, wouldn't you want to stop the overriding
> >>>>>> once boot has completed (of which, if I'm not mistaken, we don't
> >>>>>> really have any signal coming from Dom0)? And even during boot I'm
> >>>>>> not convinced we'd want to let through everything, but perhaps just
> >>>>>> Dom0's kernel messages?
> >>>>>
> >>>>> I normally use sync_console on all the boxes I'm doing dev work, so
> >>>>> this request is something that come up internally.
> >>>>>
> >>>>> Didn't realize Xen output wasn't forced, since we already have rate
> >>>>> limiting based on log levels I was assuming that non-ratelimited
> >>>>> messages wouldn't be dropped.  But yes, I agree that Xen (non-guest
> >>>>> triggered) output shouldn't be rate limited either.
> >>>>
> >>>> Which would raise the question of why we have log levels for non-guest
> >>>> messages.
> >>>
> >>> Hm, maybe I'm confused, but I don't see a direct relation between log
> >>> levels and rate limiting.  If I set log level to WARNING I would
> >>> expect to not loose _any_ non-guest log messages with level WARNING or
> >>> above.  It's still useful to have log levels for non-guest messages,
> >>> since user might want to filter out DEBUG non-guest messages for
> >>> example.
> >>
> >> It was me who was confused, because of the two log-everything variants
> >> we have (console and serial). You're right that your change is unrelated
> >> to log levels. However, when there are e.g. many warnings or when an
> >> admin has lowered the log level, what you (would) do is effectively
> >> force sync_console mode transiently (for a subset of messages, but
> >> that's secondary, especially because the "forced" output would still
> >> be waiting for earlier output to make it out).
> > 
> > Right, it would have to wait for any previous output on the buffer to
> > go out first.  In any case we can guarantee that no more output will
> > be added to the buffer while Xen waits for it to be flushed.
> > 
> > So for the hardware domain it might make sense to wait for the TX
> > buffers to be half empty (the current tx_quench logic) by preempting
> > the hypercall.  That however could cause issues if guests manage to
> > keep filling the buffer while the hardware domain is being preempted.
> > 
> > Alternatively we could always reserve half of the buffer for the
> > hardware domain, and allow it to be preempted while waiting for space
> > (since it's guaranteed non hardware domains won't be able to steal the
> > allocation from the hardware domain).
> 
> Getting complicated it seems. I have to admit that I wonder whether we
> wouldn't be better off leaving the current logic as is.

Another possible solution (more like a band aid) is to increase the
buffer size from 4 pages to 8 or 16.  That would likely allow to cope
fine with the high throughput of boot messages.

Thanks, Roger.

Re: [PATCH] xen/console: do not drop serial output from the hardware domain
Posted by Jan Beulich 2 years, 5 months ago
On 14.06.2022 08:52, Roger Pau Monné wrote:
> On Mon, Jun 13, 2022 at 03:56:54PM +0200, Jan Beulich wrote:
>> On 13.06.2022 14:32, Roger Pau Monné wrote:
>>> On Mon, Jun 13, 2022 at 11:18:49AM +0200, Jan Beulich wrote:
>>>> On 13.06.2022 11:04, Roger Pau Monné wrote:
>>>>> On Mon, Jun 13, 2022 at 10:29:43AM +0200, Jan Beulich wrote:
>>>>>> On 13.06.2022 10:21, Roger Pau Monné wrote:
>>>>>>> On Mon, Jun 13, 2022 at 09:30:06AM +0200, Jan Beulich wrote:
>>>>>>>> On 10.06.2022 17:06, Roger Pau Monne wrote:
>>>>>>>>> Prevent dropping console output from the hardware domain, since it's
>>>>>>>>> likely important to have all the output if the boot fails without
>>>>>>>>> having to resort to sync_console (which also affects the output from
>>>>>>>>> other guests).
>>>>>>>>>
>>>>>>>>> Do so by pairing the console_serial_puts() with
>>>>>>>>> serial_{start,end}_log_everything(), so that no output is dropped.
>>>>>>>>
>>>>>>>> While I can see the goal, why would Dom0 output be (effectively) more
>>>>>>>> important than Xen's own one (which isn't "forced")? And with this
>>>>>>>> aiming at boot output only, wouldn't you want to stop the overriding
>>>>>>>> once boot has completed (of which, if I'm not mistaken, we don't
>>>>>>>> really have any signal coming from Dom0)? And even during boot I'm
>>>>>>>> not convinced we'd want to let through everything, but perhaps just
>>>>>>>> Dom0's kernel messages?
>>>>>>>
>>>>>>> I normally use sync_console on all the boxes I'm doing dev work, so
>>>>>>> this request is something that come up internally.
>>>>>>>
>>>>>>> Didn't realize Xen output wasn't forced, since we already have rate
>>>>>>> limiting based on log levels I was assuming that non-ratelimited
>>>>>>> messages wouldn't be dropped.  But yes, I agree that Xen (non-guest
>>>>>>> triggered) output shouldn't be rate limited either.
>>>>>>
>>>>>> Which would raise the question of why we have log levels for non-guest
>>>>>> messages.
>>>>>
>>>>> Hm, maybe I'm confused, but I don't see a direct relation between log
>>>>> levels and rate limiting.  If I set log level to WARNING I would
>>>>> expect to not loose _any_ non-guest log messages with level WARNING or
>>>>> above.  It's still useful to have log levels for non-guest messages,
>>>>> since user might want to filter out DEBUG non-guest messages for
>>>>> example.
>>>>
>>>> It was me who was confused, because of the two log-everything variants
>>>> we have (console and serial). You're right that your change is unrelated
>>>> to log levels. However, when there are e.g. many warnings or when an
>>>> admin has lowered the log level, what you (would) do is effectively
>>>> force sync_console mode transiently (for a subset of messages, but
>>>> that's secondary, especially because the "forced" output would still
>>>> be waiting for earlier output to make it out).
>>>
>>> Right, it would have to wait for any previous output on the buffer to
>>> go out first.  In any case we can guarantee that no more output will
>>> be added to the buffer while Xen waits for it to be flushed.
>>>
>>> So for the hardware domain it might make sense to wait for the TX
>>> buffers to be half empty (the current tx_quench logic) by preempting
>>> the hypercall.  That however could cause issues if guests manage to
>>> keep filling the buffer while the hardware domain is being preempted.
>>>
>>> Alternatively we could always reserve half of the buffer for the
>>> hardware domain, and allow it to be preempted while waiting for space
>>> (since it's guaranteed non hardware domains won't be able to steal the
>>> allocation from the hardware domain).
>>
>> Getting complicated it seems. I have to admit that I wonder whether we
>> wouldn't be better off leaving the current logic as is.
> 
> Another possible solution (more like a band aid) is to increase the
> buffer size from 4 pages to 8 or 16.  That would likely allow to cope
> fine with the high throughput of boot messages.

You mean the buffer whose size is controlled by serial_tx_buffer? On
large systems one may want to simply make use of the command line
option then; I don't think the built-in default needs changing. Or
if so, then perhaps not statically at build time, but taking into
account system properties (like CPU count).

Jan

Re: [PATCH] xen/console: do not drop serial output from the hardware domain
Posted by Roger Pau Monné 2 years, 5 months ago
On Tue, Jun 14, 2022 at 10:10:03AM +0200, Jan Beulich wrote:
> On 14.06.2022 08:52, Roger Pau Monné wrote:
> > On Mon, Jun 13, 2022 at 03:56:54PM +0200, Jan Beulich wrote:
> >> On 13.06.2022 14:32, Roger Pau Monné wrote:
> >>> On Mon, Jun 13, 2022 at 11:18:49AM +0200, Jan Beulich wrote:
> >>>> On 13.06.2022 11:04, Roger Pau Monné wrote:
> >>>>> On Mon, Jun 13, 2022 at 10:29:43AM +0200, Jan Beulich wrote:
> >>>>>> On 13.06.2022 10:21, Roger Pau Monné wrote:
> >>>>>>> On Mon, Jun 13, 2022 at 09:30:06AM +0200, Jan Beulich wrote:
> >>>>>>>> On 10.06.2022 17:06, Roger Pau Monne wrote:
> >>>>>>>>> Prevent dropping console output from the hardware domain, since it's
> >>>>>>>>> likely important to have all the output if the boot fails without
> >>>>>>>>> having to resort to sync_console (which also affects the output from
> >>>>>>>>> other guests).
> >>>>>>>>>
> >>>>>>>>> Do so by pairing the console_serial_puts() with
> >>>>>>>>> serial_{start,end}_log_everything(), so that no output is dropped.
> >>>>>>>>
> >>>>>>>> While I can see the goal, why would Dom0 output be (effectively) more
> >>>>>>>> important than Xen's own one (which isn't "forced")? And with this
> >>>>>>>> aiming at boot output only, wouldn't you want to stop the overriding
> >>>>>>>> once boot has completed (of which, if I'm not mistaken, we don't
> >>>>>>>> really have any signal coming from Dom0)? And even during boot I'm
> >>>>>>>> not convinced we'd want to let through everything, but perhaps just
> >>>>>>>> Dom0's kernel messages?
> >>>>>>>
> >>>>>>> I normally use sync_console on all the boxes I'm doing dev work, so
> >>>>>>> this request is something that come up internally.
> >>>>>>>
> >>>>>>> Didn't realize Xen output wasn't forced, since we already have rate
> >>>>>>> limiting based on log levels I was assuming that non-ratelimited
> >>>>>>> messages wouldn't be dropped.  But yes, I agree that Xen (non-guest
> >>>>>>> triggered) output shouldn't be rate limited either.
> >>>>>>
> >>>>>> Which would raise the question of why we have log levels for non-guest
> >>>>>> messages.
> >>>>>
> >>>>> Hm, maybe I'm confused, but I don't see a direct relation between log
> >>>>> levels and rate limiting.  If I set log level to WARNING I would
> >>>>> expect to not loose _any_ non-guest log messages with level WARNING or
> >>>>> above.  It's still useful to have log levels for non-guest messages,
> >>>>> since user might want to filter out DEBUG non-guest messages for
> >>>>> example.
> >>>>
> >>>> It was me who was confused, because of the two log-everything variants
> >>>> we have (console and serial). You're right that your change is unrelated
> >>>> to log levels. However, when there are e.g. many warnings or when an
> >>>> admin has lowered the log level, what you (would) do is effectively
> >>>> force sync_console mode transiently (for a subset of messages, but
> >>>> that's secondary, especially because the "forced" output would still
> >>>> be waiting for earlier output to make it out).
> >>>
> >>> Right, it would have to wait for any previous output on the buffer to
> >>> go out first.  In any case we can guarantee that no more output will
> >>> be added to the buffer while Xen waits for it to be flushed.
> >>>
> >>> So for the hardware domain it might make sense to wait for the TX
> >>> buffers to be half empty (the current tx_quench logic) by preempting
> >>> the hypercall.  That however could cause issues if guests manage to
> >>> keep filling the buffer while the hardware domain is being preempted.
> >>>
> >>> Alternatively we could always reserve half of the buffer for the
> >>> hardware domain, and allow it to be preempted while waiting for space
> >>> (since it's guaranteed non hardware domains won't be able to steal the
> >>> allocation from the hardware domain).
> >>
> >> Getting complicated it seems. I have to admit that I wonder whether we
> >> wouldn't be better off leaving the current logic as is.
> > 
> > Another possible solution (more like a band aid) is to increase the
> > buffer size from 4 pages to 8 or 16.  That would likely allow to cope
> > fine with the high throughput of boot messages.
> 
> You mean the buffer whose size is controlled by serial_tx_buffer?

Yes.

> On
> large systems one may want to simply make use of the command line
> option then; I don't think the built-in default needs changing. Or
> if so, then perhaps not statically at build time, but taking into
> account system properties (like CPU count).

So how about we use:

min(16384, ROUNDUP(1024 * num_possible_cpus(), 4096))

Maybe we should also take CPU frequency into account, but that seems
too complex for the purpose.

Thanks, Roger.

Re: [PATCH] xen/console: do not drop serial output from the hardware domain
Posted by Jan Beulich 2 years, 5 months ago
On 14.06.2022 10:32, Roger Pau Monné wrote:
> On Tue, Jun 14, 2022 at 10:10:03AM +0200, Jan Beulich wrote:
>> On 14.06.2022 08:52, Roger Pau Monné wrote:
>>> On Mon, Jun 13, 2022 at 03:56:54PM +0200, Jan Beulich wrote:
>>>> On 13.06.2022 14:32, Roger Pau Monné wrote:
>>>>> On Mon, Jun 13, 2022 at 11:18:49AM +0200, Jan Beulich wrote:
>>>>>> On 13.06.2022 11:04, Roger Pau Monné wrote:
>>>>>>> On Mon, Jun 13, 2022 at 10:29:43AM +0200, Jan Beulich wrote:
>>>>>>>> On 13.06.2022 10:21, Roger Pau Monné wrote:
>>>>>>>>> On Mon, Jun 13, 2022 at 09:30:06AM +0200, Jan Beulich wrote:
>>>>>>>>>> On 10.06.2022 17:06, Roger Pau Monne wrote:
>>>>>>>>>>> Prevent dropping console output from the hardware domain, since it's
>>>>>>>>>>> likely important to have all the output if the boot fails without
>>>>>>>>>>> having to resort to sync_console (which also affects the output from
>>>>>>>>>>> other guests).
>>>>>>>>>>>
>>>>>>>>>>> Do so by pairing the console_serial_puts() with
>>>>>>>>>>> serial_{start,end}_log_everything(), so that no output is dropped.
>>>>>>>>>>
>>>>>>>>>> While I can see the goal, why would Dom0 output be (effectively) more
>>>>>>>>>> important than Xen's own one (which isn't "forced")? And with this
>>>>>>>>>> aiming at boot output only, wouldn't you want to stop the overriding
>>>>>>>>>> once boot has completed (of which, if I'm not mistaken, we don't
>>>>>>>>>> really have any signal coming from Dom0)? And even during boot I'm
>>>>>>>>>> not convinced we'd want to let through everything, but perhaps just
>>>>>>>>>> Dom0's kernel messages?
>>>>>>>>>
>>>>>>>>> I normally use sync_console on all the boxes I'm doing dev work, so
>>>>>>>>> this request is something that come up internally.
>>>>>>>>>
>>>>>>>>> Didn't realize Xen output wasn't forced, since we already have rate
>>>>>>>>> limiting based on log levels I was assuming that non-ratelimited
>>>>>>>>> messages wouldn't be dropped.  But yes, I agree that Xen (non-guest
>>>>>>>>> triggered) output shouldn't be rate limited either.
>>>>>>>>
>>>>>>>> Which would raise the question of why we have log levels for non-guest
>>>>>>>> messages.
>>>>>>>
>>>>>>> Hm, maybe I'm confused, but I don't see a direct relation between log
>>>>>>> levels and rate limiting.  If I set log level to WARNING I would
>>>>>>> expect to not loose _any_ non-guest log messages with level WARNING or
>>>>>>> above.  It's still useful to have log levels for non-guest messages,
>>>>>>> since user might want to filter out DEBUG non-guest messages for
>>>>>>> example.
>>>>>>
>>>>>> It was me who was confused, because of the two log-everything variants
>>>>>> we have (console and serial). You're right that your change is unrelated
>>>>>> to log levels. However, when there are e.g. many warnings or when an
>>>>>> admin has lowered the log level, what you (would) do is effectively
>>>>>> force sync_console mode transiently (for a subset of messages, but
>>>>>> that's secondary, especially because the "forced" output would still
>>>>>> be waiting for earlier output to make it out).
>>>>>
>>>>> Right, it would have to wait for any previous output on the buffer to
>>>>> go out first.  In any case we can guarantee that no more output will
>>>>> be added to the buffer while Xen waits for it to be flushed.
>>>>>
>>>>> So for the hardware domain it might make sense to wait for the TX
>>>>> buffers to be half empty (the current tx_quench logic) by preempting
>>>>> the hypercall.  That however could cause issues if guests manage to
>>>>> keep filling the buffer while the hardware domain is being preempted.
>>>>>
>>>>> Alternatively we could always reserve half of the buffer for the
>>>>> hardware domain, and allow it to be preempted while waiting for space
>>>>> (since it's guaranteed non hardware domains won't be able to steal the
>>>>> allocation from the hardware domain).
>>>>
>>>> Getting complicated it seems. I have to admit that I wonder whether we
>>>> wouldn't be better off leaving the current logic as is.
>>>
>>> Another possible solution (more like a band aid) is to increase the
>>> buffer size from 4 pages to 8 or 16.  That would likely allow to cope
>>> fine with the high throughput of boot messages.
>>
>> You mean the buffer whose size is controlled by serial_tx_buffer?
> 
> Yes.
> 
>> On
>> large systems one may want to simply make use of the command line
>> option then; I don't think the built-in default needs changing. Or
>> if so, then perhaps not statically at build time, but taking into
>> account system properties (like CPU count).
> 
> So how about we use:
> 
> min(16384, ROUNDUP(1024 * num_possible_cpus(), 4096))

That would _reduce_ size on small systems, wouldn't it? Originally
you were after increasing the default size. But if you had meant
max(), then I'd fear on very large systems this may grow a little
too large.

> Maybe we should also take CPU frequency into account, but that seems
> too complex for the purpose.

Why would frequency matter? Other aspects I could see mattering is
node count and maybe memory size.

Jan

Re: [PATCH] xen/console: do not drop serial output from the hardware domain
Posted by Roger Pau Monné 2 years, 5 months ago
On Tue, Jun 14, 2022 at 11:13:07AM +0200, Jan Beulich wrote:
> On 14.06.2022 10:32, Roger Pau Monné wrote:
> > On Tue, Jun 14, 2022 at 10:10:03AM +0200, Jan Beulich wrote:
> >> On 14.06.2022 08:52, Roger Pau Monné wrote:
> >>> On Mon, Jun 13, 2022 at 03:56:54PM +0200, Jan Beulich wrote:
> >>>> On 13.06.2022 14:32, Roger Pau Monné wrote:
> >>>>> On Mon, Jun 13, 2022 at 11:18:49AM +0200, Jan Beulich wrote:
> >>>>>> On 13.06.2022 11:04, Roger Pau Monné wrote:
> >>>>>>> On Mon, Jun 13, 2022 at 10:29:43AM +0200, Jan Beulich wrote:
> >>>>>>>> On 13.06.2022 10:21, Roger Pau Monné wrote:
> >>>>>>>>> On Mon, Jun 13, 2022 at 09:30:06AM +0200, Jan Beulich wrote:
> >>>>>>>>>> On 10.06.2022 17:06, Roger Pau Monne wrote:
> >>>>>>>>>>> Prevent dropping console output from the hardware domain, since it's
> >>>>>>>>>>> likely important to have all the output if the boot fails without
> >>>>>>>>>>> having to resort to sync_console (which also affects the output from
> >>>>>>>>>>> other guests).
> >>>>>>>>>>>
> >>>>>>>>>>> Do so by pairing the console_serial_puts() with
> >>>>>>>>>>> serial_{start,end}_log_everything(), so that no output is dropped.
> >>>>>>>>>>
> >>>>>>>>>> While I can see the goal, why would Dom0 output be (effectively) more
> >>>>>>>>>> important than Xen's own one (which isn't "forced")? And with this
> >>>>>>>>>> aiming at boot output only, wouldn't you want to stop the overriding
> >>>>>>>>>> once boot has completed (of which, if I'm not mistaken, we don't
> >>>>>>>>>> really have any signal coming from Dom0)? And even during boot I'm
> >>>>>>>>>> not convinced we'd want to let through everything, but perhaps just
> >>>>>>>>>> Dom0's kernel messages?
> >>>>>>>>>
> >>>>>>>>> I normally use sync_console on all the boxes I'm doing dev work, so
> >>>>>>>>> this request is something that come up internally.
> >>>>>>>>>
> >>>>>>>>> Didn't realize Xen output wasn't forced, since we already have rate
> >>>>>>>>> limiting based on log levels I was assuming that non-ratelimited
> >>>>>>>>> messages wouldn't be dropped.  But yes, I agree that Xen (non-guest
> >>>>>>>>> triggered) output shouldn't be rate limited either.
> >>>>>>>>
> >>>>>>>> Which would raise the question of why we have log levels for non-guest
> >>>>>>>> messages.
> >>>>>>>
> >>>>>>> Hm, maybe I'm confused, but I don't see a direct relation between log
> >>>>>>> levels and rate limiting.  If I set log level to WARNING I would
> >>>>>>> expect to not loose _any_ non-guest log messages with level WARNING or
> >>>>>>> above.  It's still useful to have log levels for non-guest messages,
> >>>>>>> since user might want to filter out DEBUG non-guest messages for
> >>>>>>> example.
> >>>>>>
> >>>>>> It was me who was confused, because of the two log-everything variants
> >>>>>> we have (console and serial). You're right that your change is unrelated
> >>>>>> to log levels. However, when there are e.g. many warnings or when an
> >>>>>> admin has lowered the log level, what you (would) do is effectively
> >>>>>> force sync_console mode transiently (for a subset of messages, but
> >>>>>> that's secondary, especially because the "forced" output would still
> >>>>>> be waiting for earlier output to make it out).
> >>>>>
> >>>>> Right, it would have to wait for any previous output on the buffer to
> >>>>> go out first.  In any case we can guarantee that no more output will
> >>>>> be added to the buffer while Xen waits for it to be flushed.
> >>>>>
> >>>>> So for the hardware domain it might make sense to wait for the TX
> >>>>> buffers to be half empty (the current tx_quench logic) by preempting
> >>>>> the hypercall.  That however could cause issues if guests manage to
> >>>>> keep filling the buffer while the hardware domain is being preempted.
> >>>>>
> >>>>> Alternatively we could always reserve half of the buffer for the
> >>>>> hardware domain, and allow it to be preempted while waiting for space
> >>>>> (since it's guaranteed non hardware domains won't be able to steal the
> >>>>> allocation from the hardware domain).
> >>>>
> >>>> Getting complicated it seems. I have to admit that I wonder whether we
> >>>> wouldn't be better off leaving the current logic as is.
> >>>
> >>> Another possible solution (more like a band aid) is to increase the
> >>> buffer size from 4 pages to 8 or 16.  That would likely allow to cope
> >>> fine with the high throughput of boot messages.
> >>
> >> You mean the buffer whose size is controlled by serial_tx_buffer?
> > 
> > Yes.
> > 
> >> On
> >> large systems one may want to simply make use of the command line
> >> option then; I don't think the built-in default needs changing. Or
> >> if so, then perhaps not statically at build time, but taking into
> >> account system properties (like CPU count).
> > 
> > So how about we use:
> > 
> > min(16384, ROUNDUP(1024 * num_possible_cpus(), 4096))
> 
> That would _reduce_ size on small systems, wouldn't it? Originally
> you were after increasing the default size. But if you had meant
> max(), then I'd fear on very large systems this may grow a little
> too large.

See previous followup about my mistake of using min() instead of
max().

On a system with 512 CPUs that would be 512KB, I don't think that's a
lot of memory, specially taking into account that a system with 512
CPUs should have a matching amount of memory I would expect.

It's true however that I very much doubt we would fill a 512K buffer,
so limiting to 64K might be a sensible starting point?

> > Maybe we should also take CPU frequency into account, but that seems
> > too complex for the purpose.
> 
> Why would frequency matter? Other aspects I could see mattering is
> node count and maybe memory size.

Higher frequency likely means faster boot, and faster buffer fill,
because the baudrate of the console is constant.

Thanks, Roger.

Re: [PATCH] xen/console: do not drop serial output from the hardware domain
Posted by Jan Beulich 2 years, 5 months ago
On 14.06.2022 11:38, Roger Pau Monné wrote:
> On Tue, Jun 14, 2022 at 11:13:07AM +0200, Jan Beulich wrote:
>> On 14.06.2022 10:32, Roger Pau Monné wrote:
>>> On Tue, Jun 14, 2022 at 10:10:03AM +0200, Jan Beulich wrote:
>>>> On 14.06.2022 08:52, Roger Pau Monné wrote:
>>>>> On Mon, Jun 13, 2022 at 03:56:54PM +0200, Jan Beulich wrote:
>>>>>> On 13.06.2022 14:32, Roger Pau Monné wrote:
>>>>>>> On Mon, Jun 13, 2022 at 11:18:49AM +0200, Jan Beulich wrote:
>>>>>>>> On 13.06.2022 11:04, Roger Pau Monné wrote:
>>>>>>>>> On Mon, Jun 13, 2022 at 10:29:43AM +0200, Jan Beulich wrote:
>>>>>>>>>> On 13.06.2022 10:21, Roger Pau Monné wrote:
>>>>>>>>>>> On Mon, Jun 13, 2022 at 09:30:06AM +0200, Jan Beulich wrote:
>>>>>>>>>>>> On 10.06.2022 17:06, Roger Pau Monne wrote:
>>>>>>>>>>>>> Prevent dropping console output from the hardware domain, since it's
>>>>>>>>>>>>> likely important to have all the output if the boot fails without
>>>>>>>>>>>>> having to resort to sync_console (which also affects the output from
>>>>>>>>>>>>> other guests).
>>>>>>>>>>>>>
>>>>>>>>>>>>> Do so by pairing the console_serial_puts() with
>>>>>>>>>>>>> serial_{start,end}_log_everything(), so that no output is dropped.
>>>>>>>>>>>>
>>>>>>>>>>>> While I can see the goal, why would Dom0 output be (effectively) more
>>>>>>>>>>>> important than Xen's own one (which isn't "forced")? And with this
>>>>>>>>>>>> aiming at boot output only, wouldn't you want to stop the overriding
>>>>>>>>>>>> once boot has completed (of which, if I'm not mistaken, we don't
>>>>>>>>>>>> really have any signal coming from Dom0)? And even during boot I'm
>>>>>>>>>>>> not convinced we'd want to let through everything, but perhaps just
>>>>>>>>>>>> Dom0's kernel messages?
>>>>>>>>>>>
>>>>>>>>>>> I normally use sync_console on all the boxes I'm doing dev work, so
>>>>>>>>>>> this request is something that come up internally.
>>>>>>>>>>>
>>>>>>>>>>> Didn't realize Xen output wasn't forced, since we already have rate
>>>>>>>>>>> limiting based on log levels I was assuming that non-ratelimited
>>>>>>>>>>> messages wouldn't be dropped.  But yes, I agree that Xen (non-guest
>>>>>>>>>>> triggered) output shouldn't be rate limited either.
>>>>>>>>>>
>>>>>>>>>> Which would raise the question of why we have log levels for non-guest
>>>>>>>>>> messages.
>>>>>>>>>
>>>>>>>>> Hm, maybe I'm confused, but I don't see a direct relation between log
>>>>>>>>> levels and rate limiting.  If I set log level to WARNING I would
>>>>>>>>> expect to not loose _any_ non-guest log messages with level WARNING or
>>>>>>>>> above.  It's still useful to have log levels for non-guest messages,
>>>>>>>>> since user might want to filter out DEBUG non-guest messages for
>>>>>>>>> example.
>>>>>>>>
>>>>>>>> It was me who was confused, because of the two log-everything variants
>>>>>>>> we have (console and serial). You're right that your change is unrelated
>>>>>>>> to log levels. However, when there are e.g. many warnings or when an
>>>>>>>> admin has lowered the log level, what you (would) do is effectively
>>>>>>>> force sync_console mode transiently (for a subset of messages, but
>>>>>>>> that's secondary, especially because the "forced" output would still
>>>>>>>> be waiting for earlier output to make it out).
>>>>>>>
>>>>>>> Right, it would have to wait for any previous output on the buffer to
>>>>>>> go out first.  In any case we can guarantee that no more output will
>>>>>>> be added to the buffer while Xen waits for it to be flushed.
>>>>>>>
>>>>>>> So for the hardware domain it might make sense to wait for the TX
>>>>>>> buffers to be half empty (the current tx_quench logic) by preempting
>>>>>>> the hypercall.  That however could cause issues if guests manage to
>>>>>>> keep filling the buffer while the hardware domain is being preempted.
>>>>>>>
>>>>>>> Alternatively we could always reserve half of the buffer for the
>>>>>>> hardware domain, and allow it to be preempted while waiting for space
>>>>>>> (since it's guaranteed non hardware domains won't be able to steal the
>>>>>>> allocation from the hardware domain).
>>>>>>
>>>>>> Getting complicated it seems. I have to admit that I wonder whether we
>>>>>> wouldn't be better off leaving the current logic as is.
>>>>>
>>>>> Another possible solution (more like a band aid) is to increase the
>>>>> buffer size from 4 pages to 8 or 16.  That would likely allow to cope
>>>>> fine with the high throughput of boot messages.
>>>>
>>>> You mean the buffer whose size is controlled by serial_tx_buffer?
>>>
>>> Yes.
>>>
>>>> On
>>>> large systems one may want to simply make use of the command line
>>>> option then; I don't think the built-in default needs changing. Or
>>>> if so, then perhaps not statically at build time, but taking into
>>>> account system properties (like CPU count).
>>>
>>> So how about we use:
>>>
>>> min(16384, ROUNDUP(1024 * num_possible_cpus(), 4096))
>>
>> That would _reduce_ size on small systems, wouldn't it? Originally
>> you were after increasing the default size. But if you had meant
>> max(), then I'd fear on very large systems this may grow a little
>> too large.
> 
> See previous followup about my mistake of using min() instead of
> max().
> 
> On a system with 512 CPUs that would be 512KB, I don't think that's a
> lot of memory, specially taking into account that a system with 512
> CPUs should have a matching amount of memory I would expect.
> 
> It's true however that I very much doubt we would fill a 512K buffer,
> so limiting to 64K might be a sensible starting point?

Yeah, 64k could be a value to compromise on. What total size of
output have you observed to trigger the making of this patch? Xen
alone doesn't even manage to fill 16k on most of my systems ...

>>> Maybe we should also take CPU frequency into account, but that seems
>>> too complex for the purpose.
>>
>> Why would frequency matter? Other aspects I could see mattering is
>> node count and maybe memory size.
> 
> Higher frequency likely means faster boot, and faster buffer fill,
> because the baudrate of the console is constant.

Hmm, yes. But remember there are may serializing actions. Bringing
up of APs for example is relatively slow _and_ not producing a lot
of output by default. As to the baudrate - modern chipsets allow
for higher than the 115200 value, so we may want to modernize our
drivers to know of such vendor specific aspects.

Jan

Re: [PATCH] xen/console: do not drop serial output from the hardware domain
Posted by Roger Pau Monné 2 years, 5 months ago
On Tue, Jun 14, 2022 at 11:45:54AM +0200, Jan Beulich wrote:
> On 14.06.2022 11:38, Roger Pau Monné wrote:
> > On Tue, Jun 14, 2022 at 11:13:07AM +0200, Jan Beulich wrote:
> >> On 14.06.2022 10:32, Roger Pau Monné wrote:
> >>> On Tue, Jun 14, 2022 at 10:10:03AM +0200, Jan Beulich wrote:
> >>>> On 14.06.2022 08:52, Roger Pau Monné wrote:
> >>>>> On Mon, Jun 13, 2022 at 03:56:54PM +0200, Jan Beulich wrote:
> >>>>>> On 13.06.2022 14:32, Roger Pau Monné wrote:
> >>>>>>> On Mon, Jun 13, 2022 at 11:18:49AM +0200, Jan Beulich wrote:
> >>>>>>>> On 13.06.2022 11:04, Roger Pau Monné wrote:
> >>>>>>>>> On Mon, Jun 13, 2022 at 10:29:43AM +0200, Jan Beulich wrote:
> >>>>>>>>>> On 13.06.2022 10:21, Roger Pau Monné wrote:
> >>>>>>>>>>> On Mon, Jun 13, 2022 at 09:30:06AM +0200, Jan Beulich wrote:
> >>>>>>>>>>>> On 10.06.2022 17:06, Roger Pau Monne wrote:
> >>>>>>>>>>>>> Prevent dropping console output from the hardware domain, since it's
> >>>>>>>>>>>>> likely important to have all the output if the boot fails without
> >>>>>>>>>>>>> having to resort to sync_console (which also affects the output from
> >>>>>>>>>>>>> other guests).
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Do so by pairing the console_serial_puts() with
> >>>>>>>>>>>>> serial_{start,end}_log_everything(), so that no output is dropped.
> >>>>>>>>>>>>
> >>>>>>>>>>>> While I can see the goal, why would Dom0 output be (effectively) more
> >>>>>>>>>>>> important than Xen's own one (which isn't "forced")? And with this
> >>>>>>>>>>>> aiming at boot output only, wouldn't you want to stop the overriding
> >>>>>>>>>>>> once boot has completed (of which, if I'm not mistaken, we don't
> >>>>>>>>>>>> really have any signal coming from Dom0)? And even during boot I'm
> >>>>>>>>>>>> not convinced we'd want to let through everything, but perhaps just
> >>>>>>>>>>>> Dom0's kernel messages?
> >>>>>>>>>>>
> >>>>>>>>>>> I normally use sync_console on all the boxes I'm doing dev work, so
> >>>>>>>>>>> this request is something that come up internally.
> >>>>>>>>>>>
> >>>>>>>>>>> Didn't realize Xen output wasn't forced, since we already have rate
> >>>>>>>>>>> limiting based on log levels I was assuming that non-ratelimited
> >>>>>>>>>>> messages wouldn't be dropped.  But yes, I agree that Xen (non-guest
> >>>>>>>>>>> triggered) output shouldn't be rate limited either.
> >>>>>>>>>>
> >>>>>>>>>> Which would raise the question of why we have log levels for non-guest
> >>>>>>>>>> messages.
> >>>>>>>>>
> >>>>>>>>> Hm, maybe I'm confused, but I don't see a direct relation between log
> >>>>>>>>> levels and rate limiting.  If I set log level to WARNING I would
> >>>>>>>>> expect to not loose _any_ non-guest log messages with level WARNING or
> >>>>>>>>> above.  It's still useful to have log levels for non-guest messages,
> >>>>>>>>> since user might want to filter out DEBUG non-guest messages for
> >>>>>>>>> example.
> >>>>>>>>
> >>>>>>>> It was me who was confused, because of the two log-everything variants
> >>>>>>>> we have (console and serial). You're right that your change is unrelated
> >>>>>>>> to log levels. However, when there are e.g. many warnings or when an
> >>>>>>>> admin has lowered the log level, what you (would) do is effectively
> >>>>>>>> force sync_console mode transiently (for a subset of messages, but
> >>>>>>>> that's secondary, especially because the "forced" output would still
> >>>>>>>> be waiting for earlier output to make it out).
> >>>>>>>
> >>>>>>> Right, it would have to wait for any previous output on the buffer to
> >>>>>>> go out first.  In any case we can guarantee that no more output will
> >>>>>>> be added to the buffer while Xen waits for it to be flushed.
> >>>>>>>
> >>>>>>> So for the hardware domain it might make sense to wait for the TX
> >>>>>>> buffers to be half empty (the current tx_quench logic) by preempting
> >>>>>>> the hypercall.  That however could cause issues if guests manage to
> >>>>>>> keep filling the buffer while the hardware domain is being preempted.
> >>>>>>>
> >>>>>>> Alternatively we could always reserve half of the buffer for the
> >>>>>>> hardware domain, and allow it to be preempted while waiting for space
> >>>>>>> (since it's guaranteed non hardware domains won't be able to steal the
> >>>>>>> allocation from the hardware domain).
> >>>>>>
> >>>>>> Getting complicated it seems. I have to admit that I wonder whether we
> >>>>>> wouldn't be better off leaving the current logic as is.
> >>>>>
> >>>>> Another possible solution (more like a band aid) is to increase the
> >>>>> buffer size from 4 pages to 8 or 16.  That would likely allow to cope
> >>>>> fine with the high throughput of boot messages.
> >>>>
> >>>> You mean the buffer whose size is controlled by serial_tx_buffer?
> >>>
> >>> Yes.
> >>>
> >>>> On
> >>>> large systems one may want to simply make use of the command line
> >>>> option then; I don't think the built-in default needs changing. Or
> >>>> if so, then perhaps not statically at build time, but taking into
> >>>> account system properties (like CPU count).
> >>>
> >>> So how about we use:
> >>>
> >>> min(16384, ROUNDUP(1024 * num_possible_cpus(), 4096))
> >>
> >> That would _reduce_ size on small systems, wouldn't it? Originally
> >> you were after increasing the default size. But if you had meant
> >> max(), then I'd fear on very large systems this may grow a little
> >> too large.
> > 
> > See previous followup about my mistake of using min() instead of
> > max().
> > 
> > On a system with 512 CPUs that would be 512KB, I don't think that's a
> > lot of memory, specially taking into account that a system with 512
> > CPUs should have a matching amount of memory I would expect.
> > 
> > It's true however that I very much doubt we would fill a 512K buffer,
> > so limiting to 64K might be a sensible starting point?
> 
> Yeah, 64k could be a value to compromise on. What total size of
> output have you observed to trigger the making of this patch? Xen
> alone doesn't even manage to fill 16k on most of my systems ...

I've tried on one of the affected systems now, it's a 8 CPU Kaby Lake
at 3,5GHz, and manages to fill the buffer while booting Linux.

My proposed formula won't fix this use case, so what about just
bumping the buffer to 32K by default, which does fix it?

Or alternatively use the proposed formula, but adjust the buffer to be
between [32K,64K].

Thanks, Roger.

Re: [PATCH] xen/console: do not drop serial output from the hardware domain
Posted by Jan Beulich 2 years, 5 months ago
On 16.06.2022 13:31, Roger Pau Monné wrote:
> On Tue, Jun 14, 2022 at 11:45:54AM +0200, Jan Beulich wrote:
>> On 14.06.2022 11:38, Roger Pau Monné wrote:
>>> On Tue, Jun 14, 2022 at 11:13:07AM +0200, Jan Beulich wrote:
>>>> On 14.06.2022 10:32, Roger Pau Monné wrote:
>>>>> On Tue, Jun 14, 2022 at 10:10:03AM +0200, Jan Beulich wrote:
>>>>>> On 14.06.2022 08:52, Roger Pau Monné wrote:
>>>>>>> On Mon, Jun 13, 2022 at 03:56:54PM +0200, Jan Beulich wrote:
>>>>>>>> On 13.06.2022 14:32, Roger Pau Monné wrote:
>>>>>>>>> On Mon, Jun 13, 2022 at 11:18:49AM +0200, Jan Beulich wrote:
>>>>>>>>>> On 13.06.2022 11:04, Roger Pau Monné wrote:
>>>>>>>>>>> On Mon, Jun 13, 2022 at 10:29:43AM +0200, Jan Beulich wrote:
>>>>>>>>>>>> On 13.06.2022 10:21, Roger Pau Monné wrote:
>>>>>>>>>>>>> On Mon, Jun 13, 2022 at 09:30:06AM +0200, Jan Beulich wrote:
>>>>>>>>>>>>>> On 10.06.2022 17:06, Roger Pau Monne wrote:
>>>>>>>>>>>>>>> Prevent dropping console output from the hardware domain, since it's
>>>>>>>>>>>>>>> likely important to have all the output if the boot fails without
>>>>>>>>>>>>>>> having to resort to sync_console (which also affects the output from
>>>>>>>>>>>>>>> other guests).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Do so by pairing the console_serial_puts() with
>>>>>>>>>>>>>>> serial_{start,end}_log_everything(), so that no output is dropped.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> While I can see the goal, why would Dom0 output be (effectively) more
>>>>>>>>>>>>>> important than Xen's own one (which isn't "forced")? And with this
>>>>>>>>>>>>>> aiming at boot output only, wouldn't you want to stop the overriding
>>>>>>>>>>>>>> once boot has completed (of which, if I'm not mistaken, we don't
>>>>>>>>>>>>>> really have any signal coming from Dom0)? And even during boot I'm
>>>>>>>>>>>>>> not convinced we'd want to let through everything, but perhaps just
>>>>>>>>>>>>>> Dom0's kernel messages?
>>>>>>>>>>>>>
>>>>>>>>>>>>> I normally use sync_console on all the boxes I'm doing dev work, so
>>>>>>>>>>>>> this request is something that come up internally.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Didn't realize Xen output wasn't forced, since we already have rate
>>>>>>>>>>>>> limiting based on log levels I was assuming that non-ratelimited
>>>>>>>>>>>>> messages wouldn't be dropped.  But yes, I agree that Xen (non-guest
>>>>>>>>>>>>> triggered) output shouldn't be rate limited either.
>>>>>>>>>>>>
>>>>>>>>>>>> Which would raise the question of why we have log levels for non-guest
>>>>>>>>>>>> messages.
>>>>>>>>>>>
>>>>>>>>>>> Hm, maybe I'm confused, but I don't see a direct relation between log
>>>>>>>>>>> levels and rate limiting.  If I set log level to WARNING I would
>>>>>>>>>>> expect to not loose _any_ non-guest log messages with level WARNING or
>>>>>>>>>>> above.  It's still useful to have log levels for non-guest messages,
>>>>>>>>>>> since user might want to filter out DEBUG non-guest messages for
>>>>>>>>>>> example.
>>>>>>>>>>
>>>>>>>>>> It was me who was confused, because of the two log-everything variants
>>>>>>>>>> we have (console and serial). You're right that your change is unrelated
>>>>>>>>>> to log levels. However, when there are e.g. many warnings or when an
>>>>>>>>>> admin has lowered the log level, what you (would) do is effectively
>>>>>>>>>> force sync_console mode transiently (for a subset of messages, but
>>>>>>>>>> that's secondary, especially because the "forced" output would still
>>>>>>>>>> be waiting for earlier output to make it out).
>>>>>>>>>
>>>>>>>>> Right, it would have to wait for any previous output on the buffer to
>>>>>>>>> go out first.  In any case we can guarantee that no more output will
>>>>>>>>> be added to the buffer while Xen waits for it to be flushed.
>>>>>>>>>
>>>>>>>>> So for the hardware domain it might make sense to wait for the TX
>>>>>>>>> buffers to be half empty (the current tx_quench logic) by preempting
>>>>>>>>> the hypercall.  That however could cause issues if guests manage to
>>>>>>>>> keep filling the buffer while the hardware domain is being preempted.
>>>>>>>>>
>>>>>>>>> Alternatively we could always reserve half of the buffer for the
>>>>>>>>> hardware domain, and allow it to be preempted while waiting for space
>>>>>>>>> (since it's guaranteed non hardware domains won't be able to steal the
>>>>>>>>> allocation from the hardware domain).
>>>>>>>>
>>>>>>>> Getting complicated it seems. I have to admit that I wonder whether we
>>>>>>>> wouldn't be better off leaving the current logic as is.
>>>>>>>
>>>>>>> Another possible solution (more like a band aid) is to increase the
>>>>>>> buffer size from 4 pages to 8 or 16.  That would likely allow to cope
>>>>>>> fine with the high throughput of boot messages.
>>>>>>
>>>>>> You mean the buffer whose size is controlled by serial_tx_buffer?
>>>>>
>>>>> Yes.
>>>>>
>>>>>> On
>>>>>> large systems one may want to simply make use of the command line
>>>>>> option then; I don't think the built-in default needs changing. Or
>>>>>> if so, then perhaps not statically at build time, but taking into
>>>>>> account system properties (like CPU count).
>>>>>
>>>>> So how about we use:
>>>>>
>>>>> min(16384, ROUNDUP(1024 * num_possible_cpus(), 4096))
>>>>
>>>> That would _reduce_ size on small systems, wouldn't it? Originally
>>>> you were after increasing the default size. But if you had meant
>>>> max(), then I'd fear on very large systems this may grow a little
>>>> too large.
>>>
>>> See previous followup about my mistake of using min() instead of
>>> max().
>>>
>>> On a system with 512 CPUs that would be 512KB, I don't think that's a
>>> lot of memory, specially taking into account that a system with 512
>>> CPUs should have a matching amount of memory I would expect.
>>>
>>> It's true however that I very much doubt we would fill a 512K buffer,
>>> so limiting to 64K might be a sensible starting point?
>>
>> Yeah, 64k could be a value to compromise on. What total size of
>> output have you observed to trigger the making of this patch? Xen
>> alone doesn't even manage to fill 16k on most of my systems ...
> 
> I've tried on one of the affected systems now, it's a 8 CPU Kaby Lake
> at 3,5GHz, and manages to fill the buffer while booting Linux.
> 
> My proposed formula won't fix this use case, so what about just
> bumping the buffer to 32K by default, which does fix it?

As said, suitably explained I could also agree with going to 64k. The
question though is in how far 32k, 64k, or ...

> Or alternatively use the proposed formula, but adjust the buffer to be
> between [32K,64K].

... this formula would cover a wide range of contemporary systems.
Without such I can't really see what good a bump would do, as then
many people may still find themselves in need of using the command
line option to put in place a larger buffer.

Jan

Re: [PATCH] xen/console: do not drop serial output from the hardware domain
Posted by Roger Pau Monné 2 years, 5 months ago
On Wed, Jun 22, 2022 at 10:04:19AM +0200, Jan Beulich wrote:
> On 16.06.2022 13:31, Roger Pau Monné wrote:
> > On Tue, Jun 14, 2022 at 11:45:54AM +0200, Jan Beulich wrote:
> >> On 14.06.2022 11:38, Roger Pau Monné wrote:
> >>> On Tue, Jun 14, 2022 at 11:13:07AM +0200, Jan Beulich wrote:
> >>>> On 14.06.2022 10:32, Roger Pau Monné wrote:
> >>>>> On Tue, Jun 14, 2022 at 10:10:03AM +0200, Jan Beulich wrote:
> >>>>>> On 14.06.2022 08:52, Roger Pau Monné wrote:
> >>>>>>> On Mon, Jun 13, 2022 at 03:56:54PM +0200, Jan Beulich wrote:
> >>>>>>>> On 13.06.2022 14:32, Roger Pau Monné wrote:
> >>>>>>>>> On Mon, Jun 13, 2022 at 11:18:49AM +0200, Jan Beulich wrote:
> >>>>>>>>>> On 13.06.2022 11:04, Roger Pau Monné wrote:
> >>>>>>>>>>> On Mon, Jun 13, 2022 at 10:29:43AM +0200, Jan Beulich wrote:
> >>>>>>>>>>>> On 13.06.2022 10:21, Roger Pau Monné wrote:
> >>>>>>>>>>>>> On Mon, Jun 13, 2022 at 09:30:06AM +0200, Jan Beulich wrote:
> >>>>>>>>>>>>>> On 10.06.2022 17:06, Roger Pau Monne wrote:
> >>>>>>>>>>>>>>> Prevent dropping console output from the hardware domain, since it's
> >>>>>>>>>>>>>>> likely important to have all the output if the boot fails without
> >>>>>>>>>>>>>>> having to resort to sync_console (which also affects the output from
> >>>>>>>>>>>>>>> other guests).
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Do so by pairing the console_serial_puts() with
> >>>>>>>>>>>>>>> serial_{start,end}_log_everything(), so that no output is dropped.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> While I can see the goal, why would Dom0 output be (effectively) more
> >>>>>>>>>>>>>> important than Xen's own one (which isn't "forced")? And with this
> >>>>>>>>>>>>>> aiming at boot output only, wouldn't you want to stop the overriding
> >>>>>>>>>>>>>> once boot has completed (of which, if I'm not mistaken, we don't
> >>>>>>>>>>>>>> really have any signal coming from Dom0)? And even during boot I'm
> >>>>>>>>>>>>>> not convinced we'd want to let through everything, but perhaps just
> >>>>>>>>>>>>>> Dom0's kernel messages?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I normally use sync_console on all the boxes I'm doing dev work, so
> >>>>>>>>>>>>> this request is something that come up internally.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Didn't realize Xen output wasn't forced, since we already have rate
> >>>>>>>>>>>>> limiting based on log levels I was assuming that non-ratelimited
> >>>>>>>>>>>>> messages wouldn't be dropped.  But yes, I agree that Xen (non-guest
> >>>>>>>>>>>>> triggered) output shouldn't be rate limited either.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Which would raise the question of why we have log levels for non-guest
> >>>>>>>>>>>> messages.
> >>>>>>>>>>>
> >>>>>>>>>>> Hm, maybe I'm confused, but I don't see a direct relation between log
> >>>>>>>>>>> levels and rate limiting.  If I set log level to WARNING I would
> >>>>>>>>>>> expect to not loose _any_ non-guest log messages with level WARNING or
> >>>>>>>>>>> above.  It's still useful to have log levels for non-guest messages,
> >>>>>>>>>>> since user might want to filter out DEBUG non-guest messages for
> >>>>>>>>>>> example.
> >>>>>>>>>>
> >>>>>>>>>> It was me who was confused, because of the two log-everything variants
> >>>>>>>>>> we have (console and serial). You're right that your change is unrelated
> >>>>>>>>>> to log levels. However, when there are e.g. many warnings or when an
> >>>>>>>>>> admin has lowered the log level, what you (would) do is effectively
> >>>>>>>>>> force sync_console mode transiently (for a subset of messages, but
> >>>>>>>>>> that's secondary, especially because the "forced" output would still
> >>>>>>>>>> be waiting for earlier output to make it out).
> >>>>>>>>>
> >>>>>>>>> Right, it would have to wait for any previous output on the buffer to
> >>>>>>>>> go out first.  In any case we can guarantee that no more output will
> >>>>>>>>> be added to the buffer while Xen waits for it to be flushed.
> >>>>>>>>>
> >>>>>>>>> So for the hardware domain it might make sense to wait for the TX
> >>>>>>>>> buffers to be half empty (the current tx_quench logic) by preempting
> >>>>>>>>> the hypercall.  That however could cause issues if guests manage to
> >>>>>>>>> keep filling the buffer while the hardware domain is being preempted.
> >>>>>>>>>
> >>>>>>>>> Alternatively we could always reserve half of the buffer for the
> >>>>>>>>> hardware domain, and allow it to be preempted while waiting for space
> >>>>>>>>> (since it's guaranteed non hardware domains won't be able to steal the
> >>>>>>>>> allocation from the hardware domain).
> >>>>>>>>
> >>>>>>>> Getting complicated it seems. I have to admit that I wonder whether we
> >>>>>>>> wouldn't be better off leaving the current logic as is.
> >>>>>>>
> >>>>>>> Another possible solution (more like a band aid) is to increase the
> >>>>>>> buffer size from 4 pages to 8 or 16.  That would likely allow to cope
> >>>>>>> fine with the high throughput of boot messages.
> >>>>>>
> >>>>>> You mean the buffer whose size is controlled by serial_tx_buffer?
> >>>>>
> >>>>> Yes.
> >>>>>
> >>>>>> On
> >>>>>> large systems one may want to simply make use of the command line
> >>>>>> option then; I don't think the built-in default needs changing. Or
> >>>>>> if so, then perhaps not statically at build time, but taking into
> >>>>>> account system properties (like CPU count).
> >>>>>
> >>>>> So how about we use:
> >>>>>
> >>>>> min(16384, ROUNDUP(1024 * num_possible_cpus(), 4096))
> >>>>
> >>>> That would _reduce_ size on small systems, wouldn't it? Originally
> >>>> you were after increasing the default size. But if you had meant
> >>>> max(), then I'd fear on very large systems this may grow a little
> >>>> too large.
> >>>
> >>> See previous followup about my mistake of using min() instead of
> >>> max().
> >>>
> >>> On a system with 512 CPUs that would be 512KB, I don't think that's a
> >>> lot of memory, specially taking into account that a system with 512
> >>> CPUs should have a matching amount of memory I would expect.
> >>>
> >>> It's true however that I very much doubt we would fill a 512K buffer,
> >>> so limiting to 64K might be a sensible starting point?
> >>
> >> Yeah, 64k could be a value to compromise on. What total size of
> >> output have you observed to trigger the making of this patch? Xen
> >> alone doesn't even manage to fill 16k on most of my systems ...
> > 
> > I've tried on one of the affected systems now, it's a 8 CPU Kaby Lake
> > at 3,5GHz, and manages to fill the buffer while booting Linux.
> > 
> > My proposed formula won't fix this use case, so what about just
> > bumping the buffer to 32K by default, which does fix it?
> 
> As said, suitably explained I could also agree with going to 64k. The
> question though is in how far 32k, 64k, or ...
> 
> > Or alternatively use the proposed formula, but adjust the buffer to be
> > between [32K,64K].
> 
> ... this formula would cover a wide range of contemporary systems.
> Without such I can't really see what good a bump would do, as then
> many people may still find themselves in need of using the command
> line option to put in place a larger buffer.

I'm afraid I don't know how to make progress with this.

The current value is clearly too low for at least one of my systems.
I don't think it's feasible for me to propose a value or formula that
I can confirm will be suitable for all systems, hence I would suggest
increasing the buffer value to 32K as that does fix the problem on
that specific system (without claiming it's a value that would suit
all setups).

I agree that many people could still find themselves in the need of
using the command line option, but I can assure that new buffer value
would fix the issue on at least one system, which should be enough as
a justification.

Thanks, Roger.

Re: [PATCH] xen/console: do not drop serial output from the hardware domain
Posted by Jan Beulich 2 years, 5 months ago
On 22.06.2022 11:09, Roger Pau Monné wrote:
> On Wed, Jun 22, 2022 at 10:04:19AM +0200, Jan Beulich wrote:
>> On 16.06.2022 13:31, Roger Pau Monné wrote:
>>> On Tue, Jun 14, 2022 at 11:45:54AM +0200, Jan Beulich wrote:
>>>> On 14.06.2022 11:38, Roger Pau Monné wrote:
>>>>> On Tue, Jun 14, 2022 at 11:13:07AM +0200, Jan Beulich wrote:
>>>>>> On 14.06.2022 10:32, Roger Pau Monné wrote:
>>>>>>> On Tue, Jun 14, 2022 at 10:10:03AM +0200, Jan Beulich wrote:
>>>>>>>> On 14.06.2022 08:52, Roger Pau Monné wrote:
>>>>>>>>> On Mon, Jun 13, 2022 at 03:56:54PM +0200, Jan Beulich wrote:
>>>>>>>>>> On 13.06.2022 14:32, Roger Pau Monné wrote:
>>>>>>>>>>> On Mon, Jun 13, 2022 at 11:18:49AM +0200, Jan Beulich wrote:
>>>>>>>>>>>> On 13.06.2022 11:04, Roger Pau Monné wrote:
>>>>>>>>>>>>> On Mon, Jun 13, 2022 at 10:29:43AM +0200, Jan Beulich wrote:
>>>>>>>>>>>>>> On 13.06.2022 10:21, Roger Pau Monné wrote:
>>>>>>>>>>>>>>> On Mon, Jun 13, 2022 at 09:30:06AM +0200, Jan Beulich wrote:
>>>>>>>>>>>>>>>> On 10.06.2022 17:06, Roger Pau Monne wrote:
>>>>>>>>>>>>>>>>> Prevent dropping console output from the hardware domain, since it's
>>>>>>>>>>>>>>>>> likely important to have all the output if the boot fails without
>>>>>>>>>>>>>>>>> having to resort to sync_console (which also affects the output from
>>>>>>>>>>>>>>>>> other guests).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Do so by pairing the console_serial_puts() with
>>>>>>>>>>>>>>>>> serial_{start,end}_log_everything(), so that no output is dropped.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> While I can see the goal, why would Dom0 output be (effectively) more
>>>>>>>>>>>>>>>> important than Xen's own one (which isn't "forced")? And with this
>>>>>>>>>>>>>>>> aiming at boot output only, wouldn't you want to stop the overriding
>>>>>>>>>>>>>>>> once boot has completed (of which, if I'm not mistaken, we don't
>>>>>>>>>>>>>>>> really have any signal coming from Dom0)? And even during boot I'm
>>>>>>>>>>>>>>>> not convinced we'd want to let through everything, but perhaps just
>>>>>>>>>>>>>>>> Dom0's kernel messages?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I normally use sync_console on all the boxes I'm doing dev work, so
>>>>>>>>>>>>>>> this request is something that come up internally.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Didn't realize Xen output wasn't forced, since we already have rate
>>>>>>>>>>>>>>> limiting based on log levels I was assuming that non-ratelimited
>>>>>>>>>>>>>>> messages wouldn't be dropped.  But yes, I agree that Xen (non-guest
>>>>>>>>>>>>>>> triggered) output shouldn't be rate limited either.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Which would raise the question of why we have log levels for non-guest
>>>>>>>>>>>>>> messages.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hm, maybe I'm confused, but I don't see a direct relation between log
>>>>>>>>>>>>> levels and rate limiting.  If I set log level to WARNING I would
>>>>>>>>>>>>> expect to not loose _any_ non-guest log messages with level WARNING or
>>>>>>>>>>>>> above.  It's still useful to have log levels for non-guest messages,
>>>>>>>>>>>>> since user might want to filter out DEBUG non-guest messages for
>>>>>>>>>>>>> example.
>>>>>>>>>>>>
>>>>>>>>>>>> It was me who was confused, because of the two log-everything variants
>>>>>>>>>>>> we have (console and serial). You're right that your change is unrelated
>>>>>>>>>>>> to log levels. However, when there are e.g. many warnings or when an
>>>>>>>>>>>> admin has lowered the log level, what you (would) do is effectively
>>>>>>>>>>>> force sync_console mode transiently (for a subset of messages, but
>>>>>>>>>>>> that's secondary, especially because the "forced" output would still
>>>>>>>>>>>> be waiting for earlier output to make it out).
>>>>>>>>>>>
>>>>>>>>>>> Right, it would have to wait for any previous output on the buffer to
>>>>>>>>>>> go out first.  In any case we can guarantee that no more output will
>>>>>>>>>>> be added to the buffer while Xen waits for it to be flushed.
>>>>>>>>>>>
>>>>>>>>>>> So for the hardware domain it might make sense to wait for the TX
>>>>>>>>>>> buffers to be half empty (the current tx_quench logic) by preempting
>>>>>>>>>>> the hypercall.  That however could cause issues if guests manage to
>>>>>>>>>>> keep filling the buffer while the hardware domain is being preempted.
>>>>>>>>>>>
>>>>>>>>>>> Alternatively we could always reserve half of the buffer for the
>>>>>>>>>>> hardware domain, and allow it to be preempted while waiting for space
>>>>>>>>>>> (since it's guaranteed non hardware domains won't be able to steal the
>>>>>>>>>>> allocation from the hardware domain).
>>>>>>>>>>
>>>>>>>>>> Getting complicated it seems. I have to admit that I wonder whether we
>>>>>>>>>> wouldn't be better off leaving the current logic as is.
>>>>>>>>>
>>>>>>>>> Another possible solution (more like a band aid) is to increase the
>>>>>>>>> buffer size from 4 pages to 8 or 16.  That would likely allow to cope
>>>>>>>>> fine with the high throughput of boot messages.
>>>>>>>>
>>>>>>>> You mean the buffer whose size is controlled by serial_tx_buffer?
>>>>>>>
>>>>>>> Yes.
>>>>>>>
>>>>>>>> On
>>>>>>>> large systems one may want to simply make use of the command line
>>>>>>>> option then; I don't think the built-in default needs changing. Or
>>>>>>>> if so, then perhaps not statically at build time, but taking into
>>>>>>>> account system properties (like CPU count).
>>>>>>>
>>>>>>> So how about we use:
>>>>>>>
>>>>>>> min(16384, ROUNDUP(1024 * num_possible_cpus(), 4096))
>>>>>>
>>>>>> That would _reduce_ size on small systems, wouldn't it? Originally
>>>>>> you were after increasing the default size. But if you had meant
>>>>>> max(), then I'd fear on very large systems this may grow a little
>>>>>> too large.
>>>>>
>>>>> See previous followup about my mistake of using min() instead of
>>>>> max().
>>>>>
>>>>> On a system with 512 CPUs that would be 512KB, I don't think that's a
>>>>> lot of memory, specially taking into account that a system with 512
>>>>> CPUs should have a matching amount of memory I would expect.
>>>>>
>>>>> It's true however that I very much doubt we would fill a 512K buffer,
>>>>> so limiting to 64K might be a sensible starting point?
>>>>
>>>> Yeah, 64k could be a value to compromise on. What total size of
>>>> output have you observed to trigger the making of this patch? Xen
>>>> alone doesn't even manage to fill 16k on most of my systems ...
>>>
>>> I've tried on one of the affected systems now, it's a 8 CPU Kaby Lake
>>> at 3,5GHz, and manages to fill the buffer while booting Linux.
>>>
>>> My proposed formula won't fix this use case, so what about just
>>> bumping the buffer to 32K by default, which does fix it?
>>
>> As said, suitably explained I could also agree with going to 64k. The
>> question though is in how far 32k, 64k, or ...
>>
>>> Or alternatively use the proposed formula, but adjust the buffer to be
>>> between [32K,64K].
>>
>> ... this formula would cover a wide range of contemporary systems.
>> Without such I can't really see what good a bump would do, as then
>> many people may still find themselves in need of using the command
>> line option to put in place a larger buffer.
> 
> I'm afraid I don't know how to make progress with this.
> 
> The current value is clearly too low for at least one of my systems.
> I don't think it's feasible for me to propose a value or formula that
> I can confirm will be suitable for all systems, hence I would suggest
> increasing the buffer value to 32K as that does fix the problem on
> that specific system (without claiming it's a value that would suit
> all setups).
> 
> I agree that many people could still find themselves in the need of
> using the command line option, but I can assure that new buffer value
> would fix the issue on at least one system, which should be enough as
> a justification.

I'm afraid I view this differently. Dealing with individual systems is
imo not a reason to change a default, when there is a command line
option to adjust the value in question. And when, at the same time,
the higher default might cause waste of resources on at least on other
system. As said before, I'm not going to object to bumping to 32k or
even 64k, provided this has wider benefit and limited downsides. But
with a justification of "this fixes one system" I'm not going to ack
(but also not nak) such a change.

Jan

Re: [PATCH] xen/console: do not drop serial output from the hardware domain
Posted by Roger Pau Monné 2 years, 5 months ago
On Wed, Jun 22, 2022 at 11:33:32AM +0200, Jan Beulich wrote:
> On 22.06.2022 11:09, Roger Pau Monné wrote:
> > On Wed, Jun 22, 2022 at 10:04:19AM +0200, Jan Beulich wrote:
> >> On 16.06.2022 13:31, Roger Pau Monné wrote:
> >>> On Tue, Jun 14, 2022 at 11:45:54AM +0200, Jan Beulich wrote:
> >>>> On 14.06.2022 11:38, Roger Pau Monné wrote:
> >>>>> On Tue, Jun 14, 2022 at 11:13:07AM +0200, Jan Beulich wrote:
> >>>>>> On 14.06.2022 10:32, Roger Pau Monné wrote:
> >>>>>>> On Tue, Jun 14, 2022 at 10:10:03AM +0200, Jan Beulich wrote:
> >>>>>>>> On 14.06.2022 08:52, Roger Pau Monné wrote:
> >>>>>>>>> On Mon, Jun 13, 2022 at 03:56:54PM +0200, Jan Beulich wrote:
> >>>>>>>>>> On 13.06.2022 14:32, Roger Pau Monné wrote:
> >>>>>>>>>>> On Mon, Jun 13, 2022 at 11:18:49AM +0200, Jan Beulich wrote:
> >>>>>>>>>>>> On 13.06.2022 11:04, Roger Pau Monné wrote:
> >>>>>>>>>>>>> On Mon, Jun 13, 2022 at 10:29:43AM +0200, Jan Beulich wrote:
> >>>>>>>>>>>>>> On 13.06.2022 10:21, Roger Pau Monné wrote:
> >>>>>>>>>>>>>>> On Mon, Jun 13, 2022 at 09:30:06AM +0200, Jan Beulich wrote:
> >>>>>>>>>>>>>>>> On 10.06.2022 17:06, Roger Pau Monne wrote:
> >>>>>>>>>>>>>>>>> Prevent dropping console output from the hardware domain, since it's
> >>>>>>>>>>>>>>>>> likely important to have all the output if the boot fails without
> >>>>>>>>>>>>>>>>> having to resort to sync_console (which also affects the output from
> >>>>>>>>>>>>>>>>> other guests).
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Do so by pairing the console_serial_puts() with
> >>>>>>>>>>>>>>>>> serial_{start,end}_log_everything(), so that no output is dropped.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> While I can see the goal, why would Dom0 output be (effectively) more
> >>>>>>>>>>>>>>>> important than Xen's own one (which isn't "forced")? And with this
> >>>>>>>>>>>>>>>> aiming at boot output only, wouldn't you want to stop the overriding
> >>>>>>>>>>>>>>>> once boot has completed (of which, if I'm not mistaken, we don't
> >>>>>>>>>>>>>>>> really have any signal coming from Dom0)? And even during boot I'm
> >>>>>>>>>>>>>>>> not convinced we'd want to let through everything, but perhaps just
> >>>>>>>>>>>>>>>> Dom0's kernel messages?
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I normally use sync_console on all the boxes I'm doing dev work, so
> >>>>>>>>>>>>>>> this request is something that come up internally.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Didn't realize Xen output wasn't forced, since we already have rate
> >>>>>>>>>>>>>>> limiting based on log levels I was assuming that non-ratelimited
> >>>>>>>>>>>>>>> messages wouldn't be dropped.  But yes, I agree that Xen (non-guest
> >>>>>>>>>>>>>>> triggered) output shouldn't be rate limited either.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Which would raise the question of why we have log levels for non-guest
> >>>>>>>>>>>>>> messages.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Hm, maybe I'm confused, but I don't see a direct relation between log
> >>>>>>>>>>>>> levels and rate limiting.  If I set log level to WARNING I would
> >>>>>>>>>>>>> expect to not loose _any_ non-guest log messages with level WARNING or
> >>>>>>>>>>>>> above.  It's still useful to have log levels for non-guest messages,
> >>>>>>>>>>>>> since user might want to filter out DEBUG non-guest messages for
> >>>>>>>>>>>>> example.
> >>>>>>>>>>>>
> >>>>>>>>>>>> It was me who was confused, because of the two log-everything variants
> >>>>>>>>>>>> we have (console and serial). You're right that your change is unrelated
> >>>>>>>>>>>> to log levels. However, when there are e.g. many warnings or when an
> >>>>>>>>>>>> admin has lowered the log level, what you (would) do is effectively
> >>>>>>>>>>>> force sync_console mode transiently (for a subset of messages, but
> >>>>>>>>>>>> that's secondary, especially because the "forced" output would still
> >>>>>>>>>>>> be waiting for earlier output to make it out).
> >>>>>>>>>>>
> >>>>>>>>>>> Right, it would have to wait for any previous output on the buffer to
> >>>>>>>>>>> go out first.  In any case we can guarantee that no more output will
> >>>>>>>>>>> be added to the buffer while Xen waits for it to be flushed.
> >>>>>>>>>>>
> >>>>>>>>>>> So for the hardware domain it might make sense to wait for the TX
> >>>>>>>>>>> buffers to be half empty (the current tx_quench logic) by preempting
> >>>>>>>>>>> the hypercall.  That however could cause issues if guests manage to
> >>>>>>>>>>> keep filling the buffer while the hardware domain is being preempted.
> >>>>>>>>>>>
> >>>>>>>>>>> Alternatively we could always reserve half of the buffer for the
> >>>>>>>>>>> hardware domain, and allow it to be preempted while waiting for space
> >>>>>>>>>>> (since it's guaranteed non hardware domains won't be able to steal the
> >>>>>>>>>>> allocation from the hardware domain).
> >>>>>>>>>>
> >>>>>>>>>> Getting complicated it seems. I have to admit that I wonder whether we
> >>>>>>>>>> wouldn't be better off leaving the current logic as is.
> >>>>>>>>>
> >>>>>>>>> Another possible solution (more like a band aid) is to increase the
> >>>>>>>>> buffer size from 4 pages to 8 or 16.  That would likely allow to cope
> >>>>>>>>> fine with the high throughput of boot messages.
> >>>>>>>>
> >>>>>>>> You mean the buffer whose size is controlled by serial_tx_buffer?
> >>>>>>>
> >>>>>>> Yes.
> >>>>>>>
> >>>>>>>> On
> >>>>>>>> large systems one may want to simply make use of the command line
> >>>>>>>> option then; I don't think the built-in default needs changing. Or
> >>>>>>>> if so, then perhaps not statically at build time, but taking into
> >>>>>>>> account system properties (like CPU count).
> >>>>>>>
> >>>>>>> So how about we use:
> >>>>>>>
> >>>>>>> min(16384, ROUNDUP(1024 * num_possible_cpus(), 4096))
> >>>>>>
> >>>>>> That would _reduce_ size on small systems, wouldn't it? Originally
> >>>>>> you were after increasing the default size. But if you had meant
> >>>>>> max(), then I'd fear on very large systems this may grow a little
> >>>>>> too large.
> >>>>>
> >>>>> See previous followup about my mistake of using min() instead of
> >>>>> max().
> >>>>>
> >>>>> On a system with 512 CPUs that would be 512KB, I don't think that's a
> >>>>> lot of memory, specially taking into account that a system with 512
> >>>>> CPUs should have a matching amount of memory I would expect.
> >>>>>
> >>>>> It's true however that I very much doubt we would fill a 512K buffer,
> >>>>> so limiting to 64K might be a sensible starting point?
> >>>>
> >>>> Yeah, 64k could be a value to compromise on. What total size of
> >>>> output have you observed to trigger the making of this patch? Xen
> >>>> alone doesn't even manage to fill 16k on most of my systems ...
> >>>
> >>> I've tried on one of the affected systems now, it's a 8 CPU Kaby Lake
> >>> at 3,5GHz, and manages to fill the buffer while booting Linux.
> >>>
> >>> My proposed formula won't fix this use case, so what about just
> >>> bumping the buffer to 32K by default, which does fix it?
> >>
> >> As said, suitably explained I could also agree with going to 64k. The
> >> question though is in how far 32k, 64k, or ...
> >>
> >>> Or alternatively use the proposed formula, but adjust the buffer to be
> >>> between [32K,64K].
> >>
> >> ... this formula would cover a wide range of contemporary systems.
> >> Without such I can't really see what good a bump would do, as then
> >> many people may still find themselves in need of using the command
> >> line option to put in place a larger buffer.
> > 
> > I'm afraid I don't know how to make progress with this.
> > 
> > The current value is clearly too low for at least one of my systems.
> > I don't think it's feasible for me to propose a value or formula that
> > I can confirm will be suitable for all systems, hence I would suggest
> > increasing the buffer value to 32K as that does fix the problem on
> > that specific system (without claiming it's a value that would suit
> > all setups).
> > 
> > I agree that many people could still find themselves in the need of
> > using the command line option, but I can assure that new buffer value
> > would fix the issue on at least one system, which should be enough as
> > a justification.
> 
> I'm afraid I view this differently. Dealing with individual systems is
> imo not a reason to change a default, when there is a command line
> option to adjust the value in question. And when, at the same time,
> the higher default might cause waste of resources on at least on other
> system. As said before, I'm not going to object to bumping to 32k or
> even 64k, provided this has wider benefit and limited downsides. But
> with a justification of "this fixes one system" I'm not going to ack
> (but also not nak) such a change.

Sorry, I certainly have a different view on this, as I think we should
aim to provide sane defaults that allow for proper functioning of Xen,
unless it turns out those defaults could cause issues on other
systems.  In this case I don't see how bumping the default console
ring from 16K to 32K is going to cause issues elsewhere.

Will prepare a patch to do that and send to the list, in case anyone
else would like to Ack it.

Thanks, Roger.

Re: [PATCH] xen/console: do not drop serial output from the hardware domain
Posted by Roger Pau Monné 2 years, 5 months ago
On Tue, Jun 14, 2022 at 10:32:53AM +0200, Roger Pau Monné wrote:
> On Tue, Jun 14, 2022 at 10:10:03AM +0200, Jan Beulich wrote:
> > On 14.06.2022 08:52, Roger Pau Monné wrote:
> > > On Mon, Jun 13, 2022 at 03:56:54PM +0200, Jan Beulich wrote:
> > >> On 13.06.2022 14:32, Roger Pau Monné wrote:
> > >>> On Mon, Jun 13, 2022 at 11:18:49AM +0200, Jan Beulich wrote:
> > >>>> On 13.06.2022 11:04, Roger Pau Monné wrote:
> > >>>>> On Mon, Jun 13, 2022 at 10:29:43AM +0200, Jan Beulich wrote:
> > >>>>>> On 13.06.2022 10:21, Roger Pau Monné wrote:
> > >>>>>>> On Mon, Jun 13, 2022 at 09:30:06AM +0200, Jan Beulich wrote:
> > >>>>>>>> On 10.06.2022 17:06, Roger Pau Monne wrote:
> > >>>>>>>>> Prevent dropping console output from the hardware domain, since it's
> > >>>>>>>>> likely important to have all the output if the boot fails without
> > >>>>>>>>> having to resort to sync_console (which also affects the output from
> > >>>>>>>>> other guests).
> > >>>>>>>>>
> > >>>>>>>>> Do so by pairing the console_serial_puts() with
> > >>>>>>>>> serial_{start,end}_log_everything(), so that no output is dropped.
> > >>>>>>>>
> > >>>>>>>> While I can see the goal, why would Dom0 output be (effectively) more
> > >>>>>>>> important than Xen's own one (which isn't "forced")? And with this
> > >>>>>>>> aiming at boot output only, wouldn't you want to stop the overriding
> > >>>>>>>> once boot has completed (of which, if I'm not mistaken, we don't
> > >>>>>>>> really have any signal coming from Dom0)? And even during boot I'm
> > >>>>>>>> not convinced we'd want to let through everything, but perhaps just
> > >>>>>>>> Dom0's kernel messages?
> > >>>>>>>
> > >>>>>>> I normally use sync_console on all the boxes I'm doing dev work, so
> > >>>>>>> this request is something that come up internally.
> > >>>>>>>
> > >>>>>>> Didn't realize Xen output wasn't forced, since we already have rate
> > >>>>>>> limiting based on log levels I was assuming that non-ratelimited
> > >>>>>>> messages wouldn't be dropped.  But yes, I agree that Xen (non-guest
> > >>>>>>> triggered) output shouldn't be rate limited either.
> > >>>>>>
> > >>>>>> Which would raise the question of why we have log levels for non-guest
> > >>>>>> messages.
> > >>>>>
> > >>>>> Hm, maybe I'm confused, but I don't see a direct relation between log
> > >>>>> levels and rate limiting.  If I set log level to WARNING I would
> > >>>>> expect to not loose _any_ non-guest log messages with level WARNING or
> > >>>>> above.  It's still useful to have log levels for non-guest messages,
> > >>>>> since user might want to filter out DEBUG non-guest messages for
> > >>>>> example.
> > >>>>
> > >>>> It was me who was confused, because of the two log-everything variants
> > >>>> we have (console and serial). You're right that your change is unrelated
> > >>>> to log levels. However, when there are e.g. many warnings or when an
> > >>>> admin has lowered the log level, what you (would) do is effectively
> > >>>> force sync_console mode transiently (for a subset of messages, but
> > >>>> that's secondary, especially because the "forced" output would still
> > >>>> be waiting for earlier output to make it out).
> > >>>
> > >>> Right, it would have to wait for any previous output on the buffer to
> > >>> go out first.  In any case we can guarantee that no more output will
> > >>> be added to the buffer while Xen waits for it to be flushed.
> > >>>
> > >>> So for the hardware domain it might make sense to wait for the TX
> > >>> buffers to be half empty (the current tx_quench logic) by preempting
> > >>> the hypercall.  That however could cause issues if guests manage to
> > >>> keep filling the buffer while the hardware domain is being preempted.
> > >>>
> > >>> Alternatively we could always reserve half of the buffer for the
> > >>> hardware domain, and allow it to be preempted while waiting for space
> > >>> (since it's guaranteed non hardware domains won't be able to steal the
> > >>> allocation from the hardware domain).
> > >>
> > >> Getting complicated it seems. I have to admit that I wonder whether we
> > >> wouldn't be better off leaving the current logic as is.
> > > 
> > > Another possible solution (more like a band aid) is to increase the
> > > buffer size from 4 pages to 8 or 16.  That would likely allow to cope
> > > fine with the high throughput of boot messages.
> > 
> > You mean the buffer whose size is controlled by serial_tx_buffer?
> 
> Yes.
> 
> > On
> > large systems one may want to simply make use of the command line
> > option then; I don't think the built-in default needs changing. Or
> > if so, then perhaps not statically at build time, but taking into
> > account system properties (like CPU count).
> 
> So how about we use:
> 
> min(16384, ROUNDUP(1024 * num_possible_cpus(), 4096))

Er, sorry, that should be max(...) instead.

Thanks, Roger.