Hello,
With Xen 4.18, a PV domain running under pvshim doesn't get console input.
This is because the domain id in pvshim isn't 0 (and on x86 max_init_domid is
hardwired to 0), so console_input_domain() will never select that domain
as input.
The attached patch fixes it by translating 0 to the real domain id for
pvshim, but there may be a better way to do this.
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
--- drivers/char/console.c.orig 2023-10-18 12:24:57.221891748 +0200
+++ drivers/char/console.c 2023-10-18 12:30:26.072844802 +0200
@@ -478,14 +478,20 @@
/* Make sure to rcu_unlock_domain after use */
struct domain *console_input_domain(void)
{
+ domid_t domid;
if ( console_rx == 0 )
return NULL;
- return rcu_lock_domain_by_id(console_rx - 1);
+ if (console_rx == 1 && pv_shim)
+ domid = get_initial_domain_id();
+ else
+ domid = console_rx - 1;
+ return rcu_lock_domain_by_id(domid);
}
static void switch_serial_input(void)
{
unsigned int next_rx = console_rx;
+ domid_t domid;
/*
* Rotate among Xen, dom0 and boot-time created domUs while skipping
@@ -502,7 +508,11 @@
break;
}
- d = rcu_lock_domain_by_id(next_rx - 1);
+ if (next_rx == 1 && pv_shim)
+ domid = get_initial_domain_id();
+ else
+ domid = next_rx - 1;
+ d = rcu_lock_domain_by_id(domid);
if ( d )
{
rcu_unlock_domain(d);
On 18.10.2023 12:38, Manuel Bouyer wrote: > Hello, > With Xen 4.18, a PV domain running under pvshim doesn't get console input. > This is because the domain id in pvshim isn't 0 (and on x86 max_init_domid is > hardwired to 0), so console_input_domain() will never select that domain > as input. > > The attached patch fixes it by translating 0 to the real domain id for > pvshim, but there may be a better way to do this. My primary observation with the patch is that it presumably won't build for other than x86. There are also indentation and other style issues, no S-o-b, and no description. But I wonder whether a different approach doesn't want taking: Wouldn't it help if max_init_domid was 1 in the shim case, with no need for any other changes? Also Cc-ing Michal as the author of the (possibly) offending patch. Jan
Hi, On 18/10/2023 15:29, Jan Beulich wrote: > > > On 18.10.2023 12:38, Manuel Bouyer wrote: >> Hello, >> With Xen 4.18, a PV domain running under pvshim doesn't get console input. >> This is because the domain id in pvshim isn't 0 (and on x86 max_init_domid is >> hardwired to 0), so console_input_domain() will never select that domain >> as input. >> >> The attached patch fixes it by translating 0 to the real domain id for >> pvshim, but there may be a better way to do this. > > My primary observation with the patch is that it presumably won't build for > other than x86. There are also indentation and other style issues, no S-o-b, > and no description. But I wonder whether a different approach doesn't want > taking: Wouldn't it help if max_init_domid was 1 in the shim case, with no > need for any other changes? > > Also Cc-ing Michal as the author of the (possibly) offending patch. What if we set max_init_domid in pvshim case to the value returned by get_initial_domain_id() in create_dom0()? The drawback is that we would waste some time in a loop if the value is large. ~Michal
On 18.10.2023 16:18, Michal Orzel wrote: > On 18/10/2023 15:29, Jan Beulich wrote: >> On 18.10.2023 12:38, Manuel Bouyer wrote: >>> Hello, >>> With Xen 4.18, a PV domain running under pvshim doesn't get console input. >>> This is because the domain id in pvshim isn't 0 (and on x86 max_init_domid is >>> hardwired to 0), so console_input_domain() will never select that domain >>> as input. >>> >>> The attached patch fixes it by translating 0 to the real domain id for >>> pvshim, but there may be a better way to do this. >> >> My primary observation with the patch is that it presumably won't build for >> other than x86. There are also indentation and other style issues, no S-o-b, >> and no description. But I wonder whether a different approach doesn't want >> taking: Wouldn't it help if max_init_domid was 1 in the shim case, with no >> need for any other changes? >> >> Also Cc-ing Michal as the author of the (possibly) offending patch. > What if we set max_init_domid in pvshim case to the value returned by get_initial_domain_id() > in create_dom0()? The drawback is that we would waste some time in a loop if the value is large. I'd like to avoid specifically that. Jan
On Wed, Oct 18, 2023 at 04:18:26PM +0200, Michal Orzel wrote:
> Hi,
>
> On 18/10/2023 15:29, Jan Beulich wrote:
> >
> >
> > On 18.10.2023 12:38, Manuel Bouyer wrote:
> >> Hello,
> >> With Xen 4.18, a PV domain running under pvshim doesn't get console input.
> >> This is because the domain id in pvshim isn't 0 (and on x86 max_init_domid is
> >> hardwired to 0), so console_input_domain() will never select that domain
> >> as input.
> >>
> >> The attached patch fixes it by translating 0 to the real domain id for
> >> pvshim, but there may be a better way to do this.
> >
> > My primary observation with the patch is that it presumably won't build for
> > other than x86. There are also indentation and other style issues, no S-o-b,
> > and no description. But I wonder whether a different approach doesn't want
> > taking: Wouldn't it help if max_init_domid was 1 in the shim case, with no
> > need for any other changes?
> >
> > Also Cc-ing Michal as the author of the (possibly) offending patch.
> What if we set max_init_domid in pvshim case to the value returned by get_initial_domain_id()
> in create_dom0()? The drawback is that we would waste some time in a loop if the value is large.
I considered this too, but max_init_domid is a constant on x86; it looked
like a more intrusive change.
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
On Wed, Oct 18, 2023 at 03:29:08PM +0200, Jan Beulich wrote:
> On 18.10.2023 12:38, Manuel Bouyer wrote:
> > Hello,
> > With Xen 4.18, a PV domain running under pvshim doesn't get console input.
> > This is because the domain id in pvshim isn't 0 (and on x86 max_init_domid is
> > hardwired to 0), so console_input_domain() will never select that domain
> > as input.
> >
> > The attached patch fixes it by translating 0 to the real domain id for
> > pvshim, but there may be a better way to do this.
>
> My primary observation with the patch is that it presumably won't build for
> other than x86.
It's possible, I don't have other platform to test
> There are also indentation and other style issues, no S-o-b,
> and no description.
I'll let whoever commit it deal with this
> But I wonder whether a different approach doesn't want
> taking: Wouldn't it help if max_init_domid was 1 in the shim case, with no
> need for any other changes?
max_init_domid=1 won't help, because we're using the real domid from the
real hypervisor here. So it can be arbitrary large (in my tests I did a
printk here, and the domid was matching the id from 'xl list')
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
On 18/10/2023 11:38 am, Manuel Bouyer wrote: > Hello, > With Xen 4.18, a PV domain running under pvshim doesn't get console input. > This is because the domain id in pvshim isn't 0 (and on x86 max_init_domid is > hardwired to 0), so console_input_domain() will never select that domain > as input. > > The attached patch fixes it by translating 0 to the real domain id for > pvshim, but there may be a better way to do this. > Thankyou for the report. First, CC'ing Henry as 4.18 release manager. There have been changes in how this works recently in 4.18, notably c/s c2581c58bec96. However, it's not obvious whether this worked in 4.17 or not. i.e. whether it's a regression in 4.18, or whether it's been broken since PV Shim was introduced. ~Andrew
On Wed, Oct 18, 2023 at 11:44:22AM +0100, Andrew Cooper wrote:
> On 18/10/2023 11:38 am, Manuel Bouyer wrote:
> > Hello,
> > With Xen 4.18, a PV domain running under pvshim doesn't get console input.
> > This is because the domain id in pvshim isn't 0 (and on x86 max_init_domid is
> > hardwired to 0), so console_input_domain() will never select that domain
> > as input.
> >
> > The attached patch fixes it by translating 0 to the real domain id for
> > pvshim, but there may be a better way to do this.
> >
>
> Thankyou for the report.
>
> First, CC'ing Henry as 4.18 release manager.
>
> There have been changes in how this works recently in 4.18, notably c/s
> c2581c58bec96.
Yes, it looks like this one introduced the problem.
Before this, we would switch console_rx to 1 without checking if
domain (console_rx - 1) exists, and console_rx == 1 is a special case
in __serial_rx()
>
> However, it's not obvious whether this worked in 4.17 or not. i.e.
> whether it's a regression in 4.18, or whether it's been broken since PV
> Shim was introduced.
I don't know for 4.16 or 4.17 but I can tell that it's working in 4.15
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
On 18.10.2023 13:20, Manuel Bouyer wrote: > On Wed, Oct 18, 2023 at 11:44:22AM +0100, Andrew Cooper wrote: >> On 18/10/2023 11:38 am, Manuel Bouyer wrote: >>> Hello, >>> With Xen 4.18, a PV domain running under pvshim doesn't get console input. >>> This is because the domain id in pvshim isn't 0 (and on x86 max_init_domid is >>> hardwired to 0), so console_input_domain() will never select that domain >>> as input. >>> >>> The attached patch fixes it by translating 0 to the real domain id for >>> pvshim, but there may be a better way to do this. >>> >> >> Thankyou for the report. >> >> First, CC'ing Henry as 4.18 release manager. >> >> There have been changes in how this works recently in 4.18, notably c/s >> c2581c58bec96. > > Yes, it looks like this one introduced the problem. > Before this, we would switch console_rx to 1 without checking if > domain (console_rx - 1) exists, and console_rx == 1 is a special case > in __serial_rx() > >> >> However, it's not obvious whether this worked in 4.17 or not. i.e. >> whether it's a regression in 4.18, or whether it's been broken since PV >> Shim was introduced. > > I don't know for 4.16 or 4.17 but I can tell that it's working in 4.15 From looking at the code, it doesn't look like it would: There switch_serial_input() toggles console_rx between 0 and 1 afaict, and console_input_domain() handles value 0 as "Xen" (like in 4.18), while otherwise it calls rcu_lock_domain_by_id(console_rx - 1) (i.e. trying to get hold of Dom0's domain structure, not Dom1's). Jan
On Wed, Oct 18, 2023 at 03:24:22PM +0200, Jan Beulich wrote:
> On 18.10.2023 13:20, Manuel Bouyer wrote:
> > On Wed, Oct 18, 2023 at 11:44:22AM +0100, Andrew Cooper wrote:
> >> On 18/10/2023 11:38 am, Manuel Bouyer wrote:
> >>> Hello,
> >>> With Xen 4.18, a PV domain running under pvshim doesn't get console input.
> >>> This is because the domain id in pvshim isn't 0 (and on x86 max_init_domid is
> >>> hardwired to 0), so console_input_domain() will never select that domain
> >>> as input.
> >>>
> >>> The attached patch fixes it by translating 0 to the real domain id for
> >>> pvshim, but there may be a better way to do this.
> >>>
> >>
> >> Thankyou for the report.
> >>
> >> First, CC'ing Henry as 4.18 release manager.
> >>
> >> There have been changes in how this works recently in 4.18, notably c/s
> >> c2581c58bec96.
> >
> > Yes, it looks like this one introduced the problem.
> > Before this, we would switch console_rx to 1 without checking if
> > domain (console_rx - 1) exists, and console_rx == 1 is a special case
> > in __serial_rx()
> >
> >>
> >> However, it's not obvious whether this worked in 4.17 or not. i.e.
> >> whether it's a regression in 4.18, or whether it's been broken since PV
> >> Shim was introduced.
> >
> > I don't know for 4.16 or 4.17 but I can tell that it's working in 4.15
>
> >From looking at the code, it doesn't look like it would: There
> switch_serial_input() toggles console_rx between 0 and 1 afaict, and
> console_input_domain() handles value 0 as "Xen" (like in 4.18), while
> otherwise it calls rcu_lock_domain_by_id(console_rx - 1) (i.e. trying
> to get hold of Dom0's domain structure, not Dom1's).
Well, I have a 4.15.5 installation in production and I can tell you that
with PV+PVSHIM, the console is working, for sure.
AFAIK console_input_domain() is called only on ARM, from
vpl011_write_data_xen(). It's never called for x86.
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
On 18.10.2023 15:36, Manuel Bouyer wrote: > On Wed, Oct 18, 2023 at 03:24:22PM +0200, Jan Beulich wrote: >> On 18.10.2023 13:20, Manuel Bouyer wrote: >>> On Wed, Oct 18, 2023 at 11:44:22AM +0100, Andrew Cooper wrote: >>>> On 18/10/2023 11:38 am, Manuel Bouyer wrote: >>>>> Hello, >>>>> With Xen 4.18, a PV domain running under pvshim doesn't get console input. >>>>> This is because the domain id in pvshim isn't 0 (and on x86 max_init_domid is >>>>> hardwired to 0), so console_input_domain() will never select that domain >>>>> as input. >>>>> >>>>> The attached patch fixes it by translating 0 to the real domain id for >>>>> pvshim, but there may be a better way to do this. >>>>> >>>> >>>> Thankyou for the report. >>>> >>>> First, CC'ing Henry as 4.18 release manager. >>>> >>>> There have been changes in how this works recently in 4.18, notably c/s >>>> c2581c58bec96. >>> >>> Yes, it looks like this one introduced the problem. >>> Before this, we would switch console_rx to 1 without checking if >>> domain (console_rx - 1) exists, and console_rx == 1 is a special case >>> in __serial_rx() >>> >>>> >>>> However, it's not obvious whether this worked in 4.17 or not. i.e. >>>> whether it's a regression in 4.18, or whether it's been broken since PV >>>> Shim was introduced. >>> >>> I don't know for 4.16 or 4.17 but I can tell that it's working in 4.15 >> >> >From looking at the code, it doesn't look like it would: There >> switch_serial_input() toggles console_rx between 0 and 1 afaict, and >> console_input_domain() handles value 0 as "Xen" (like in 4.18), while >> otherwise it calls rcu_lock_domain_by_id(console_rx - 1) (i.e. trying >> to get hold of Dom0's domain structure, not Dom1's). > > Well, I have a 4.15.5 installation in production and I can tell you that > with PV+PVSHIM, the console is working, for sure. > > AFAIK console_input_domain() is called only on ARM, from > vpl011_write_data_xen(). It's never called for x86. Oh, indeed. I took your patch touching the function as meaning it's used on x86. This then means my earlier suggestion won't work, as we need console_rx to have the value 1 for input to be accepted from Dom1. Which in turn means your change to switch_serial_input(), suitably cleaned up, is then likely the best we can do, despite me not really liking the shim special casing. As to cleaning up - besides the build, indentation, and style issues I think you want to drop the "&& pv_shim" part of the condition (as get_initial_domain_id() takes care of that already), and ideally you'd also move the new "domid" variable into the more narrow scope. If you don't feel like providing a proper (updated) patch, I'm happy to take over, but to retain indication of your initial work I'd need you to permit me to add your S-o-b (alongside mine). Jan
On Wed, Oct 18, 2023 at 03:51:54PM +0200, Jan Beulich wrote:
> On 18.10.2023 15:36, Manuel Bouyer wrote:
> > On Wed, Oct 18, 2023 at 03:24:22PM +0200, Jan Beulich wrote:
> >> On 18.10.2023 13:20, Manuel Bouyer wrote:
> >>> On Wed, Oct 18, 2023 at 11:44:22AM +0100, Andrew Cooper wrote:
> >>>> On 18/10/2023 11:38 am, Manuel Bouyer wrote:
> >>>>> Hello,
> >>>>> With Xen 4.18, a PV domain running under pvshim doesn't get console input.
> >>>>> This is because the domain id in pvshim isn't 0 (and on x86 max_init_domid is
> >>>>> hardwired to 0), so console_input_domain() will never select that domain
> >>>>> as input.
> >>>>>
> >>>>> The attached patch fixes it by translating 0 to the real domain id for
> >>>>> pvshim, but there may be a better way to do this.
> >>>>>
> >>>>
> >>>> Thankyou for the report.
> >>>>
> >>>> First, CC'ing Henry as 4.18 release manager.
> >>>>
> >>>> There have been changes in how this works recently in 4.18, notably c/s
> >>>> c2581c58bec96.
> >>>
> >>> Yes, it looks like this one introduced the problem.
> >>> Before this, we would switch console_rx to 1 without checking if
> >>> domain (console_rx - 1) exists, and console_rx == 1 is a special case
> >>> in __serial_rx()
> >>>
> >>>>
> >>>> However, it's not obvious whether this worked in 4.17 or not. i.e.
> >>>> whether it's a regression in 4.18, or whether it's been broken since PV
> >>>> Shim was introduced.
> >>>
> >>> I don't know for 4.16 or 4.17 but I can tell that it's working in 4.15
> >>
> >> >From looking at the code, it doesn't look like it would: There
> >> switch_serial_input() toggles console_rx between 0 and 1 afaict, and
> >> console_input_domain() handles value 0 as "Xen" (like in 4.18), while
> >> otherwise it calls rcu_lock_domain_by_id(console_rx - 1) (i.e. trying
> >> to get hold of Dom0's domain structure, not Dom1's).
> >
> > Well, I have a 4.15.5 installation in production and I can tell you that
> > with PV+PVSHIM, the console is working, for sure.
> >
> > AFAIK console_input_domain() is called only on ARM, from
> > vpl011_write_data_xen(). It's never called for x86.
>
> Oh, indeed. I took your patch touching the function as meaning it's used
> on x86. This then means my earlier suggestion won't work, as we need
> console_rx to have the value 1 for input to be accepted from Dom1. Which
> in turn means your change to switch_serial_input(), suitably cleaned up,
> is then likely the best we can do, despite me not really liking the shim
> special casing.
>
> As to cleaning up - besides the build, indentation, and style issues I
> think you want to drop the "&& pv_shim" part of the condition (as
> get_initial_domain_id() takes care of that already), and ideally you'd
> also move the new "domid" variable into the more narrow scope. If you
> don't feel like providing a proper (updated) patch, I'm happy to take
> over, but to retain indication of your initial work I'd need you to
> permit me to add your S-o-b (alongside mine).
No problem, please do !
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
On 18/10/2023 12:20 pm, Manuel Bouyer wrote: > On Wed, Oct 18, 2023 at 11:44:22AM +0100, Andrew Cooper wrote: >> On 18/10/2023 11:38 am, Manuel Bouyer wrote: >>> Hello, >>> With Xen 4.18, a PV domain running under pvshim doesn't get console input. >>> This is because the domain id in pvshim isn't 0 (and on x86 max_init_domid is >>> hardwired to 0), so console_input_domain() will never select that domain >>> as input. >>> >>> The attached patch fixes it by translating 0 to the real domain id for >>> pvshim, but there may be a better way to do this. >>> >> Thankyou for the report. >> >> First, CC'ing Henry as 4.18 release manager. >> >> There have been changes in how this works recently in 4.18, notably c/s >> c2581c58bec96. > Yes, it looks like this one introduced the problem. > Before this, we would switch console_rx to 1 without checking if > domain (console_rx - 1) exists, and console_rx == 1 is a special case > in __serial_rx() > >> However, it's not obvious whether this worked in 4.17 or not. i.e. >> whether it's a regression in 4.18, or whether it's been broken since PV >> Shim was introduced. > I don't know for 4.16 or 4.17 but I can tell that it's working in 4.15 > That commit is new in 4.18, so Henry - this is a release critical/blocker owing to it being a regression vs 4.17. I'll try and put some brainpower towards it when I get some other 4.18 work sorted. ~Andrew
Hi Andrew, > On Oct 18, 2023, at 19:23, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > > On 18/10/2023 12:20 pm, Manuel Bouyer wrote: >> On Wed, Oct 18, 2023 at 11:44:22AM +0100, Andrew Cooper wrote: >>> On 18/10/2023 11:38 am, Manuel Bouyer wrote: >>>> Hello, >>>> With Xen 4.18, a PV domain running under pvshim doesn't get console input. >>>> This is because the domain id in pvshim isn't 0 (and on x86 max_init_domid is >>>> hardwired to 0), so console_input_domain() will never select that domain >>>> as input. >>>> >>>> The attached patch fixes it by translating 0 to the real domain id for >>>> pvshim, but there may be a better way to do this. >>>> >>> Thankyou for the report. >>> >>> First, CC'ing Henry as 4.18 release manager. >>> >>> There have been changes in how this works recently in 4.18, notably c/s >>> c2581c58bec96. >> Yes, it looks like this one introduced the problem. >> Before this, we would switch console_rx to 1 without checking if >> domain (console_rx - 1) exists, and console_rx == 1 is a special case >> in __serial_rx() >> >>> However, it's not obvious whether this worked in 4.17 or not. i.e. >>> whether it's a regression in 4.18, or whether it's been broken since PV >>> Shim was introduced. >> I don't know for 4.16 or 4.17 but I can tell that it's working in 4.15 >> > > That commit is new in 4.18, so Henry - this is a release > critical/blocker owing to it being a regression vs 4.17. Noted down. Out of curiosity, do we have maintainers for that specific driver? It would be good to looping them in. Kind regards, Henry > > I'll try and put some brainpower towards it when I get some other 4.18 > work sorted. > > ~Andrew
On Wed, Oct 18, 2023 at 12:38:57PM +0200, Manuel Bouyer wrote:
> Hello,
> With Xen 4.18, a PV domain running under pvshim doesn't get console input.
> This is because the domain id in pvshim isn't 0 (and on x86 max_init_domid is
> hardwired to 0), so console_input_domain() will never select that domain
> as input.
>
> The attached patch fixes it by translating 0 to the real domain id for
> pvshim, but there may be a better way to do this.
Small improvement, print the real domain ID instead of DOM0
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
--- drivers/char/console.c.orig 2023-10-18 12:24:57.221891748 +0200
+++ drivers/char/console.c 2023-10-18 12:41:10.513003560 +0200
@@ -478,14 +478,20 @@
/* Make sure to rcu_unlock_domain after use */
struct domain *console_input_domain(void)
{
+ domid_t domid;
if ( console_rx == 0 )
return NULL;
- return rcu_lock_domain_by_id(console_rx - 1);
+ if (console_rx == 1 && pv_shim)
+ domid = get_initial_domain_id();
+ else
+ domid = console_rx - 1;
+ return rcu_lock_domain_by_id(domid);
}
static void switch_serial_input(void)
{
unsigned int next_rx = console_rx;
+ domid_t domid;
/*
* Rotate among Xen, dom0 and boot-time created domUs while skipping
@@ -502,12 +508,16 @@
break;
}
- d = rcu_lock_domain_by_id(next_rx - 1);
+ if (next_rx == 1 && pv_shim)
+ domid = get_initial_domain_id();
+ else
+ domid = next_rx - 1;
+ d = rcu_lock_domain_by_id(domid);
if ( d )
{
rcu_unlock_domain(d);
console_rx = next_rx;
- printk("*** Serial input to DOM%u", next_rx - 1);
+ printk("*** Serial input to DOM%u", domid);
break;
} else {
printk("next_rx %d not domain", next_rx);
© 2016 - 2026 Red Hat, Inc.