The type not being used in do_bug_frame() is suspicious. Apparently
that's solely because the type uses a pointer-to-const parameter,
when run_in_exception_handler() wants functions taking pointer-to-non-
const. Drop the const, in turn requiring Arm's do_bug_frame() to also
have its const dropped. This then brings that function also closer to
the common one, with Arm's use of vaddr_t remaining as a difference.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
This is an alternative proposal to
https://lists.xen.org/archives/html/xen-devel/2023-12/msg01385.html,
albeit without paving a road towards Andrew's desire of getting rid of
show_execution_state_nonconst() again. Retaining (and propagating) the
const would imply the need to cast away the const-ness somewhere on (at
least) the path to invoking gdb stub code. Personally I'm averse to such
casting away of const-ness ...
--- a/xen/arch/arm/include/asm/traps.h
+++ b/xen/arch/arm/include/asm/traps.h
@@ -69,7 +69,7 @@ void do_cp(struct cpu_user_regs *regs, c
void do_trap_smc(struct cpu_user_regs *regs, const union hsr hsr);
void do_trap_hvc_smccc(struct cpu_user_regs *regs);
-int do_bug_frame(const struct cpu_user_regs *regs, vaddr_t pc);
+int do_bug_frame(struct cpu_user_regs *regs, vaddr_t pc);
void noreturn do_unexpected_trap(const char *msg,
const struct cpu_user_regs *regs);
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -1202,7 +1202,7 @@ void do_unexpected_trap(const char *msg,
panic("CPU%d: Unexpected Trap: %s\n", smp_processor_id(), msg);
}
-int do_bug_frame(const struct cpu_user_regs *regs, vaddr_t pc)
+int do_bug_frame(struct cpu_user_regs *regs, vaddr_t pc)
{
const struct bug_frame *bug = NULL;
const char *prefix = "", *filename, *predicate;
--- a/xen/common/bug.c
+++ b/xen/common/bug.c
@@ -63,14 +63,10 @@ int do_bug_frame(struct cpu_user_regs *r
if ( id == BUGFRAME_run_fn )
{
- void (*fn)(struct cpu_user_regs *) = bug_ptr(bug);
+ bug_fn_t *fn = bug_ptr(bug);
fn(regs);
- /* Re-enforce consistent types, because of the casts involved. */
- if ( false )
- run_in_exception_handler(fn);
-
return id;
}
--- a/xen/include/xen/bug.h
+++ b/xen/include/xen/bug.h
@@ -101,12 +101,11 @@ struct bug_frame {
#endif
struct cpu_user_regs;
-typedef void bug_fn_t(const struct cpu_user_regs *regs);
+typedef void bug_fn_t(struct cpu_user_regs *regs);
#ifndef run_in_exception_handler
-static void always_inline run_in_exception_handler(
- void (*fn)(struct cpu_user_regs *regs))
+static void always_inline run_in_exception_handler(bug_fn_t *fn)
{
BUG_FRAME(BUGFRAME_run_fn, 0, fn, 0, NULL);
}
Hi Jan, Andrew, I managed to get back to read the mailing list and noticed this patch. Is it still relevant and needs to be reviewed? Are there any outstanding disagreements between maintainers on the approach to take here? Or should I just go ahead and review it? On Tue, 9 Jan 2024, Jan Beulich wrote: > The type not being used in do_bug_frame() is suspicious. Apparently > that's solely because the type uses a pointer-to-const parameter, > when run_in_exception_handler() wants functions taking pointer-to-non- > const. Drop the const, in turn requiring Arm's do_bug_frame() to also > have its const dropped. This then brings that function also closer to > the common one, with Arm's use of vaddr_t remaining as a difference. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > --- > This is an alternative proposal to > https://lists.xen.org/archives/html/xen-devel/2023-12/msg01385.html, > albeit without paving a road towards Andrew's desire of getting rid of > show_execution_state_nonconst() again. Retaining (and propagating) the > const would imply the need to cast away the const-ness somewhere on (at > least) the path to invoking gdb stub code. Personally I'm averse to such > casting away of const-ness ... > > --- a/xen/arch/arm/include/asm/traps.h > +++ b/xen/arch/arm/include/asm/traps.h > @@ -69,7 +69,7 @@ void do_cp(struct cpu_user_regs *regs, c > void do_trap_smc(struct cpu_user_regs *regs, const union hsr hsr); > void do_trap_hvc_smccc(struct cpu_user_regs *regs); > > -int do_bug_frame(const struct cpu_user_regs *regs, vaddr_t pc); > +int do_bug_frame(struct cpu_user_regs *regs, vaddr_t pc); > > void noreturn do_unexpected_trap(const char *msg, > const struct cpu_user_regs *regs); > --- a/xen/arch/arm/traps.c > +++ b/xen/arch/arm/traps.c > @@ -1202,7 +1202,7 @@ void do_unexpected_trap(const char *msg, > panic("CPU%d: Unexpected Trap: %s\n", smp_processor_id(), msg); > } > > -int do_bug_frame(const struct cpu_user_regs *regs, vaddr_t pc) > +int do_bug_frame(struct cpu_user_regs *regs, vaddr_t pc) > { > const struct bug_frame *bug = NULL; > const char *prefix = "", *filename, *predicate; > --- a/xen/common/bug.c > +++ b/xen/common/bug.c > @@ -63,14 +63,10 @@ int do_bug_frame(struct cpu_user_regs *r > > if ( id == BUGFRAME_run_fn ) > { > - void (*fn)(struct cpu_user_regs *) = bug_ptr(bug); > + bug_fn_t *fn = bug_ptr(bug); > > fn(regs); > > - /* Re-enforce consistent types, because of the casts involved. */ > - if ( false ) > - run_in_exception_handler(fn); > - > return id; > } > > --- a/xen/include/xen/bug.h > +++ b/xen/include/xen/bug.h > @@ -101,12 +101,11 @@ struct bug_frame { > #endif > > struct cpu_user_regs; > -typedef void bug_fn_t(const struct cpu_user_regs *regs); > +typedef void bug_fn_t(struct cpu_user_regs *regs); > > #ifndef run_in_exception_handler > > -static void always_inline run_in_exception_handler( > - void (*fn)(struct cpu_user_regs *regs)) > +static void always_inline run_in_exception_handler(bug_fn_t *fn) > { > BUG_FRAME(BUGFRAME_run_fn, 0, fn, 0, NULL); > } >
On 24.01.2024 02:34, Stefano Stabellini wrote: > I managed to get back to read the mailing list and noticed this patch. > > Is it still relevant and needs to be reviewed? > > Are there any outstanding disagreements between maintainers on the > approach to take here? Or should I just go ahead and review it? It is still relevant from my pov, and everything that may be controversial is said ... > On Tue, 9 Jan 2024, Jan Beulich wrote: >> The type not being used in do_bug_frame() is suspicious. Apparently >> that's solely because the type uses a pointer-to-const parameter, >> when run_in_exception_handler() wants functions taking pointer-to-non- >> const. Drop the const, in turn requiring Arm's do_bug_frame() to also >> have its const dropped. This then brings that function also closer to >> the common one, with Arm's use of vaddr_t remaining as a difference. >> >> Signed-off-by: Jan Beulich <jbeulich@suse.com> >> --- >> This is an alternative proposal to >> https://lists.xen.org/archives/html/xen-devel/2023-12/msg01385.html, >> albeit without paving a road towards Andrew's desire of getting rid of >> show_execution_state_nonconst() again. Retaining (and propagating) the >> const would imply the need to cast away the const-ness somewhere on (at >> least) the path to invoking gdb stub code. Personally I'm averse to such >> casting away of const-ness ... ... here. Without Andrew commenting, I'm afraid it's not actually clear whether he objects to this approach, or is meaning to tolerate it silently, or actually views it as a step in a good direction, even if not quite getting where earlier on he thought we may want to move to. Jan
On 24/01/2024 7:28 am, Jan Beulich wrote: > On 24.01.2024 02:34, Stefano Stabellini wrote: >> I managed to get back to read the mailing list and noticed this patch. >> >> Is it still relevant and needs to be reviewed? >> >> Are there any outstanding disagreements between maintainers on the >> approach to take here? Or should I just go ahead and review it? > It is still relevant from my pov, and everything that may be controversial > is said ... BUGFRAME_* cannot legitimately modify the interrupted context. Two are fatal paths, and other two are side-effect-less as far as C can tell. So the infrastructure ought to take a const pointer. The reason why this pointer is non-const is to do with the interaction of the serial and keyhandler infrastructures. Because we're adjusting that for other reasons, I was hoping it would subsequently be easy to switch Xen to being properly const in this regard. Turns out it is: https://gitlab.com/xen-project/people/andyhhp/xen/-/commit/4f857075005da1d28632e4f9198c2e7d0f404b9a with a couple of caveats. (Only the buster-gcc-ibt run failed, so I've got some cf_check-ing to adjust, but all the other builds worked fine). To make the serial code compile, I ended up having to revert patch 2 of the regs series, which I believe is safe to do following patch 3-5 which un-plumb the regs pointer deeper in the call chain. If this is turns out to be true, then the patch ought to be added and reverted in the same series so it isn't left hanging about after the fact. The _$X_poll() functions are used in timer context, which means there's an outer regs context already latched, and that's arguably a better context to use anyway for 'd'. This in turn allows us to remove a #UD from a fast(ish) path, and remove some per-cpu or static variables which are just used for non-standard parameter passing because run_in_exception_handler() doesn't let you pass any. This leaves the '%' debugger infrastructure. Being a debugger, it's making arbitrary changes anyway and I'd much rather cast away constness for a debugger, than to keep everything else mutable when it oughtn't to be. If absolutely nothing else, registration and handling '%' ought to be from x86 code rather than common code, which would remove the do_debugger_trap_fatal() layering violation. But, the more I look into the gdbstub the more I'm convinced that it doesn't work. For example, this gem: /* Resuming after we've stopped used to work, but more through luck than any actual intention. It doesn't at the moment. */ From c/s b69f92f3012 in July 2004, and more specifically the commit which added the gdbstub functionality to begin with. I.e. it doesn't appear to have ever supported more than "poke around in the crashed state". In the 2 decades that noone has fixed this, we've gained far better technologies for doing this, such as running it in a VM. I am going to submit some patches deleting gdbstub. It clearly had not much value to begin with, and is not definitely not worth the problems it is creating in adjacent code these days. ~Andrew
On 25.01.2024 02:14, Andrew Cooper wrote: > On 24/01/2024 7:28 am, Jan Beulich wrote: >> On 24.01.2024 02:34, Stefano Stabellini wrote: >>> I managed to get back to read the mailing list and noticed this patch. >>> >>> Is it still relevant and needs to be reviewed? >>> >>> Are there any outstanding disagreements between maintainers on the >>> approach to take here? Or should I just go ahead and review it? >> It is still relevant from my pov, and everything that may be controversial >> is said ... > > BUGFRAME_* cannot legitimately modify the interrupted context. Two are > fatal paths, and other two are side-effect-less as far as C can tell. > > So the infrastructure ought to take a const pointer. > > The reason why this pointer is non-const is to do with the interaction > of the serial and keyhandler infrastructures. Because we're adjusting > that for other reasons, I was hoping it would subsequently be easy to > switch Xen to being properly const in this regard. > > Turns out it is: > > > https://gitlab.com/xen-project/people/andyhhp/xen/-/commit/4f857075005da1d28632e4f9198c2e7d0f404b9a > > with a couple of caveats. (Only the buster-gcc-ibt run failed, so I've > got some cf_check-ing to adjust, but all the other builds worked fine). > > > To make the serial code compile, I ended up having to revert patch 2 of > the regs series, which I believe is safe to do following patch 3-5 which > un-plumb the regs pointer deeper in the call chain. If this is turns > out to be true, then the patch ought to be added and reverted in the > same series so it isn't left hanging about after the fact. Looking further into this, I can't see how reverting ought to be possible, even less so specifically after patch 5. Patches 4 and 5 merely eliminate now unused parameters. Hence imo if it could be reverted after 5, it also ought to be fine to revert after 3. Which in turn it would mean it's not needed at all. Which I simply cannot see (yet?). Jan
On 25.01.2024 02:14, Andrew Cooper wrote: > On 24/01/2024 7:28 am, Jan Beulich wrote: >> On 24.01.2024 02:34, Stefano Stabellini wrote: >>> I managed to get back to read the mailing list and noticed this patch. >>> >>> Is it still relevant and needs to be reviewed? >>> >>> Are there any outstanding disagreements between maintainers on the >>> approach to take here? Or should I just go ahead and review it? >> It is still relevant from my pov, and everything that may be controversial >> is said ... > > BUGFRAME_* cannot legitimately modify the interrupted context. Two are > fatal paths, and other two are side-effect-less as far as C can tell. > > So the infrastructure ought to take a const pointer. > > The reason why this pointer is non-const is to do with the interaction > of the serial and keyhandler infrastructures. Because we're adjusting > that for other reasons, I was hoping it would subsequently be easy to > switch Xen to being properly const in this regard. > > Turns out it is: > > > https://gitlab.com/xen-project/people/andyhhp/xen/-/commit/4f857075005da1d28632e4f9198c2e7d0f404b9a > > with a couple of caveats. (Only the buster-gcc-ibt run failed, so I've > got some cf_check-ing to adjust, but all the other builds worked fine). > > > To make the serial code compile, I ended up having to revert patch 2 of > the regs series, which I believe is safe to do following patch 3-5 which > un-plumb the regs pointer deeper in the call chain. If this is turns > out to be true, then the patch ought to be added and reverted in the > same series so it isn't left hanging about after the fact. Hmm, I'm not sure I see how reverting that would end up working. However, aiui you need to revert primarily for the non-const-ness of the pointers involved in [gs]et_irq_regs(). I wonder whether, if we followed your underlying thought here, those shouldn't be const-ified then anyway. > The _$X_poll() functions are used in timer context, which means there's > an outer regs context already latched, and that's arguably a better > context to use anyway for 'd'. If the timer happens to run on an idle vCPU, what "outer regs context" would we have there? > This in turn allows us to remove a #UD from a fast(ish) path, and remove > some per-cpu or static variables which are just used for non-standard > parameter passing because run_in_exception_handler() doesn't let you > pass any. > > > This leaves the '%' debugger infrastructure. Being a debugger, it's > making arbitrary changes anyway and I'd much rather cast away constness > for a debugger, than to keep everything else mutable when it oughtn't to be. > > If absolutely nothing else, registration and handling '%' ought to be > from x86 code rather than common code, which would remove the > do_debugger_trap_fatal() layering violation. > > But, the more I look into the gdbstub the more I'm convinced that it > doesn't work. For example, this gem: > > /* Resuming after we've stopped used to work, but more through luck than > any actual intention. It doesn't at the moment. */ > > From c/s b69f92f3012 in July 2004, and more specifically the commit > which added the gdbstub functionality to begin with. I.e. it doesn't > appear to have ever supported more than "poke around in the crashed > state". In the 2 decades that noone has fixed this, we've gained far > better technologies for doing this, such as running it in a VM. > > I am going to submit some patches deleting gdbstub. It clearly had not > much value to begin with, and is not definitely not worth the problems > it is creating in adjacent code these days. All fine. Still I wonder whether in the meantime this patch isn't an improvement on its own, and hence whether the const couldn't sensibly be added subsequently. Jan
Answering out of order: On 25/01/2024 8:15 am, Jan Beulich wrote: > All fine. Still I wonder whether in the meantime this patch isn't an > improvement on its own, and hence whether the const couldn't sensibly > be added subsequently. We have a while until 4.19 goes out. I would prefer to try and get this untangled properly, because half of your patch is in the opposite direction for getting the const-ness working. If we start to hit rc1 and the const-ness isn't complete, we can revisit. Regarding the removal of gdb-stub. I'd like to get that done, and rebase the remainder of the IRQ series over it, because it will reduce the churn in your IRQ series. I'm happy to do the rebase if you want. > On 25.01.2024 02:14, Andrew Cooper wrote: >> To make the serial code compile, I ended up having to revert patch 2 of >> the regs series, which I believe is safe to do following patch 3-5 which >> un-plumb the regs pointer deeper in the call chain. If this is turns >> out to be true, then the patch ought to be added and reverted in the >> same series so it isn't left hanging about after the fact. > Hmm, I'm not sure I see how reverting that would end up working. However, > aiui you need to revert primarily for the non-const-ness of the pointers > involved in [gs]et_irq_regs(). I wonder whether, if we followed your > underlying thought here, those shouldn't be const-ified then anyway. > >> The _$X_poll() functions are used in timer context, which means there's >> an outer regs context already latched, and that's arguably a better >> context to use anyway for 'd'. > If the timer happens to run on an idle vCPU, what "outer regs context" > would we have there? The only reason the serial infrastructure was setting up a fake IRQ context was because it was using run_in_exception_handler(). But I (think I have) removed that fully (and it simplifies things more than I was hoping). With '%' deleted, it's only 'd' that cares, isn't it? And that's "dump the current regs", rather than wanting something else. ~Andrew
On 25.01.2024 13:10, Andrew Cooper wrote: > Answering out of order: > > On 25/01/2024 8:15 am, Jan Beulich wrote: >> All fine. Still I wonder whether in the meantime this patch isn't an >> improvement on its own, and hence whether the const couldn't sensibly >> be added subsequently. > > We have a while until 4.19 goes out. I would prefer to try and get this > untangled properly, because half of your patch is in the opposite > direction for getting the const-ness working. > > If we start to hit rc1 and the const-ness isn't complete, we can revisit. > > > Regarding the removal of gdb-stub. I'd like to get that done, and > rebase the remainder of the IRQ series over it, because it will reduce > the churn in your IRQ series. I'm happy to do the rebase if you want. Shouldn't be overly difficult, so I guess I can as well do it before (eventually) re-submitting. >> On 25.01.2024 02:14, Andrew Cooper wrote: >>> To make the serial code compile, I ended up having to revert patch 2 of >>> the regs series, which I believe is safe to do following patch 3-5 which >>> un-plumb the regs pointer deeper in the call chain. If this is turns >>> out to be true, then the patch ought to be added and reverted in the >>> same series so it isn't left hanging about after the fact. >> Hmm, I'm not sure I see how reverting that would end up working. However, >> aiui you need to revert primarily for the non-const-ness of the pointers >> involved in [gs]et_irq_regs(). I wonder whether, if we followed your >> underlying thought here, those shouldn't be const-ified then anyway. >> >>> The _$X_poll() functions are used in timer context, which means there's >>> an outer regs context already latched, and that's arguably a better >>> context to use anyway for 'd'. >> If the timer happens to run on an idle vCPU, what "outer regs context" >> would we have there? > > The only reason the serial infrastructure was setting up a fake IRQ > context was because it was using run_in_exception_handler(). > > But I (think I have) removed that fully (and it simplifies things more > than I was hoping). > > With '%' deleted, it's only 'd' that cares, isn't it? Yes. > And that's "dump the current regs", rather than wanting something else. Question remains though: What are "current regs" when on an idle vCPU? But perhaps that will clarify itself once I see how you "removed that fully". Jan
© 2016 - 2024 Red Hat, Inc.