PGC_static, PGC_extra and PGC_broken need to be preserved when assigning a
page. Define a new macro that groups those flags and use it instead of or'ing
every time.
To make preserved flags even more meaningful, they are kept also when
switching state in mark_page_free().
Enforce the removal of PGC_extra before freeing domain pages as this is
considered an error and can cause ASSERT violations.
Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
---
v10:
- fixed commit message
v9:
- add PGC_broken to PGC_preserved
- clear PGC_extra in alloc_domheap_pages() only if MEMF_no_refcount is set
v8:
- fixed PGC_extra ASSERT fail in alloc_domheap_pages() by removing PGC_extra
before freeing
v7:
- PGC_preserved used also in mark_page_free()
v6:
- preserved_flags renamed to PGC_preserved
- PGC_preserved is used only in assign_pages()
v5:
- new patch
---
xen/common/page_alloc.c | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 7b911b5ed9..34cd473150 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -160,6 +160,7 @@
#endif
#define PGC_no_buddy_merge PGC_static
+#define PGC_preserved (PGC_extra | PGC_static | PGC_broken)
#ifndef PGT_TYPE_INFO_INITIALIZER
#define PGT_TYPE_INFO_INITIALIZER 0
@@ -1427,12 +1428,11 @@ static bool mark_page_free(struct page_info *pg, mfn_t mfn)
{
case PGC_state_inuse:
BUG_ON(pg->count_info & PGC_broken);
- pg->count_info = PGC_state_free;
+ pg->count_info = PGC_state_free | (pg->count_info & PGC_preserved);
break;
case PGC_state_offlining:
- pg->count_info = (pg->count_info & PGC_broken) |
- PGC_state_offlined;
+ pg->count_info = (pg->count_info & PGC_preserved) | PGC_state_offlined;
pg_offlined = true;
break;
@@ -2366,7 +2366,7 @@ int assign_pages(
for ( i = 0; i < nr; i++ )
{
- ASSERT(!(pg[i].count_info & ~(PGC_extra | PGC_static)));
+ ASSERT(!(pg[i].count_info & ~PGC_preserved));
if ( pg[i].count_info & PGC_extra )
extra_pages++;
}
@@ -2426,7 +2426,7 @@ int assign_pages(
page_set_owner(&pg[i], d);
smp_wmb(); /* Domain pointer must be visible before updating refcnt. */
pg[i].count_info =
- (pg[i].count_info & (PGC_extra | PGC_static)) | PGC_allocated | 1;
+ (pg[i].count_info & PGC_preserved) | PGC_allocated | 1;
page_list_add_tail(&pg[i], page_to_list(d, &pg[i]));
}
@@ -2485,6 +2485,14 @@ struct page_info *alloc_domheap_pages(
}
if ( assign_page(pg, order, d, memflags) )
{
+ if ( memflags & MEMF_no_refcount )
+ {
+ unsigned long i;
+
+ for ( i = 0; i < (1UL << order); i++ )
+ pg[i].count_info &= ~PGC_extra;
+ }
+
free_heap_pages(pg, order, memflags & MEMF_no_scrub);
return NULL;
}
@@ -2539,6 +2547,7 @@ void free_domheap_pages(struct page_info *pg, unsigned int order)
{
ASSERT(d->extra_pages);
d->extra_pages--;
+ pg[i].count_info &= ~PGC_extra;
}
}
--
2.43.0
On 19.11.2024 15:13, Carlo Nonato wrote: > PGC_static, PGC_extra and PGC_broken need to be preserved when assigning a > page. Define a new macro that groups those flags and use it instead of or'ing > every time. > > To make preserved flags even more meaningful, they are kept also when > switching state in mark_page_free(). > Enforce the removal of PGC_extra before freeing domain pages as this is > considered an error and can cause ASSERT violations. > > Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech> > --- > v10: > - fixed commit message > v9: > - add PGC_broken to PGC_preserved > - clear PGC_extra in alloc_domheap_pages() only if MEMF_no_refcount is set > v8: > - fixed PGC_extra ASSERT fail in alloc_domheap_pages() by removing PGC_extra > before freeing > v7: > - PGC_preserved used also in mark_page_free() > v6: > - preserved_flags renamed to PGC_preserved > - PGC_preserved is used only in assign_pages() > v5: > - new patch > --- > xen/common/page_alloc.c | 19 ++++++++++++++----- > 1 file changed, 14 insertions(+), 5 deletions(-) > > diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c > index 7b911b5ed9..34cd473150 100644 > --- a/xen/common/page_alloc.c > +++ b/xen/common/page_alloc.c > @@ -160,6 +160,7 @@ > #endif > > #define PGC_no_buddy_merge PGC_static > +#define PGC_preserved (PGC_extra | PGC_static | PGC_broken) > > #ifndef PGT_TYPE_INFO_INITIALIZER > #define PGT_TYPE_INFO_INITIALIZER 0 > @@ -1427,12 +1428,11 @@ static bool mark_page_free(struct page_info *pg, mfn_t mfn) > { > case PGC_state_inuse: > BUG_ON(pg->count_info & PGC_broken); > - pg->count_info = PGC_state_free; > + pg->count_info = PGC_state_free | (pg->count_info & PGC_preserved); > break; PGC_extra doesn't want preserving here. Since it's mark_page_free(), and since PGC_extra is removed before freeing, the state change is apparently fine. But an assertion may want adding, for documentation purposes if nothing else. Alternatively it may make sense to indeed exclude PGC_extra here. In fact PGC_static doesn't need using here either, as unprepare_staticmem_pages() will explicitly set it again anyway. Hence I wonder whether the change here really is necessary (one will then be needed in the next patch aiui, when PGC_colored is introduced). Which would then eliminate the need for the final two hunks of the patch, I think. > case PGC_state_offlining: > - pg->count_info = (pg->count_info & PGC_broken) | > - PGC_state_offlined; > + pg->count_info = (pg->count_info & PGC_preserved) | PGC_state_offlined; > pg_offlined = true; > break; I'm similarly unconvinced that anything other than PGC_broken (and subsequently perhaps PGC_colored) would need preserving here. Jan
Hi Jan, On Thu, Nov 28, 2024 at 12:05 PM Jan Beulich <jbeulich@suse.com> wrote: > > On 19.11.2024 15:13, Carlo Nonato wrote: > > PGC_static, PGC_extra and PGC_broken need to be preserved when assigning a > > page. Define a new macro that groups those flags and use it instead of or'ing > > every time. > > > > To make preserved flags even more meaningful, they are kept also when > > switching state in mark_page_free(). > > Enforce the removal of PGC_extra before freeing domain pages as this is > > considered an error and can cause ASSERT violations. > > > > Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech> > > --- > > v10: > > - fixed commit message > > v9: > > - add PGC_broken to PGC_preserved > > - clear PGC_extra in alloc_domheap_pages() only if MEMF_no_refcount is set > > v8: > > - fixed PGC_extra ASSERT fail in alloc_domheap_pages() by removing PGC_extra > > before freeing > > v7: > > - PGC_preserved used also in mark_page_free() > > v6: > > - preserved_flags renamed to PGC_preserved > > - PGC_preserved is used only in assign_pages() > > v5: > > - new patch > > --- > > xen/common/page_alloc.c | 19 ++++++++++++++----- > > 1 file changed, 14 insertions(+), 5 deletions(-) > > > > diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c > > index 7b911b5ed9..34cd473150 100644 > > --- a/xen/common/page_alloc.c > > +++ b/xen/common/page_alloc.c > > @@ -160,6 +160,7 @@ > > #endif > > > > #define PGC_no_buddy_merge PGC_static > > +#define PGC_preserved (PGC_extra | PGC_static | PGC_broken) > > > > #ifndef PGT_TYPE_INFO_INITIALIZER > > #define PGT_TYPE_INFO_INITIALIZER 0 > > @@ -1427,12 +1428,11 @@ static bool mark_page_free(struct page_info *pg, mfn_t mfn) > > { > > case PGC_state_inuse: > > BUG_ON(pg->count_info & PGC_broken); > > - pg->count_info = PGC_state_free; > > + pg->count_info = PGC_state_free | (pg->count_info & PGC_preserved); > > break; > > PGC_extra doesn't want preserving here. Since it's mark_page_free(), and > since PGC_extra is removed before freeing, the state change is apparently > fine. But an assertion may want adding, for documentation purposes if > nothing else. > > Alternatively it may make sense to indeed exclude PGC_extra here. In fact > PGC_static doesn't need using here either, as unprepare_staticmem_pages() > will explicitly set it again anyway. > > Hence I wonder whether the change here really is necessary (one will then > be needed in the next patch aiui, when PGC_colored is introduced). Which > would then eliminate the need for the final two hunks of the patch, I > think. > > > case PGC_state_offlining: > > - pg->count_info = (pg->count_info & PGC_broken) | > > - PGC_state_offlined; > > + pg->count_info = (pg->count_info & PGC_preserved) | PGC_state_offlined; > > pg_offlined = true; > > break; > > I'm similarly unconvinced that anything other than PGC_broken (and > subsequently perhaps PGC_colored) would need preserving here. Yes, we (re)checked the code and also believe that the introduction of PGC_preserved is generating more confusion (and code) then the actual logical help it provides. We'll remove this patch and integrate PGC_colored explicitly in the flags to be preserved. This avoid the clumsy logic of preserving something (extra) when it's not needed and then handling the special case to remove it explicitly. Basically my goal is to go back to v4 where this patch didn't exist. > Jan Thanks. - Carlo
On 29.11.2024 10:32, Carlo Nonato wrote: > Hi Jan, > > On Thu, Nov 28, 2024 at 12:05 PM Jan Beulich <jbeulich@suse.com> wrote: >> >> On 19.11.2024 15:13, Carlo Nonato wrote: >>> PGC_static, PGC_extra and PGC_broken need to be preserved when assigning a >>> page. Define a new macro that groups those flags and use it instead of or'ing >>> every time. >>> >>> To make preserved flags even more meaningful, they are kept also when >>> switching state in mark_page_free(). >>> Enforce the removal of PGC_extra before freeing domain pages as this is >>> considered an error and can cause ASSERT violations. >>> >>> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech> >>> --- >>> v10: >>> - fixed commit message >>> v9: >>> - add PGC_broken to PGC_preserved >>> - clear PGC_extra in alloc_domheap_pages() only if MEMF_no_refcount is set >>> v8: >>> - fixed PGC_extra ASSERT fail in alloc_domheap_pages() by removing PGC_extra >>> before freeing >>> v7: >>> - PGC_preserved used also in mark_page_free() >>> v6: >>> - preserved_flags renamed to PGC_preserved >>> - PGC_preserved is used only in assign_pages() >>> v5: >>> - new patch >>> --- >>> xen/common/page_alloc.c | 19 ++++++++++++++----- >>> 1 file changed, 14 insertions(+), 5 deletions(-) >>> >>> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c >>> index 7b911b5ed9..34cd473150 100644 >>> --- a/xen/common/page_alloc.c >>> +++ b/xen/common/page_alloc.c >>> @@ -160,6 +160,7 @@ >>> #endif >>> >>> #define PGC_no_buddy_merge PGC_static >>> +#define PGC_preserved (PGC_extra | PGC_static | PGC_broken) >>> >>> #ifndef PGT_TYPE_INFO_INITIALIZER >>> #define PGT_TYPE_INFO_INITIALIZER 0 >>> @@ -1427,12 +1428,11 @@ static bool mark_page_free(struct page_info *pg, mfn_t mfn) >>> { >>> case PGC_state_inuse: >>> BUG_ON(pg->count_info & PGC_broken); >>> - pg->count_info = PGC_state_free; >>> + pg->count_info = PGC_state_free | (pg->count_info & PGC_preserved); >>> break; >> >> PGC_extra doesn't want preserving here. Since it's mark_page_free(), and >> since PGC_extra is removed before freeing, the state change is apparently >> fine. But an assertion may want adding, for documentation purposes if >> nothing else. >> >> Alternatively it may make sense to indeed exclude PGC_extra here. In fact >> PGC_static doesn't need using here either, as unprepare_staticmem_pages() >> will explicitly set it again anyway. >> >> Hence I wonder whether the change here really is necessary (one will then >> be needed in the next patch aiui, when PGC_colored is introduced). Which >> would then eliminate the need for the final two hunks of the patch, I >> think. >> >>> case PGC_state_offlining: >>> - pg->count_info = (pg->count_info & PGC_broken) | >>> - PGC_state_offlined; >>> + pg->count_info = (pg->count_info & PGC_preserved) | PGC_state_offlined; >>> pg_offlined = true; >>> break; >> >> I'm similarly unconvinced that anything other than PGC_broken (and >> subsequently perhaps PGC_colored) would need preserving here. > > Yes, we (re)checked the code and also believe that the introduction of > PGC_preserved is generating more confusion (and code) then the actual logical > help it provides. > > We'll remove this patch and integrate PGC_colored explicitly in the flags to > be preserved. This avoid the clumsy logic of preserving something (extra) > when it's not needed and then handling the special case to remove it > explicitly. > Basically my goal is to go back to v4 where this patch didn't exist. Hmm, no, I don't think I said anything in the direction of removing PGC_preserved again. I merely think you went too far in where it actually wants using. Jan
Hi, On 29/11/2024 12:09, Jan Beulich wrote: > On 29.11.2024 10:32, Carlo Nonato wrote: >> Hi Jan, >> >> On Thu, Nov 28, 2024 at 12:05 PM Jan Beulich <jbeulich@suse.com> wrote: >>> >>> On 19.11.2024 15:13, Carlo Nonato wrote: >>>> PGC_static, PGC_extra and PGC_broken need to be preserved when assigning a >>>> page. Define a new macro that groups those flags and use it instead of or'ing >>>> every time. >>>> >>>> To make preserved flags even more meaningful, they are kept also when >>>> switching state in mark_page_free(). >>>> Enforce the removal of PGC_extra before freeing domain pages as this is >>>> considered an error and can cause ASSERT violations. >>>> >>>> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech> >>>> --- >>>> v10: >>>> - fixed commit message >>>> v9: >>>> - add PGC_broken to PGC_preserved >>>> - clear PGC_extra in alloc_domheap_pages() only if MEMF_no_refcount is set >>>> v8: >>>> - fixed PGC_extra ASSERT fail in alloc_domheap_pages() by removing PGC_extra >>>> before freeing >>>> v7: >>>> - PGC_preserved used also in mark_page_free() >>>> v6: >>>> - preserved_flags renamed to PGC_preserved >>>> - PGC_preserved is used only in assign_pages() >>>> v5: >>>> - new patch >>>> --- >>>> xen/common/page_alloc.c | 19 ++++++++++++++----- >>>> 1 file changed, 14 insertions(+), 5 deletions(-) >>>> >>>> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c >>>> index 7b911b5ed9..34cd473150 100644 >>>> --- a/xen/common/page_alloc.c >>>> +++ b/xen/common/page_alloc.c >>>> @@ -160,6 +160,7 @@ >>>> #endif >>>> >>>> #define PGC_no_buddy_merge PGC_static >>>> +#define PGC_preserved (PGC_extra | PGC_static | PGC_broken) >>>> >>>> #ifndef PGT_TYPE_INFO_INITIALIZER >>>> #define PGT_TYPE_INFO_INITIALIZER 0 >>>> @@ -1427,12 +1428,11 @@ static bool mark_page_free(struct page_info *pg, mfn_t mfn) >>>> { >>>> case PGC_state_inuse: >>>> BUG_ON(pg->count_info & PGC_broken); >>>> - pg->count_info = PGC_state_free; >>>> + pg->count_info = PGC_state_free | (pg->count_info & PGC_preserved); >>>> break; >>> >>> PGC_extra doesn't want preserving here. Since it's mark_page_free(), and >>> since PGC_extra is removed before freeing, the state change is apparently >>> fine. But an assertion may want adding, for documentation purposes if >>> nothing else. >>> >>> Alternatively it may make sense to indeed exclude PGC_extra here. In fact >>> PGC_static doesn't need using here either, as unprepare_staticmem_pages() >>> will explicitly set it again anyway. >>> >>> Hence I wonder whether the change here really is necessary (one will then >>> be needed in the next patch aiui, when PGC_colored is introduced). Which >>> would then eliminate the need for the final two hunks of the patch, I >>> think. >>> >>>> case PGC_state_offlining: >>>> - pg->count_info = (pg->count_info & PGC_broken) | >>>> - PGC_state_offlined; >>>> + pg->count_info = (pg->count_info & PGC_preserved) | PGC_state_offlined; >>>> pg_offlined = true; >>>> break; >>> >>> I'm similarly unconvinced that anything other than PGC_broken (and >>> subsequently perhaps PGC_colored) would need preserving here. >> >> Yes, we (re)checked the code and also believe that the introduction of >> PGC_preserved is generating more confusion (and code) then the actual logical >> help it provides. >> >> We'll remove this patch and integrate PGC_colored explicitly in the flags to >> be preserved. This avoid the clumsy logic of preserving something (extra) >> when it's not needed and then handling the special case to remove it >> explicitly. >> Basically my goal is to go back to v4 where this patch didn't exist. > > Hmm, no, I don't think I said anything in the direction of removing PGC_preserved > again. I merely think you went too far in where it actually wants using. Let me try to better detail the intention of what we have in mind. Here again parts of this patch: > diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c > index 7b911b5ed9..34cd473150 100644 > --- a/xen/common/page_alloc.c > +++ b/xen/common/page_alloc.c > @@ -160,6 +160,7 @@ > #endif > > #define PGC_no_buddy_merge PGC_static > +#define PGC_preserved (PGC_extra | PGC_static | PGC_broken) The granularity of preserved as implemented by v10 is different than what you require... > > #ifndef PGT_TYPE_INFO_INITIALIZER > #define PGT_TYPE_INFO_INITIALIZER 0 > @@ -1427,12 +1428,11 @@ static bool mark_page_free(struct page_info *pg, mfn_t mfn) > { > case PGC_state_inuse: > BUG_ON(pg->count_info & PGC_broken); > - pg->count_info = PGC_state_free; > + pg->count_info = PGC_state_free | (pg->count_info & PGC_preserved); here (no need), > break; > > case PGC_state_offlining: > - pg->count_info = (pg->count_info & PGC_broken) | > - PGC_state_offlined; > + pg->count_info = (pg->count_info & PGC_preserved) | PGC_state_offlined; here (only broken), > pg_offlined = true; > break; > > @@ -2366,7 +2366,7 @@ int assign_pages( > > for ( i = 0; i < nr; i++ ) > { > - ASSERT(!(pg[i].count_info & ~(PGC_extra | PGC_static))); > + ASSERT(!(pg[i].count_info & ~PGC_preserved)); here (no broken), > if ( pg[i].count_info & PGC_extra ) > extra_pages++; > } > @@ -2426,7 +2426,7 @@ int assign_pages( > page_set_owner(&pg[i], d); > smp_wmb(); /* Domain pointer must be visible before updating refcnt. */ > pg[i].count_info = > - (pg[i].count_info & (PGC_extra | PGC_static)) | PGC_allocated | 1; > + (pg[i].count_info & PGC_preserved) | PGC_allocated | 1; here (no broken). And it also leads (as noted) to the special management of extra... > > page_list_add_tail(&pg[i], page_to_list(d, &pg[i])); > } > @@ -2485,6 +2485,14 @@ struct page_info *alloc_domheap_pages( > } > if ( assign_page(pg, order, d, memflags) ) > { > + if ( memflags & MEMF_no_refcount ) > + { > + unsigned long i; > + > + for ( i = 0; i < (1UL << order); i++ ) > + pg[i].count_info &= ~PGC_extra; > + } here (likely not need, should be impossible to trigger), > + > free_heap_pages(pg, order, memflags & MEMF_no_scrub); > return NULL; > } > @@ -2539,6 +2547,7 @@ void free_domheap_pages(struct page_info *pg, unsigned int order) > { > ASSERT(d->extra_pages); > d->extra_pages--; > + pg[i].count_info &= ~PGC_extra; and here. The proposal would be to go back to v9, which reduces (for the PGC) the management to colored only: > diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c ... > @@ -2382,7 +2556,7 @@ int assign_pages( > > for ( i = 0; i < nr; i++ ) > { > - ASSERT(!(pg[i].count_info & ~(PGC_extra | PGC_static))); > + ASSERT(!(pg[i].count_info & ~(PGC_extra | PGC_static | PGC_colored))); > if ( pg[i].count_info & PGC_extra ) > extra_pages++; > } > @@ -2442,7 +2616,8 @@ int assign_pages( > page_set_owner(&pg[i], d); > smp_wmb(); /* Domain pointer must be visible before updating refcnt. */ > pg[i].count_info = > - (pg[i].count_info & (PGC_extra | PGC_static)) | PGC_allocated | 1; > + (pg[i].count_info & (PGC_extra | PGC_static | PGC_colored)) | > + PGC_allocated | 1; > > page_list_add_tail(&pg[i], page_to_list(d, &pg[i])); Thanks, Andrea
On 02.12.2024 13:51, Andrea Bastoni wrote: > The proposal would be to go back to v9, which reduces (for the PGC) > the management to colored only: > >> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c > ... >> @@ -2382,7 +2556,7 @@ int assign_pages( >> >> for ( i = 0; i < nr; i++ ) >> { >> - ASSERT(!(pg[i].count_info & ~(PGC_extra | PGC_static))); >> + ASSERT(!(pg[i].count_info & ~(PGC_extra | PGC_static | PGC_colored))); >> if ( pg[i].count_info & PGC_extra ) >> extra_pages++; >> } >> @@ -2442,7 +2616,8 @@ int assign_pages( >> page_set_owner(&pg[i], d); >> smp_wmb(); /* Domain pointer must be visible before updating refcnt. */ >> pg[i].count_info = >> - (pg[i].count_info & (PGC_extra | PGC_static)) | PGC_allocated | 1; >> + (pg[i].count_info & (PGC_extra | PGC_static | PGC_colored)) | >> + PGC_allocated | 1; >> >> page_list_add_tail(&pg[i], page_to_list(d, &pg[i])); So what's wrong with having PGC_preserved _just_ for these two instances? Jan
© 2016 - 2024 Red Hat, Inc.