With the processing done linearly (rather than recursively), checking
whether any of the features was previously seen is wrong: That would
e.g. trigger for this simple set of dependencies
X: [A, B]
A: [C]
B: [C]
(observed in reality when making AMX-AVX512 dependent upon both
AMX-TILE and AVX512F, causing XSAVE to see AMX-AVX512 twice in its list
of dependents). But checking the whole accumulated set also isn't
necessary - just checking the feature we're processing dependents of is
sufficient. We may detect a cycle later that way, but we still will
detect it. What we need to avoid is adding a feature again when we've
already seen it.
As a result, seeding "seen[]" with "feat" isn't necessary anymore.
Fixes: fe4408d180f4 ("xen/x86: Generate deep dependencies of features")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
Doing AMX-AVX512's dependencies like mentioned above still isn't quite
right; we really need AVX512F || AVX10, which can't be expressed right
now. I'm now handling this by some custom code in the AVX10 series.
This contextually collides with patch 2 of "x86/cpu-policy: minor
adjustments", posted almost 2 years ago and still pending (afair) any
kind of feedback.
---
v2: Adjust an error message. Reduce diff / indentation some.
--- a/xen/tools/gen-cpuid.py
+++ b/xen/tools/gen-cpuid.py
@@ -366,7 +366,7 @@ def crunch_numbers(state):
for feat in deep_features:
- seen = [feat]
+ seen = []
to_process = list(deps[feat])
while len(to_process):
@@ -379,14 +379,17 @@ def crunch_numbers(state):
f = to_process.pop(0)
+ if f == feat:
+ raise Fail("ERROR: Cycle found when processing %s" % \
+ (state.names[f], ))
+
if f in seen:
- raise Fail("ERROR: Cycle found with %s when processing %s"
- % (state.names[f], state.names[feat]))
+ continue
seen.append(f)
to_process = list(set(to_process + deps.get(f, [])))
- state.deep_deps[feat] = seen[1:]
+ state.deep_deps[feat] = seen
state.deep_features = deps.keys()
state.nr_deep_deps = len(state.deep_deps.keys())
On 01/09/2025 9:56 am, Jan Beulich wrote:
> With the processing done linearly (rather than recursively), checking
> whether any of the features was previously seen is wrong: That would
> e.g. trigger for this simple set of dependencies
>
> X: [A, B]
> A: [C]
> B: [C]
>
> (observed in reality when making AMX-AVX512 dependent upon both
> AMX-TILE and AVX512F, causing XSAVE to see AMX-AVX512 twice in its list
> of dependents). But checking the whole accumulated set also isn't
> necessary - just checking the feature we're processing dependents of is
> sufficient. We may detect a cycle later that way, but we still will
> detect it. What we need to avoid is adding a feature again when we've
> already seen it.
>
> As a result, seeding "seen[]" with "feat" isn't necessary anymore.
>
> Fixes: fe4408d180f4 ("xen/x86: Generate deep dependencies of features")
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>, with one further
minor adjustment.
> --- a/xen/tools/gen-cpuid.py
> +++ b/xen/tools/gen-cpuid.py
> @@ -379,14 +379,17 @@ def crunch_numbers(state):
>
> f = to_process.pop(0)
>
> + if f == feat:
> + raise Fail("ERROR: Cycle found when processing %s" % \
No need for the \ here.
~Andrew
On 01.09.2025 12:31, Andrew Cooper wrote:
> On 01/09/2025 9:56 am, Jan Beulich wrote:
>> With the processing done linearly (rather than recursively), checking
>> whether any of the features was previously seen is wrong: That would
>> e.g. trigger for this simple set of dependencies
>>
>> X: [A, B]
>> A: [C]
>> B: [C]
>>
>> (observed in reality when making AMX-AVX512 dependent upon both
>> AMX-TILE and AVX512F, causing XSAVE to see AMX-AVX512 twice in its list
>> of dependents). But checking the whole accumulated set also isn't
>> necessary - just checking the feature we're processing dependents of is
>> sufficient. We may detect a cycle later that way, but we still will
>> detect it. What we need to avoid is adding a feature again when we've
>> already seen it.
>>
>> As a result, seeding "seen[]" with "feat" isn't necessary anymore.
>>
>> Fixes: fe4408d180f4 ("xen/x86: Generate deep dependencies of features")
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>, with one further
> minor adjustment.
Thanks.
>> --- a/xen/tools/gen-cpuid.py
>> +++ b/xen/tools/gen-cpuid.py
>> @@ -379,14 +379,17 @@ def crunch_numbers(state):
>>
>> f = to_process.pop(0)
>>
>> + if f == feat:
>> + raise Fail("ERROR: Cycle found when processing %s" % \
>
> No need for the \ here.
Okay, but then why is there one in the commented out code you touch in the
other patch?
Jan
On 01/09/2025 12:02 pm, Jan Beulich wrote:
> On 01.09.2025 12:31, Andrew Cooper wrote:
>> On 01/09/2025 9:56 am, Jan Beulich wrote:
>>> With the processing done linearly (rather than recursively), checking
>>> whether any of the features was previously seen is wrong: That would
>>> e.g. trigger for this simple set of dependencies
>>>
>>> X: [A, B]
>>> A: [C]
>>> B: [C]
>>>
>>> (observed in reality when making AMX-AVX512 dependent upon both
>>> AMX-TILE and AVX512F, causing XSAVE to see AMX-AVX512 twice in its list
>>> of dependents). But checking the whole accumulated set also isn't
>>> necessary - just checking the feature we're processing dependents of is
>>> sufficient. We may detect a cycle later that way, but we still will
>>> detect it. What we need to avoid is adding a feature again when we've
>>> already seen it.
>>>
>>> As a result, seeding "seen[]" with "feat" isn't necessary anymore.
>>>
>>> Fixes: fe4408d180f4 ("xen/x86: Generate deep dependencies of features")
>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>, with one further
>> minor adjustment.
> Thanks.
>
>>> --- a/xen/tools/gen-cpuid.py
>>> +++ b/xen/tools/gen-cpuid.py
>>> @@ -379,14 +379,17 @@ def crunch_numbers(state):
>>>
>>> f = to_process.pop(0)
>>>
>>> + if f == feat:
>>> + raise Fail("ERROR: Cycle found when processing %s" % \
>> No need for the \ here.
> Okay, but then why is there one in the commented out code you touch in the
> other patch?
Oh, that's wrong too.
That will have originally been a print statement (no brackets in py2,
thus needing the line continuation) which I refactored to
sys.stderr.write() (has brackets) and didn't clean up correctly.
I'll adjust it in my patch, as I'm dropping the trailing whitespace as well.
~Andrew
© 2016 - 2025 Red Hat, Inc.