[PATCH v2] x86/gen-cpuid: correct cycle detection

Jan Beulich posted 1 patch 1 month, 4 weeks ago
Failed in applying to current master (apply log)
[PATCH v2] x86/gen-cpuid: correct cycle detection
Posted by Jan Beulich 1 month, 4 weeks ago
With the processing done linearly (rather than recursively), checking
whether any of the features was previously seen is wrong: That would
e.g. trigger for this simple set of dependencies

    X: [A, B]
    A: [C]
    B: [C]

(observed in reality when making AMX-AVX512 dependent upon both
AMX-TILE and AVX512F, causing XSAVE to see AMX-AVX512 twice in its list
of dependents). But checking the whole accumulated set also isn't
necessary - just checking the feature we're processing dependents of is
sufficient. We may detect a cycle later that way, but we still will
detect it. What we need to avoid is adding a feature again when we've
already seen it.

As a result, seeding "seen[]" with "feat" isn't necessary anymore.

Fixes: fe4408d180f4 ("xen/x86: Generate deep dependencies of features")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
Doing AMX-AVX512's dependencies like mentioned above still isn't quite
right; we really need AVX512F || AVX10, which can't be expressed right
now. I'm now handling this by some custom code in the AVX10 series.

This contextually collides with patch 2 of "x86/cpu-policy: minor
adjustments", posted almost 2 years ago and still pending (afair) any
kind of feedback.
---
v2: Adjust an error message. Reduce diff / indentation some.

--- a/xen/tools/gen-cpuid.py
+++ b/xen/tools/gen-cpuid.py
@@ -366,7 +366,7 @@ def crunch_numbers(state):
 
     for feat in deep_features:
 
-        seen = [feat]
+        seen = []
         to_process = list(deps[feat])
 
         while len(to_process):
@@ -379,14 +379,17 @@ def crunch_numbers(state):
 
             f = to_process.pop(0)
 
+            if f == feat:
+                raise Fail("ERROR: Cycle found when processing %s" % \
+                           (state.names[f], ))
+
             if f in seen:
-                raise Fail("ERROR: Cycle found with %s when processing %s"
-                           % (state.names[f], state.names[feat]))
+                continue
 
             seen.append(f)
             to_process = list(set(to_process + deps.get(f, [])))
 
-        state.deep_deps[feat] = seen[1:]
+        state.deep_deps[feat] = seen
 
     state.deep_features = deps.keys()
     state.nr_deep_deps = len(state.deep_deps.keys())
Re: [PATCH v2] x86/gen-cpuid: correct cycle detection
Posted by Andrew Cooper 1 month, 4 weeks ago
On 01/09/2025 9:56 am, Jan Beulich wrote:
> With the processing done linearly (rather than recursively), checking
> whether any of the features was previously seen is wrong: That would
> e.g. trigger for this simple set of dependencies
>
>     X: [A, B]
>     A: [C]
>     B: [C]
>
> (observed in reality when making AMX-AVX512 dependent upon both
> AMX-TILE and AVX512F, causing XSAVE to see AMX-AVX512 twice in its list
> of dependents). But checking the whole accumulated set also isn't
> necessary - just checking the feature we're processing dependents of is
> sufficient. We may detect a cycle later that way, but we still will
> detect it. What we need to avoid is adding a feature again when we've
> already seen it.
>
> As a result, seeding "seen[]" with "feat" isn't necessary anymore.
>
> Fixes: fe4408d180f4 ("xen/x86: Generate deep dependencies of features")
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>, with one further
minor adjustment.

> --- a/xen/tools/gen-cpuid.py
> +++ b/xen/tools/gen-cpuid.py
> @@ -379,14 +379,17 @@ def crunch_numbers(state):
>  
>              f = to_process.pop(0)
>  
> +            if f == feat:
> +                raise Fail("ERROR: Cycle found when processing %s" % \

No need for the \ here.

~Andrew
Re: [PATCH v2] x86/gen-cpuid: correct cycle detection
Posted by Jan Beulich 1 month, 4 weeks ago
On 01.09.2025 12:31, Andrew Cooper wrote:
> On 01/09/2025 9:56 am, Jan Beulich wrote:
>> With the processing done linearly (rather than recursively), checking
>> whether any of the features was previously seen is wrong: That would
>> e.g. trigger for this simple set of dependencies
>>
>>     X: [A, B]
>>     A: [C]
>>     B: [C]
>>
>> (observed in reality when making AMX-AVX512 dependent upon both
>> AMX-TILE and AVX512F, causing XSAVE to see AMX-AVX512 twice in its list
>> of dependents). But checking the whole accumulated set also isn't
>> necessary - just checking the feature we're processing dependents of is
>> sufficient. We may detect a cycle later that way, but we still will
>> detect it. What we need to avoid is adding a feature again when we've
>> already seen it.
>>
>> As a result, seeding "seen[]" with "feat" isn't necessary anymore.
>>
>> Fixes: fe4408d180f4 ("xen/x86: Generate deep dependencies of features")
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>, with one further
> minor adjustment.

Thanks.

>> --- a/xen/tools/gen-cpuid.py
>> +++ b/xen/tools/gen-cpuid.py
>> @@ -379,14 +379,17 @@ def crunch_numbers(state):
>>  
>>              f = to_process.pop(0)
>>  
>> +            if f == feat:
>> +                raise Fail("ERROR: Cycle found when processing %s" % \
> 
> No need for the \ here.

Okay, but then why is there one in the commented out code you touch in the
other patch?

Jan
Re: [PATCH v2] x86/gen-cpuid: correct cycle detection
Posted by Andrew Cooper 1 month, 4 weeks ago
On 01/09/2025 12:02 pm, Jan Beulich wrote:
> On 01.09.2025 12:31, Andrew Cooper wrote:
>> On 01/09/2025 9:56 am, Jan Beulich wrote:
>>> With the processing done linearly (rather than recursively), checking
>>> whether any of the features was previously seen is wrong: That would
>>> e.g. trigger for this simple set of dependencies
>>>
>>>     X: [A, B]
>>>     A: [C]
>>>     B: [C]
>>>
>>> (observed in reality when making AMX-AVX512 dependent upon both
>>> AMX-TILE and AVX512F, causing XSAVE to see AMX-AVX512 twice in its list
>>> of dependents). But checking the whole accumulated set also isn't
>>> necessary - just checking the feature we're processing dependents of is
>>> sufficient. We may detect a cycle later that way, but we still will
>>> detect it. What we need to avoid is adding a feature again when we've
>>> already seen it.
>>>
>>> As a result, seeding "seen[]" with "feat" isn't necessary anymore.
>>>
>>> Fixes: fe4408d180f4 ("xen/x86: Generate deep dependencies of features")
>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>, with one further
>> minor adjustment.
> Thanks.
>
>>> --- a/xen/tools/gen-cpuid.py
>>> +++ b/xen/tools/gen-cpuid.py
>>> @@ -379,14 +379,17 @@ def crunch_numbers(state):
>>>  
>>>              f = to_process.pop(0)
>>>  
>>> +            if f == feat:
>>> +                raise Fail("ERROR: Cycle found when processing %s" % \
>> No need for the \ here.
> Okay, but then why is there one in the commented out code you touch in the
> other patch?

Oh, that's wrong too.

That will have originally been a print statement (no brackets in py2,
thus needing the line continuation) which I refactored to
sys.stderr.write() (has brackets) and didn't clean up correctly.

I'll adjust it in my patch, as I'm dropping the trailing whitespace as well.

~Andrew