[PATCH] x86/gen-cpuid: correct cycle detection

Jan Beulich posted 1 patch 3 months, 2 weeks ago
Failed in applying to current master (apply log)
There is a newer version of this series
[PATCH] x86/gen-cpuid: correct cycle detection
Posted by Jan Beulich 3 months, 2 weeks ago
With the processing done linearly (rather than recursively), checking
whether any of the features was previously seen is wrong: That would
e.g. trigger for this simple set of dependencies

    X: [A, B]
    A: [C]
    B: [C]

(observed in reality when making AMX-AVX512 dependent upon both
AMX-TILE and AVX512F, causing XSAVE to see AMX-AVX512 twice in its list
of dependents). But checking the whole accumulated set also isn't
necessary - just checking the feature we're processing dependents of is
sufficient. We may detect a cycle later that way, but we still will
detect it. What we need to avoid is adding a feature again when we've
already seen it.

As a result, seeding "seen[]" with "feat" isn't necessary anymore.

Fixes: fe4408d180f4 ("xen/x86: Generate deep dependencies of features")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
Doing AMX-AVX512's dependencies like mentioned above still isn't quite
right; we really need AVX512F || AVX10, which can't be expressed right
now.

This contextually collides with patch 2 of "x86/cpu-policy: minor
adjustments", posted almost 2 years ago and still pending (afair) any
kind of feedback.

I'd like to note that the commented out code in the loop (sitting
between the two hunks beklow) doesn't really work for ARCH_CAPS: The
first unused bit (between XAPIC_STATUS and OVRCLK_STATUS) triggers

Traceback (most recent call last):
  File ".../xen/../xen/tools/gen-cpuid.py", line 608, in <module>
    sys.exit(main())
  File ".../xen/../xen/tools/gen-cpuid.py", line 602, in main
    crunch_numbers(state)
  File ".../xen/../xen/tools/gen-cpuid.py", line 382, in crunch_numbers
    (state.names[feat], repl(seen)))
  File ".../xen/../xen/tools/gen-cpuid.py", line 378, in repl
    return "[" + ", ".join((state.names[x] for x in l)) + "]"
  File ".../xen/../xen/tools/gen-cpuid.py", line 378, in <genexpr>
    return "[" + ", ".join((state.names[x] for x in l)) + "]"
KeyError: 534

(line numbers slightly shifted due to other debugging code I had added).
My Python clearly isn't good enough to try to guess how to fix that.

--- a/xen/tools/gen-cpuid.py
+++ b/xen/tools/gen-cpuid.py
@@ -350,7 +350,7 @@ def crunch_numbers(state):
 
     for feat in deep_features:
 
-        seen = [feat]
+        seen = []
         to_process = list(deps[feat])
 
         while len(to_process):
@@ -363,14 +363,14 @@ def crunch_numbers(state):
 
             f = to_process.pop(0)
 
-            if f in seen:
-                raise Fail("ERROR: Cycle found with %s when processing %s"
-                           % (state.names[f], state.names[feat]))
+            if f == feat:
+                raise Fail("ERROR: Cycle found with %s" % (state.names[f], ))
 
-            seen.append(f)
-            to_process = list(set(to_process + deps.get(f, [])))
+            if not (f in seen):
+                seen.append(f)
+                to_process = list(set(to_process + deps.get(f, [])))
 
-        state.deep_deps[feat] = seen[1:]
+        state.deep_deps[feat] = seen
 
     state.deep_features = deps.keys()
     state.nr_deep_deps = len(state.deep_deps.keys())
Re: [PATCH] x86/gen-cpuid: correct cycle detection
Posted by Andrew Cooper 2 months ago
On 16/07/2025 7:59 am, Jan Beulich wrote:
> With the processing done linearly (rather than recursively), checking
> whether any of the features was previously seen is wrong: That would
> e.g. trigger for this simple set of dependencies
>
>     X: [A, B]
>     A: [C]
>     B: [C]
>
> (observed in reality when making AMX-AVX512 dependent upon both
> AMX-TILE and AVX512F, causing XSAVE to see AMX-AVX512 twice in its list
> of dependents). But checking the whole accumulated set also isn't
> necessary - just checking the feature we're processing dependents of is
> sufficient. We may detect a cycle later that way, but we still will
> detect it. What we need to avoid is adding a feature again when we've
> already seen it.
>
> As a result, seeding "seen[]" with "feat" isn't necessary anymore.
>
> Fixes: fe4408d180f4 ("xen/x86: Generate deep dependencies of features")
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> Doing AMX-AVX512's dependencies like mentioned above still isn't quite
> right; we really need AVX512F || AVX10, which can't be expressed right
> now.
>
> This contextually collides with patch 2 of "x86/cpu-policy: minor
> adjustments", posted almost 2 years ago and still pending (afair) any
> kind of feedback.
>
> I'd like to note that the commented out code in the loop (sitting
> between the two hunks beklow) doesn't really work for ARCH_CAPS: The
> first unused bit (between XAPIC_STATUS and OVRCLK_STATUS) triggers
>
> Traceback (most recent call last):
>   File ".../xen/../xen/tools/gen-cpuid.py", line 608, in <module>
>     sys.exit(main())
>   File ".../xen/../xen/tools/gen-cpuid.py", line 602, in main
>     crunch_numbers(state)
>   File ".../xen/../xen/tools/gen-cpuid.py", line 382, in crunch_numbers
>     (state.names[feat], repl(seen)))
>   File ".../xen/../xen/tools/gen-cpuid.py", line 378, in repl
>     return "[" + ", ".join((state.names[x] for x in l)) + "]"
>   File ".../xen/../xen/tools/gen-cpuid.py", line 378, in <genexpr>
>     return "[" + ", ".join((state.names[x] for x in l)) + "]"
> KeyError: 534
>
> (line numbers slightly shifted due to other debugging code I had added).
> My Python clearly isn't good enough to try to guess how to fix that.

I've posted a fix for this.


>
> --- a/xen/tools/gen-cpuid.py
> +++ b/xen/tools/gen-cpuid.py
> @@ -350,7 +350,7 @@ def crunch_numbers(state):
>  
>      for feat in deep_features:
>  
> -        seen = [feat]
> +        seen = []
>          to_process = list(deps[feat])
>  
>          while len(to_process):
> @@ -363,14 +363,14 @@ def crunch_numbers(state):
>  
>              f = to_process.pop(0)
>  
> -            if f in seen:
> -                raise Fail("ERROR: Cycle found with %s when processing %s"
> -                           % (state.names[f], state.names[feat]))
> +            if f == feat:
> +                raise Fail("ERROR: Cycle found with %s" % (state.names[f], ))

Despite f and feat being the same now, I think this wants to keep the
other part of the sentence.  i.e. "Cycle found when processing %s".

It's a little awkward that there's no sensible way to reverse engineer
the cycle and print it, but it's also been far too long since I last did
graph theory.

>  
> -            seen.append(f)
> -            to_process = list(set(to_process + deps.get(f, [])))
> +            if not (f in seen):
> +                seen.append(f)
> +                to_process = list(set(to_process + deps.get(f, [])))

    if f not in seen:

But this will be a simpler patch if you do:

    if f in seen:
        continue

and don't change the indentation of of seen.append()

After this fix goes in, and now because the order is less relevant, I
probably ought to rewrite this to use sets rather than lists.  I have a
suspicion it can be done better than one-at-a-time; all that matters if
we don't see a repeat feature in deps.  We don't need to check the
feature bits outside of deps because they're (by definition) leaf values.

~Andrew