Hi James,
On Wed, Aug 27, 2025 at 06:11:25PM +0100, James Morse wrote:
> Hi Dave,
>
> On 27/08/2025 11:46, Dave Martin wrote:
> > On Fri, Aug 22, 2025 at 03:29:42PM +0000, James Morse wrote:
> >> The MPAM driver identifies caches by id for use with resctrl. It
> >> needs to know the cache-id when probe-ing, but the value isn't set
> >> in cacheinfo until device_initcall().
> >>
> >> Expose the code that generates the cache-id. The parts of the MPAM
> >> driver that run early can use this to set up the resctrl structures
> >> before cacheinfo is ready in device_initcall().
>
> > Why can't the MPAM driver just consume the precomputed cache-id
> > information?
>
> Because it would need to wait until cacheinfo was ready, and it would still
> need a way of getting the cache-id for caches where all the CPUs are offline.
>
> The resctrl glue code has a waitqueue to wait for device_initcall_sync(), but that is
> asynchronous to driver probing, its triggered by the schedule_work() from the cpuhp
> callbacks. This bit is about the driver's use, which just gets probed whenever the core
> code feels like it.
>
> I toyed with always using cacheinfo for everything, and just waiting - but the MPAM driver
> already has to parse the PPTT to find the information it needs on ACPI platforms, so the
> wait would only happen on DT.
>
> It seemed simpler to grab what the value would be, instead of waiting (or probe defer) -
> especially as this is also needed for caches where all the CPUs are offline.
>
> (I'll add the offline-cpus angle to the commit message)
Ack
> > Possible reasons are that the MPAM driver probes too early,
>
> yup,
>
> > or that it
> > must parse the PPTT directly (which is true) and needs to label caches
> > consistently with the way the kernel does it.
>
> It needs to match what will be exposed to user-space from cacheinfo.
> This isn't about the PPTT, its the value that is generated for DT systems.
Right -- confused myself there. From the point of view of this series,
the usage scenario isn't clear at this point.
> The driver has to know if its ACPI or DT to call the appropriate thing to get cache-ids
> before cacheinfo is ready.
I see. This might be worth stating in the commit message.
> >> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> >> index 613410705a47..f6289d142ba9 100644
> >> --- a/drivers/base/cacheinfo.c
> >> +++ b/drivers/base/cacheinfo.c
> >> @@ -207,11 +207,10 @@ static bool match_cache_node(struct device_node *cpu,
> >> #define arch_compact_of_hwid(_x) (_x)
> >> #endif
> >>
> >> -static void cache_of_set_id(struct cacheinfo *this_leaf,
> >> - struct device_node *cache_node)
> >> +unsigned long cache_of_calculate_id(struct device_node *cache_node)
> >> {
> >> struct device_node *cpu;
> >> - u32 min_id = ~0;
> >> + unsigned long min_id = ~0UL;
>
> > Why the change of type here?
>
> This is a hang over from Rob's approach of making the cache-id 64 bit.
Ah, right.
(I have assumed that 0xffffffff is never going to clash with a valid
value.)
> > This does mean that 0xffffffff can now be generated as a valid cache-id,
> > but if that is necessary then this patch is also fixing a bug in the
> > code -- but the commit message doesn't say anything about that.
> >
> > For a patch that is just exposing an internal result, it may be
> > better to keep the original type. ~(u32)0 is already used as an
> > exceptional value.
>
> Yup, I'll fix that.
OK -- it works either way, of course, but this should make the patch a
little less noisy.
Cheers
---Dave