[PATCH v2 00/14] x86,fs/resctrl: Improve resctrl quality and consistency

Reinette Chatre posted 14 patches 2 weeks ago
MAINTAINERS                        |   2 +
arch/x86/kernel/cpu/resctrl/core.c |   4 +-
fs/resctrl/ctrlmondata.c           |  70 +++++++++------
fs/resctrl/monitor.c               |  80 ++++++++++-------
fs/resctrl/pseudo_lock.c           |   2 +-
fs/resctrl/rdtgroup.c              | 137 ++++++++++++++++++-----------
include/linux/resctrl.h            |  11 +--
7 files changed, 188 insertions(+), 118 deletions(-)
[PATCH v2 00/14] x86,fs/resctrl: Improve resctrl quality and consistency
Posted by Reinette Chatre 2 weeks ago
Changes since v1:
- v1: https://lore.kernel.org/lkml/cover.1772476561.git.reinette.chatre@intel.com/
- To simplify tracking, include patch to MAINTAINERS submitted separately:
  https://lore.kernel.org/lkml/4274c478922c01f9ceebc805acf991f10a95519f.1771442788.git.reinette.chatre@intel.com/
- Follow recent upstream changes to add reference to tip tree handbook in
  MAINTAINERS entry.
- Use new pattern in resctrl to enable looping over all entries of an enum
  while making adding new enum entries more clear all while avoiding warnings
  when compiling with -Wswitch. (Ben)
- Add another last_cmd_status enhancement that appends "[truncated]" to
  output of info/last_cmd_status if the backing buffer overflowed.
- Please see patch changelogs for details of changes.
- Add tags.

Hi Everybody,

This is a collection of resctrl cleanups assembled together for convenience
and simpler tracking. I'd be happy to split them up if it makes review and/or
handling easier.

Summary of changes:

- Let resctrl pass stricter checks from various tools to provide a cleaner
  baseline with the goal to promote healthier contributions:
  - ./tools/docs/kernel-doc -Wall -v <files>
  - Build with W=12
  - ./scripts/coccicheck
  - Static checkers

- Use accurate and consistent type for all uses of resource ID.

- In the unlikely scenario that resctrl picked a wrong CPU to read an event
  from, pass the error through to user space instead of claiming to succeed
  and returning a (wrong) result.

- Since inception of last_cmd_status feature there have been mismatches
  between resctrl file operation failures and the contents of
  info/last_cmd_status. This pattern keeps propagating with each new resctrl
  feature. Establish a new baseline with a new pattern that ensures
  info/last_cmd_status contains an accurate failure description that matches
  the most recent resctrl file operation failure.

One potential open:

There remains an inconsistency between resctrl file operations that do/can
_not_ fail and the contents of info/last_cmd_status. If a resctrl
file operation fails and an informational error is printed to last_cmd_status
then a subsequent reading of a resctrl file (specifically most of the files
found in info/) may succeed while info/last_cmd_status may or may not return
the error from previous failure.

Ensuring last_cmd_status is reset on every read carries the cost of taking
rdtgroup_mutex on several more user space initiated paths and thus increase
contention on rdtgroup_mutex. I opted to not make this change and instead
focus this work on ensuring that last_cmd_status is accurate whenever there is
a failure during any resctrl file operation. Please let me know if you have
opinions in this regard.

Any feedback is appreciated.

Regards,

Reinette

Reinette Chatre (14):
  MAINTAINERS: Update resctrl entry
  fs/resctrl: Add missing return value descriptions
  fs/resctrl: Avoid "may be used uninitialized" warning
  fs/resctrl: Use correct format specifier for printing error pointers
  x86/resctrl: Protect against bad shift
  fs/resctrl: Change pattern used to track number of entries in enum
  fs/resctrl: Use accurate type for rdt_resource::rid
  fs/resctrl: Pass error reading event through to user space
  fs/resctrl: Add last_cmd_status support for writes to
    max_threshold_occupancy
  fs/resctrl: Use accurate and symmetric exit flows
  fs/resctrl: Use stricter checks on input to cpus/cpus_list file
  fs/resctrl: Change last_cmd_status custom during input parsing
  fs/resctrl: Communicate resource group deleted error via
    last_cmd_status
  fs/resctrl: Inform user space when status buffer overflowed

 MAINTAINERS                        |   2 +
 arch/x86/kernel/cpu/resctrl/core.c |   4 +-
 fs/resctrl/ctrlmondata.c           |  70 +++++++++------
 fs/resctrl/monitor.c               |  80 ++++++++++-------
 fs/resctrl/pseudo_lock.c           |   2 +-
 fs/resctrl/rdtgroup.c              | 137 ++++++++++++++++++-----------
 include/linux/resctrl.h            |  11 +--
 7 files changed, 188 insertions(+), 118 deletions(-)

-- 
2.50.1