[PATCH v4 0/8] timers/migration: Fix three possible races and some improvements

Anna-Maria Behnsen posted 8 patches 1 year, 5 months ago
include/linux/cpuhotplug.h             |   1 +
include/trace/events/timer_migration.h |  16 +-
kernel/time/timer_migration.c          | 383 ++++++++++++++++-----------------
kernel/time/timer_migration.h          |  27 ++-
4 files changed, 214 insertions(+), 213 deletions(-)
[PATCH v4 0/8] timers/migration: Fix three possible races and some improvements
Posted by Anna-Maria Behnsen 1 year, 5 months ago
Borislav reported a warning in timer migration deactive path

  https://lore.kernel.org/r/20240612090347.GBZmlkc5PwlVpOG6vT@fat_crate.local

Sadly it doesn't reproduce directly. But with the change of timing (by
adding a trace prinkt before the warning), it is possible to trigger the
warning reliable at least in my test setup. The problem here is a racy
check agains group->parent pointer. This is also used in other places in
the code and fixing this racy usage is adressed by the first patch.

There were two other races reported by Frederic in setup path:

  https://lore.kernel.org/r/ZnWOswTMML6ShzYO@localhost.localdomain

  https://lore.kernel.org/r/ZnoIlO22habOyQRe@lothringen

Those races are both is addressed by the change of patch 2.

Some updates/cleanups are provided by patch 3-8. ("timers/migration:
Improve tracing" and "timers/migration: Spare write when nothing changed"
are the same as provided by v2).

Patches are available here:

  https://git.kernel.org/pub/scm/linux/kernel/git/anna-maria/linux-devel.git timers/misc

---
Changes in v4:
- Update Patch 2: Fix broken cpuhp_setup_state() call for prepare
- Update Patch 2: Activate child during setup only when it is an already
  existing group
- Update Patch 2: Change init into early_initcall() to make usage of
  preparation by an already active CPU.
- Update Patch 2: Move initialization of tmc in tmigr_cpu_prepare() before
  using data of tmc (e.g. by a tracepoint)
- Update Patch 5: Use proper childmask for tmigr_walk in __walk_groups()
- Update Patch 6: Fix missing update of s/childmask/groupmask in
  connect_[cpu|child]_parent tracepoint and update to change of Patch 5
- Link to v3: https://lore.kernel.org/r/20240701-tmigr-fixes-v3-0-25cd5de318fb@linutronix.de

Changes in v3:
- Address the new reported possible race (childmask and parent pointer)
  together with the existing race (both reported by Frederic).
- New cleanup: Two patches to access childmask and parent pointer only in
  one place
- New cleanup: Rename childmask to parentmask as during discussions there
  was some kind of confusion because of the naming
- New cleanup: Fix typo
- Fix prefix in all patches (s$timer_migration$timers/migration$)
- Link to v2: https://lore.kernel.org/r/20240624-tmigr-fixes-v2-0-3eb4c0604790@linutronix.de

Changes in v2:
- Address another possible race in setup code (reported by Frederic) and
  recycle therefore one improvement patch
- Change order and move the already existing improvement patch to the end
  of the queue
- Existing patches didn't change
- Link to v1: https://lore.kernel.org/r/20240621-tmigr-fixes-v1-0-8c8a2d8e8d77@linutronix.de

Thanks,

        Anna-Maria

---
Anna-Maria Behnsen (8):
      timers/migration: Do not rely always on group->parent
      timers/migration: Move hierarchy setup into cpuhotplug prepare callback
      timers/migration: Improve tracing
      timers/migration: Use a single struct for hierarchy walk data
      timers/migration: Read childmask and parent pointer in a single place
      timers/migration: Rename childmask by groupmask to make naming more obvious
      timers/migration: Spare write when nothing changed
      timers/migration: Fix grammar in comment

 include/linux/cpuhotplug.h             |   1 +
 include/trace/events/timer_migration.h |  16 +-
 kernel/time/timer_migration.c          | 383 ++++++++++++++++-----------------
 kernel/time/timer_migration.h          |  27 ++-
 4 files changed, 214 insertions(+), 213 deletions(-)