[PATCH v17 28/47] dept: add documentation for dept

Byungchul Park posted 47 patches 2 months, 2 weeks ago
There is a newer version of this series
[PATCH v17 28/47] dept: add documentation for dept
Posted by Byungchul Park 2 months, 2 weeks ago
This document describes the concept and APIs of dept.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 Documentation/dependency/dept.txt     | 735 ++++++++++++++++++++++++++
 Documentation/dependency/dept_api.txt | 117 ++++
 2 files changed, 852 insertions(+)
 create mode 100644 Documentation/dependency/dept.txt
 create mode 100644 Documentation/dependency/dept_api.txt

diff --git a/Documentation/dependency/dept.txt b/Documentation/dependency/dept.txt
new file mode 100644
index 000000000000..5dd358b96734
--- /dev/null
+++ b/Documentation/dependency/dept.txt
@@ -0,0 +1,735 @@
+DEPT(DEPendency Tracker)
+========================
+
+Started by Byungchul Park <max.byungchul.park@sk.com>
+
+How lockdep works
+-----------------
+
+Lockdep detects a deadlock by checking lock acquisition order. For
+example, a graph to track acquisition order built by lockdep might look
+like:
+
+   A -> B -
+           \
+            -> E
+           /
+   C -> D -
+
+   where 'A -> B' means that acquisition A is prior to acquisition B
+   with A still held.
+
+Lockdep keeps adding each new acquisition order into the graph in
+runtime. For example, 'E -> C' will be added when the two locks have
+been acquired in the order, E and then C. The graph will look like:
+
+       A -> B -
+               \
+                -> E -
+               /      \
+    -> C -> D -        \
+   /                   /
+   \                  /
+    ------------------
+
+   where 'A -> B' means that acquisition A is prior to acquisition B
+   with A still held.
+
+This graph contains a subgraph that demonstrates a loop like:
+
+                -> E -
+               /      \
+    -> C -> D -        \
+   /                   /
+   \                  /
+    ------------------
+
+   where 'A -> B' means that acquisition A is prior to acquisition B
+   with A still held.
+
+Lockdep reports it as a deadlock on detection of a loop and stops its
+working.
+
+CONCLUSION
+
+Lockdep detects a deadlock by checking if a loop has been created after
+adding a new acquisition order into the graph.
+
+
+Limitation of lockdep
+---------------------
+
+Lockdep deals with a deadlock by typical lock e.g. spinlock and mutex,
+that are supposed to be released within the acquisition context. However,
+when it comes to a deadlock by folio lock that is not supposed to be
+released within the acquisition context or other general synchronization
+mechanisms, lockdep doesn't work.
+
+Can lockdep detect the following deadlock?
+
+   context X	   context Y	   context Z
+
+		   mutex_lock A
+   folio_lock B
+		   folio_lock B <- DEADLOCK
+				   mutex_lock A <- DEADLOCK
+				   folio_unlock B
+		   folio_unlock B
+		   mutex_unlock A
+				   mutex_unlock A
+
+No. What about the following?
+
+   context X		   context Y
+
+			   mutex_lock A
+   mutex_lock A <- DEADLOCK
+			   wait_for_complete B <- DEADLOCK
+   complete B
+			   mutex_unlock A
+   mutex_unlock A
+
+No.
+
+CONCLUSION
+
+Lockdep cannot detect a deadlock by folio lock or other general
+synchronization mechanisms.
+
+
+What leads a deadlock
+---------------------
+
+A deadlock occurs when one or multi contexts are waiting for events that
+will never happen. For example:
+
+   context X	   context Y	   context Z
+
+   |		   |		   |
+   v		   |		   |
+   1 wait for A    v		   |
+   .		   2 wait for C    v
+   event C	   .		   3 wait for B
+		   event B	   .
+				   event A
+
+Event C cannot be triggered because context X is stuck at 1, event B
+cannot be triggered because context Y is stuck at 2, and event A cannot
+be triggered because context Z is stuck at 3. All the contexts are stuck.
+We call this *deadlock*.
+
+If an event occurrence is a prerequisite to reaching another event, we
+call it *dependency*. In this example:
+
+   Event A occurrence is a prerequisite to reaching event C.
+   Event C occurrence is a prerequisite to reaching event B.
+   Event B occurrence is a prerequisite to reaching event A.
+
+In terms of dependency:
+
+   Event C depends on event A.
+   Event B depends on event C.
+   Event A depends on event B.
+
+Dependency graph reflecting this example will look like:
+
+    -> C -> A -> B -
+   /                \
+   \                /
+    ----------------
+
+   where 'A -> B' means that event A depends on event B.
+
+A circular dependency exists. Such a circular dependency leads a
+deadlock since no waiters can have desired events triggered.
+
+CONCLUSION
+
+A circular dependency of events leads a deadlock.
+
+
+Introduce DEPT
+--------------
+
+DEPT(DEPendency Tracker) tracks wait and event instead of lock
+acquisition order so as to recognize the following situation:
+
+   context X	   context Y	   context Z
+
+   |		   |		   |
+   v		   |		   |
+   wait for A	   v		   |
+   .		   wait for C	   v
+   event C	   .		   wait for B
+		   event B	   .
+				   event A
+
+and builds up a dependency graph in runtime that is similar to lockdep.
+The graph might look like:
+
+    -> C -> A -> B -
+   /                \
+   \                /
+    ----------------
+
+   where 'A -> B' means that event A depends on event B.
+
+DEPT keeps adding each new dependency into the graph in runtime. For
+example, 'B -> D' will be added when event D occurrence is a
+prerequisite to reaching event B like:
+
+   |
+   v
+   wait for D
+   .
+   event B
+
+After the addition, the graph will look like:
+
+                     -> D
+                    /
+    -> C -> A -> B -
+   /                \
+   \                /
+    ----------------
+
+   where 'A -> B' means that event A depends on event B.
+
+DEPT is going to report a deadlock on detection of a new loop.
+
+CONCLUSION
+
+DEPT works on wait and event so as to theoretically detect all the
+potential deadlocks.
+
+
+How DEPT works
+--------------
+
+Let's take a look how DEPT works with the 1st example in the section
+'Limitation of lockdep'.
+
+   context X	   context Y	   context Z
+
+		   mutex_lock A
+   folio_lock B
+		   folio_lock B <- DEADLOCK
+				   mutex_lock A <- DEADLOCK
+				   folio_unlock B
+		   folio_unlock B
+		   mutex_unlock A
+				   mutex_unlock A
+
+Adding comments to describe DEPT's view in terms of wait and event:
+
+   context X	   context Y	   context Z
+
+		   mutex_lock A
+		   /* wait for A */
+   folio_lock B
+   /* wait for A */
+   /* start event A context */
+
+		   folio_lock B
+		   /* wait for B */ <- DEADLOCK
+		   /* start event B context */
+
+				   mutex_lock A
+				   /* wait for A */ <- DEADLOCK
+				   /* start event A context */
+
+				   folio_unlock B
+				   /* event B */
+		   folio_unlock B
+		   /* event B */
+
+		   mutex_unlock A
+		   /* event A */
+				   mutex_unlock A
+				   /* event A */
+
+Adding more supplementary comments to describe DEPT's view in detail:
+
+   context X	   context Y	   context Z
+
+		   mutex_lock A
+		   /* might wait for A */
+		   /* start to take into account event A's context */
+		   /* 1 */
+   folio_lock B
+   /* might wait for B */
+   /* start to take into account event B's context */
+   /* 2 */
+
+		   folio_lock B
+		   /* might wait for B */ <- DEADLOCK
+		   /* start to take into account event B's context */
+		   /* 3 */
+
+				   mutex_lock A
+				   /* might wait for A */ <- DEADLOCK
+				   /* start to take into account
+				      event A's context */
+				   /* 4 */
+
+				   folio_unlock B
+				   /* event B that's been valid since 2 */
+		   folio_unlock B
+		   /* event B that's been valid since 3 */
+
+		   mutex_unlock A
+		   /* event A that's been valid since 1 */
+
+				   mutex_unlock A
+				   /* event A that's been valid since 4 */
+
+Let's build up dependency graph with this example. Firstly, context X:
+
+   context X
+
+   folio_lock B
+   /* might wait for B */
+   /* start to take into account event B's context */
+   /* 2 */
+
+There are no events to create dependency. Next, context Y:
+
+   context Y
+
+   mutex_lock A
+   /* might wait for A */
+   /* start to take into account event A's context */
+   /* 1 */
+
+   folio_lock B
+   /* might wait for B */
+   /* start to take into account event B's context */
+   /* 3 */
+
+   folio_unlock B
+   /* event B that's been valid since 3 */
+
+   mutex_unlock A
+   /* event A that's been valid since 1 */
+
+There are two events. For event B, folio_unlock B, since there are no
+waits between 3 and the event, event B does not create dependency. For
+event A, there is a wait, folio_lock B, between 1 and the event. Which
+means event A cannot be triggered if event B does not wake up the wait.
+Therefore, we can say event A depends on event B, say, 'A -> B'. The
+graph will look like after adding the dependency:
+
+   A -> B
+
+   where 'A -> B' means that event A depends on event B.
+
+Lastly, context Z:
+
+   context Z
+
+   mutex_lock A
+   /* might wait for A */
+   /* start to take into account event A's context */
+   /* 4 */
+
+   folio_unlock B
+   /* event B that's been valid since 2 */
+
+   mutex_unlock A
+   /* event A that's been valid since 4 */
+
+There are also two events. For event B, folio_unlock B, there is a
+wait, mutex_lock A, between 2 and the event - remind 2 is at a very
+start and before the wait in timeline. Which means event B cannot be
+triggered if event A does not wake up the wait. Therefore, we can say
+event B depends on event A, say, 'B -> A'. The graph will look like
+after adding the dependency:
+
+    -> A -> B -
+   /           \
+   \           /
+    -----------
+
+   where 'A -> B' means that event A depends on event B.
+
+A new loop has been created. So DEPT can report it as a deadlock. For
+event A, mutex_unlock A, since there are no waits between 4 and the
+event, event A does not create dependency. That's it.
+
+CONCLUSION
+
+DEPT works well with any general synchronization mechanisms by focusing
+on wait, event and its context.
+
+
+Interpret DEPT report
+---------------------
+
+The following is the example in the section 'How DEPT works'.
+
+   context X	   context Y	   context Z
+
+		   mutex_lock A
+		   /* might wait for A */
+		   /* start to take into account event A's context */
+		   /* 1 */
+   folio_lock B
+   /* might wait for B */
+   /* start to take into account event B's context */
+   /* 2 */
+
+		   folio_lock B
+		   /* might wait for B */ <- DEADLOCK
+		   /* start to take into account event B's context */
+		   /* 3 */
+
+				   mutex_lock A
+				   /* might wait for A */ <- DEADLOCK
+				   /* start to take into account
+				      event A's context */
+				   /* 4 */
+
+				   folio_unlock B
+				   /* event B that's been valid since 2 */
+		   folio_unlock B
+		   /* event B that's been valid since 3 */
+
+		   mutex_unlock A
+		   /* event A that's been valid since 1 */
+
+				   mutex_unlock A
+				   /* event A that's been valid since 4 */
+
+We can Simplify this by replacing each waiting point with [W], each
+point where its event's context starts with [S] and each event with [E].
+This example will look like after the replacement:
+
+   context X	   context Y	   context Z
+
+		   [W][S] mutex_lock A
+   [W][S] folio_lock B
+		   [W][S] folio_lock B <- DEADLOCK
+
+				   [W][S] mutex_lock A <- DEADLOCK
+				   [E] folio_unlock B
+		   [E] folio_unlock B
+		   [E] mutex_unlock A
+				   [E] mutex_unlock A
+
+DEPT uses the symbols [W], [S] and [E] in its report as described above.
+The following is an example reported by DEPT for a real problem.
+
+   Link: https://lore.kernel.org/lkml/6383cde5-cf4b-facf-6e07-1378a485657d@I-love.SAKURA.ne.jp/#t
+   Link: https://lore.kernel.org/lkml/1674268856-31807-1-git-send-email-byungchul.park@lge.com/
+
+   ===================================================
+   DEPT: Circular dependency has been detected.
+   6.2.0-rc1-00025-gb0c20ebf51ac-dirty #28 Not tainted
+   ---------------------------------------------------
+   summary
+   ---------------------------------------------------
+   *** DEADLOCK ***
+
+   context A
+       [S] lock(&ni->ni_lock:0)
+       [W] folio_wait_bit_common(PG_locked_map:0)
+       [E] unlock(&ni->ni_lock:0)
+
+   context B
+       [S] (unknown)(PG_locked_map:0)
+       [W] lock(&ni->ni_lock:0)
+       [E] folio_unlock(PG_locked_map:0)
+
+   [S]: start of the event context
+   [W]: the wait blocked
+   [E]: the event not reachable
+   ---------------------------------------------------
+   context A's detail
+   ---------------------------------------------------
+   context A
+       [S] lock(&ni->ni_lock:0)
+       [W] folio_wait_bit_common(PG_locked_map:0)
+       [E] unlock(&ni->ni_lock:0)
+
+   [S] lock(&ni->ni_lock:0):
+   [<ffffffff82b396fb>] ntfs3_setattr+0x54b/0xd40
+   stacktrace:
+         ntfs3_setattr+0x54b/0xd40
+         notify_change+0xcb3/0x1430
+         do_truncate+0x149/0x210
+         path_openat+0x21a3/0x2a90
+         do_filp_open+0x1ba/0x410
+         do_sys_openat2+0x16d/0x4e0
+         __x64_sys_creat+0xcd/0x120
+         do_syscall_64+0x41/0xc0
+         entry_SYSCALL_64_after_hwframe+0x63/0xcd
+
+   [W] folio_wait_bit_common(PG_locked_map:0):
+   [<ffffffff81b228b0>] truncate_inode_pages_range+0x9b0/0xf20
+   stacktrace:
+         folio_wait_bit_common+0x5e0/0xaf0
+         truncate_inode_pages_range+0x9b0/0xf20
+         truncate_pagecache+0x67/0x90
+         ntfs3_setattr+0x55a/0xd40
+         notify_change+0xcb3/0x1430
+         do_truncate+0x149/0x210
+         path_openat+0x21a3/0x2a90
+         do_filp_open+0x1ba/0x410
+         do_sys_openat2+0x16d/0x4e0
+         __x64_sys_creat+0xcd/0x120
+         do_syscall_64+0x41/0xc0
+         entry_SYSCALL_64_after_hwframe+0x63/0xcd
+
+   [E] unlock(&ni->ni_lock:0):
+   (N/A)
+   ---------------------------------------------------
+   context B's detail
+   ---------------------------------------------------
+   context B
+       [S] (unknown)(PG_locked_map:0)
+       [W] lock(&ni->ni_lock:0)
+       [E] folio_unlock(PG_locked_map:0)
+
+   [S] (unknown)(PG_locked_map:0):
+   (N/A)
+
+   [W] lock(&ni->ni_lock:0):
+   [<ffffffff82b009ec>] attr_data_get_block+0x32c/0x19f0
+   stacktrace:
+         attr_data_get_block+0x32c/0x19f0
+         ntfs_get_block_vbo+0x264/0x1330
+         __block_write_begin_int+0x3bd/0x14b0
+         block_write_begin+0xb9/0x4d0
+         ntfs_write_begin+0x27e/0x480
+         generic_perform_write+0x256/0x570
+         __generic_file_write_iter+0x2ae/0x500
+         ntfs_file_write_iter+0x66d/0x1d70
+         do_iter_readv_writev+0x20b/0x3c0
+         do_iter_write+0x188/0x710
+         vfs_iter_write+0x74/0xa0
+         iter_file_splice_write+0x745/0xc90
+         direct_splice_actor+0x114/0x180
+         splice_direct_to_actor+0x33b/0x8b0
+         do_splice_direct+0x1b7/0x280
+         do_sendfile+0xb49/0x1310
+
+   [E] folio_unlock(PG_locked_map:0):
+   [<ffffffff81f10222>] generic_write_end+0xf2/0x440
+   stacktrace:
+         generic_write_end+0xf2/0x440
+         ntfs_write_end+0x42e/0x980
+         generic_perform_write+0x316/0x570
+         __generic_file_write_iter+0x2ae/0x500
+         ntfs_file_write_iter+0x66d/0x1d70
+         do_iter_readv_writev+0x20b/0x3c0
+         do_iter_write+0x188/0x710
+         vfs_iter_write+0x74/0xa0
+         iter_file_splice_write+0x745/0xc90
+         direct_splice_actor+0x114/0x180
+         splice_direct_to_actor+0x33b/0x8b0
+         do_splice_direct+0x1b7/0x280
+         do_sendfile+0xb49/0x1310
+         __x64_sys_sendfile64+0x1d0/0x210
+         do_syscall_64+0x41/0xc0
+         entry_SYSCALL_64_after_hwframe+0x63/0xcd
+   ---------------------------------------------------
+   information that might be helpful
+   ---------------------------------------------------
+   CPU: 1 PID: 8060 Comm: a.out Not tainted
+	6.2.0-rc1-00025-gb0c20ebf51ac-dirty #28
+   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
+	BIOS Bochs 01/01/2011
+   Call Trace:
+    <TASK>
+    dump_stack_lvl+0xf2/0x169
+    print_circle.cold+0xca4/0xd28
+    ? lookup_dep+0x240/0x240
+    ? extend_queue+0x223/0x300
+    cb_check_dl+0x1e7/0x260
+    bfs+0x27b/0x610
+    ? print_circle+0x240/0x240
+    ? llist_add_batch+0x180/0x180
+    ? extend_queue_rev+0x300/0x300
+    ? __add_dep+0x60f/0x810
+    add_dep+0x221/0x5b0
+    ? __add_idep+0x310/0x310
+    ? add_iecxt+0x1bc/0xa60
+    ? add_iecxt+0x1bc/0xa60
+    ? add_iecxt+0x1bc/0xa60
+    ? add_iecxt+0x1bc/0xa60
+    __dept_wait+0x600/0x1490
+    ? add_iecxt+0x1bc/0xa60
+    ? truncate_inode_pages_range+0x9b0/0xf20
+    ? check_new_class+0x790/0x790
+    ? dept_enirq_transition+0x519/0x9c0
+    dept_wait+0x159/0x3b0
+    ? truncate_inode_pages_range+0x9b0/0xf20
+    folio_wait_bit_common+0x5e0/0xaf0
+    ? filemap_get_folios_contig+0xa30/0xa30
+    ? dept_enirq_transition+0x519/0x9c0
+    ? lock_is_held_type+0x10e/0x160
+    ? lock_is_held_type+0x11e/0x160
+    truncate_inode_pages_range+0x9b0/0xf20
+    ? truncate_inode_partial_folio+0xba0/0xba0
+    ? setattr_prepare+0x142/0xc40
+    truncate_pagecache+0x67/0x90
+    ntfs3_setattr+0x55a/0xd40
+    ? ktime_get_coarse_real_ts64+0x1e5/0x2f0
+    ? ntfs_extend+0x5c0/0x5c0
+    ? mode_strip_sgid+0x210/0x210
+    ? ntfs_extend+0x5c0/0x5c0
+    notify_change+0xcb3/0x1430
+    ? do_truncate+0x149/0x210
+    do_truncate+0x149/0x210
+    ? file_open_root+0x430/0x430
+    ? process_measurement+0x18c0/0x18c0
+    ? ntfs_file_release+0x230/0x230
+    path_openat+0x21a3/0x2a90
+    ? path_lookupat+0x840/0x840
+    ? dept_enirq_transition+0x519/0x9c0
+    ? lock_is_held_type+0x10e/0x160
+    do_filp_open+0x1ba/0x410
+    ? may_open_dev+0xf0/0xf0
+    ? find_held_lock+0x2d/0x110
+    ? lock_release+0x43c/0x830
+    ? dept_ecxt_exit+0x31a/0x590
+    ? _raw_spin_unlock+0x3b/0x50
+    ? alloc_fd+0x2de/0x6e0
+    do_sys_openat2+0x16d/0x4e0
+    ? __ia32_sys_get_robust_list+0x3b0/0x3b0
+    ? build_open_flags+0x6f0/0x6f0
+    ? dept_enirq_transition+0x519/0x9c0
+    ? dept_enirq_transition+0x519/0x9c0
+    ? lock_is_held_type+0x4e/0x160
+    ? lock_is_held_type+0x4e/0x160
+    __x64_sys_creat+0xcd/0x120
+    ? __x64_compat_sys_openat+0x1f0/0x1f0
+    do_syscall_64+0x41/0xc0
+    entry_SYSCALL_64_after_hwframe+0x63/0xcd
+   RIP: 0033:0x7f8b9e4e4469
+   Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48
+   89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48>
+   3d 01 f0 ff ff 73 01 c3 48 8b 0d ff 49 2b 00 f7 d8 64 89 01 48
+   RSP: 002b:00007f8b9eea4ef8 EFLAGS: 00000202 ORIG_RAX: 0000000000000055
+   RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f8b9e4e4469
+   RDX: 0000000000737562 RSI: 0000000000000000 RDI: 0000000020000000
+   RBP: 00007f8b9eea4f20 R08: 0000000000000000 R09: 0000000000000000
+   R10: 0000000000000000 R11: 0000000000000202 R12: 00007fffa75511ee
+   R13: 00007fffa75511ef R14: 00007f8b9ee85000 R15: 0000000000000003
+    </TASK>
+
+Let's take a look at the summary that is the most important part.
+
+   ---------------------------------------------------
+   summary
+   ---------------------------------------------------
+   *** DEADLOCK ***
+
+   context A
+       [S] lock(&ni->ni_lock:0)
+       [W] folio_wait_bit_common(PG_locked_map:0)
+       [E] unlock(&ni->ni_lock:0)
+
+   context B
+       [S] (unknown)(PG_locked_map:0)
+       [W] lock(&ni->ni_lock:0)
+       [E] folio_unlock(PG_locked_map:0)
+
+   [S]: start of the event context
+   [W]: the wait blocked
+   [E]: the event not reachable
+
+The summary shows the following scenario:
+
+   context A	   context B	   context ?(unknown)
+
+				   [S] folio_lock(&f1)
+   [S] lock(&ni->ni_lock:0)
+   [W] folio_wait_bit_common(PG_locked_map:0)
+
+		   [W] lock(&ni->ni_lock:0)
+		   [E] folio_unlock(&f1)
+
+   [E] unlock(&ni->ni_lock:0)
+
+Adding supplementary comments to describe DEPT's view in detail:
+
+   context A	   context B	   context ?(unknown)
+
+				   [S] folio_lock(&f1)
+				   /* start to take into account context
+				      B heading for folio_unlock(&f1) */
+				   /* 1 */
+   [S] lock(&ni->ni_lock:0)
+   /* start to take into account this context heading for
+      unlock(&ni->ni_lock:0) */
+   /* 2 */
+
+   [W] folio_wait_bit_common(PG_locked_map:0) (= folio_lock(&f1))
+   /* might wait for folio_unlock(&f1) */
+
+		   [W] lock(&ni->ni_lock:0)
+		   /* might wait for unlock(&ni->ni_lock:0) */
+
+		   [E] folio_unlock(&f1)
+		   /* event that's been valid since 1 */
+
+   [E] unlock(&ni->ni_lock:0)
+   /* event that's been valid since 2 */
+
+Let's build up dependency graph with this report. Firstly, context A:
+
+   context A
+
+   [S] lock(&ni->ni_lock:0)
+   /* start to take into account this context heading for
+      unlock(&ni->ni_lock:0) */
+   /* 2 */
+
+   [W] folio_wait_bit_common(PG_locked_map:0) (= folio_lock(&f1))
+   /* might wait for folio_unlock(&f1) */
+
+   [E] unlock(&ni->ni_lock:0)
+   /* event that's been valid since 2 */
+
+There is one interesting event, unlock(&ni->ni_lock:0). There is a
+wait, folio_lock(&f1), between 2 and the event. Which means
+unlock(&ni->ni_lock:0) is not reachable if folio_unlock(&f1) does not
+wake up the wait. Therefore, we can say unlock(&ni->ni_lock:0) depends
+on folio_unlock(&f1), say, 'unlock(&ni->ni_lock:0) -> folio_unlock(&f1)'.
+The graph will look like after adding the dependency:
+
+   unlock(&ni->ni_lock:0) -> folio_unlock(&f1)
+
+   where 'A -> B' means that event A depends on event B.
+
+Secondly, context B:
+
+   context B
+
+   [W] lock(&ni->ni_lock:0)
+   /* might wait for unlock(&ni->ni_lock:0) */
+
+   [E] folio_unlock(&f1)
+   /* event that's been valid since 1 */
+
+There is also one interesting event, folio_unlock(&f1). There is a
+wait, lock(&ni->ni_lock:0), between 1 and the event - remind 1 is at a
+very start and before the wait in timeline. Which means folio_unlock(&f1)
+is not reachable if unlock(&ni->ni_lock:0) does not wake up the wait.
+Therefore, we can say folio_unlock(&f1) depends on unlock(&ni->ni_lock:0),
+say, 'folio_unlock(&f1) -> unlock(&ni->ni_lock:0)'. The graph will look
+like after adding the dependency:
+
+    -> unlock(&ni->ni_lock:0) -> folio_unlock(&f1) -
+   /                                                \
+   \                                                /
+    ------------------------------------------------
+
+   where 'A -> B' means that event A depends on event B.
+
+A new loop has been created. So DEPT can report it as a deadlock! Cool!
+
+CONCLUSION
+
+DEPT works awesome!
diff --git a/Documentation/dependency/dept_api.txt b/Documentation/dependency/dept_api.txt
new file mode 100644
index 000000000000..8e0d5a118a46
--- /dev/null
+++ b/Documentation/dependency/dept_api.txt
@@ -0,0 +1,117 @@
+DEPT(DEPendency Tracker) APIs
+=============================
+
+Started by Byungchul Park <max.byungchul.park@sk.com>
+
+SDT(Single-event Dependency Tracker) APIs
+-----------------------------------------
+Use these APIs to annotate on either wait or event.  These have been
+already applied into the existing synchronization primitives e.g.
+waitqueue, swait, wait_for_completion(), dma fence and so on.  The basic
+APIs of SDT are:
+
+   /*
+    * After defining 'struct dept_map map', initialize the instance.
+    */
+   sdt_map_init(map);
+
+   /*
+    * Place just before the interesting wait.
+    */
+   sdt_wait(map);
+
+   /*
+    * Place just before the interesting event.
+    */
+   sdt_event(map);
+
+The advanced APIs of SDT are:
+
+   /*
+    * After defining 'struct dept_map map', initialize the instance
+    * using an external key.
+    */
+   sdt_map_init_key(map, key);
+
+   /*
+    * Place just before the interesting timeout wait.
+    */
+   sdt_wait_timeout(map, time);
+
+   /*
+    * Use sdt_might_sleep_start() and sdt_might_sleep_end() in pair.
+    * Place at the start of the interesting section that might enter
+    * schedule() or its family that needs to be woken up by
+    * try_to_wake_up().
+    */
+   sdt_might_sleep_start(map);
+
+   /*
+    * Use sdt_might_sleep_start_timeout() and sdt_might_sleep_end() in
+    * pair.  Place at the start of the interesting section that might
+    * enter schedule_timeout() or its family that needs to be woken up
+    * by try_to_wake_up().
+    */
+   sdt_might_sleep_start_timeout(map, time);
+
+   /*
+    * Use sdt_might_sleep_start() and sdt_might_sleep_end() in pair.
+    * Place at the end of the interesting section that might enter
+    * schedule(), schedule_timeout() or its family that needs to be
+    * woken up by try_to_wake_up().
+    */
+   sdt_might_sleep_end();
+
+   /*
+    * Use sdt_ecxt_enter() and sdt_ecxt_exit() in pair.  Place at the
+    * start of the interesting section where the interesting event might
+    * be triggered.
+    */
+   sdt_ecxt_enter(map);
+
+   /*
+    * Use sdt_ecxt_enter() and sdt_ecxt_exit() in pair.  Place at the
+    * end of the interesting section where the interesting event might
+    * be triggered.
+    */
+   sdt_ecxt_exit(map);
+
+
+LDT(Lock Dependency Tracker) APIs
+---------------------------------
+Do not use these APIs directly.  These are the wrappers for typical
+locks, that have been already applied into major locks internally e.g.
+spin lock, mutex, rwlock and so on.  The APIs of LDT are:
+
+   ldt_init(map, key, sub, name);
+   ldt_lock(map, sub_local, try, nest, ip);
+   ldt_rlock(map, sub_local, try, nest, ip, queued);
+   ldt_wlock(map, sub_local, try, nest, ip);
+   ldt_unlock(map, ip);
+   ldt_downgrade(map, ip);
+   ldt_set_class(map, name, key, sub_local, ip);
+
+
+Raw APIs
+--------
+Do not use these APIs directly.  The raw APIs of dept are:
+
+   dept_free_range(start, size);
+   dept_map_init(map, key, sub, name);
+   dept_map_reinit(map, key, sub, name);
+   dept_ext_wgen_init(ext_wgen);
+   dept_map_copy(map_to, map_from);
+   dept_wait(map, wait_flags, ip, wait_func, sub_local, time);
+   dept_stage_wait(map, key, ip, wait_func, time);
+   dept_request_event_wait_commit();
+   dept_clean_stage();
+   dept_stage_event(task, ip);
+   dept_ecxt_enter(map, evt_flags, ip, ecxt_func, evt_func, sub_local);
+   dept_ecxt_holding(map, evt_flags);
+   dept_request_event(map, ext_wgen);
+   dept_event(map, evt_flags, ip, evt_func, ext_wgen);
+   dept_ecxt_exit(map, evt_flags, ip);
+   dept_ecxt_enter_nokeep(map);
+   dept_key_init(key);
+   dept_key_destroy(key);
+   dept_map_ecxt_modify(map, cur_evt_flags, key, evt_flags, ip, ecxt_func, evt_func, sub_local);
-- 
2.17.1
Re: [PATCH v17 28/47] dept: add documentation for dept
Posted by NeilBrown 2 months, 2 weeks ago
On Thu, 02 Oct 2025, Byungchul Park wrote:
> This document describes the concept and APIs of dept.
> 

Thanks for the documentation.  I've been trying to understand it.


> +How DEPT works
> +--------------
> +
> +Let's take a look how DEPT works with the 1st example in the section
> +'Limitation of lockdep'.
> +
> +   context X	   context Y	   context Z
> +
> +		   mutex_lock A
> +   folio_lock B
> +		   folio_lock B <- DEADLOCK
> +				   mutex_lock A <- DEADLOCK
> +				   folio_unlock B
> +		   folio_unlock B
> +		   mutex_unlock A
> +				   mutex_unlock A
> +
> +Adding comments to describe DEPT's view in terms of wait and event:
> +
> +   context X	   context Y	   context Z
> +
> +		   mutex_lock A
> +		   /* wait for A */
> +   folio_lock B
> +   /* wait for A */
> +   /* start event A context */
> +
> +		   folio_lock B
> +		   /* wait for B */ <- DEADLOCK
> +		   /* start event B context */
> +
> +				   mutex_lock A
> +				   /* wait for A */ <- DEADLOCK
> +				   /* start event A context */
> +
> +				   folio_unlock B
> +				   /* event B */
> +		   folio_unlock B
> +		   /* event B */
> +
> +		   mutex_unlock A
> +		   /* event A */
> +				   mutex_unlock A
> +				   /* event A */
> +

I can't see the value of the above section.
The first section with no comments is useful as it is easy to see the
deadlock being investigate.  The section below is useful as it add
comments to explain how DEPT sees the situation.  But the above section,
with some but not all of the comments, does seem (to me) to add anything
useful.

> +Adding more supplementary comments to describe DEPT's view in detail:
> +
> +   context X	   context Y	   context Z
> +
> +		   mutex_lock A
> +		   /* might wait for A */
> +		   /* start to take into account event A's context */

What do you mean precisely by "context".
You use the word in the heading "context X  context Y  context Z"
so it seems like "context" means "task" or "process".  But then as I
read on, I think - maybe it means something else.  If it does, then you
should use different words.  Maybe "task X ..." in the heading.

If the examples that follow It seems that the "context" for event A
starts at "mutex lock A" when it (possibly) waits for a mutex and ends
at "mutex unlock A" - which are both in the same process.  Clearly
various other events that happen between these two points in the same
process could be seen as the "context" for event A.

However event B starts in "context X" with "folio_lock B" and ends in
"context Z" or "context Y" with "folio_unlock B".  Is that right?

My question then is: how do you decide which, of all the event in all
the processes in all the system, between the start[S] and the end[E] are
considered to be part of the "context" of event A.

I think it would help me if you defined what a "context" is earlier.
What sorts of things appear in a context?

Thanks,
NeilBrown


> +		   /* 1 */
> +   folio_lock B
> +   /* might wait for B */
> +   /* start to take into account event B's context */
> +   /* 2 */
> +
> +		   folio_lock B
> +		   /* might wait for B */ <- DEADLOCK
> +		   /* start to take into account event B's context */
> +		   /* 3 */
> +
> +				   mutex_lock A
> +				   /* might wait for A */ <- DEADLOCK
> +				   /* start to take into account
> +				      event A's context */
> +				   /* 4 */
> +
> +				   folio_unlock B
> +				   /* event B that's been valid since 2 */
> +		   folio_unlock B
> +		   /* event B that's been valid since 3 */
> +
> +		   mutex_unlock A
> +		   /* event A that's been valid since 1 */
> +
> +				   mutex_unlock A
> +				   /* event A that's been valid since 4 */
> +
> +Let's build up dependency graph with this example. Firstly, context X:
> +
> +   context X
> +
> +   folio_lock B
> +   /* might wait for B */
> +   /* start to take into account event B's context */
> +   /* 2 */
> +
> +There are no events to create dependency. Next, context Y:
> +
> +   context Y
> +
> +   mutex_lock A
> +   /* might wait for A */
> +   /* start to take into account event A's context */
> +   /* 1 */
> +
> +   folio_lock B
> +   /* might wait for B */
> +   /* start to take into account event B's context */
> +   /* 3 */
> +
> +   folio_unlock B
> +   /* event B that's been valid since 3 */
> +
> +   mutex_unlock A
> +   /* event A that's been valid since 1 */
> +
> +There are two events. For event B, folio_unlock B, since there are no
> +waits between 3 and the event, event B does not create dependency. For
> +event A, there is a wait, folio_lock B, between 1 and the event. Which
> +means event A cannot be triggered if event B does not wake up the wait.
> +Therefore, we can say event A depends on event B, say, 'A -> B'. The
> +graph will look like after adding the dependency:
> +
> +   A -> B
> +
> +   where 'A -> B' means that event A depends on event B.
> +
> +Lastly, context Z:
> +
> +   context Z
> +
> +   mutex_lock A
> +   /* might wait for A */
> +   /* start to take into account event A's context */
> +   /* 4 */
> +
> +   folio_unlock B
> +   /* event B that's been valid since 2 */
> +
> +   mutex_unlock A
> +   /* event A that's been valid since 4 */
> +
> +There are also two events. For event B, folio_unlock B, there is a
> +wait, mutex_lock A, between 2 and the event - remind 2 is at a very
> +start and before the wait in timeline. Which means event B cannot be
> +triggered if event A does not wake up the wait. Therefore, we can say
> +event B depends on event A, say, 'B -> A'. The graph will look like
> +after adding the dependency:
> +
> +    -> A -> B -
> +   /           \
> +   \           /
> +    -----------
> +
> +   where 'A -> B' means that event A depends on event B.
> +
> +A new loop has been created. So DEPT can report it as a deadlock. For
> +event A, mutex_unlock A, since there are no waits between 4 and the
> +event, event A does not create dependency. That's it.
> +
> +CONCLUSION
> +
> +DEPT works well with any general synchronization mechanisms by focusing
> +on wait, event and its context.
> +
Re: [PATCH v17 28/47] dept: add documentation for dept
Posted by Byungchul Park 2 months ago
On Fri, Oct 03, 2025 at 04:55:14PM +1000, NeilBrown wrote:
> On Thu, 02 Oct 2025, Byungchul Park wrote:
> > This document describes the concept and APIs of dept.
> >
> 
> Thanks for the documentation.  I've been trying to understand it.

You're welcome.  Feel free to ask me if you have any questions.

> > +How DEPT works
> > +--------------
> > +
> > +Let's take a look how DEPT works with the 1st example in the section
> > +'Limitation of lockdep'.
> > +
> > +   context X    context Y       context Z
> > +
> > +                mutex_lock A
> > +   folio_lock B
> > +                folio_lock B <- DEADLOCK
> > +                                mutex_lock A <- DEADLOCK
> > +                                folio_unlock B
> > +                folio_unlock B
> > +                mutex_unlock A
> > +                                mutex_unlock A
> > +
> > +Adding comments to describe DEPT's view in terms of wait and event:
> > +
> > +   context X    context Y       context Z
> > +
> > +                mutex_lock A
> > +                /* wait for A */
> > +   folio_lock B
> > +   /* wait for A */
> > +   /* start event A context */
> > +
> > +                folio_lock B
> > +                /* wait for B */ <- DEADLOCK
> > +                /* start event B context */
> > +
> > +                                mutex_lock A
> > +                                /* wait for A */ <- DEADLOCK
> > +                                /* start event A context */
> > +
> > +                                folio_unlock B
> > +                                /* event B */
> > +                folio_unlock B
> > +                /* event B */
> > +
> > +                mutex_unlock A
> > +                /* event A */
> > +                                mutex_unlock A
> > +                                /* event A */
> > +
> 
> I can't see the value of the above section.
> The first section with no comments is useful as it is easy to see the
> deadlock being investigate.  The section below is useful as it add
> comments to explain how DEPT sees the situation.  But the above section,
> with some but not all of the comments, does seem (to me) to add anything
> useful.

I just wanted to convert 'locking terms' to 'wait and event terms' by
one step.  However, I can remove the section you pointed out that you
thought was useless.

> > +Adding more supplementary comments to describe DEPT's view in detail:
> > +
> > +   context X    context Y       context Z
> > +
> > +                mutex_lock A
> > +                /* might wait for A */
> > +                /* start to take into account event A's context */
> 
> What do you mean precisely by "context".

That means one of task context, irq context, wq worker context (even
though it can also be considered as task context), or something.

Of course, in the example above, it must be task context since it showed
a case involving only sleepible ones.  However, I wanted to use general
terms in the document to cover all the types of context e.g. irq, task,
and so on.

> You use the word in the heading "context X  context Y  context Z"
> so it seems like "context" means "task" or "process".  But then as I
> read on, I think - maybe it means something else.  If it does, then you
> should use different words.  Maybe "task X ..." in the heading.

It should cover all the types of context.  What word would you recommend
for that purpose?

> If the examples that follow It seems that the "context" for event A
> starts at "mutex lock A" when it (possibly) waits for a mutex and ends
> at "mutex unlock A" - which are both in the same process.  Clearly
> various other events that happen between these two points in the same
> process could be seen as the "context" for event A.
> 
> However event B starts in "context X" with "folio_lock B" and ends in
> "context Z" or "context Y" with "folio_unlock B".  Is that right?

Right.

> My question then is: how do you decide which, of all the event in all
> the processes in all the system, between the start[S] and the end[E] are
> considered to be part of the "context" of event A.

DEPT can identify the "context" of event A only *once* the event A is
actually executed, and builds dependencies between the event and the
recorded waits in the "context" of event A since [S].

> I think it would help me if you defined what a "context" is earlier.

Sorry if my description was not clear enough.

	Byungchul

> What sorts of things appear in a context?
> 
> Thanks,
> NeilBrown
> 
> 
> > +                /* 1 */
> > +   folio_lock B
> > +   /* might wait for B */
> > +   /* start to take into account event B's context */
> > +   /* 2 */
> > +
> > +                folio_lock B
> > +                /* might wait for B */ <- DEADLOCK
> > +                /* start to take into account event B's context */
> > +                /* 3 */
> > +
> > +                                mutex_lock A
> > +                                /* might wait for A */ <- DEADLOCK
> > +                                /* start to take into account
> > +                                   event A's context */
> > +                                /* 4 */
> > +
> > +                                folio_unlock B
> > +                                /* event B that's been valid since 2 */
> > +                folio_unlock B
> > +                /* event B that's been valid since 3 */
> > +
> > +                mutex_unlock A
> > +                /* event A that's been valid since 1 */
> > +
> > +                                mutex_unlock A
> > +                                /* event A that's been valid since 4 */
> > +
> > +Let's build up dependency graph with this example. Firstly, context X:
> > +
> > +   context X
> > +
> > +   folio_lock B
> > +   /* might wait for B */
> > +   /* start to take into account event B's context */
> > +   /* 2 */
> > +
> > +There are no events to create dependency. Next, context Y:
> > +
> > +   context Y
> > +
> > +   mutex_lock A
> > +   /* might wait for A */
> > +   /* start to take into account event A's context */
> > +   /* 1 */
> > +
> > +   folio_lock B
> > +   /* might wait for B */
> > +   /* start to take into account event B's context */
> > +   /* 3 */
> > +
> > +   folio_unlock B
> > +   /* event B that's been valid since 3 */
> > +
> > +   mutex_unlock A
> > +   /* event A that's been valid since 1 */
> > +
> > +There are two events. For event B, folio_unlock B, since there are no
> > +waits between 3 and the event, event B does not create dependency. For
> > +event A, there is a wait, folio_lock B, between 1 and the event. Which
> > +means event A cannot be triggered if event B does not wake up the wait.
> > +Therefore, we can say event A depends on event B, say, 'A -> B'. The
> > +graph will look like after adding the dependency:
> > +
> > +   A -> B
> > +
> > +   where 'A -> B' means that event A depends on event B.
> > +
> > +Lastly, context Z:
> > +
> > +   context Z
> > +
> > +   mutex_lock A
> > +   /* might wait for A */
> > +   /* start to take into account event A's context */
> > +   /* 4 */
> > +
> > +   folio_unlock B
> > +   /* event B that's been valid since 2 */
> > +
> > +   mutex_unlock A
> > +   /* event A that's been valid since 4 */
> > +
> > +There are also two events. For event B, folio_unlock B, there is a
> > +wait, mutex_lock A, between 2 and the event - remind 2 is at a very
> > +start and before the wait in timeline. Which means event B cannot be
> > +triggered if event A does not wake up the wait. Therefore, we can say
> > +event B depends on event A, say, 'B -> A'. The graph will look like
> > +after adding the dependency:
> > +
> > +    -> A -> B -
> > +   /           \
> > +   \           /
> > +    -----------
> > +
> > +   where 'A -> B' means that event A depends on event B.
> > +
> > +A new loop has been created. So DEPT can report it as a deadlock. For
> > +event A, mutex_unlock A, since there are no waits between 4 and the
> > +event, event A does not create dependency. That's it.
> > +
> > +CONCLUSION
> > +
> > +DEPT works well with any general synchronization mechanisms by focusing
> > +on wait, event and its context.
> > +
Re: [PATCH v17 28/47] dept: add documentation for dept
Posted by NeilBrown 2 months ago
On Mon, 13 Oct 2025, Byungchul Park wrote:
> On Fri, Oct 03, 2025 at 04:55:14PM +1000, NeilBrown wrote:
> > On Thu, 02 Oct 2025, Byungchul Park wrote:
> > > This document describes the concept and APIs of dept.
> > >
> > 
> > Thanks for the documentation.  I've been trying to understand it.
> 
> You're welcome.  Feel free to ask me if you have any questions.
> 
> > > +How DEPT works
> > > +--------------
> > > +
> > > +Let's take a look how DEPT works with the 1st example in the section
> > > +'Limitation of lockdep'.
> > > +
> > > +   context X    context Y       context Z
> > > +
> > > +                mutex_lock A
> > > +   folio_lock B
> > > +                folio_lock B <- DEADLOCK
> > > +                                mutex_lock A <- DEADLOCK
> > > +                                folio_unlock B
> > > +                folio_unlock B
> > > +                mutex_unlock A
> > > +                                mutex_unlock A
> > > +
> > > +Adding comments to describe DEPT's view in terms of wait and event:
> > > +
> > > +   context X    context Y       context Z
> > > +
> > > +                mutex_lock A
> > > +                /* wait for A */
> > > +   folio_lock B
> > > +   /* wait for A */
> > > +   /* start event A context */
> > > +
> > > +                folio_lock B
> > > +                /* wait for B */ <- DEADLOCK
> > > +                /* start event B context */
> > > +
> > > +                                mutex_lock A
> > > +                                /* wait for A */ <- DEADLOCK
> > > +                                /* start event A context */
> > > +
> > > +                                folio_unlock B
> > > +                                /* event B */
> > > +                folio_unlock B
> > > +                /* event B */
> > > +
> > > +                mutex_unlock A
> > > +                /* event A */
> > > +                                mutex_unlock A
> > > +                                /* event A */
> > > +
> > 
> > I can't see the value of the above section.
> > The first section with no comments is useful as it is easy to see the
> > deadlock being investigate.  The section below is useful as it add
> > comments to explain how DEPT sees the situation.  But the above section,
> > with some but not all of the comments, does seem (to me) to add anything
> > useful.
> 
> I just wanted to convert 'locking terms' to 'wait and event terms' by
> one step.  However, I can remove the section you pointed out that you
> thought was useless.

But it seems you did it in two steps???

If you think the middle section with some but not all of the comments
adds value (And maybe it does - maybe I just haven't seen it yet), the
please explain what value is being added at each step.

It is currently documented as:

 +Adding comments to describe DEPT's view in terms of wait and event:

then

 +Adding more supplementary comments to describe DEPT's view in detail:

Maybe if you said more DEPT's view so at this point so that when we see
the supplementary comments, we can understand how they relate to DEPT's
view.


> 
> > > +
> > > +   context X    context Y       context Z
> > > +
> > > +                mutex_lock A
> > > +                /* might wait for A */
> > > +                /* start to take into account event A's context */
> > 
> > What do you mean precisely by "context".
> 
> That means one of task context, irq context, wq worker context (even
> though it can also be considered as task context), or something.

OK, that makes sense.  If you provide this definition for "context"
before you use the term, I think that will help the reader.


> > If the examples that follow It seems that the "context" for event A
> > starts at "mutex lock A" when it (possibly) waits for a mutex and ends
> > at "mutex unlock A" - which are both in the same process.  Clearly
> > various other events that happen between these two points in the same
> > process could be seen as the "context" for event A.
> > 
> > However event B starts in "context X" with "folio_lock B" and ends in
> > "context Z" or "context Y" with "folio_unlock B".  Is that right?
> 
> Right.
> 
> > My question then is: how do you decide which, of all the event in all
> > the processes in all the system, between the start[S] and the end[E] are
> > considered to be part of the "context" of event A.
> 
> DEPT can identify the "context" of event A only *once* the event A is
> actually executed, and builds dependencies between the event and the
> recorded waits in the "context" of event A since [S].

So a dependency is an ordered set of pairs of "context" and "wait" or
"context" and "event".  "wait"s and "event"s are linked by some abstract
identifier for the event (like lockdep's lock classes).
How are the contexts abstracted. Is it just "same" or "different"

I'll try reading the document again and see how much further I get.

Thanks,
NeilBrown
Re: [PATCH v17 28/47] dept: add documentation for dept
Posted by Byungchul Park 2 months ago
On Tue, Oct 14, 2025 at 05:03:58PM +1100, NeilBrown wrote:
> On Mon, 13 Oct 2025, Byungchul Park wrote:
> > On Fri, Oct 03, 2025 at 04:55:14PM +1000, NeilBrown wrote:
> > > On Thu, 02 Oct 2025, Byungchul Park wrote:
> > > > This document describes the concept and APIs of dept.
> > > >
> > >
> > > Thanks for the documentation.  I've been trying to understand it.
> >
> > You're welcome.  Feel free to ask me if you have any questions.
> >
> > > > +How DEPT works
> > > > +--------------
> > > > +
> > > > +Let's take a look how DEPT works with the 1st example in the section
> > > > +'Limitation of lockdep'.
> > > > +
> > > > +   context X    context Y       context Z
> > > > +
> > > > +                mutex_lock A
> > > > +   folio_lock B
> > > > +                folio_lock B <- DEADLOCK
> > > > +                                mutex_lock A <- DEADLOCK
> > > > +                                folio_unlock B
> > > > +                folio_unlock B
> > > > +                mutex_unlock A
> > > > +                                mutex_unlock A
> > > > +
> > > > +Adding comments to describe DEPT's view in terms of wait and event:
> > > > +
> > > > +   context X    context Y       context Z
> > > > +
> > > > +                mutex_lock A
> > > > +                /* wait for A */
> > > > +   folio_lock B
> > > > +   /* wait for A */
> > > > +   /* start event A context */
> > > > +
> > > > +                folio_lock B
> > > > +                /* wait for B */ <- DEADLOCK
> > > > +                /* start event B context */
> > > > +
> > > > +                                mutex_lock A
> > > > +                                /* wait for A */ <- DEADLOCK
> > > > +                                /* start event A context */
> > > > +
> > > > +                                folio_unlock B
> > > > +                                /* event B */
> > > > +                folio_unlock B
> > > > +                /* event B */
> > > > +
> > > > +                mutex_unlock A
> > > > +                /* event A */
> > > > +                                mutex_unlock A
> > > > +                                /* event A */
> > > > +
> > >
> > > I can't see the value of the above section.
> > > The first section with no comments is useful as it is easy to see the
> > > deadlock being investigate.  The section below is useful as it add
> > > comments to explain how DEPT sees the situation.  But the above section,
> > > with some but not all of the comments, does seem (to me) to add anything
> > > useful.
> >
> > I just wanted to convert 'locking terms' to 'wait and event terms' by
> > one step.  However, I can remove the section you pointed out that you
> > thought was useless.
> 
> But it seems you did it in two steps???
> 
> If you think the middle section with some but not all of the comments
> adds value (And maybe it does - maybe I just haven't seen it yet), the
> please explain what value is being added at each step.
> 
> It is currently documented as:
> 
>  +Adding comments to describe DEPT's view in terms of wait and event:
> 
> then
> 
>  +Adding more supplementary comments to describe DEPT's view in detail:
> 
> Maybe if you said more DEPT's view so at this point so that when we see
> the supplementary comments, we can understand how they relate to DEPT's
> view.

As you pointed out, I'd better remove the middle part so as to simplify
it.  It doesn't give much information I also think.

> > > > +
> > > > +   context X    context Y       context Z
> > > > +
> > > > +                mutex_lock A
> > > > +                /* might wait for A */
> > > > +                /* start to take into account event A's context */
> > >
> > > What do you mean precisely by "context".
> >
> > That means one of task context, irq context, wq worker context (even
> > though it can also be considered as task context), or something.
> 
> OK, that makes sense.  If you provide this definition for "context"
> before you use the term, I think that will help the reader.

Thank you.  I will add it.

> > > If the examples that follow It seems that the "context" for event A
> > > starts at "mutex lock A" when it (possibly) waits for a mutex and ends
> > > at "mutex unlock A" - which are both in the same process.  Clearly
> > > various other events that happen between these two points in the same
> > > process could be seen as the "context" for event A.
> > >
> > > However event B starts in "context X" with "folio_lock B" and ends in
> > > "context Z" or "context Y" with "folio_unlock B".  Is that right?
> >
> > Right.
> >
> > > My question then is: how do you decide which, of all the event in all
> > > the processes in all the system, between the start[S] and the end[E] are
> > > considered to be part of the "context" of event A.
> >
> > DEPT can identify the "context" of event A only *once* the event A is
> > actually executed, and builds dependencies between the event and the
> > recorded waits in the "context" of event A since [S].
> 
> So a dependency is an ordered set of pairs of "context" and "wait" or

I don't get what you were trying to tell here.  FWIW, DEPT focuses on
*event* contexts and, within each event context, it tracks pairs of
waits that appears since [S] and the interesting event that identifies
the event context.

> "context" and "event".  "wait"s and "event"s are linked by some abstract
> identifier for the event (like lockdep's lock classes).

Yeah, kind of.

> How are the contexts abstracted. Is it just "same" or "different"

I don't get this.  Can you explain in more detail?

	Byungchul

> I'll try reading the document again and see how much further I get.
> 
> Thanks,
> NeilBrown
Re: [PATCH v17 28/47] dept: add documentation for dept
Posted by Jonathan Corbet 2 months, 2 weeks ago
Byungchul Park <byungchul@sk.com> writes:

> This document describes the concept and APIs of dept.
>
> Signed-off-by: Byungchul Park <byungchul@sk.com>
> ---
>  Documentation/dependency/dept.txt     | 735 ++++++++++++++++++++++++++
>  Documentation/dependency/dept_api.txt | 117 ++++
>  2 files changed, 852 insertions(+)
>  create mode 100644 Documentation/dependency/dept.txt
>  create mode 100644 Documentation/dependency/dept_api.txt

As already suggested, this should be in RST; you're already 95% of the
way there.  Also, please put it under Documentation/dev-tools; we don't
need another top-level directory for this.

Thanks,

jon
Re: [PATCH v17 28/47] dept: add documentation for dept
Posted by Byungchul Park 2 months ago
On Thu, Oct 02, 2025 at 11:36:16PM -0600, Jonathan Corbet wrote:
> Byungchul Park <byungchul@sk.com> writes:
> 
> > This document describes the concept and APIs of dept.
> >
> > Signed-off-by: Byungchul Park <byungchul@sk.com>
> > ---
> >  Documentation/dependency/dept.txt     | 735 ++++++++++++++++++++++++++
> >  Documentation/dependency/dept_api.txt | 117 ++++
> >  2 files changed, 852 insertions(+)
> >  create mode 100644 Documentation/dependency/dept.txt
> >  create mode 100644 Documentation/dependency/dept_api.txt
> 
> As already suggested, this should be in RST; you're already 95% of the
> way there.  Also, please put it under Documentation/dev-tools; we don't
> need another top-level directory for this.

Sure.  I will.  Thanks!

	Byungchul
> 
> Thanks,
> 
> jon
Re: [PATCH v17 28/47] dept: add documentation for dept
Posted by Bagas Sanjaya 2 months, 2 weeks ago
On Thu, Oct 02, 2025 at 05:12:28PM +0900, Byungchul Park wrote:
> This document describes the concept and APIs of dept.
> 
> Signed-off-by: Byungchul Park <byungchul@sk.com>
> ---
>  Documentation/dependency/dept.txt     | 735 ++++++++++++++++++++++++++
>  Documentation/dependency/dept_api.txt | 117 ++++
>  2 files changed, 852 insertions(+)
>  create mode 100644 Documentation/dependency/dept.txt
>  create mode 100644 Documentation/dependency/dept_api.txt

What about writing dept docs in reST (like the rest of kernel documentation)?

---- >8 ----
diff --git a/Documentation/dependency/dept.txt b/Documentation/locking/dept.rst
similarity index 92%
rename from Documentation/dependency/dept.txt
rename to Documentation/locking/dept.rst
index 5dd358b96734e6..7b90a0d95f0876 100644
--- a/Documentation/dependency/dept.txt
+++ b/Documentation/locking/dept.rst
@@ -8,7 +8,7 @@ How lockdep works
 
 Lockdep detects a deadlock by checking lock acquisition order. For
 example, a graph to track acquisition order built by lockdep might look
-like:
+like::
 
    A -> B -
            \
@@ -16,12 +16,12 @@ like:
            /
    C -> D -
 
-   where 'A -> B' means that acquisition A is prior to acquisition B
-   with A still held.
+where 'A -> B' means that acquisition A is prior to acquisition B
+with A still held.
 
 Lockdep keeps adding each new acquisition order into the graph in
 runtime. For example, 'E -> C' will be added when the two locks have
-been acquired in the order, E and then C. The graph will look like:
+been acquired in the order, E and then C. The graph will look like::
 
        A -> B -
                \
@@ -32,10 +32,10 @@ been acquired in the order, E and then C. The graph will look like:
    \                  /
     ------------------
 
-   where 'A -> B' means that acquisition A is prior to acquisition B
-   with A still held.
+where 'A -> B' means that acquisition A is prior to acquisition B
+with A still held.
 
-This graph contains a subgraph that demonstrates a loop like:
+This graph contains a subgraph that demonstrates a loop like::
 
                 -> E -
                /      \
@@ -67,6 +67,8 @@ mechanisms, lockdep doesn't work.
 
 Can lockdep detect the following deadlock?
 
+::
+
    context X	   context Y	   context Z
 
 		   mutex_lock A
@@ -80,6 +82,8 @@ Can lockdep detect the following deadlock?
 
 No. What about the following?
 
+::
+
    context X		   context Y
 
 			   mutex_lock A
@@ -101,7 +105,7 @@ What leads a deadlock
 ---------------------
 
 A deadlock occurs when one or multi contexts are waiting for events that
-will never happen. For example:
+will never happen. For example::
 
    context X	   context Y	   context Z
 
@@ -121,24 +125,24 @@ We call this *deadlock*.
 If an event occurrence is a prerequisite to reaching another event, we
 call it *dependency*. In this example:
 
-   Event A occurrence is a prerequisite to reaching event C.
-   Event C occurrence is a prerequisite to reaching event B.
-   Event B occurrence is a prerequisite to reaching event A.
+   * Event A occurrence is a prerequisite to reaching event C.
+   * Event C occurrence is a prerequisite to reaching event B.
+   * Event B occurrence is a prerequisite to reaching event A.
 
 In terms of dependency:
 
-   Event C depends on event A.
-   Event B depends on event C.
-   Event A depends on event B.
+   * Event C depends on event A.
+   * Event B depends on event C.
+   * Event A depends on event B.
 
-Dependency graph reflecting this example will look like:
+Dependency graph reflecting this example will look like::
 
     -> C -> A -> B -
    /                \
    \                /
     ----------------
 
-   where 'A -> B' means that event A depends on event B.
+where 'A -> B' means that event A depends on event B.
 
 A circular dependency exists. Such a circular dependency leads a
 deadlock since no waiters can have desired events triggered.
@@ -152,7 +156,7 @@ Introduce DEPT
 --------------
 
 DEPT(DEPendency Tracker) tracks wait and event instead of lock
-acquisition order so as to recognize the following situation:
+acquisition order so as to recognize the following situation::
 
    context X	   context Y	   context Z
 
@@ -165,18 +169,18 @@ acquisition order so as to recognize the following situation:
 				   event A
 
 and builds up a dependency graph in runtime that is similar to lockdep.
-The graph might look like:
+The graph might look like::
 
     -> C -> A -> B -
    /                \
    \                /
     ----------------
 
-   where 'A -> B' means that event A depends on event B.
+where 'A -> B' means that event A depends on event B.
 
 DEPT keeps adding each new dependency into the graph in runtime. For
 example, 'B -> D' will be added when event D occurrence is a
-prerequisite to reaching event B like:
+prerequisite to reaching event B like::
 
    |
    v
@@ -184,7 +188,7 @@ prerequisite to reaching event B like:
    .
    event B
 
-After the addition, the graph will look like:
+After the addition, the graph will look like::
 
                      -> D
                     /
@@ -209,6 +213,8 @@ How DEPT works
 Let's take a look how DEPT works with the 1st example in the section
 'Limitation of lockdep'.
 
+::
+
    context X	   context Y	   context Z
 
 		   mutex_lock A
@@ -220,7 +226,7 @@ Let's take a look how DEPT works with the 1st example in the section
 		   mutex_unlock A
 				   mutex_unlock A
 
-Adding comments to describe DEPT's view in terms of wait and event:
+Adding comments to describe DEPT's view in terms of wait and event::
 
    context X	   context Y	   context Z
 
@@ -248,7 +254,7 @@ Adding comments to describe DEPT's view in terms of wait and event:
 				   mutex_unlock A
 				   /* event A */
 
-Adding more supplementary comments to describe DEPT's view in detail:
+Adding more supplementary comments to describe DEPT's view in detail::
 
    context X	   context Y	   context Z
 
@@ -283,7 +289,7 @@ Adding more supplementary comments to describe DEPT's view in detail:
 				   mutex_unlock A
 				   /* event A that's been valid since 4 */
 
-Let's build up dependency graph with this example. Firstly, context X:
+Let's build up dependency graph with this example. Firstly, context X::
 
    context X
 
@@ -292,7 +298,7 @@ Let's build up dependency graph with this example. Firstly, context X:
    /* start to take into account event B's context */
    /* 2 */
 
-There are no events to create dependency. Next, context Y:
+There are no events to create dependency. Next, context Y::
 
    context Y
 
@@ -317,13 +323,13 @@ waits between 3 and the event, event B does not create dependency. For
 event A, there is a wait, folio_lock B, between 1 and the event. Which
 means event A cannot be triggered if event B does not wake up the wait.
 Therefore, we can say event A depends on event B, say, 'A -> B'. The
-graph will look like after adding the dependency:
+graph will look like after adding the dependency::
 
    A -> B
 
-   where 'A -> B' means that event A depends on event B.
+where 'A -> B' means that event A depends on event B.
 
-Lastly, context Z:
+Lastly, context Z::
 
    context Z
 
@@ -343,7 +349,7 @@ wait, mutex_lock A, between 2 and the event - remind 2 is at a very
 start and before the wait in timeline. Which means event B cannot be
 triggered if event A does not wake up the wait. Therefore, we can say
 event B depends on event A, say, 'B -> A'. The graph will look like
-after adding the dependency:
+after adding the dependency::
 
     -> A -> B -
    /           \
@@ -367,6 +373,8 @@ Interpret DEPT report
 
 The following is the example in the section 'How DEPT works'.
 
+::
+
    context X	   context Y	   context Z
 
 		   mutex_lock A
@@ -402,7 +410,7 @@ The following is the example in the section 'How DEPT works'.
 
 We can Simplify this by replacing each waiting point with [W], each
 point where its event's context starts with [S] and each event with [E].
-This example will look like after the replacement:
+This example will look like after the replacement::
 
    context X	   context Y	   context Z
 
@@ -419,6 +427,8 @@ This example will look like after the replacement:
 DEPT uses the symbols [W], [S] and [E] in its report as described above.
 The following is an example reported by DEPT for a real problem.
 
+::
+
    Link: https://lore.kernel.org/lkml/6383cde5-cf4b-facf-6e07-1378a485657d@I-love.SAKURA.ne.jp/#t
    Link: https://lore.kernel.org/lkml/1674268856-31807-1-git-send-email-byungchul.park@lge.com/
 
@@ -620,6 +630,8 @@ The following is an example reported by DEPT for a real problem.
 
 Let's take a look at the summary that is the most important part.
 
+::
+
    ---------------------------------------------------
    summary
    ---------------------------------------------------
@@ -639,7 +651,7 @@ Let's take a look at the summary that is the most important part.
    [W]: the wait blocked
    [E]: the event not reachable
 
-The summary shows the following scenario:
+The summary shows the following scenario::
 
    context A	   context B	   context ?(unknown)
 
@@ -652,7 +664,7 @@ The summary shows the following scenario:
 
    [E] unlock(&ni->ni_lock:0)
 
-Adding supplementary comments to describe DEPT's view in detail:
+Adding supplementary comments to describe DEPT's view in detail::
 
    context A	   context B	   context ?(unknown)
 
@@ -677,7 +689,7 @@ Adding supplementary comments to describe DEPT's view in detail:
    [E] unlock(&ni->ni_lock:0)
    /* event that's been valid since 2 */
 
-Let's build up dependency graph with this report. Firstly, context A:
+Let's build up dependency graph with this report. Firstly, context A::
 
    context A
 
@@ -697,13 +709,13 @@ wait, folio_lock(&f1), between 2 and the event. Which means
 unlock(&ni->ni_lock:0) is not reachable if folio_unlock(&f1) does not
 wake up the wait. Therefore, we can say unlock(&ni->ni_lock:0) depends
 on folio_unlock(&f1), say, 'unlock(&ni->ni_lock:0) -> folio_unlock(&f1)'.
-The graph will look like after adding the dependency:
+The graph will look like after adding the dependency::
 
    unlock(&ni->ni_lock:0) -> folio_unlock(&f1)
 
-   where 'A -> B' means that event A depends on event B.
+where 'A -> B' means that event A depends on event B.
 
-Secondly, context B:
+Secondly, context B::
 
    context B
 
@@ -719,14 +731,14 @@ very start and before the wait in timeline. Which means folio_unlock(&f1)
 is not reachable if unlock(&ni->ni_lock:0) does not wake up the wait.
 Therefore, we can say folio_unlock(&f1) depends on unlock(&ni->ni_lock:0),
 say, 'folio_unlock(&f1) -> unlock(&ni->ni_lock:0)'. The graph will look
-like after adding the dependency:
+like after adding the dependency::
 
     -> unlock(&ni->ni_lock:0) -> folio_unlock(&f1) -
    /                                                \
    \                                                /
     ------------------------------------------------
 
-   where 'A -> B' means that event A depends on event B.
+where 'A -> B' means that event A depends on event B.
 
 A new loop has been created. So DEPT can report it as a deadlock! Cool!
 
diff --git a/Documentation/dependency/dept_api.txt b/Documentation/locking/dept_api.rst
similarity index 97%
rename from Documentation/dependency/dept_api.txt
rename to Documentation/locking/dept_api.rst
index 8e0d5a118a460e..96c4d65f4a9a2d 100644
--- a/Documentation/dependency/dept_api.txt
+++ b/Documentation/locking/dept_api.rst
@@ -10,6 +10,8 @@ already applied into the existing synchronization primitives e.g.
 waitqueue, swait, wait_for_completion(), dma fence and so on.  The basic
 APIs of SDT are:
 
+.. code-block:: c
+
    /*
     * After defining 'struct dept_map map', initialize the instance.
     */
@@ -27,6 +29,8 @@ APIs of SDT are:
 
 The advanced APIs of SDT are:
 
+.. code-block:: c
+
    /*
     * After defining 'struct dept_map map', initialize the instance
     * using an external key.
@@ -83,6 +87,8 @@ Do not use these APIs directly.  These are the wrappers for typical
 locks, that have been already applied into major locks internally e.g.
 spin lock, mutex, rwlock and so on.  The APIs of LDT are:
 
+.. code-block:: c
+   
    ldt_init(map, key, sub, name);
    ldt_lock(map, sub_local, try, nest, ip);
    ldt_rlock(map, sub_local, try, nest, ip, queued);
@@ -96,6 +102,8 @@ Raw APIs
 --------
 Do not use these APIs directly.  The raw APIs of dept are:
 
+.. code-block:: c
+
    dept_free_range(start, size);
    dept_map_init(map, key, sub, name);
    dept_map_reinit(map, key, sub, name);
diff --git a/Documentation/locking/index.rst b/Documentation/locking/index.rst
index 6a9ea96c8bcb70..7ec3dce7fee425 100644
--- a/Documentation/locking/index.rst
+++ b/Documentation/locking/index.rst
@@ -24,6 +24,8 @@ Locking
     percpu-rw-semaphore
     robust-futexes
     robust-futex-ABI
+    dept
+    dept_api
 
 .. only::  subproject and html
 

> +Can lockdep detect the following deadlock?
> +
> +   context X	   context Y	   context Z
> +
> +		   mutex_lock A
> +   folio_lock B
> +		   folio_lock B <- DEADLOCK
> +				   mutex_lock A <- DEADLOCK
> +				   folio_unlock B
> +		   folio_unlock B
> +		   mutex_unlock A
> +				   mutex_unlock A
> +
> +No. What about the following?
> +
> +   context X		   context Y
> +
> +			   mutex_lock A
> +   mutex_lock A <- DEADLOCK
> +			   wait_for_complete B <- DEADLOCK
> +   complete B
> +			   mutex_unlock A
> +   mutex_unlock A

Can you explain how DEPT detects deadlock on the second example above (like
the first one being described in "How DEPT works" section)?

Confused...

-- 
An old man doll... just what I always wanted! - Clara
Re: [PATCH v17 28/47] dept: add documentation for dept
Posted by Byungchul Park 2 months ago
On Fri, Oct 03, 2025 at 09:44:28AM +0700, Bagas Sanjaya wrote:
> On Thu, Oct 02, 2025 at 05:12:28PM +0900, Byungchul Park wrote:
> > This document describes the concept and APIs of dept.
> >
> > Signed-off-by: Byungchul Park <byungchul@sk.com>
> > ---
> >  Documentation/dependency/dept.txt     | 735 ++++++++++++++++++++++++++
> >  Documentation/dependency/dept_api.txt | 117 ++++
> >  2 files changed, 852 insertions(+)
> >  create mode 100644 Documentation/dependency/dept.txt
> >  create mode 100644 Documentation/dependency/dept_api.txt
> 
> What about writing dept docs in reST (like the rest of kernel documentation)?

Sorry for late reply, but sure.  I should and will.  Thank you!

> ---- >8 ----
> diff --git a/Documentation/dependency/dept.txt b/Documentation/locking/dept.rst
> similarity index 92%
> rename from Documentation/dependency/dept.txt
> rename to Documentation/locking/dept.rst
> index 5dd358b96734e6..7b90a0d95f0876 100644
> --- a/Documentation/dependency/dept.txt
> +++ b/Documentation/locking/dept.rst

However, I'm not sure if dept is about locking.  dept is about general
waits e.g. wait_for_completion(), wait_event(), and so on, rather than
just waits involved in typical locking mechanisms e.g. spin lock, mutex,
and so on.

[snip]

> > +
> > +   context X            context Y
> > +
			     /*
			      * Acquired A.
			      */
> > +                        mutex_lock A

			     /*
			      * Request something that will be handled
			      * through e.g. wq, deamon or any its own
			      * way, and then do 'complete B'.
			      */
			     request_xxx_and_complete_B();
        /*
	 * The request from
	 * context Y has been
	 * done.  So running
	 * toward 'complete B'.
	 */

	/*
	 * Wait for A to be
	 * released, but will
	 * never happen.
	 */
> > +   mutex_lock A <- DEADLOCK
			     /*
			      * Wait for 'complete B' to happen, but
			      * will never happen.
			      */
> > +                        wait_for_complete B <- DEADLOCK
	/*
	 * Never reachable.
	 */
> > +   complete B

			     /*
			      * Never reachable.
			      */
> > +                        mutex_unlock A
> > +   mutex_unlock A
> 
> Can you explain how DEPT detects deadlock on the second example above (like
> the first one being described in "How DEPT works" section)?

Sure.  I added the explanation inline above.  Don't hesitate if you have
any questions.  Thanks.

	Byungchul
> 
> Confused...
> 
> --
> An old man doll... just what I always wanted! - Clara