[RFC PATCH v2] codetag: ensure module memory has been freed

Jackie Liu posted 1 patch 2 months ago
lib/codetag.c | 3 +++
1 file changed, 3 insertions(+)
[RFC PATCH v2] codetag: ensure module memory has been freed
Posted by Jackie Liu 2 months ago
From: Jackie Liu <liuyun01@kylinos.cn>

We found a problem that can be quickly reproduced using the self-test
script [1], (the latest version of the test script no longer releases
the module immediately). There will be a warning message that the module
memory has not been released. In fact, it is released through kree_rcu,
and its memory will eventually be released, so this warning message is
incorrect.

I don’t think this is a correct solution. I tried to use rcu_barrier for
synchronization, but it didn’t work. After using schedule(), the warning
message disappeared. It ensures that kfree has been called, so the counter
will be cleared.

The specific error message is as follows:

[   76.756915] ------------[ cut here ]------------
[   76.756921] drivers/net/bonding/bond_main.c:5122 module bonding func:bond_update_slave_arr has 320 allocated at module unload
[   76.756991] WARNING: CPU: 0 PID: 5503 at lib/alloc_tag.c:168 alloc_tag_module_unload+0x1a8/0x238
[   76.757371]  aes_neon_bs aes_ce_blk [last unloaded: bonding]
[   76.757379] CPU: 0 PID: 5503 Comm: modprobe Kdump: loaded Not tainted 6.6.52+ #7
[   76.757383] Source Version: d828af5b77f6a3d3a91203e6d60a02c83ce77d74
[   76.757385] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
[   76.757387] pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[   76.757390] pc : alloc_tag_module_unload+0x1a8/0x238
[   76.757395] lr : alloc_tag_module_unload+0x1a8/0x238
[   76.757398] sp : ffff800081f07890
[   76.757400] x29: ffff800081f07890 x28: 0000000000000008 x27: ffff6fc980b10000
[   76.757405] x26: ffff800081f07930 x25: ffffb2b6c410ef00 x24: 0000000000001402
[   76.757410] x23: ffffb2b72ed28500 x22: 0000000000000140 x21: ffffb2b72ed23a40
[   76.757415] x20: ffffb2b6c40edca0 x19: ffffb2b6c410ef80 x18: 0000000000000000
[   76.757419] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[   76.757424] x14: 0000000000000000 x13: 0000000000000001 x12: ffff645015cef093
[   76.757428] x11: 1fffe45015cef092 x10: ffff645015cef092 x9 : dfff800000000000
[   76.757433] x8 : 00009bafea310f6e x7 : ffff2280ae778493 x6 : 0000000000000001
[   76.757438] x5 : ffff2280ae778490 x4 : ffff645015cef093 x3 : dfff800000000000
[   76.757442] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff2280113be400
[   76.757447] Call trace:
[   76.757452]  alloc_tag_module_unload+0x1a8/0x238
[   76.757455]  codetag_unload_module+0x184/0x218
[   76.757458]  free_module+0x30/0x270
[   76.757470]  __do_sys_delete_module.constprop.0+0x2c4/0x408
[   76.757473]  __arm64_sys_delete_module+0x28/0x40
[   76.757476]  invoke_syscall+0xb0/0x190
[   76.757479]  el0_svc_common.constprop.0+0x80/0x150
[   76.757482]  do_el0_svc+0x38/0x50
[   76.757485]  el0_svc+0x40/0xe0
[   76.757501]  el0t_64_sync_handler+0x100/0x130
[   76.757504]  el0t_64_sync+0x1a4/0x1a8
[   76.757511] Kernel panic - not syncing: kernel: panic_on_warn set ...

I think this problem occurs not only in the bonding module, but also
because the memory allocation profiling does not take the kfree_rcu
situation into consideration.

[1]: https://elixir.bootlin.com/linux/v6.6.52/source/tools/testing/selftests/drivers/net/bonding/bond-break-lacpdu-tx.sh

Fixes: 47a92dfbe01f ("lib: prevent module unloading if memory is not freed")
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
---
 lib/codetag.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/lib/codetag.c b/lib/codetag.c
index afa8a2d4f317..7eab77e99381 100644
--- a/lib/codetag.c
+++ b/lib/codetag.c
@@ -228,6 +228,9 @@ bool codetag_unload_module(struct module *mod)
 	if (!mod)
 		return true;
 
+	/* Make sure all module's rcu memory is released */
+	schedule();
+
 	mutex_lock(&codetag_lock);
 	list_for_each_entry(cttype, &codetag_types, link) {
 		struct codetag_module *found = NULL;
-- 
2.46.2