drivers/char/ipmi/ipmi_msghandler.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
On recent kernels this warning fires:
drivers/char/ipmi/ipmi_msghandler.c:1238 RCU-list traversed in non-reader section!!
This looks like a fairly simple lockdep trigger, where
list_for_each_entry_rcu and list_for_each_entry_srcu are
functionally identical, but the lockdep annotation in
the former has an extra check.
That extra check is whether the RCU read lock is held,
which is not true when the code uses srcu_read_lock.
Get rid of the warning by using the properly annotated
list traversal macro.
Signed-off-by: Rik van Riel <riel@surriel.com>
---
drivers/char/ipmi/ipmi_msghandler.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c
index 1e5313748f8b..a2823763fd37 100644
--- a/drivers/char/ipmi/ipmi_msghandler.c
+++ b/drivers/char/ipmi/ipmi_msghandler.c
@@ -1235,7 +1235,7 @@ int ipmi_create_user(unsigned int if_num,
return -ENOMEM;
index = srcu_read_lock(&ipmi_interfaces_srcu);
- list_for_each_entry_rcu(intf, &ipmi_interfaces, link) {
+ list_for_each_entry_srcu(intf, &ipmi_interfaces, link) {
if (intf->intf_num == if_num)
goto found;
}
--
2.48.1
On Wed, Mar 12, 2025 at 01:19:32PM -0400, Rik van Riel wrote: > On recent kernels this warning fires: > > drivers/char/ipmi/ipmi_msghandler.c:1238 RCU-list traversed in non-reader section!! > > This looks like a fairly simple lockdep trigger, where > list_for_each_entry_rcu and list_for_each_entry_srcu are > functionally identical, but the lockdep annotation in > the former has an extra check. > > That extra check is whether the RCU read lock is held, > which is not true when the code uses srcu_read_lock. > > Get rid of the warning by using the properly annotated > list traversal macro. Thanks for looking at this one. There was a discussion about this issue a few years ago, with a different approach that never landed upstream. https://lore.kernel.org/all/20201119123831.GH3710@minyard.net/#r
On Mon, Mar 17, 2025 at 02:33:31AM -0700, Breno Leitao wrote: > On Wed, Mar 12, 2025 at 01:19:32PM -0400, Rik van Riel wrote: > > On recent kernels this warning fires: > > > > drivers/char/ipmi/ipmi_msghandler.c:1238 RCU-list traversed in non-reader section!! > > > > This looks like a fairly simple lockdep trigger, where > > list_for_each_entry_rcu and list_for_each_entry_srcu are > > functionally identical, but the lockdep annotation in > > the former has an extra check. > > > > That extra check is whether the RCU read lock is held, > > which is not true when the code uses srcu_read_lock. > > > > Get rid of the warning by using the properly annotated > > list traversal macro. > > Thanks for looking at this one. > > There was a discussion about this issue a few years ago, with > a different approach that never landed upstream. > > https://lore.kernel.org/all/20201119123831.GH3710@minyard.net/#r I thought this looked familiar. Breno, I believe you suggested a change to the patch that sounded reasonable, so I removed the patch, and then nothing happened and I didn't follow up. This is kind of a mess :-(. Let me look at it. -corey
On Mon, Mar 17, 2025 at 07:27:30AM -0500, Corey Minyard wrote: > On Mon, Mar 17, 2025 at 02:33:31AM -0700, Breno Leitao wrote: > > On Wed, Mar 12, 2025 at 01:19:32PM -0400, Rik van Riel wrote: > > > On recent kernels this warning fires: > > > > > > drivers/char/ipmi/ipmi_msghandler.c:1238 RCU-list traversed in non-reader section!! > > > > > > This looks like a fairly simple lockdep trigger, where > > > list_for_each_entry_rcu and list_for_each_entry_srcu are > > > functionally identical, but the lockdep annotation in > > > the former has an extra check. > > > > > > That extra check is whether the RCU read lock is held, > > > which is not true when the code uses srcu_read_lock. > > > > > > Get rid of the warning by using the properly annotated > > > list traversal macro. > > > > Thanks for looking at this one. > > > > There was a discussion about this issue a few years ago, with > > a different approach that never landed upstream. > > > > https://lore.kernel.org/all/20201119123831.GH3710@minyard.net/#r > > I thought this looked familiar. > > Breno, I believe you suggested a change to the patch that sounded > reasonable, so I removed the patch, and then nothing happened and I > didn't follow up. > > This is kind of a mess :-(. Let me look at it. I've been working on this, so it's not forgotten. Actually, I've mostly been working on a test framework for the IPMI driver, which has been a fairly big job. The modification was fairly easy. I decided to just pull srcu out of this and use a different method. And I'm not going to do that without some serious testing, thus the delay. Reviewing the code has shown me that the IPMI driver needs some TLC in some places. Anyway, expect some changes in the next few days. -corey
Hi Rik,
kernel test robot noticed the following build errors:
[auto build test ERROR on v6.14-rc6]
[also build test ERROR on linus/master next-20250313]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Rik-van-Riel/ipmi-fix-suspicious-RCU-usage-warning/20250313-013222
base: v6.14-rc6
patch link: https://lore.kernel.org/r/20250312131932.44d901f7%40fangorn
patch subject: [PATCH] ipmi: fix suspicious RCU usage warning
config: i386-randconfig-001-20250313 (https://download.01.org/0day-ci/archive/20250314/202503140113.cTWvIvtK-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250314/202503140113.cTWvIvtK-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202503140113.cTWvIvtK-lkp@intel.com/
All error/warnings (new ones prefixed by >>):
drivers/char/ipmi/ipmi_msghandler.c: In function 'ipmi_create_user':
>> drivers/char/ipmi/ipmi_msghandler.c:1238:62: error: macro "list_for_each_entry_srcu" requires 4 arguments, but only 3 given
1238 | list_for_each_entry_srcu(intf, &ipmi_interfaces, link) {
| ^
In file included from include/linux/dcache.h:8,
from include/linux/fs.h:8,
from include/linux/poll.h:10,
from drivers/char/ipmi/ipmi_msghandler.c:20:
include/linux/rculist.h:455: note: macro "list_for_each_entry_srcu" defined here
455 | #define list_for_each_entry_srcu(pos, head, member, cond) \
|
>> drivers/char/ipmi/ipmi_msghandler.c:1238:9: error: 'list_for_each_entry_srcu' undeclared (first use in this function)
1238 | list_for_each_entry_srcu(intf, &ipmi_interfaces, link) {
| ^~~~~~~~~~~~~~~~~~~~~~~~
drivers/char/ipmi/ipmi_msghandler.c:1238:9: note: each undeclared identifier is reported only once for each function it appears in
>> drivers/char/ipmi/ipmi_msghandler.c:1238:33: error: expected ';' before '{' token
1238 | list_for_each_entry_srcu(intf, &ipmi_interfaces, link) {
| ^ ~
| ;
>> drivers/char/ipmi/ipmi_msghandler.c:1246:2: warning: label 'found' defined but not used [-Wunused-label]
1246 | found:
| ^~~~~
vim +/list_for_each_entry_srcu +1238 drivers/char/ipmi/ipmi_msghandler.c
1203
1204 int ipmi_create_user(unsigned int if_num,
1205 const struct ipmi_user_hndl *handler,
1206 void *handler_data,
1207 struct ipmi_user **user)
1208 {
1209 unsigned long flags;
1210 struct ipmi_user *new_user;
1211 int rv, index;
1212 struct ipmi_smi *intf;
1213
1214 /*
1215 * There is no module usecount here, because it's not
1216 * required. Since this can only be used by and called from
1217 * other modules, they will implicitly use this module, and
1218 * thus this can't be removed unless the other modules are
1219 * removed.
1220 */
1221
1222 if (handler == NULL)
1223 return -EINVAL;
1224
1225 /*
1226 * Make sure the driver is actually initialized, this handles
1227 * problems with initialization order.
1228 */
1229 rv = ipmi_init_msghandler();
1230 if (rv)
1231 return rv;
1232
1233 new_user = vzalloc(sizeof(*new_user));
1234 if (!new_user)
1235 return -ENOMEM;
1236
1237 index = srcu_read_lock(&ipmi_interfaces_srcu);
> 1238 list_for_each_entry_srcu(intf, &ipmi_interfaces, link) {
1239 if (intf->intf_num == if_num)
1240 goto found;
1241 }
1242 /* Not found, return an error */
1243 rv = -EINVAL;
1244 goto out_kfree;
1245
> 1246 found:
1247 if (atomic_add_return(1, &intf->nr_users) > max_users) {
1248 rv = -EBUSY;
1249 goto out_kfree;
1250 }
1251
1252 INIT_WORK(&new_user->remove_work, free_user_work);
1253
1254 rv = init_srcu_struct(&new_user->release_barrier);
1255 if (rv)
1256 goto out_kfree;
1257
1258 if (!try_module_get(intf->owner)) {
1259 rv = -ENODEV;
1260 goto out_kfree;
1261 }
1262
1263 /* Note that each existing user holds a refcount to the interface. */
1264 kref_get(&intf->refcount);
1265
1266 atomic_set(&new_user->nr_msgs, 0);
1267 kref_init(&new_user->refcount);
1268 new_user->handler = handler;
1269 new_user->handler_data = handler_data;
1270 new_user->intf = intf;
1271 new_user->gets_events = false;
1272
1273 rcu_assign_pointer(new_user->self, new_user);
1274 spin_lock_irqsave(&intf->seq_lock, flags);
1275 list_add_rcu(&new_user->link, &intf->users);
1276 spin_unlock_irqrestore(&intf->seq_lock, flags);
1277 if (handler->ipmi_watchdog_pretimeout)
1278 /* User wants pretimeouts, so make sure to watch for them. */
1279 smi_add_watch(intf, IPMI_WATCH_MASK_CHECK_WATCHDOG);
1280 srcu_read_unlock(&ipmi_interfaces_srcu, index);
1281 *user = new_user;
1282 return 0;
1283
1284 out_kfree:
1285 atomic_dec(&intf->nr_users);
1286 srcu_read_unlock(&ipmi_interfaces_srcu, index);
1287 vfree(new_user);
1288 return rv;
1289 }
1290 EXPORT_SYMBOL(ipmi_create_user);
1291
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
On Wed, Mar 12, 2025 at 01:19:32PM -0400, Rik van Riel wrote:
> On recent kernels this warning fires:
>
> drivers/char/ipmi/ipmi_msghandler.c:1238 RCU-list traversed in non-reader section!!
>
> This looks like a fairly simple lockdep trigger, where
> list_for_each_entry_rcu and list_for_each_entry_srcu are
> functionally identical, but the lockdep annotation in
> the former has an extra check.
>
> That extra check is whether the RCU read lock is held,
> which is not true when the code uses srcu_read_lock.
>
> Get rid of the warning by using the properly annotated
> list traversal macro.
>
> Signed-off-by: Rik van Riel <riel@surriel.com>
> ---
> drivers/char/ipmi/ipmi_msghandler.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c
> index 1e5313748f8b..a2823763fd37 100644
> --- a/drivers/char/ipmi/ipmi_msghandler.c
> +++ b/drivers/char/ipmi/ipmi_msghandler.c
> @@ -1235,7 +1235,7 @@ int ipmi_create_user(unsigned int if_num,
> return -ENOMEM;
>
> index = srcu_read_lock(&ipmi_interfaces_srcu);
> - list_for_each_entry_rcu(intf, &ipmi_interfaces, link) {
> + list_for_each_entry_srcu(intf, &ipmi_interfaces, link) {
Doesn't the above line want to be something like this?
+ list_for_each_entry_srcu(intf, &ipmi_interfaces, link,
srcu_read_lock_held(&ipmi_interfaces_srcu)) {
> if (intf->intf_num == if_num)
> goto found;
> }
> --
> 2.48.1
>
On 3/12/25 18:29, Paul E. McKenney wrote:
> On Wed, Mar 12, 2025 at 01:19:32PM -0400, Rik van Riel wrote:
>> diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c
>> index 1e5313748f8b..a2823763fd37 100644
>> --- a/drivers/char/ipmi/ipmi_msghandler.c
>> +++ b/drivers/char/ipmi/ipmi_msghandler.c
>> @@ -1235,7 +1235,7 @@ int ipmi_create_user(unsigned int if_num,
>> return -ENOMEM;
>>
>> index = srcu_read_lock(&ipmi_interfaces_srcu);
>> - list_for_each_entry_rcu(intf, &ipmi_interfaces, link) {
>> + list_for_each_entry_srcu(intf, &ipmi_interfaces, link) {
>
> Doesn't the above line want to be something like this?
>
> + list_for_each_entry_srcu(intf, &ipmi_interfaces, link,
> srcu_read_lock_held(&ipmi_interfaces_srcu)) {
Ouch what a mess. There are multiple occurrences of this,
almost all susceptible to the same warning.
I'd start with:
-#define ipmi_interfaces_mutex_held() \
- lockdep_is_held(&ipmi_interfaces_mutex)
+#define for_each_ipmi_interface(intf) \
+ list_for_each_entry_srcu(intf, &ipmi_interfaces, link,
+ srcu_read_lock_held(&ipmi_interfaces_srcu)
+ || lockdep_is_held(&ipmi_interfaces_mutex)) {
and use the for_each_ipmi_interface() macro throughout the file.
Here is the list... all of them are using _rcu, plus:
- ipmi_smi_watcher_register is using the wrong lockdep_is_held() assertion,
but would warn if fixed
- ipmi_add_smi is using _rcu but otherwise correct
- ipmi_get_smi_info is using _rcu and can warn
- ipmi_timeout is using _rcu and can warn
- panic_event is using _rcu and can warn, and is also not
using any protection around the walk. Taking srcu_read_lock
would be much better
On top of this, intf->users_srcu never does a synchronize_srcu, so I'm
a bit at a loss at how it is protecting the list. The safest change
here is probably:
diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c
index f2a56c624f54..dc8936254c1b 100644
--- a/drivers/char/ipmi/ipmi_msghandler.c
+++ b/drivers/char/ipmi/ipmi_msghandler.c
@@ -3769,12 +3769,12 @@ void ipmi_unregister_smi(struct ipmi_smi *intf)
intf->in_shutdown = true;
list_del_rcu(&intf->link);
mutex_unlock(&ipmi_interfaces_mutex);
- synchronize_srcu(&ipmi_interfaces_srcu);
/* At this point no users can be added to the interface. */
device_remove_file(intf->si_dev, &intf->nr_msgs_devattr);
device_remove_file(intf->si_dev, &intf->nr_users_devattr);
+ synchronize_srcu(&ipmi_interfaces_srcu);
/*
* Call all the watcher interfaces to tell them that
... plus replacing all uses of intf->users_srcu with ipmi_interfaces_srcu.
A couple more issues:
- in handle_read_event_rsp() there's a lone rcu_read_unlock()
that should become srcu_read_unlock() (currently for intf->users_srcu;
modulo changes like the above)
- while the intf->cmd_rcvrs list is protected by regular RCU,
there are many other occurrences of rcu_read_lock(), typically
followed by
if (intf->in_shutdown) {
rv = -ENODEV;
goto out_err;
}
and I think they should use interfaces_srcu instead.
Paolo
>> if (intf->intf_num == if_num)
>> goto found;
>> }
>> --
>> 2.48.1
>>
>
© 2016 - 2025 Red Hat, Inc.