[PATCH] ipmi: fix suspicious RCU usage warning

Rik van Riel posted 1 patch 9 months, 1 week ago
drivers/char/ipmi/ipmi_msghandler.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH] ipmi: fix suspicious RCU usage warning
Posted by Rik van Riel 9 months, 1 week ago
On recent kernels this warning fires:

drivers/char/ipmi/ipmi_msghandler.c:1238 RCU-list traversed in non-reader section!!

This looks like a fairly simple lockdep trigger, where
list_for_each_entry_rcu and list_for_each_entry_srcu are
functionally identical, but the lockdep annotation in
the former has an extra check.

That extra check is whether the RCU read lock is held,
which is not true when the code uses srcu_read_lock.

Get rid of the warning by using the properly annotated
list traversal macro.

Signed-off-by: Rik van Riel <riel@surriel.com>
---
 drivers/char/ipmi/ipmi_msghandler.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c
index 1e5313748f8b..a2823763fd37 100644
--- a/drivers/char/ipmi/ipmi_msghandler.c
+++ b/drivers/char/ipmi/ipmi_msghandler.c
@@ -1235,7 +1235,7 @@ int ipmi_create_user(unsigned int          if_num,
 		return -ENOMEM;
 
 	index = srcu_read_lock(&ipmi_interfaces_srcu);
-	list_for_each_entry_rcu(intf, &ipmi_interfaces, link) {
+	list_for_each_entry_srcu(intf, &ipmi_interfaces, link) {
 		if (intf->intf_num == if_num)
 			goto found;
 	}
-- 
2.48.1
Re: [PATCH] ipmi: fix suspicious RCU usage warning
Posted by Breno Leitao 9 months ago
On Wed, Mar 12, 2025 at 01:19:32PM -0400, Rik van Riel wrote:
> On recent kernels this warning fires:
> 
> drivers/char/ipmi/ipmi_msghandler.c:1238 RCU-list traversed in non-reader section!!
> 
> This looks like a fairly simple lockdep trigger, where
> list_for_each_entry_rcu and list_for_each_entry_srcu are
> functionally identical, but the lockdep annotation in
> the former has an extra check.
> 
> That extra check is whether the RCU read lock is held,
> which is not true when the code uses srcu_read_lock.
> 
> Get rid of the warning by using the properly annotated
> list traversal macro.

Thanks for looking at this one.

There was a discussion about this issue a few years ago, with
a different approach that never landed upstream.

	https://lore.kernel.org/all/20201119123831.GH3710@minyard.net/#r
Re: [PATCH] ipmi: fix suspicious RCU usage warning
Posted by Corey Minyard 9 months ago
On Mon, Mar 17, 2025 at 02:33:31AM -0700, Breno Leitao wrote:
> On Wed, Mar 12, 2025 at 01:19:32PM -0400, Rik van Riel wrote:
> > On recent kernels this warning fires:
> > 
> > drivers/char/ipmi/ipmi_msghandler.c:1238 RCU-list traversed in non-reader section!!
> > 
> > This looks like a fairly simple lockdep trigger, where
> > list_for_each_entry_rcu and list_for_each_entry_srcu are
> > functionally identical, but the lockdep annotation in
> > the former has an extra check.
> > 
> > That extra check is whether the RCU read lock is held,
> > which is not true when the code uses srcu_read_lock.
> > 
> > Get rid of the warning by using the properly annotated
> > list traversal macro.
> 
> Thanks for looking at this one.
> 
> There was a discussion about this issue a few years ago, with
> a different approach that never landed upstream.
> 
> 	https://lore.kernel.org/all/20201119123831.GH3710@minyard.net/#r

I thought this looked familiar.

Breno, I believe you suggested a change to the patch that sounded
reasonable, so I removed the patch, and then nothing happened and I
didn't follow up.

This is kind of a mess :-(.  Let me look at it.

-corey
Re: [PATCH] ipmi: fix suspicious RCU usage warning
Posted by Corey Minyard 8 months ago
On Mon, Mar 17, 2025 at 07:27:30AM -0500, Corey Minyard wrote:
> On Mon, Mar 17, 2025 at 02:33:31AM -0700, Breno Leitao wrote:
> > On Wed, Mar 12, 2025 at 01:19:32PM -0400, Rik van Riel wrote:
> > > On recent kernels this warning fires:
> > > 
> > > drivers/char/ipmi/ipmi_msghandler.c:1238 RCU-list traversed in non-reader section!!
> > > 
> > > This looks like a fairly simple lockdep trigger, where
> > > list_for_each_entry_rcu and list_for_each_entry_srcu are
> > > functionally identical, but the lockdep annotation in
> > > the former has an extra check.
> > > 
> > > That extra check is whether the RCU read lock is held,
> > > which is not true when the code uses srcu_read_lock.
> > > 
> > > Get rid of the warning by using the properly annotated
> > > list traversal macro.
> > 
> > Thanks for looking at this one.
> > 
> > There was a discussion about this issue a few years ago, with
> > a different approach that never landed upstream.
> > 
> > 	https://lore.kernel.org/all/20201119123831.GH3710@minyard.net/#r
> 
> I thought this looked familiar.
> 
> Breno, I believe you suggested a change to the patch that sounded
> reasonable, so I removed the patch, and then nothing happened and I
> didn't follow up.
> 
> This is kind of a mess :-(.  Let me look at it.

I've been working on this, so it's not forgotten.  Actually, I've mostly
been working on a test framework for the IPMI driver, which has been a
fairly big job.

The modification was fairly easy.  I decided to just pull srcu out of
this and use a different method.  And I'm not going to do that without
some serious testing, thus the delay.

Reviewing the code has shown me that the IPMI driver needs some TLC in
some places.

Anyway, expect some changes in the next few days.

-corey
Re: [PATCH] ipmi: fix suspicious RCU usage warning
Posted by kernel test robot 9 months, 1 week ago
Hi Rik,

kernel test robot noticed the following build errors:

[auto build test ERROR on v6.14-rc6]
[also build test ERROR on linus/master next-20250313]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Rik-van-Riel/ipmi-fix-suspicious-RCU-usage-warning/20250313-013222
base:   v6.14-rc6
patch link:    https://lore.kernel.org/r/20250312131932.44d901f7%40fangorn
patch subject: [PATCH] ipmi: fix suspicious RCU usage warning
config: i386-randconfig-001-20250313 (https://download.01.org/0day-ci/archive/20250314/202503140113.cTWvIvtK-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250314/202503140113.cTWvIvtK-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202503140113.cTWvIvtK-lkp@intel.com/

All error/warnings (new ones prefixed by >>):

   drivers/char/ipmi/ipmi_msghandler.c: In function 'ipmi_create_user':
>> drivers/char/ipmi/ipmi_msghandler.c:1238:62: error: macro "list_for_each_entry_srcu" requires 4 arguments, but only 3 given
    1238 |         list_for_each_entry_srcu(intf, &ipmi_interfaces, link) {
         |                                                              ^
   In file included from include/linux/dcache.h:8,
                    from include/linux/fs.h:8,
                    from include/linux/poll.h:10,
                    from drivers/char/ipmi/ipmi_msghandler.c:20:
   include/linux/rculist.h:455: note: macro "list_for_each_entry_srcu" defined here
     455 | #define list_for_each_entry_srcu(pos, head, member, cond)               \
         | 
>> drivers/char/ipmi/ipmi_msghandler.c:1238:9: error: 'list_for_each_entry_srcu' undeclared (first use in this function)
    1238 |         list_for_each_entry_srcu(intf, &ipmi_interfaces, link) {
         |         ^~~~~~~~~~~~~~~~~~~~~~~~
   drivers/char/ipmi/ipmi_msghandler.c:1238:9: note: each undeclared identifier is reported only once for each function it appears in
>> drivers/char/ipmi/ipmi_msghandler.c:1238:33: error: expected ';' before '{' token
    1238 |         list_for_each_entry_srcu(intf, &ipmi_interfaces, link) {
         |                                 ^                              ~
         |                                 ;
>> drivers/char/ipmi/ipmi_msghandler.c:1246:2: warning: label 'found' defined but not used [-Wunused-label]
    1246 |  found:
         |  ^~~~~


vim +/list_for_each_entry_srcu +1238 drivers/char/ipmi/ipmi_msghandler.c

  1203	
  1204	int ipmi_create_user(unsigned int          if_num,
  1205			     const struct ipmi_user_hndl *handler,
  1206			     void                  *handler_data,
  1207			     struct ipmi_user      **user)
  1208	{
  1209		unsigned long flags;
  1210		struct ipmi_user *new_user;
  1211		int           rv, index;
  1212		struct ipmi_smi *intf;
  1213	
  1214		/*
  1215		 * There is no module usecount here, because it's not
  1216		 * required.  Since this can only be used by and called from
  1217		 * other modules, they will implicitly use this module, and
  1218		 * thus this can't be removed unless the other modules are
  1219		 * removed.
  1220		 */
  1221	
  1222		if (handler == NULL)
  1223			return -EINVAL;
  1224	
  1225		/*
  1226		 * Make sure the driver is actually initialized, this handles
  1227		 * problems with initialization order.
  1228		 */
  1229		rv = ipmi_init_msghandler();
  1230		if (rv)
  1231			return rv;
  1232	
  1233		new_user = vzalloc(sizeof(*new_user));
  1234		if (!new_user)
  1235			return -ENOMEM;
  1236	
  1237		index = srcu_read_lock(&ipmi_interfaces_srcu);
> 1238		list_for_each_entry_srcu(intf, &ipmi_interfaces, link) {
  1239			if (intf->intf_num == if_num)
  1240				goto found;
  1241		}
  1242		/* Not found, return an error */
  1243		rv = -EINVAL;
  1244		goto out_kfree;
  1245	
> 1246	 found:
  1247		if (atomic_add_return(1, &intf->nr_users) > max_users) {
  1248			rv = -EBUSY;
  1249			goto out_kfree;
  1250		}
  1251	
  1252		INIT_WORK(&new_user->remove_work, free_user_work);
  1253	
  1254		rv = init_srcu_struct(&new_user->release_barrier);
  1255		if (rv)
  1256			goto out_kfree;
  1257	
  1258		if (!try_module_get(intf->owner)) {
  1259			rv = -ENODEV;
  1260			goto out_kfree;
  1261		}
  1262	
  1263		/* Note that each existing user holds a refcount to the interface. */
  1264		kref_get(&intf->refcount);
  1265	
  1266		atomic_set(&new_user->nr_msgs, 0);
  1267		kref_init(&new_user->refcount);
  1268		new_user->handler = handler;
  1269		new_user->handler_data = handler_data;
  1270		new_user->intf = intf;
  1271		new_user->gets_events = false;
  1272	
  1273		rcu_assign_pointer(new_user->self, new_user);
  1274		spin_lock_irqsave(&intf->seq_lock, flags);
  1275		list_add_rcu(&new_user->link, &intf->users);
  1276		spin_unlock_irqrestore(&intf->seq_lock, flags);
  1277		if (handler->ipmi_watchdog_pretimeout)
  1278			/* User wants pretimeouts, so make sure to watch for them. */
  1279			smi_add_watch(intf, IPMI_WATCH_MASK_CHECK_WATCHDOG);
  1280		srcu_read_unlock(&ipmi_interfaces_srcu, index);
  1281		*user = new_user;
  1282		return 0;
  1283	
  1284	out_kfree:
  1285		atomic_dec(&intf->nr_users);
  1286		srcu_read_unlock(&ipmi_interfaces_srcu, index);
  1287		vfree(new_user);
  1288		return rv;
  1289	}
  1290	EXPORT_SYMBOL(ipmi_create_user);
  1291	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Re: [PATCH] ipmi: fix suspicious RCU usage warning
Posted by Paul E. McKenney 9 months, 1 week ago
On Wed, Mar 12, 2025 at 01:19:32PM -0400, Rik van Riel wrote:
> On recent kernels this warning fires:
> 
> drivers/char/ipmi/ipmi_msghandler.c:1238 RCU-list traversed in non-reader section!!
> 
> This looks like a fairly simple lockdep trigger, where
> list_for_each_entry_rcu and list_for_each_entry_srcu are
> functionally identical, but the lockdep annotation in
> the former has an extra check.
> 
> That extra check is whether the RCU read lock is held,
> which is not true when the code uses srcu_read_lock.
> 
> Get rid of the warning by using the properly annotated
> list traversal macro.
> 
> Signed-off-by: Rik van Riel <riel@surriel.com>
> ---
>  drivers/char/ipmi/ipmi_msghandler.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c
> index 1e5313748f8b..a2823763fd37 100644
> --- a/drivers/char/ipmi/ipmi_msghandler.c
> +++ b/drivers/char/ipmi/ipmi_msghandler.c
> @@ -1235,7 +1235,7 @@ int ipmi_create_user(unsigned int          if_num,
>  		return -ENOMEM;
>  
>  	index = srcu_read_lock(&ipmi_interfaces_srcu);
> -	list_for_each_entry_rcu(intf, &ipmi_interfaces, link) {
> +	list_for_each_entry_srcu(intf, &ipmi_interfaces, link) {

Doesn't the above line want to be something like this?

+	list_for_each_entry_srcu(intf, &ipmi_interfaces, link,
				 srcu_read_lock_held(&ipmi_interfaces_srcu)) {

>  		if (intf->intf_num == if_num)
>  			goto found;
>  	}
> -- 
> 2.48.1
>
Re: [PATCH] ipmi: fix suspicious RCU usage warning
Posted by Paolo Bonzini 9 months, 1 week ago
On 3/12/25 18:29, Paul E. McKenney wrote:
> On Wed, Mar 12, 2025 at 01:19:32PM -0400, Rik van Riel wrote:
>> diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c
>> index 1e5313748f8b..a2823763fd37 100644
>> --- a/drivers/char/ipmi/ipmi_msghandler.c
>> +++ b/drivers/char/ipmi/ipmi_msghandler.c
>> @@ -1235,7 +1235,7 @@ int ipmi_create_user(unsigned int          if_num,
>>   		return -ENOMEM;
>>   
>>   	index = srcu_read_lock(&ipmi_interfaces_srcu);
>> -	list_for_each_entry_rcu(intf, &ipmi_interfaces, link) {
>> +	list_for_each_entry_srcu(intf, &ipmi_interfaces, link) {
> 
> Doesn't the above line want to be something like this?
> 
> +	list_for_each_entry_srcu(intf, &ipmi_interfaces, link,
> 				 srcu_read_lock_held(&ipmi_interfaces_srcu)) {

Ouch what a mess.  There are multiple occurrences of this,
almost all susceptible to the same warning.

I'd start with:

-#define ipmi_interfaces_mutex_held() \
-	lockdep_is_held(&ipmi_interfaces_mutex)
+#define for_each_ipmi_interface(intf) \
+	list_for_each_entry_srcu(intf, &ipmi_interfaces, link,
+				 srcu_read_lock_held(&ipmi_interfaces_srcu)
+				 || lockdep_is_held(&ipmi_interfaces_mutex)) {

and use the for_each_ipmi_interface() macro throughout the file.

Here is the list... all of them are using _rcu, plus:

- ipmi_smi_watcher_register is using the wrong lockdep_is_held() assertion,
   but would warn if fixed
- ipmi_add_smi is using _rcu but otherwise correct
- ipmi_get_smi_info is using _rcu and can warn
- ipmi_timeout is using _rcu and can warn
- panic_event is using _rcu and can warn, and is also not
   using any protection around the walk.  Taking srcu_read_lock
   would be much better


On top of this, intf->users_srcu never does a synchronize_srcu, so I'm
a bit at a loss at how it is protecting the list.  The safest change
here is probably:

diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c
index f2a56c624f54..dc8936254c1b 100644
--- a/drivers/char/ipmi/ipmi_msghandler.c
+++ b/drivers/char/ipmi/ipmi_msghandler.c
@@ -3769,12 +3769,12 @@ void ipmi_unregister_smi(struct ipmi_smi *intf)
  	intf->in_shutdown = true;
  	list_del_rcu(&intf->link);
  	mutex_unlock(&ipmi_interfaces_mutex);
-	synchronize_srcu(&ipmi_interfaces_srcu);
  
  	/* At this point no users can be added to the interface. */
  
  	device_remove_file(intf->si_dev, &intf->nr_msgs_devattr);
  	device_remove_file(intf->si_dev, &intf->nr_users_devattr);
+	synchronize_srcu(&ipmi_interfaces_srcu);
  
  	/*
  	 * Call all the watcher interfaces to tell them that

... plus replacing all uses of intf->users_srcu with ipmi_interfaces_srcu.


A couple more issues:

- in handle_read_event_rsp() there's a lone rcu_read_unlock()
that should become srcu_read_unlock() (currently for intf->users_srcu;
modulo changes like the above)

- while the intf->cmd_rcvrs list is protected by regular RCU,
there are many other occurrences of rcu_read_lock(), typically
followed by

         if (intf->in_shutdown) {
                 rv = -ENODEV;
                 goto out_err;
         }

and I think they should use interfaces_srcu instead.

Paolo

>>   		if (intf->intf_num == if_num)
>>   			goto found;
>>   	}
>> -- 
>> 2.48.1
>>
>