fs/btrfs/tree-log.c | 7 +++++++ fs/btrfs/tree-log.h | 1 + 2 files changed, 8 insertions(+)
The Data Race occurs when the `log_conflicting_inodes()` function is
executed in different threads at the same time. When one thread assigns
a value to `ctx->logging_conflict_inodes` while another thread performs
an `if(ctx->logging_conflict_inodes)` judgment or modifies it at the
same time, a data contention problem may arise.
Further, an atomicity violation may also occur here. Consider the
following case, when a thread A `if(ctx->logging_conflict_inodes)`
passes the judgment, the execution switches to another thread B, at
which time the value of `ctx->logging_conflict_inodes` has not yet
been assigned true, which would result in multiple threads executing
`log_conflicting_inodes()`.
To address this issue, it is recommended to add locks to protect
`logging_conflict_inodes` in the `btrfs_log_ctx` structure, and lock
protection during assignment and judgment. This modification ensures
that the value of `ctx->logging_conflict_inodes` does not change during
the validation process, thereby maintaining its integrity.
Signed-off-by: Hao-ran Zheng <zhenghaoran@buaa.edu.cn>
---
fs/btrfs/tree-log.c | 7 +++++++
fs/btrfs/tree-log.h | 1 +
2 files changed, 8 insertions(+)
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 9637c7cdc0cf..9cdbf280ca9a 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -2854,6 +2854,7 @@ void btrfs_init_log_ctx(struct btrfs_log_ctx *ctx, struct btrfs_inode *inode)
INIT_LIST_HEAD(&ctx->conflict_inodes);
ctx->num_conflict_inodes = 0;
ctx->logging_conflict_inodes = false;
+ spin_lock_init(&ctx->logging_conflict_inodes_lock);
ctx->scratch_eb = NULL;
}
@@ -5779,16 +5780,20 @@ static int log_conflicting_inodes(struct btrfs_trans_handle *trans,
struct btrfs_log_ctx *ctx)
{
int ret = 0;
+ unsigned long logging_conflict_inodes_flags;
/*
* Conflicting inodes are logged by the first call to btrfs_log_inode(),
* otherwise we could have unbounded recursion of btrfs_log_inode()
* calls. This check guarantees we can have only 1 level of recursion.
*/
+ spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
if (ctx->logging_conflict_inodes)
+ spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
return 0;
ctx->logging_conflict_inodes = true;
+ spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
/*
* New conflicting inodes may be found and added to the list while we
@@ -5869,7 +5874,9 @@ static int log_conflicting_inodes(struct btrfs_trans_handle *trans,
break;
}
+ spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
ctx->logging_conflict_inodes = false;
+ spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
if (ret)
free_conflicting_inodes(ctx);
diff --git a/fs/btrfs/tree-log.h b/fs/btrfs/tree-log.h
index dc313e6bb2fa..0f862d0c80f2 100644
--- a/fs/btrfs/tree-log.h
+++ b/fs/btrfs/tree-log.h
@@ -44,6 +44,7 @@ struct btrfs_log_ctx {
struct list_head conflict_inodes;
int num_conflict_inodes;
bool logging_conflict_inodes;
+ spinlock_t logging_conflict_inodes_lock;
/*
* Used for fsyncs that need to copy items from the subvolume tree to
* the log tree (full sync flag set or copy everything flag set) to
--
2.34.1
Hi Hao-ran,
kernel test robot noticed the following build errors:
[auto build test ERROR on kdave/for-next]
[also build test ERROR on linus/master v6.12-rc5 next-20241101]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Hao-ran-Zheng/btrfs-Fix-data-race-in-log_conflicting_inodes/20241101-115429
base: https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-next
patch link: https://lore.kernel.org/r/20241101035133.925251-1-zhenghaoran%40buaa.edu.cn
patch subject: [PATCH] btrfs: Fix data race in log_conflicting_inodes
config: x86_64-kexec (https://download.01.org/0day-ci/archive/20241102/202411021448.6pjzV4h1-lkp@intel.com/config)
compiler: clang version 19.1.3 (https://github.com/llvm/llvm-project ab51eccf88f5321e7c60591c5546b254b6afab99)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241102/202411021448.6pjzV4h1-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202411021448.6pjzV4h1-lkp@intel.com/
All error/warnings (new ones prefixed by >>):
In file included from fs/btrfs/tree-log.c:8:
In file included from include/linux/blkdev.h:9:
In file included from include/linux/blk_types.h:10:
In file included from include/linux/bvec.h:10:
In file included from include/linux/highmem.h:8:
In file included from include/linux/cacheflush.h:5:
In file included from arch/x86/include/asm/cacheflush.h:5:
In file included from include/linux/mm.h:2213:
include/linux/vmstat.h:504:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
504 | return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
| ~~~~~~~~~~~~~~~~~~~~~ ^
505 | item];
| ~~~~
include/linux/vmstat.h:511:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
511 | return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
| ~~~~~~~~~~~~~~~~~~~~~ ^
512 | NR_VM_NUMA_EVENT_ITEMS +
| ~~~~~~~~~~~~~~~~~~~~~~
include/linux/vmstat.h:518:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
518 | return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
| ~~~~~~~~~~~ ^ ~~~
include/linux/vmstat.h:524:43: warning: arithmetic between different enumeration types ('enum zone_stat_item' and 'enum numa_stat_item') [-Wenum-enum-conversion]
524 | return vmstat_text[NR_VM_ZONE_STAT_ITEMS +
| ~~~~~~~~~~~~~~~~~~~~~ ^
525 | NR_VM_NUMA_EVENT_ITEMS +
| ~~~~~~~~~~~~~~~~~~~~~~
>> fs/btrfs/tree-log.c:5790:26: error: no member named 'conflict_inodes_lock' in 'struct btrfs_log_ctx'; did you mean 'conflict_inodes'?
5790 | spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
| ^~~~~~~~~~~~~~~~~~~~
| conflict_inodes
include/linux/spinlock.h:381:39: note: expanded from macro 'spin_lock_irqsave'
381 | raw_spin_lock_irqsave(spinlock_check(lock), flags); \
| ^
include/linux/spinlock.h:244:34: note: expanded from macro 'raw_spin_lock_irqsave'
244 | flags = _raw_spin_lock_irqsave(lock); \
| ^
fs/btrfs/tree-log.h:44:19: note: 'conflict_inodes' declared here
44 | struct list_head conflict_inodes;
| ^
fs/btrfs/tree-log.c:5792:32: error: no member named 'conflict_inodes_lock' in 'struct btrfs_log_ctx'; did you mean 'conflict_inodes'?
5792 | spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
| ^~~~~~~~~~~~~~~~~~~~
| conflict_inodes
fs/btrfs/tree-log.h:44:19: note: 'conflict_inodes' declared here
44 | struct list_head conflict_inodes;
| ^
>> fs/btrfs/tree-log.c:5793:3: warning: misleading indentation; statement is not part of the previous 'if' [-Wmisleading-indentation]
5793 | return 0;
| ^
fs/btrfs/tree-log.c:5791:2: note: previous statement is here
5791 | if (ctx->logging_conflict_inodes)
| ^
fs/btrfs/tree-log.c:5796:31: error: no member named 'conflict_inodes_lock' in 'struct btrfs_log_ctx'; did you mean 'conflict_inodes'?
5796 | spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
| ^~~~~~~~~~~~~~~~~~~~
| conflict_inodes
fs/btrfs/tree-log.h:44:19: note: 'conflict_inodes' declared here
44 | struct list_head conflict_inodes;
| ^
fs/btrfs/tree-log.c:5877:26: error: no member named 'conflict_inodes_lock' in 'struct btrfs_log_ctx'; did you mean 'conflict_inodes'?
5877 | spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
| ^~~~~~~~~~~~~~~~~~~~
| conflict_inodes
include/linux/spinlock.h:381:39: note: expanded from macro 'spin_lock_irqsave'
381 | raw_spin_lock_irqsave(spinlock_check(lock), flags); \
| ^
include/linux/spinlock.h:244:34: note: expanded from macro 'raw_spin_lock_irqsave'
244 | flags = _raw_spin_lock_irqsave(lock); \
| ^
fs/btrfs/tree-log.h:44:19: note: 'conflict_inodes' declared here
44 | struct list_head conflict_inodes;
| ^
fs/btrfs/tree-log.c:5879:31: error: no member named 'conflict_inodes_lock' in 'struct btrfs_log_ctx'; did you mean 'conflict_inodes'?
5879 | spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
| ^~~~~~~~~~~~~~~~~~~~
| conflict_inodes
fs/btrfs/tree-log.h:44:19: note: 'conflict_inodes' declared here
44 | struct list_head conflict_inodes;
| ^
5 warnings and 5 errors generated.
vim +5790 fs/btrfs/tree-log.c
5777
5778 static int log_conflicting_inodes(struct btrfs_trans_handle *trans,
5779 struct btrfs_root *root,
5780 struct btrfs_log_ctx *ctx)
5781 {
5782 int ret = 0;
5783 unsigned long logging_conflict_inodes_flags;
5784
5785 /*
5786 * Conflicting inodes are logged by the first call to btrfs_log_inode(),
5787 * otherwise we could have unbounded recursion of btrfs_log_inode()
5788 * calls. This check guarantees we can have only 1 level of recursion.
5789 */
> 5790 spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
5791 if (ctx->logging_conflict_inodes)
5792 spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
> 5793 return 0;
5794
5795 ctx->logging_conflict_inodes = true;
5796 spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
5797
5798 /*
5799 * New conflicting inodes may be found and added to the list while we
5800 * are logging a conflicting inode, so keep iterating while the list is
5801 * not empty.
5802 */
5803 while (!list_empty(&ctx->conflict_inodes)) {
5804 struct btrfs_ino_list *curr;
5805 struct inode *inode;
5806 u64 ino;
5807 u64 parent;
5808
5809 curr = list_first_entry(&ctx->conflict_inodes,
5810 struct btrfs_ino_list, list);
5811 ino = curr->ino;
5812 parent = curr->parent;
5813 list_del(&curr->list);
5814 kfree(curr);
5815
5816 inode = btrfs_iget_logging(ino, root);
5817 /*
5818 * If the other inode that had a conflicting dir entry was
5819 * deleted in the current transaction, we need to log its parent
5820 * directory. See the comment at add_conflicting_inode().
5821 */
5822 if (IS_ERR(inode)) {
5823 ret = PTR_ERR(inode);
5824 if (ret != -ENOENT)
5825 break;
5826
5827 inode = btrfs_iget_logging(parent, root);
5828 if (IS_ERR(inode)) {
5829 ret = PTR_ERR(inode);
5830 break;
5831 }
5832
5833 /*
5834 * Always log the directory, we cannot make this
5835 * conditional on need_log_inode() because the directory
5836 * might have been logged in LOG_INODE_EXISTS mode or
5837 * the dir index of the conflicting inode is not in a
5838 * dir index key range logged for the directory. So we
5839 * must make sure the deletion is recorded.
5840 */
5841 ret = btrfs_log_inode(trans, BTRFS_I(inode),
5842 LOG_INODE_ALL, ctx);
5843 btrfs_add_delayed_iput(BTRFS_I(inode));
5844 if (ret)
5845 break;
5846 continue;
5847 }
5848
5849 /*
5850 * Here we can use need_log_inode() because we only need to log
5851 * the inode in LOG_INODE_EXISTS mode and rename operations
5852 * update the log, so that the log ends up with the new name and
5853 * without the old name.
5854 *
5855 * We did this check at add_conflicting_inode(), but here we do
5856 * it again because if some other task logged the inode after
5857 * that, we can avoid doing it again.
5858 */
5859 if (!need_log_inode(trans, BTRFS_I(inode))) {
5860 btrfs_add_delayed_iput(BTRFS_I(inode));
5861 continue;
5862 }
5863
5864 /*
5865 * We are safe logging the other inode without acquiring its
5866 * lock as long as we log with the LOG_INODE_EXISTS mode. We
5867 * are safe against concurrent renames of the other inode as
5868 * well because during a rename we pin the log and update the
5869 * log with the new name before we unpin it.
5870 */
5871 ret = btrfs_log_inode(trans, BTRFS_I(inode), LOG_INODE_EXISTS, ctx);
5872 btrfs_add_delayed_iput(BTRFS_I(inode));
5873 if (ret)
5874 break;
5875 }
5876
5877 spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
5878 ctx->logging_conflict_inodes = false;
5879 spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
5880 if (ret)
5881 free_conflicting_inodes(ctx);
5882
5883 return ret;
5884 }
5885
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Hi Hao-ran,
kernel test robot noticed the following build errors:
[auto build test ERROR on kdave/for-next]
[also build test ERROR on linus/master v6.12-rc5 next-20241101]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Hao-ran-Zheng/btrfs-Fix-data-race-in-log_conflicting_inodes/20241101-115429
base: https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-next
patch link: https://lore.kernel.org/r/20241101035133.925251-1-zhenghaoran%40buaa.edu.cn
patch subject: [PATCH] btrfs: Fix data race in log_conflicting_inodes
config: x86_64-rhel-8.3 (https://download.01.org/0day-ci/archive/20241102/202411021443.lsHICRJl-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241102/202411021443.lsHICRJl-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202411021443.lsHICRJl-lkp@intel.com/
All error/warnings (new ones prefixed by >>):
In file included from include/linux/sched.h:2145,
from fs/btrfs/tree-log.c:6:
fs/btrfs/tree-log.c: In function 'log_conflicting_inodes':
>> fs/btrfs/tree-log.c:5790:33: error: 'struct btrfs_log_ctx' has no member named 'conflict_inodes_lock'; did you mean 'conflict_inodes'?
5790 | spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
| ^~~~~~~~~~~~~~~~~~~~
include/linux/spinlock.h:244:48: note: in definition of macro 'raw_spin_lock_irqsave'
244 | flags = _raw_spin_lock_irqsave(lock); \
| ^~~~
fs/btrfs/tree-log.c:5790:9: note: in expansion of macro 'spin_lock_irqsave'
5790 | spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
| ^~~~~~~~~~~~~~~~~
fs/btrfs/tree-log.c:5792:46: error: 'struct btrfs_log_ctx' has no member named 'conflict_inodes_lock'; did you mean 'conflict_inodes'?
5792 | spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
| ^~~~~~~~~~~~~~~~~~~~
| conflict_inodes
>> fs/btrfs/tree-log.c:5791:9: warning: this 'if' clause does not guard... [-Wmisleading-indentation]
5791 | if (ctx->logging_conflict_inodes)
| ^~
fs/btrfs/tree-log.c:5793:17: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'
5793 | return 0;
| ^~~~~~
fs/btrfs/tree-log.c:5796:38: error: 'struct btrfs_log_ctx' has no member named 'conflict_inodes_lock'; did you mean 'conflict_inodes'?
5796 | spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
| ^~~~~~~~~~~~~~~~~~~~
| conflict_inodes
fs/btrfs/tree-log.c:5877:33: error: 'struct btrfs_log_ctx' has no member named 'conflict_inodes_lock'; did you mean 'conflict_inodes'?
5877 | spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
| ^~~~~~~~~~~~~~~~~~~~
include/linux/spinlock.h:244:48: note: in definition of macro 'raw_spin_lock_irqsave'
244 | flags = _raw_spin_lock_irqsave(lock); \
| ^~~~
fs/btrfs/tree-log.c:5877:9: note: in expansion of macro 'spin_lock_irqsave'
5877 | spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
| ^~~~~~~~~~~~~~~~~
fs/btrfs/tree-log.c:5879:38: error: 'struct btrfs_log_ctx' has no member named 'conflict_inodes_lock'; did you mean 'conflict_inodes'?
5879 | spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
| ^~~~~~~~~~~~~~~~~~~~
| conflict_inodes
vim +5790 fs/btrfs/tree-log.c
5777
5778 static int log_conflicting_inodes(struct btrfs_trans_handle *trans,
5779 struct btrfs_root *root,
5780 struct btrfs_log_ctx *ctx)
5781 {
5782 int ret = 0;
5783 unsigned long logging_conflict_inodes_flags;
5784
5785 /*
5786 * Conflicting inodes are logged by the first call to btrfs_log_inode(),
5787 * otherwise we could have unbounded recursion of btrfs_log_inode()
5788 * calls. This check guarantees we can have only 1 level of recursion.
5789 */
> 5790 spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
> 5791 if (ctx->logging_conflict_inodes)
5792 spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
5793 return 0;
5794
5795 ctx->logging_conflict_inodes = true;
5796 spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
5797
5798 /*
5799 * New conflicting inodes may be found and added to the list while we
5800 * are logging a conflicting inode, so keep iterating while the list is
5801 * not empty.
5802 */
5803 while (!list_empty(&ctx->conflict_inodes)) {
5804 struct btrfs_ino_list *curr;
5805 struct inode *inode;
5806 u64 ino;
5807 u64 parent;
5808
5809 curr = list_first_entry(&ctx->conflict_inodes,
5810 struct btrfs_ino_list, list);
5811 ino = curr->ino;
5812 parent = curr->parent;
5813 list_del(&curr->list);
5814 kfree(curr);
5815
5816 inode = btrfs_iget_logging(ino, root);
5817 /*
5818 * If the other inode that had a conflicting dir entry was
5819 * deleted in the current transaction, we need to log its parent
5820 * directory. See the comment at add_conflicting_inode().
5821 */
5822 if (IS_ERR(inode)) {
5823 ret = PTR_ERR(inode);
5824 if (ret != -ENOENT)
5825 break;
5826
5827 inode = btrfs_iget_logging(parent, root);
5828 if (IS_ERR(inode)) {
5829 ret = PTR_ERR(inode);
5830 break;
5831 }
5832
5833 /*
5834 * Always log the directory, we cannot make this
5835 * conditional on need_log_inode() because the directory
5836 * might have been logged in LOG_INODE_EXISTS mode or
5837 * the dir index of the conflicting inode is not in a
5838 * dir index key range logged for the directory. So we
5839 * must make sure the deletion is recorded.
5840 */
5841 ret = btrfs_log_inode(trans, BTRFS_I(inode),
5842 LOG_INODE_ALL, ctx);
5843 btrfs_add_delayed_iput(BTRFS_I(inode));
5844 if (ret)
5845 break;
5846 continue;
5847 }
5848
5849 /*
5850 * Here we can use need_log_inode() because we only need to log
5851 * the inode in LOG_INODE_EXISTS mode and rename operations
5852 * update the log, so that the log ends up with the new name and
5853 * without the old name.
5854 *
5855 * We did this check at add_conflicting_inode(), but here we do
5856 * it again because if some other task logged the inode after
5857 * that, we can avoid doing it again.
5858 */
5859 if (!need_log_inode(trans, BTRFS_I(inode))) {
5860 btrfs_add_delayed_iput(BTRFS_I(inode));
5861 continue;
5862 }
5863
5864 /*
5865 * We are safe logging the other inode without acquiring its
5866 * lock as long as we log with the LOG_INODE_EXISTS mode. We
5867 * are safe against concurrent renames of the other inode as
5868 * well because during a rename we pin the log and update the
5869 * log with the new name before we unpin it.
5870 */
5871 ret = btrfs_log_inode(trans, BTRFS_I(inode), LOG_INODE_EXISTS, ctx);
5872 btrfs_add_delayed_iput(BTRFS_I(inode));
5873 if (ret)
5874 break;
5875 }
5876
5877 spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
5878 ctx->logging_conflict_inodes = false;
5879 spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
5880 if (ret)
5881 free_conflicting_inodes(ctx);
5882
5883 return ret;
5884 }
5885
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
On Fri, Nov 1, 2024 at 3:52 AM Hao-ran Zheng <zhenghaoran@buaa.edu.cn> wrote:
>
> The Data Race occurs when the `log_conflicting_inodes()` function is
> executed in different threads at the same time. When one thread assigns
> a value to `ctx->logging_conflict_inodes` while another thread performs
> an `if(ctx->logging_conflict_inodes)` judgment or modifies it at the
> same time, a data contention problem may arise.
No, there's no problem at all.
A log context is thread local, it's never shared between threads.
>
> Further, an atomicity violation may also occur here. Consider the
> following case, when a thread A `if(ctx->logging_conflict_inodes)`
> passes the judgment, the execution switches to another thread B, at
> which time the value of `ctx->logging_conflict_inodes` has not yet
> been assigned true, which would result in multiple threads executing
> `log_conflicting_inodes()`.
No. When you make such claims, please provide a sequence diagram that
shows how the tasks interact, what their call stacks are, so that we
can see where the race happens.
But again, this is completely wrong because a log context (struct
btrfs_log_ctx) is never shared between threads.
Thanks.
>
> To address this issue, it is recommended to add locks to protect
> `logging_conflict_inodes` in the `btrfs_log_ctx` structure, and lock
> protection during assignment and judgment. This modification ensures
> that the value of `ctx->logging_conflict_inodes` does not change during
> the validation process, thereby maintaining its integrity.
>
> Signed-off-by: Hao-ran Zheng <zhenghaoran@buaa.edu.cn>
> ---
> fs/btrfs/tree-log.c | 7 +++++++
> fs/btrfs/tree-log.h | 1 +
> 2 files changed, 8 insertions(+)
>
> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
> index 9637c7cdc0cf..9cdbf280ca9a 100644
> --- a/fs/btrfs/tree-log.c
> +++ b/fs/btrfs/tree-log.c
> @@ -2854,6 +2854,7 @@ void btrfs_init_log_ctx(struct btrfs_log_ctx *ctx, struct btrfs_inode *inode)
> INIT_LIST_HEAD(&ctx->conflict_inodes);
> ctx->num_conflict_inodes = 0;
> ctx->logging_conflict_inodes = false;
> + spin_lock_init(&ctx->logging_conflict_inodes_lock);
> ctx->scratch_eb = NULL;
> }
>
> @@ -5779,16 +5780,20 @@ static int log_conflicting_inodes(struct btrfs_trans_handle *trans,
> struct btrfs_log_ctx *ctx)
> {
> int ret = 0;
> + unsigned long logging_conflict_inodes_flags;
>
> /*
> * Conflicting inodes are logged by the first call to btrfs_log_inode(),
> * otherwise we could have unbounded recursion of btrfs_log_inode()
> * calls. This check guarantees we can have only 1 level of recursion.
> */
> + spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
Even if this was remotely correct, why the irqsave? The fsync code is
never called under irq context.
> if (ctx->logging_conflict_inodes)
> + spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
> return 0;
>
> ctx->logging_conflict_inodes = true;
> + spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
>
> /*
> * New conflicting inodes may be found and added to the list while we
> @@ -5869,7 +5874,9 @@ static int log_conflicting_inodes(struct btrfs_trans_handle *trans,
> break;
> }
>
> + spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
> ctx->logging_conflict_inodes = false;
> + spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
> if (ret)
> free_conflicting_inodes(ctx);
>
> diff --git a/fs/btrfs/tree-log.h b/fs/btrfs/tree-log.h
> index dc313e6bb2fa..0f862d0c80f2 100644
> --- a/fs/btrfs/tree-log.h
> +++ b/fs/btrfs/tree-log.h
> @@ -44,6 +44,7 @@ struct btrfs_log_ctx {
> struct list_head conflict_inodes;
> int num_conflict_inodes;
> bool logging_conflict_inodes;
> + spinlock_t logging_conflict_inodes_lock;
> /*
> * Used for fsyncs that need to copy items from the subvolume tree to
> * the log tree (full sync flag set or copy everything flag set) to
> --
> 2.34.1
>
>
在 2024/11/1 14:21, Hao-ran Zheng 写道:
> The Data Race occurs when the `log_conflicting_inodes()` function is
> executed in different threads at the same time. When one thread assigns
> a value to `ctx->logging_conflict_inodes` while another thread performs
> an `if(ctx->logging_conflict_inodes)` judgment or modifies it at the
> same time, a data contention problem may arise.
>
> Further, an atomicity violation may also occur here. Consider the
> following case, when a thread A `if(ctx->logging_conflict_inodes)`
> passes the judgment, the execution switches to another thread B, at
> which time the value of `ctx->logging_conflict_inodes` has not yet
> been assigned true, which would result in multiple threads executing
> `log_conflicting_inodes()`.
>
> To address this issue, it is recommended to add locks to protect
> `logging_conflict_inodes` in the `btrfs_log_ctx` structure, and lock
> protection during assignment and judgment. This modification ensures
> that the value of `ctx->logging_conflict_inodes` does not change during
> the validation process, thereby maintaining its integrity.
>
> Signed-off-by: Hao-ran Zheng <zhenghaoran@buaa.edu.cn>
> ---
> fs/btrfs/tree-log.c | 7 +++++++
> fs/btrfs/tree-log.h | 1 +
> 2 files changed, 8 insertions(+)
>
> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
> index 9637c7cdc0cf..9cdbf280ca9a 100644
> --- a/fs/btrfs/tree-log.c
> +++ b/fs/btrfs/tree-log.c
> @@ -2854,6 +2854,7 @@ void btrfs_init_log_ctx(struct btrfs_log_ctx *ctx, struct btrfs_inode *inode)
> INIT_LIST_HEAD(&ctx->conflict_inodes);
> ctx->num_conflict_inodes = 0;
> ctx->logging_conflict_inodes = false;
> + spin_lock_init(&ctx->logging_conflict_inodes_lock);
> ctx->scratch_eb = NULL;
> }
>
> @@ -5779,16 +5780,20 @@ static int log_conflicting_inodes(struct btrfs_trans_handle *trans,
> struct btrfs_log_ctx *ctx)
> {
> int ret = 0;
> + unsigned long logging_conflict_inodes_flags;
>
> /*
> * Conflicting inodes are logged by the first call to btrfs_log_inode(),
> * otherwise we could have unbounded recursion of btrfs_log_inode()
> * calls. This check guarantees we can have only 1 level of recursion.
> */
> + spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
> if (ctx->logging_conflict_inodes)
> + spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
Not an expert on the log tree, but in the above case, the only thing the
spinlock is protecting is a bool.
This looks overkilled to me.
Yes, several booleans can be stored in to a single byte, which can cause
problems.
But in that case, why not changing those booleans into a unsigned long
and use test_bit()/set_bit()/clear_bit() so that the bit operation will
be atomic and no need for the extra spinlock.
Although I haven't check the other boolean usage, but at least for this
@logging_conflict_inodes variable, it looks like atomic bit operation is
safe.
Thanks,
Qu
> return 0;
>
> ctx->logging_conflict_inodes = true;
> + spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
>
> /*
> * New conflicting inodes may be found and added to the list while we
> @@ -5869,7 +5874,9 @@ static int log_conflicting_inodes(struct btrfs_trans_handle *trans,
> break;
> }
>
> + spin_lock_irqsave(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
> ctx->logging_conflict_inodes = false;
> + spin_unlock_irqrestore(&ctx->conflict_inodes_lock, logging_conflict_inodes_flags);
> if (ret)
> free_conflicting_inodes(ctx);
>
> diff --git a/fs/btrfs/tree-log.h b/fs/btrfs/tree-log.h
> index dc313e6bb2fa..0f862d0c80f2 100644
> --- a/fs/btrfs/tree-log.h
> +++ b/fs/btrfs/tree-log.h
> @@ -44,6 +44,7 @@ struct btrfs_log_ctx {
> struct list_head conflict_inodes;
> int num_conflict_inodes;
> bool logging_conflict_inodes;
> + spinlock_t logging_conflict_inodes_lock;
> /*
> * Used for fsyncs that need to copy items from the subvolume tree to
> * the log tree (full sync flag set or copy everything flag set) to
© 2016 - 2026 Red Hat, Inc.