fs/libfs.c | 129 ++- include/linux/fs.h | 6 +- include/linux/maple_tree.h | 356 +++--- include/linux/mm_types.h | 3 +- lib/maple_tree.c | 1096 +++++++++++++------ lib/test_maple_tree.c | 218 ++-- mm/internal.h | 10 +- mm/shmem.c | 4 +- tools/include/linux/spinlock.h | 1 + tools/testing/radix-tree/linux/maple_tree.h | 2 +- tools/testing/radix-tree/maple.c | 390 ++++++- 11 files changed, 1564 insertions(+), 651 deletions(-)
From: Yu Kuai <yukuai3@huawei.com> Fix patch is patch 27, relied patches are from: - patches from set [1] to add helpers to maple_tree, the last patch to improve fork() performance is not backported; - patches from set [2] to change maple_tree, and follow up fixes; - patches from set [3] to convert offset_ctx from xarray to maple_tree; Please notice that I'm not an expert in this area, and I'm afraid to make manual changes. That's why patch 16 revert the commit that is different from mainline and will cause conflict backporting new patches. patch 28 pick the original mainline patch again. (And this is what we did to fix the CVE in downstream kernels). [1] https://lore.kernel.org/all/20231027033845.90608-1-zhangpeng.00@bytedance.com/ [2] https://lore.kernel.org/all/20231101171629.3612299-2-Liam.Howlett@oracle.com/T/ [3] https://lore.kernel.org/all/170820083431.6328.16233178852085891453.stgit@91.116.238.104.host.secureserver.net/ Andrew Morton (1): lib/maple_tree.c: fix build error due to hotfix alteration Chuck Lever (5): libfs: Re-arrange locking in offset_iterate_dir() libfs: Define a minimum directory offset libfs: Add simple_offset_empty() maple_tree: Add mtree_alloc_cyclic() libfs: Convert simple directory offsets to use a Maple Tree Liam R. Howlett (12): maple_tree: remove unnecessary default labels from switch statements maple_tree: make mas_erase() more robust maple_tree: move debug check to __mas_set_range() maple_tree: add end of node tracking to the maple state maple_tree: use cached node end in mas_next() maple_tree: use cached node end in mas_destroy() maple_tree: clean up inlines for some functions maple_tree: separate ma_state node from status maple_tree: remove mas_searchable() maple_tree: use maple state end for write operations maple_tree: don't find node end in mtree_lookup_walk() maple_tree: mtree_range_walk() clean up Lorenzo Stoakes (1): maple_tree: correct tree corruption on spanning store Peng Zhang (7): maple_tree: add mt_free_one() and mt_attr() helpers maple_tree: introduce {mtree,mas}_lock_nested() maple_tree: introduce interfaces __mt_dup() and mtree_dup() maple_tree: skip other tests when BENCH is enabled maple_tree: preserve the tree attributes when destroying maple tree maple_tree: add test for mtree_dup() maple_tree: avoid checking other gaps after getting the largest gap Yu Kuai (1): Revert "maple_tree: correct tree corruption on spanning store" yangerkun (1): libfs: fix infinite directory reads for offset dir fs/libfs.c | 129 ++- include/linux/fs.h | 6 +- include/linux/maple_tree.h | 356 +++--- include/linux/mm_types.h | 3 +- lib/maple_tree.c | 1096 +++++++++++++------ lib/test_maple_tree.c | 218 ++-- mm/internal.h | 10 +- mm/shmem.c | 4 +- tools/include/linux/spinlock.h | 1 + tools/testing/radix-tree/linux/maple_tree.h | 2 +- tools/testing/radix-tree/maple.c | 390 ++++++- 11 files changed, 1564 insertions(+), 651 deletions(-) -- 2.39.2
NACK. Do this some other way that isn't a terrible mess. You've reverted my CRITICAL fix, then didn't cc- me so I'm grumpy. Even if you bizarrely brought it back later. Don't fail to cc- people you revert in future, please, especially in stable. It's not only discourteous it's also an actual security risk. Thanks. Also this commit log is ridiculous, you don't even explain WHAT ON EARTH YOU ARE DOING HERE. It's not just good enough to reference a CVE and expect us to go research this for you, especially one you've 'addressed' in this totally bizarre fashion. On Thu, Oct 24, 2024 at 09:19:41PM +0800, Yu Kuai wrote: > From: Yu Kuai <yukuai3@huawei.com> > > Fix patch is patch 27, relied patches are from: > > - patches from set [1] to add helpers to maple_tree, the last patch to > improve fork() performance is not backported; > - patches from set [2] to change maple_tree, and follow up fixes; > - patches from set [3] to convert offset_ctx from xarray to maple_tree; > > Please notice that I'm not an expert in this area, and I'm afraid to > make manual changes. That's why patch 16 revert the commit that is > different from mainline and will cause conflict backporting new patches. > patch 28 pick the original mainline patch again. This is... what? :/ You have to fix conflicts, that's part of what backporting involves. Yeah, rethink your whole approach, thanks. > > (And this is what we did to fix the CVE in downstream kernels). > > [1] https://lore.kernel.org/all/20231027033845.90608-1-zhangpeng.00@bytedance.com/ > [2] https://lore.kernel.org/all/20231101171629.3612299-2-Liam.Howlett@oracle.com/T/ > [3] https://lore.kernel.org/all/170820083431.6328.16233178852085891453.stgit@91.116.238.104.host.secureserver.net/ > > Andrew Morton (1): > lib/maple_tree.c: fix build error due to hotfix alteration > > Chuck Lever (5): > libfs: Re-arrange locking in offset_iterate_dir() > libfs: Define a minimum directory offset > libfs: Add simple_offset_empty() > maple_tree: Add mtree_alloc_cyclic() > libfs: Convert simple directory offsets to use a Maple Tree > > Liam R. Howlett (12): > maple_tree: remove unnecessary default labels from switch statements > maple_tree: make mas_erase() more robust > maple_tree: move debug check to __mas_set_range() > maple_tree: add end of node tracking to the maple state > maple_tree: use cached node end in mas_next() > maple_tree: use cached node end in mas_destroy() > maple_tree: clean up inlines for some functions > maple_tree: separate ma_state node from status > maple_tree: remove mas_searchable() > maple_tree: use maple state end for write operations > maple_tree: don't find node end in mtree_lookup_walk() > maple_tree: mtree_range_walk() clean up > > Lorenzo Stoakes (1): > maple_tree: correct tree corruption on spanning store > > Peng Zhang (7): > maple_tree: add mt_free_one() and mt_attr() helpers > maple_tree: introduce {mtree,mas}_lock_nested() > maple_tree: introduce interfaces __mt_dup() and mtree_dup() > maple_tree: skip other tests when BENCH is enabled > maple_tree: preserve the tree attributes when destroying maple tree > maple_tree: add test for mtree_dup() > maple_tree: avoid checking other gaps after getting the largest gap > > Yu Kuai (1): > Revert "maple_tree: correct tree corruption on spanning store" > > yangerkun (1): > libfs: fix infinite directory reads for offset dir > > fs/libfs.c | 129 ++- > include/linux/fs.h | 6 +- > include/linux/maple_tree.h | 356 +++--- > include/linux/mm_types.h | 3 +- > lib/maple_tree.c | 1096 +++++++++++++------ > lib/test_maple_tree.c | 218 ++-- > mm/internal.h | 10 +- > mm/shmem.c | 4 +- > tools/include/linux/spinlock.h | 1 + > tools/testing/radix-tree/linux/maple_tree.h | 2 +- > tools/testing/radix-tree/maple.c | 390 ++++++- > 11 files changed, 1564 insertions(+), 651 deletions(-) > > -- > 2.39.2 >
Hi, 在 2024/11/06 22:43, Lorenzo Stoakes 写道: > NACK. > > Do this some other way that isn't a terrible mess. > > You've reverted my CRITICAL fix, then didn't cc- me so I'm grumpy. > > Even if you bizarrely brought it back later. > > Don't fail to cc- people you revert in future, please, especially in > stable. It's not only discourteous it's also an actual security risk. ok, that's my fault. > > Thanks. > > Also this commit log is ridiculous, you don't even explain WHAT ON EARTH > YOU ARE DOING HERE. It's not just good enough to reference a CVE and expect > us to go research this for you, especially one you've 'addressed' in this > totally bizarre fashion. > > On Thu, Oct 24, 2024 at 09:19:41PM +0800, Yu Kuai wrote: >> From: Yu Kuai <yukuai3@huawei.com> >> >> Fix patch is patch 27, relied patches are from: >> >> - patches from set [1] to add helpers to maple_tree, the last patch to >> improve fork() performance is not backported; >> - patches from set [2] to change maple_tree, and follow up fixes; >> - patches from set [3] to convert offset_ctx from xarray to maple_tree; >> >> Please notice that I'm not an expert in this area, and I'm afraid to >> make manual changes. That's why patch 16 revert the commit that is >> different from mainline and will cause conflict backporting new patches. >> patch 28 pick the original mainline patch again. > > This is... what? :/ > > You have to fix conflicts, that's part of what backporting involves. So, that's the best I can do in this area. I agree that this is unacceptable now. So I'll just ignore this cve for v6.6, unless some expert in this area will try to fix conflicts for patch 27 in a better way. Thanks, Kuai > > Yeah, rethink your whole approach, thanks. > >> >> (And this is what we did to fix the CVE in downstream kernels). >> >> [1] https://lore.kernel.org/all/20231027033845.90608-1-zhangpeng.00@bytedance.com/ >> [2] https://lore.kernel.org/all/20231101171629.3612299-2-Liam.Howlett@oracle.com/T/ >> [3] https://lore.kernel.org/all/170820083431.6328.16233178852085891453.stgit@91.116.238.104.host.secureserver.net/ >> >> Andrew Morton (1): >> lib/maple_tree.c: fix build error due to hotfix alteration >> >> Chuck Lever (5): >> libfs: Re-arrange locking in offset_iterate_dir() >> libfs: Define a minimum directory offset >> libfs: Add simple_offset_empty() >> maple_tree: Add mtree_alloc_cyclic() >> libfs: Convert simple directory offsets to use a Maple Tree >> >> Liam R. Howlett (12): >> maple_tree: remove unnecessary default labels from switch statements >> maple_tree: make mas_erase() more robust >> maple_tree: move debug check to __mas_set_range() >> maple_tree: add end of node tracking to the maple state >> maple_tree: use cached node end in mas_next() >> maple_tree: use cached node end in mas_destroy() >> maple_tree: clean up inlines for some functions >> maple_tree: separate ma_state node from status >> maple_tree: remove mas_searchable() >> maple_tree: use maple state end for write operations >> maple_tree: don't find node end in mtree_lookup_walk() >> maple_tree: mtree_range_walk() clean up >> >> Lorenzo Stoakes (1): >> maple_tree: correct tree corruption on spanning store >> >> Peng Zhang (7): >> maple_tree: add mt_free_one() and mt_attr() helpers >> maple_tree: introduce {mtree,mas}_lock_nested() >> maple_tree: introduce interfaces __mt_dup() and mtree_dup() >> maple_tree: skip other tests when BENCH is enabled >> maple_tree: preserve the tree attributes when destroying maple tree >> maple_tree: add test for mtree_dup() >> maple_tree: avoid checking other gaps after getting the largest gap >> >> Yu Kuai (1): >> Revert "maple_tree: correct tree corruption on spanning store" >> >> yangerkun (1): >> libfs: fix infinite directory reads for offset dir >> >> fs/libfs.c | 129 ++- >> include/linux/fs.h | 6 +- >> include/linux/maple_tree.h | 356 +++--- >> include/linux/mm_types.h | 3 +- >> lib/maple_tree.c | 1096 +++++++++++++------ >> lib/test_maple_tree.c | 218 ++-- >> mm/internal.h | 10 +- >> mm/shmem.c | 4 +- >> tools/include/linux/spinlock.h | 1 + >> tools/testing/radix-tree/linux/maple_tree.h | 2 +- >> tools/testing/radix-tree/maple.c | 390 ++++++- >> 11 files changed, 1564 insertions(+), 651 deletions(-) >> >> -- >> 2.39.2 >> > > . >
On Thu, Oct 24, 2024 at 09:19:41PM +0800, Yu Kuai wrote: > From: Yu Kuai <yukuai3@huawei.com> > > Fix patch is patch 27, relied patches are from: > > - patches from set [1] to add helpers to maple_tree, the last patch to > improve fork() performance is not backported; So things slowed down? > - patches from set [2] to change maple_tree, and follow up fixes; > - patches from set [3] to convert offset_ctx from xarray to maple_tree; > > Please notice that I'm not an expert in this area, and I'm afraid to > make manual changes. That's why patch 16 revert the commit that is > different from mainline and will cause conflict backporting new patches. > patch 28 pick the original mainline patch again. > > (And this is what we did to fix the CVE in downstream kernels). > > [1] https://lore.kernel.org/all/20231027033845.90608-1-zhangpeng.00@bytedance.com/ > [2] https://lore.kernel.org/all/20231101171629.3612299-2-Liam.Howlett@oracle.com/T/ > [3] https://lore.kernel.org/all/170820083431.6328.16233178852085891453.stgit@91.116.238.104.host.secureserver.net/ This series looks rough. I want to have the maintainers of these files/subsystems to ack this before being able to take them. thanks, greg k-h
> On Nov 6, 2024, at 1:16 AM, Greg KH <gregkh@linuxfoundation.org> wrote: > > On Thu, Oct 24, 2024 at 09:19:41PM +0800, Yu Kuai wrote: >> From: Yu Kuai <yukuai3@huawei.com> >> >> Fix patch is patch 27, relied patches are from: I assume patch 27 is: libfs: fix infinite directory reads for offset dir https://lore.kernel.org/stable/20241024132225.2271667-12-yukuai1@huaweicloud.com/ I don't think the Maple tree patches are a hard requirement for this fix. And note that libfs did not use Maple tree originally because I was told at that time that Maple tree was not yet mature. So, a better approach might be to fit the fix onto linux-6.6.y while sticking with xarray. This is the first I've heard of this CVE. It would help if the patch authors got some notification when these are filed. >> - patches from set [1] to add helpers to maple_tree, the last patch to >> improve fork() performance is not backported; > > So things slowed down? > >> - patches from set [2] to change maple_tree, and follow up fixes; >> - patches from set [3] to convert offset_ctx from xarray to maple_tree; >> >> Please notice that I'm not an expert in this area, and I'm afraid to >> make manual changes. That's why patch 16 revert the commit that is >> different from mainline and will cause conflict backporting new patches. >> patch 28 pick the original mainline patch again. >> >> (And this is what we did to fix the CVE in downstream kernels). >> >> [1] https://lore.kernel.org/all/20231027033845.90608-1-zhangpeng.00@bytedance.com/ >> [2] https://lore.kernel.org/all/20231101171629.3612299-2-Liam.Howlett@oracle.com/T/ >> [3] https://lore.kernel.org/all/170820083431.6328.16233178852085891453.stgit@91.116.238.104.host.secureserver.net/ > > This series looks rough. I want to have the maintainers of these > files/subsystems to ack this before being able to take them. > > thanks, > > greg k-h -- Chuck Lever
Hi, 在 2024/11/06 23:19, Chuck Lever III 写道: > > >> On Nov 6, 2024, at 1:16 AM, Greg KH <gregkh@linuxfoundation.org> wrote: >> >> On Thu, Oct 24, 2024 at 09:19:41PM +0800, Yu Kuai wrote: >>> From: Yu Kuai <yukuai3@huawei.com> >>> >>> Fix patch is patch 27, relied patches are from: > > I assume patch 27 is: > > libfs: fix infinite directory reads for offset dir > > https://lore.kernel.org/stable/20241024132225.2271667-12-yukuai1@huaweicloud.com/ > > I don't think the Maple tree patches are a hard > requirement for this fix. And note that libfs did > not use Maple tree originally because I was told > at that time that Maple tree was not yet mature. > > So, a better approach might be to fit the fix > onto linux-6.6.y while sticking with xarray. The painful part is that using xarray is not acceptable, the offet is just 32 bit and if it overflows, readdir will read nothing. That's why maple_tree has to be used. Thanks, Kuai > > This is the first I've heard of this CVE. It > would help if the patch authors got some > notification when these are filed. > > >>> - patches from set [1] to add helpers to maple_tree, the last patch to >>> improve fork() performance is not backported; >> >> So things slowed down? >> >>> - patches from set [2] to change maple_tree, and follow up fixes; >>> - patches from set [3] to convert offset_ctx from xarray to maple_tree; >>> >>> Please notice that I'm not an expert in this area, and I'm afraid to >>> make manual changes. That's why patch 16 revert the commit that is >>> different from mainline and will cause conflict backporting new patches. >>> patch 28 pick the original mainline patch again. >>> >>> (And this is what we did to fix the CVE in downstream kernels). >>> >>> [1] https://lore.kernel.org/all/20231027033845.90608-1-zhangpeng.00@bytedance.com/ >>> [2] https://lore.kernel.org/all/20231101171629.3612299-2-Liam.Howlett@oracle.com/T/ >>> [3] https://lore.kernel.org/all/170820083431.6328.16233178852085891453.stgit@91.116.238.104.host.secureserver.net/ >> >> This series looks rough. I want to have the maintainers of these >> files/subsystems to ack this before being able to take them. >> >> thanks, >> >> greg k-h > > -- > Chuck Lever > >
* Yu Kuai <yukuai1@huaweicloud.com> [241106 19:57]: > Hi, > > 在 2024/11/06 23:19, Chuck Lever III 写道: > > > > > > > On Nov 6, 2024, at 1:16 AM, Greg KH <gregkh@linuxfoundation.org> wrote: > > > > > > On Thu, Oct 24, 2024 at 09:19:41PM +0800, Yu Kuai wrote: > > > > From: Yu Kuai <yukuai3@huawei.com> > > > > > > > > Fix patch is patch 27, relied patches are from: > > > > I assume patch 27 is: > > > > libfs: fix infinite directory reads for offset dir > > > > https://lore.kernel.org/stable/20241024132225.2271667-12-yukuai1@huaweicloud.com/ > > > > I don't think the Maple tree patches are a hard > > requirement for this fix. And note that libfs did > > not use Maple tree originally because I was told > > at that time that Maple tree was not yet mature. > > > > So, a better approach might be to fit the fix > > onto linux-6.6.y while sticking with xarray. > > The painful part is that using xarray is not acceptable, the offet > is just 32 bit and if it overflows, readdir will read nothing. That's > why maple_tree has to be used. Why does the xarray cause it to overflow vs the maple tree? The maple tree conversion was for performance reasons, as far as I know [1]. Thanks, Liam [1]. https://lore.kernel.org/all/170820083431.6328.16233178852085891453.stgit@91.116.238.104.host.secureserver.net/ > > Thanks, > Kuai > > > > > This is the first I've heard of this CVE. It > > would help if the patch authors got some > > notification when these are filed. > > > > > > > > - patches from set [1] to add helpers to maple_tree, the last patch to > > > > improve fork() performance is not backported; > > > > > > So things slowed down? > > > > > > > - patches from set [2] to change maple_tree, and follow up fixes; > > > > - patches from set [3] to convert offset_ctx from xarray to maple_tree; > > > > > > > > Please notice that I'm not an expert in this area, and I'm afraid to > > > > make manual changes. That's why patch 16 revert the commit that is > > > > different from mainline and will cause conflict backporting new patches. > > > > patch 28 pick the original mainline patch again. > > > > > > > > (And this is what we did to fix the CVE in downstream kernels). > > > > > > > > [1] https://lore.kernel.org/all/20231027033845.90608-1-zhangpeng.00@bytedance.com/ > > > > [2] https://lore.kernel.org/all/20231101171629.3612299-2-Liam.Howlett@oracle.com/T/ > > > > [3] https://lore.kernel.org/all/170820083431.6328.16233178852085891453.stgit@91.116.238.104.host.secureserver.net/ > > > > > > This series looks rough. I want to have the maintainers of these > > > files/subsystems to ack this before being able to take them. > > > > > > thanks, > > > > > > greg k-h > > > > -- > > Chuck Lever > > > > >
On Thu, Nov 07, 2024 at 08:57:23AM +0800, Yu Kuai wrote: > Hi, > > 在 2024/11/06 23:19, Chuck Lever III 写道: > > > > > > > On Nov 6, 2024, at 1:16 AM, Greg KH <gregkh@linuxfoundation.org> wrote: > > > > > > On Thu, Oct 24, 2024 at 09:19:41PM +0800, Yu Kuai wrote: > > > > From: Yu Kuai <yukuai3@huawei.com> > > > > > > > > Fix patch is patch 27, relied patches are from: > > > > I assume patch 27 is: > > > > libfs: fix infinite directory reads for offset dir > > > > https://lore.kernel.org/stable/20241024132225.2271667-12-yukuai1@huaweicloud.com/ > > > > I don't think the Maple tree patches are a hard > > requirement for this fix. And note that libfs did > > not use Maple tree originally because I was told > > at that time that Maple tree was not yet mature. > > > > So, a better approach might be to fit the fix > > onto linux-6.6.y while sticking with xarray. > > The painful part is that using xarray is not acceptable, the offet > is just 32 bit and if it overflows, readdir will read nothing. That's > why maple_tree has to be used. A 32-bit range should be entirely adequate for this usage. - The offset allocator wraps when it reaches the maximum, it doesn't overflow unless there are actually billions of extant entries in the directory, which IMO is not likely. - The offset values are dense, so the directory can use all 2- or 4- billion in the 32-bit integer range before wrapping. - No-one complained about this limitation when offset_readdir() was first merged. The xarray was replaced for performance reasons, not because of the 32-bit range limit. It is always possible that I have misunderstood your concern! > Thanks, > Kuai > > > > > This is the first I've heard of this CVE. It > > would help if the patch authors got some > > notification when these are filed. > > > > > > > > - patches from set [1] to add helpers to maple_tree, the last patch to > > > > improve fork() performance is not backported; > > > > > > So things slowed down? > > > > > > > - patches from set [2] to change maple_tree, and follow up fixes; > > > > - patches from set [3] to convert offset_ctx from xarray to maple_tree; > > > > > > > > Please notice that I'm not an expert in this area, and I'm afraid to > > > > make manual changes. That's why patch 16 revert the commit that is > > > > different from mainline and will cause conflict backporting new patches. > > > > patch 28 pick the original mainline patch again. > > > > > > > > (And this is what we did to fix the CVE in downstream kernels). > > > > > > > > [1] https://lore.kernel.org/all/20231027033845.90608-1-zhangpeng.00@bytedance.com/ > > > > [2] https://lore.kernel.org/all/20231101171629.3612299-2-Liam.Howlett@oracle.com/T/ > > > > [3] https://lore.kernel.org/all/170820083431.6328.16233178852085891453.stgit@91.116.238.104.host.secureserver.net/ > > > > > > This series looks rough. I want to have the maintainers of these > > > files/subsystems to ack this before being able to take them. > > > > > > thanks, > > > > > > greg k-h > > > > -- > > Chuck Lever > > > > > -- Chuck Lever
Hi, 在 2024/11/07 22:41, Chuck Lever 写道: > On Thu, Nov 07, 2024 at 08:57:23AM +0800, Yu Kuai wrote: >> Hi, >> >> 在 2024/11/06 23:19, Chuck Lever III 写道: >>> >>> >>>> On Nov 6, 2024, at 1:16 AM, Greg KH <gregkh@linuxfoundation.org> wrote: >>>> >>>> On Thu, Oct 24, 2024 at 09:19:41PM +0800, Yu Kuai wrote: >>>>> From: Yu Kuai <yukuai3@huawei.com> >>>>> >>>>> Fix patch is patch 27, relied patches are from: >>> >>> I assume patch 27 is: >>> >>> libfs: fix infinite directory reads for offset dir >>> >>> https://lore.kernel.org/stable/20241024132225.2271667-12-yukuai1@huaweicloud.com/ >>> >>> I don't think the Maple tree patches are a hard >>> requirement for this fix. And note that libfs did >>> not use Maple tree originally because I was told >>> at that time that Maple tree was not yet mature. >>> >>> So, a better approach might be to fit the fix >>> onto linux-6.6.y while sticking with xarray. >> >> The painful part is that using xarray is not acceptable, the offet >> is just 32 bit and if it overflows, readdir will read nothing. That's >> why maple_tree has to be used. > > A 32-bit range should be entirely adequate for this usage. > > - The offset allocator wraps when it reaches the maximum, it > doesn't overflow unless there are actually billions of extant > entries in the directory, which IMO is not likely. Yes, it's not likely, but it's possible, and not hard to trigger for test. And please notice that the offset will increase for each new file, and file can be removed, while offset stays the same. > > - The offset values are dense, so the directory can use all 2- or > 4- billion in the 32-bit integer range before wrapping. A simple math, if user create and remove 1 file in each seconds, it will cost about 130 years to overflow. And if user create and remove 1000 files in each second, it will cost about 1 month to overflow. maple tree use 64 bit value for the offset, which is impossible to overflow for the rest of our lifes. > > - No-one complained about this limitation when offset_readdir() was > first merged. The xarray was replaced for performance reasons, > not because of the 32-bit range limit. > > It is always possible that I have misunderstood your concern! The problem is that if the next_offset overflows to 0, then after patch 27, offset_dir_open() will record the 0, and later offset_readdir will return directly, while there can be many files. Thanks, Kuai
> On Nov 7, 2024, at 8:19 PM, Yu Kuai <yukuai1@huaweicloud.com> wrote: > > Hi, > > 在 2024/11/07 22:41, Chuck Lever 写道: >> On Thu, Nov 07, 2024 at 08:57:23AM +0800, Yu Kuai wrote: >>> Hi, >>> >>> 在 2024/11/06 23:19, Chuck Lever III 写道: >>>> >>>> >>>>> On Nov 6, 2024, at 1:16 AM, Greg KH <gregkh@linuxfoundation.org> wrote: >>>>> >>>>> On Thu, Oct 24, 2024 at 09:19:41PM +0800, Yu Kuai wrote: >>>>>> From: Yu Kuai <yukuai3@huawei.com> >>>>>> >>>>>> Fix patch is patch 27, relied patches are from: >>>> >>>> I assume patch 27 is: >>>> >>>> libfs: fix infinite directory reads for offset dir >>>> >>>> https://lore.kernel.org/stable/20241024132225.2271667-12-yukuai1@huaweicloud.com/ >>>> >>>> I don't think the Maple tree patches are a hard >>>> requirement for this fix. And note that libfs did >>>> not use Maple tree originally because I was told >>>> at that time that Maple tree was not yet mature. >>>> >>>> So, a better approach might be to fit the fix >>>> onto linux-6.6.y while sticking with xarray. >>> >>> The painful part is that using xarray is not acceptable, the offet >>> is just 32 bit and if it overflows, readdir will read nothing. That's >>> why maple_tree has to be used. >> A 32-bit range should be entirely adequate for this usage. >> - The offset allocator wraps when it reaches the maximum, it >> doesn't overflow unless there are actually billions of extant >> entries in the directory, which IMO is not likely. > > Yes, it's not likely, but it's possible, and not hard to trigger for > test. I question whether such a test reflects any real-world workload. Besides, there are a number of other limits that will impact the ability to create that many entries in one directory. The number of inodes in one tmpfs instance is limited, for instance. > And please notice that the offset will increase for each new file, > and file can be removed, while offset stays the same. >> - The offset values are dense, so the directory can use all 2- or >> 4- billion in the 32-bit integer range before wrapping. > > A simple math, if user create and remove 1 file in each seconds, it will > cost about 130 years to overflow. And if user create and remove 1000 > files in each second, it will cost about 1 month to overflow. The question is what happens when there are no more offset values available. xa_alloc_cyclic should fail, and file creation is supposed to fail at that point. If it doesn't, that's a bug that is outside of the use of xarray or Maple. > maple tree use 64 bit value for the offset, which is impossible to > overflow for the rest of our lifes. >> - No-one complained about this limitation when offset_readdir() was >> first merged. The xarray was replaced for performance reasons, >> not because of the 32-bit range limit. >> It is always possible that I have misunderstood your concern! > > The problem is that if the next_offset overflows to 0, then after patch > 27, offset_dir_open() will record the 0, and later offset_readdir will > return directly, while there can be many files. That's a separate bug that has nothing to do with the maximum number of entries one directory can have. Again, you don't need Maple tree to address that. My understanding from Liam is that backporting Maple into v6.6 is just not practical to do. We must explore alternate ways to address these concerns. -- Chuck Lever
Hi, 在 2024/11/08 21:23, Chuck Lever III 写道: > > >> On Nov 7, 2024, at 8:19 PM, Yu Kuai <yukuai1@huaweicloud.com> wrote: >> >> Hi, >> >> 在 2024/11/07 22:41, Chuck Lever 写道: >>> On Thu, Nov 07, 2024 at 08:57:23AM +0800, Yu Kuai wrote: >>>> Hi, >>>> >>>> 在 2024/11/06 23:19, Chuck Lever III 写道: >>>>> >>>>> >>>>>> On Nov 6, 2024, at 1:16 AM, Greg KH <gregkh@linuxfoundation.org> wrote: >>>>>> >>>>>> On Thu, Oct 24, 2024 at 09:19:41PM +0800, Yu Kuai wrote: >>>>>>> From: Yu Kuai <yukuai3@huawei.com> >>>>>>> >>>>>>> Fix patch is patch 27, relied patches are from: >>>>> >>>>> I assume patch 27 is: >>>>> >>>>> libfs: fix infinite directory reads for offset dir >>>>> >>>>> https://lore.kernel.org/stable/20241024132225.2271667-12-yukuai1@huaweicloud.com/ >>>>> >>>>> I don't think the Maple tree patches are a hard >>>>> requirement for this fix. And note that libfs did >>>>> not use Maple tree originally because I was told >>>>> at that time that Maple tree was not yet mature. >>>>> >>>>> So, a better approach might be to fit the fix >>>>> onto linux-6.6.y while sticking with xarray. >>>> >>>> The painful part is that using xarray is not acceptable, the offet >>>> is just 32 bit and if it overflows, readdir will read nothing. That's >>>> why maple_tree has to be used. >>> A 32-bit range should be entirely adequate for this usage. >>> - The offset allocator wraps when it reaches the maximum, it >>> doesn't overflow unless there are actually billions of extant >>> entries in the directory, which IMO is not likely. >> >> Yes, it's not likely, but it's possible, and not hard to trigger for >> test. > > I question whether such a test reflects any real-world > workload. > > Besides, there are a number of other limits that will impact > the ability to create that many entries in one directory. > The number of inodes in one tmpfs instance is limited, for > instance. > > >> And please notice that the offset will increase for each new file, >> and file can be removed, while offset stays the same. Did you see the above explanation? files can be removed, you don't have to store that much files to triggger the offset to overflow. >>> - The offset values are dense, so the directory can use all 2- or >>> 4- billion in the 32-bit integer range before wrapping. >> >> A simple math, if user create and remove 1 file in each seconds, it will >> cost about 130 years to overflow. And if user create and remove 1000 >> files in each second, it will cost about 1 month to overflow. > > The question is what happens when there are no more offset > values available. xa_alloc_cyclic should fail, and file > creation is supposed to fail at that point. If it doesn't, > that's a bug that is outside of the use of xarray or Maple. Can you show me the code that xa_alloc_cyclic should fail? At least according to the commets, it will return 1 if the allocation succeeded after wrapping. * Context: Any context. Takes and releases the xa_lock. May sleep if * the @gfp flags permit. * Return: 0 if the allocation succeeded without wrapping. 1 if the * allocation succeeded after wrapping, -ENOMEM if memory could not be * allocated or -EBUSY if there are no free entries in @limit. */ static inline int xa_alloc_cyclic(struct xarray *xa, u32 *id, void *entry, struct xa_limit limit, u32 *next, gfp_t gfp) > > >> maple tree use 64 bit value for the offset, which is impossible to >> overflow for the rest of our lifes. >>> - No-one complained about this limitation when offset_readdir() was >>> first merged. The xarray was replaced for performance reasons, >>> not because of the 32-bit range limit. >>> It is always possible that I have misunderstood your concern! >> >> The problem is that if the next_offset overflows to 0, then after patch >> 27, offset_dir_open() will record the 0, and later offset_readdir will >> return directly, while there can be many files. > > That's a separate bug that has nothing to do with the maximum > number of entries one directory can have. Again, you don't > need Maple tree to address that. > > My understanding from Liam is that backporting Maple into > v6.6 is just not practical to do. We must explore alternate > ways to address these concerns. Like I said, I'll just give up for this cve for v6.6. > > > -- > Chuck Lever > >
> On Nov 8, 2024, at 8:30 PM, Yu Kuai <yukuai1@huaweicloud.com> wrote: > > Hi, > > 在 2024/11/08 21:23, Chuck Lever III 写道: >>> On Nov 7, 2024, at 8:19 PM, Yu Kuai <yukuai1@huaweicloud.com> wrote: >>> >>> Hi, >>> >>> 在 2024/11/07 22:41, Chuck Lever 写道: >>>> On Thu, Nov 07, 2024 at 08:57:23AM +0800, Yu Kuai wrote: >>>>> Hi, >>>>> >>>>> 在 2024/11/06 23:19, Chuck Lever III 写道: >>>>>> >>>>>> >>>>>>> On Nov 6, 2024, at 1:16 AM, Greg KH <gregkh@linuxfoundation.org> wrote: >>>>>>> >>>>>>> On Thu, Oct 24, 2024 at 09:19:41PM +0800, Yu Kuai wrote: >>>>>>>> From: Yu Kuai <yukuai3@huawei.com> >>>>>>>> >>>>>>>> Fix patch is patch 27, relied patches are from: >>>>>> >>>>>> I assume patch 27 is: >>>>>> >>>>>> libfs: fix infinite directory reads for offset dir >>>>>> >>>>>> https://lore.kernel.org/stable/20241024132225.2271667-12-yukuai1@huaweicloud.com/ >>>>>> >>>>>> I don't think the Maple tree patches are a hard >>>>>> requirement for this fix. And note that libfs did >>>>>> not use Maple tree originally because I was told >>>>>> at that time that Maple tree was not yet mature. >>>>>> >>>>>> So, a better approach might be to fit the fix >>>>>> onto linux-6.6.y while sticking with xarray. >>>>> >>>>> The painful part is that using xarray is not acceptable, the offet >>>>> is just 32 bit and if it overflows, readdir will read nothing. That's >>>>> why maple_tree has to be used. >>>> A 32-bit range should be entirely adequate for this usage. >>>> - The offset allocator wraps when it reaches the maximum, it >>>> doesn't overflow unless there are actually billions of extant >>>> entries in the directory, which IMO is not likely. >>> >>> Yes, it's not likely, but it's possible, and not hard to trigger for >>> test. >> I question whether such a test reflects any real-world >> workload. >> Besides, there are a number of other limits that will impact >> the ability to create that many entries in one directory. >> The number of inodes in one tmpfs instance is limited, for >> instance. >>> And please notice that the offset will increase for each new file, >>> and file can be removed, while offset stays the same. > > Did you see the above explanation? files can be removed, you don't have > to store that much files to trigger the offset to overflow. >>>> - The offset values are dense, so the directory can use all 2- or >>>> 4- billion in the 32-bit integer range before wrapping. >>> >>> A simple math, if user create and remove 1 file in each seconds, it will >>> cost about 130 years to overflow. And if user create and remove 1000 >>> files in each second, it will cost about 1 month to overflow. > The problem is that if the next_offset overflows to 0, then after patch > 27, offset_dir_open() will record the 0, and later offset_readdir will > return directly, while there can be many files. Let me revisit this for a moment. The xa_alloc_cyclic() call in simple_offset_add() has a range limit argument of 2 - U32_MAX. So I'm not clear how an overflow (or, more precisely, the reuse of an offset value) would result in a "0" offset being recorded. The range limit prevents the use of 0 and 1. A "0" offset value would be a bug, I agree, but I don't see how that can happen. >> The question is what happens when there are no more offset >> values available. xa_alloc_cyclic should fail, and file >> creation is supposed to fail at that point. If it doesn't, >> that's a bug that is outside of the use of xarray or Maple. > > Can you show me the code that xa_alloc_cyclic should fail? At least > according to the commets, it will return 1 if the allocation succeeded > after wrapping. > > * Context: Any context. Takes and releases the xa_lock. May sleep if > * the @gfp flags permit. > * Return: 0 if the allocation succeeded without wrapping. 1 if the > * allocation succeeded after wrapping, -ENOMEM if memory could not be > * allocated or -EBUSY if there are no free entries in @limit. > */ > static inline int xa_alloc_cyclic(struct xarray *xa, u32 *id, void *entry, > struct xa_limit limit, u32 *next, gfp_t gfp) I recall (dimly) that directory entry offset value re-use is acceptable and preferred, so I think ignoring a "1" return value from xa_alloc_cyclic() is OK. If there are no unused offset values available, it will return -EBUSY, and file creation will fail. Perhaps Christian or Al can chime in here on whether directory entry offset value re-use is indeed expected to be acceptable. Further, my understanding is that: https://lore.kernel.org/stable/20241024132225.2271667-12-yukuai1@huaweicloud.com/ fixes a rename issue that results in an infinite loop, and that's the (only) issue that underlies CVE-2024-46701. You are suggesting that there are other overflow problems with the xarray-based simple_offset implementation. If I can confirm them, then I can get these fixed in v6.6. But so far, I'm not sure I completely understand these other failure modes. Are you suggesting that the above fix /introduces/ the 0 offset problem? -- Chuck Lever
Hi, 在 2024/11/10 0:58, Chuck Lever III 写道: > > >> On Nov 8, 2024, at 8:30 PM, Yu Kuai <yukuai1@huaweicloud.com> wrote: >> >> Hi, >> >> 在 2024/11/08 21:23, Chuck Lever III 写道: >>>> On Nov 7, 2024, at 8:19 PM, Yu Kuai <yukuai1@huaweicloud.com> wrote: >>>> >>>> Hi, >>>> >>>> 在 2024/11/07 22:41, Chuck Lever 写道: >>>>> On Thu, Nov 07, 2024 at 08:57:23AM +0800, Yu Kuai wrote: >>>>>> Hi, >>>>>> >>>>>> 在 2024/11/06 23:19, Chuck Lever III 写道: >>>>>>> >>>>>>> >>>>>>>> On Nov 6, 2024, at 1:16 AM, Greg KH <gregkh@linuxfoundation.org> wrote: >>>>>>>> >>>>>>>> On Thu, Oct 24, 2024 at 09:19:41PM +0800, Yu Kuai wrote: >>>>>>>>> From: Yu Kuai <yukuai3@huawei.com> >>>>>>>>> >>>>>>>>> Fix patch is patch 27, relied patches are from: >>>>>>> >>>>>>> I assume patch 27 is: >>>>>>> >>>>>>> libfs: fix infinite directory reads for offset dir >>>>>>> >>>>>>> https://lore.kernel.org/stable/20241024132225.2271667-12-yukuai1@huaweicloud.com/ >>>>>>> >>>>>>> I don't think the Maple tree patches are a hard >>>>>>> requirement for this fix. And note that libfs did >>>>>>> not use Maple tree originally because I was told >>>>>>> at that time that Maple tree was not yet mature. >>>>>>> >>>>>>> So, a better approach might be to fit the fix >>>>>>> onto linux-6.6.y while sticking with xarray. >>>>>> >>>>>> The painful part is that using xarray is not acceptable, the offet >>>>>> is just 32 bit and if it overflows, readdir will read nothing. That's >>>>>> why maple_tree has to be used. >>>>> A 32-bit range should be entirely adequate for this usage. >>>>> - The offset allocator wraps when it reaches the maximum, it >>>>> doesn't overflow unless there are actually billions of extant >>>>> entries in the directory, which IMO is not likely. >>>> >>>> Yes, it's not likely, but it's possible, and not hard to trigger for >>>> test. >>> I question whether such a test reflects any real-world >>> workload. >>> Besides, there are a number of other limits that will impact >>> the ability to create that many entries in one directory. >>> The number of inodes in one tmpfs instance is limited, for >>> instance. >>>> And please notice that the offset will increase for each new file, >>>> and file can be removed, while offset stays the same. >> >> Did you see the above explanation? files can be removed, you don't have >> to store that much files to trigger the offset to overflow. >>>>> - The offset values are dense, so the directory can use all 2- or >>>>> 4- billion in the 32-bit integer range before wrapping. >>>> >>>> A simple math, if user create and remove 1 file in each seconds, it will >>>> cost about 130 years to overflow. And if user create and remove 1000 >>>> files in each second, it will cost about 1 month to overflow. > >> The problem is that if the next_offset overflows to 0, then after patch >> 27, offset_dir_open() will record the 0, and later offset_readdir will >> return directly, while there can be many files. > > > Let me revisit this for a moment. The xa_alloc_cyclic() call > in simple_offset_add() has a range limit argument of 2 - U32_MAX. > > So I'm not clear how an overflow (or, more precisely, the > reuse of an offset value) would result in a "0" offset being > recorded. The range limit prevents the use of 0 and 1. > > A "0" offset value would be a bug, I agree, but I don't see > how that can happen. > > >>> The question is what happens when there are no more offset >>> values available. xa_alloc_cyclic should fail, and file >>> creation is supposed to fail at that point. If it doesn't, >>> that's a bug that is outside of the use of xarray or Maple. >> >> Can you show me the code that xa_alloc_cyclic should fail? At least >> according to the commets, it will return 1 if the allocation succeeded >> after wrapping. >> >> * Context: Any context. Takes and releases the xa_lock. May sleep if >> * the @gfp flags permit. >> * Return: 0 if the allocation succeeded without wrapping. 1 if the >> * allocation succeeded after wrapping, -ENOMEM if memory could not be >> * allocated or -EBUSY if there are no free entries in @limit. >> */ >> static inline int xa_alloc_cyclic(struct xarray *xa, u32 *id, void *entry, >> struct xa_limit limit, u32 *next, gfp_t gfp) > > I recall (dimly) that directory entry offset value re-use > is acceptable and preferred, so I think ignoring a "1" > return value from xa_alloc_cyclic() is OK. If there are > no unused offset values available, it will return -EBUSY, > and file creation will fail. > > Perhaps Christian or Al can chime in here on whether > directory entry offset value re-use is indeed expected > to be acceptable. This can't be acceptable in this case, the reason is straightforward, it will mess readdir, and this is mucth more serious than the cve itself. Thanks, Kuai > > Further, my understanding is that: > > https://lore.kernel.org/stable/20241024132225.2271667-12-yukuai1@huaweicloud.com/ > > fixes a rename issue that results in an infinite loop, > and that's the (only) issue that underlies CVE-2024-46701. > > You are suggesting that there are other overflow problems > with the xarray-based simple_offset implementation. If I > can confirm them, then I can get these fixed in v6.6. But > so far, I'm not sure I completely understand these other > failure modes. > > Are you suggesting that the above fix /introduces/ the > 0 offset problem? > > -- > Chuck Lever > >
* Chuck Lever III <chuck.lever@oracle.com> [241108 08:23]: > > > > On Nov 7, 2024, at 8:19 PM, Yu Kuai <yukuai1@huaweicloud.com> wrote: > > > > Hi, > > > > 在 2024/11/07 22:41, Chuck Lever 写道: > >> On Thu, Nov 07, 2024 at 08:57:23AM +0800, Yu Kuai wrote: > >>> Hi, > >>> > >>> 在 2024/11/06 23:19, Chuck Lever III 写道: > >>>> > >>>> > >>>>> On Nov 6, 2024, at 1:16 AM, Greg KH <gregkh@linuxfoundation.org> wrote: > >>>>> > >>>>> On Thu, Oct 24, 2024 at 09:19:41PM +0800, Yu Kuai wrote: > >>>>>> From: Yu Kuai <yukuai3@huawei.com> > >>>>>> > >>>>>> Fix patch is patch 27, relied patches are from: > >>>> > >>>> I assume patch 27 is: > >>>> > >>>> libfs: fix infinite directory reads for offset dir > >>>> > >>>> https://lore.kernel.org/stable/20241024132225.2271667-12-yukuai1@huaweicloud.com/ > >>>> > >>>> I don't think the Maple tree patches are a hard > >>>> requirement for this fix. And note that libfs did > >>>> not use Maple tree originally because I was told > >>>> at that time that Maple tree was not yet mature. > >>>> > >>>> So, a better approach might be to fit the fix > >>>> onto linux-6.6.y while sticking with xarray. > >>> > >>> The painful part is that using xarray is not acceptable, the offet > >>> is just 32 bit and if it overflows, readdir will read nothing. That's > >>> why maple_tree has to be used. > >> A 32-bit range should be entirely adequate for this usage. > >> - The offset allocator wraps when it reaches the maximum, it > >> doesn't overflow unless there are actually billions of extant > >> entries in the directory, which IMO is not likely. > > > > Yes, it's not likely, but it's possible, and not hard to trigger for > > test. > > I question whether such a test reflects any real-world > workload. > > Besides, there are a number of other limits that will impact > the ability to create that many entries in one directory. > The number of inodes in one tmpfs instance is limited, for > instance. > > > > And please notice that the offset will increase for each new file, > > and file can be removed, while offset stays the same. > >> - The offset values are dense, so the directory can use all 2- or > >> 4- billion in the 32-bit integer range before wrapping. > > > > A simple math, if user create and remove 1 file in each seconds, it will > > cost about 130 years to overflow. And if user create and remove 1000 > > files in each second, it will cost about 1 month to overflow. > > The question is what happens when there are no more offset > values available. xa_alloc_cyclic should fail, and file > creation is supposed to fail at that point. If it doesn't, > that's a bug that is outside of the use of xarray or Maple. > > > > maple tree use 64 bit value for the offset, which is impossible to > > overflow for the rest of our lifes. > >> - No-one complained about this limitation when offset_readdir() was > >> first merged. The xarray was replaced for performance reasons, > >> not because of the 32-bit range limit. > >> It is always possible that I have misunderstood your concern! > > > > The problem is that if the next_offset overflows to 0, then after patch > > 27, offset_dir_open() will record the 0, and later offset_readdir will > > return directly, while there can be many files. > > That's a separate bug that has nothing to do with the maximum > number of entries one directory can have. Again, you don't > need Maple tree to address that. > > My understanding from Liam is that backporting Maple into > v6.6 is just not practical to do. We must explore alternate > ways to address these concerns. > The tree itself is in v6.6, but the evolution of the tree to fit the needs of this and other subsystems isn't something that would be well tested. This is really backporting features and that's not the point of stable. I think this is what Lorenzo was saying about changing your approach, we can't backport 28 patches to fix this when it isn't needed. Thanks, Liam
Hi, 在 2024/11/09 1:03, Liam R. Howlett 写道: > * Chuck Lever III <chuck.lever@oracle.com> [241108 08:23]: >> >> >>> On Nov 7, 2024, at 8:19 PM, Yu Kuai <yukuai1@huaweicloud.com> wrote: >>> >>> Hi, >>> >>> 在 2024/11/07 22:41, Chuck Lever 写道: >>>> On Thu, Nov 07, 2024 at 08:57:23AM +0800, Yu Kuai wrote: >>>>> Hi, >>>>> >>>>> 在 2024/11/06 23:19, Chuck Lever III 写道: >>>>>> >>>>>> >>>>>>> On Nov 6, 2024, at 1:16 AM, Greg KH <gregkh@linuxfoundation.org> wrote: >>>>>>> >>>>>>> On Thu, Oct 24, 2024 at 09:19:41PM +0800, Yu Kuai wrote: >>>>>>>> From: Yu Kuai <yukuai3@huawei.com> >>>>>>>> >>>>>>>> Fix patch is patch 27, relied patches are from: >>>>>> >>>>>> I assume patch 27 is: >>>>>> >>>>>> libfs: fix infinite directory reads for offset dir >>>>>> >>>>>> https://lore.kernel.org/stable/20241024132225.2271667-12-yukuai1@huaweicloud.com/ >>>>>> >>>>>> I don't think the Maple tree patches are a hard >>>>>> requirement for this fix. And note that libfs did >>>>>> not use Maple tree originally because I was told >>>>>> at that time that Maple tree was not yet mature. >>>>>> >>>>>> So, a better approach might be to fit the fix >>>>>> onto linux-6.6.y while sticking with xarray. >>>>> >>>>> The painful part is that using xarray is not acceptable, the offet >>>>> is just 32 bit and if it overflows, readdir will read nothing. That's >>>>> why maple_tree has to be used. >>>> A 32-bit range should be entirely adequate for this usage. >>>> - The offset allocator wraps when it reaches the maximum, it >>>> doesn't overflow unless there are actually billions of extant >>>> entries in the directory, which IMO is not likely. >>> >>> Yes, it's not likely, but it's possible, and not hard to trigger for >>> test. >> >> I question whether such a test reflects any real-world >> workload. >> >> Besides, there are a number of other limits that will impact >> the ability to create that many entries in one directory. >> The number of inodes in one tmpfs instance is limited, for >> instance. >> >> >>> And please notice that the offset will increase for each new file, >>> and file can be removed, while offset stays the same. >>>> - The offset values are dense, so the directory can use all 2- or >>>> 4- billion in the 32-bit integer range before wrapping. >>> >>> A simple math, if user create and remove 1 file in each seconds, it will >>> cost about 130 years to overflow. And if user create and remove 1000 >>> files in each second, it will cost about 1 month to overflow. >> >> The question is what happens when there are no more offset >> values available. xa_alloc_cyclic should fail, and file >> creation is supposed to fail at that point. If it doesn't, >> that's a bug that is outside of the use of xarray or Maple. >> >> >>> maple tree use 64 bit value for the offset, which is impossible to >>> overflow for the rest of our lifes. >>>> - No-one complained about this limitation when offset_readdir() was >>>> first merged. The xarray was replaced for performance reasons, >>>> not because of the 32-bit range limit. >>>> It is always possible that I have misunderstood your concern! >>> >>> The problem is that if the next_offset overflows to 0, then after patch >>> 27, offset_dir_open() will record the 0, and later offset_readdir will >>> return directly, while there can be many files. >> >> That's a separate bug that has nothing to do with the maximum >> number of entries one directory can have. Again, you don't >> need Maple tree to address that. >> >> My understanding from Liam is that backporting Maple into >> v6.6 is just not practical to do. We must explore alternate >> ways to address these concerns. >> > > The tree itself is in v6.6, but the evolution of the tree to fit the > needs of this and other subsystems isn't something that would be well > tested. This is really backporting features and that's not the point of > stable. Of course. > > I think this is what Lorenzo was saying about changing your approach, we > can't backport 28 patches to fix this when it isn't needed. I don't have other approach now, so I'll not follow on fixing this cve. I'll be great if someone has a beeter apporch. :) Thanks, Kuai > > Thanks, > Liam > > . >
On Wed, 2024-11-06 at 15:19 +0000, Chuck Lever III wrote: > This is the first I've heard of this CVE. It > would help if the patch authors got some > notification when these are filed. Greg did it; it came from the kernel CNA: https://www.cve.org/CVERecord?id=CVE-2024-46701 The way it seems to work is that this is simply a wrapper for the upstream commit: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=64a7ce76fb901bf9f9c36cf5d681328fc0fd4b5a Which is what appears as the last stable reference. I assume someone investigated and added the vulnerable kernel details. I think the theory is that since you reviewed the original upstream patch, stable just takes care of the backports and CVE management of the existing fix through the normal stable process. James
* Greg KH <gregkh@linuxfoundation.org> [241106 01:16]: > On Thu, Oct 24, 2024 at 09:19:41PM +0800, Yu Kuai wrote: > > From: Yu Kuai <yukuai3@huawei.com> > > > > Fix patch is patch 27, relied patches are from: > > > > - patches from set [1] to add helpers to maple_tree, the last patch to > > improve fork() performance is not backported; > > So things slowed down? Fork got faster in modern kernels. The backport contains helpers as they are dependencies for later patches. > > > - patches from set [2] to change maple_tree, and follow up fixes; > > - patches from set [3] to convert offset_ctx from xarray to maple_tree; > > > > Please notice that I'm not an expert in this area, and I'm afraid to > > make manual changes. That's why patch 16 revert the commit that is > > different from mainline and will cause conflict backporting new patches. > > patch 28 pick the original mainline patch again. You reverted and forward ported a patch but didn't Cc the author of the patch you changed. That is probably one of the most important Cc's to have on this list. By the way, that fix is already in 6.6 > > > > (And this is what we did to fix the CVE in downstream kernels). > > > > [1] https://lore.kernel.org/all/20231027033845.90608-1-zhangpeng.00@bytedance.com/ > > [2] https://lore.kernel.org/all/20231101171629.3612299-2-Liam.Howlett@oracle.com/T/ > > [3] https://lore.kernel.org/all/170820083431.6328.16233178852085891453.stgit@91.116.238.104.host.secureserver.net/ > > This series looks rough. I want to have the maintainers of these > files/subsystems to ack this before being able to take them. The entire backporting of all of this to fix an issue is extreme, and although it will solve the issue, you end up running something very different than 6.6 for a single fix. Looking at the details of the cve, it seems very odd. This is an issue in libfs and the affected kernel is 6.6 to 6.10.7. It then goes into details of how the maple tree allows this - but 6.6 doesn't use the maple tree in libfs so either the patch needs to be backported to an older stable (6.6) or the CVE is wrong. Almost all of these patches are to backport using the maple tree in libfs and that should not be done. I don't know if the CVE is incorrectly labeled or if the patch wasn't backported far enough because I was not involved in the discussion of this CVE - which seems like an oversight if this is specifically caused by the maple tree? The patch in question is 64a7ce76fb90 ("libfs: fix infinite directory reads for offset dir"). I think we just need the one? To be clear: - Do not take this serioes - Someone in libfs land should respond stating if the fix above needs to be backported. Thanks, Liam
© 2016 - 2024 Red Hat, Inc.