[PATCH] fs: Prevent panic from NULL dereference in alloc_fs_context() during do_exit()

Yunseong Kim posted 1 patch 7 months, 2 weeks ago
fs/fs_context.c | 3 +++
1 file changed, 3 insertions(+)
[PATCH] fs: Prevent panic from NULL dereference in alloc_fs_context() during do_exit()
Posted by Yunseong Kim 7 months, 2 weeks ago
The function alloc_fs_context() assumes that current->nsproxy and its
net_ns field are valid. However, this assumption can be violated in
cases such as task teardown during do_exit(), where current->nsproxy can
be NULL or already cleared.

This issue was triggered during stress-ng's kernel-coverage.sh testing,
Since alloc_fs_context() can be invoked in various contexts — including
from asynchronous or teardown paths like do_exit() — it's difficult to
guarantee that its input arguments are always valid.

A follow-up patch will improve the granularity of this fix by moving the
check closer to the actual mount trigger(e.g., in efivarfs_pm_notify()).

Observed on Apple M2 (fedora 42 asahi remix) during stress-ng-dev:

[  137.769615] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000028
[  137.769691] Mem abort info:
[  137.769693]   ESR = 0x0000000096000007
[  137.769694]   EC = 0x25: DABT (current EL), IL = 32 bits
[  137.769695]   SET = 0, FnV = 0
[  137.769696]   EA = 0, S1PTW = 0
[  137.769697]   FSC = 0x07: level 3 translation fault
[  137.769698] Data abort info:
[  137.769699]   ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000
[  137.769700]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[  137.769700]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[  137.769702] user pgtable: 16k pages, 48-bit VAs, pgdp=0000000810df28b0
[  137.769703] [0000000000000028] pgd=08000008ace3c403, p4d=08000008ace3c403, pud=08000008f7658403, pmd=08000008f765c403, pte=0000000000000000
[  137.769743] Internal error: Oops: 0000000096000007 [#1] SMP
[  137.769745] Modules linked in: uinput rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device uas usb_storage nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables qrtr uhid bnep brcmfmac_wcc sunrpc binfmt_misc brcmfmac brcmutil cfg80211 hci_bcm4377 bluetooth mmc_core rfkill aop_als aop_las industrialio macsmc_hid snd_soc_aop apple_isp videobuf2_dma_sg ofpart videobuf2_memops videobuf2_v4l2 spi_nor mtd videodev snd_soc_cs42l84 snd_soc_tas2764 videobuf2_common snd_soc_apple_mca mc apple_soc_cpufreq snd_soc_macaudio snd_soc_core snd_compress ac97_bus leds_pwm joydev loop dm_multipath nfnetlink zram lz4hc_compress lz4_compress hid_apple nvmem_spmi_mfd tps6598x macsmc_hwmon macsmc_reboot dockchannel_hid macsmc_power rtc_macsmc gpio_macsmc simple_mfd_spmi polyval_ce polyval_generic ghash_ce sha3_ce appledrm dwc3 phy_apple_atc sha512_ce
[  137.769778]  apple_dcp typec sha512_arm64 aop apple_dockchannel ulpi macsmc_rtkit mux_core apple_wdt macsmc spmi_apple_controller drm_dma_helper nvmem_apple_efuses udc_core apple_rtkit_helper snd_pcm_dmaengine snd_pcm asahi spi_apple clk_apple_nco snd_timer snd i2c_pasemi_platform pwm_apple pinctrl_apple_gpio apple_admac soundcore i2c_pasemi_core apple_dart xhci_plat_hcd vfat fat nvme_apple apple_sart nvme_core nvme_auth scsi_dh_rdac scsi_dh_emc scsi_dh_alua fuse i2c_dev
[  137.769796] CPU: 6 UID: 0 PID: 3632 Comm: stress-ng-dev Kdump: loaded Tainted: G S      W          6.14.2-401.asahi.fc42.aarch64+16k-debug #1
[  137.769799] Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN
[  137.769799] Hardware name: Apple MacBook Air (13-inch, M2, 2022) (DT)
[  137.769801] pstate: 61400009 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[  137.769802] pc : alloc_fs_context+0x98/0x2e8
[  137.769807] lr : alloc_fs_context+0x70/0x2e8
[  137.769808] sp : ffff80009b5479b0
[  137.769809] x29: ffff80009b5479b0 x28: ffff5a120e960000 x27: 0000000000000000
[  137.769811] x26: 0000000000000002 x25: ffffa55a71473000 x24: 0000000000000000
[  137.769812] x23: 0000000000000000 x22: ffffa55a7170ddc8 x21: 0000000000000000
[  137.769814] x20: 0000000000000000 x19: ffff5a1284390200 x18: ffffa55a73ade328
[  137.769815] x17: 0000000000000000 x16: 0000000000000000 x15: ffffa55a73951c70
[  137.769816] x14: 0000000000000002 x13: 000000000003bb2a x12: 0000000000000002
[  137.769818] x11: 000000000003bb28 x10: ffffa55a71473000 x9 : ffffa55a6e056b8c
[  137.769819] x8 : ffff5a120e960000 x7 : ffffa55a714727b0 x6 : 0000000000000006
[  137.769821] x5 : 0000000000000040 x4 : 0000000000000001 x3 : ffff5a17dd5bfa28
[  137.769822] x2 : 0000000000000000 x1 : 0000000000000001 x0 : 0000000000000000
[  137.769824] Call trace:
[  137.769825]  alloc_fs_context+0x98/0x2e8 (P)
[  137.769827]  fs_context_for_mount+0x28/0x40
[  137.769828]  vfs_kern_mount.part.0+0x28/0xe8
[  137.769830]  vfs_kern_mount+0x1c/0x38
[  137.769831]  efivarfs_pm_notify+0xf8/0x2f8
[  137.769834]  notifier_call_chain+0xb4/0x220
[  137.769836]  blocking_notifier_call_chain+0x4c/0x78
[  137.769837]  pm_notifier_call_chain+0x2c/0x40
[  137.769840]  snapshot_release+0x60/0xa0
[  137.769841]  __fput+0xf8/0x310
[  137.769843]  ____fput+0x1c/0x30
[  137.769845]  task_work_run+0x88/0x120
[  137.769846]  do_exit+0x19c/0x450
[  137.769848]  do_group_exit+0x38/0xc0
[  137.769849]  __arm64_sys_exit_group+0x20/0x28
[  137.769851]  invoke_syscall.constprop.0+0x64/0xe8
[  137.769853]  el0_svc_common.constprop.0+0xc0/0xe8
[  137.769854]  do_el0_svc+0x24/0x38
[  137.769855]  el0_svc+0x54/0x230
[  137.769857]  el0t_64_sync_handler+0x10c/0x138
[  137.769859]  el0t_64_sync+0x1bc/0x1c0
[  137.769861] Code: f821001f f9006e60 d5384100 f9463c00 (f9401417)
[  137.769862] ---[ end trace 0000000000000000 ]---
[  137.769864] Fixing recursive fault but reboot is needed!
[  137.769866] check_preemption_disabled: 35 callbacks suppressed
[  137.769867] BUG: using smp_processor_id() in preemptible [00000000] code: stress-ng-dev/3632
[  137.769869] caller is debug_smp_processor_id+0x20/0x38
[  137.769871] CPU: 6 UID: 0 PID: 3632 Comm: stress-ng-dev Kdump: loaded Tainted: G S    D W          6.14.2-401.asahi.fc42.aarch64+16k-debug #1
[  137.769872] Tainted: [S]=CPU_OUT_OF_SPEC, [D]=DIE, [W]=WARN
[  137.769873] Hardware name: Apple MacBook Air (13-inch, M2, 2022) (DT)
[  137.769873] Call trace:
[  137.769873]  show_stack+0x30/0x98 (C)
[  137.769874]  dump_stack_lvl+0xa8/0xe8
[  137.769876]  dump_stack+0x18/0x34
[  137.769876]  check_preemption_disabled+0x120/0x128
[  137.769878]  debug_smp_processor_id+0x20/0x38
[  137.769879]  __schedule+0x4c/0x718
[  137.769880]  do_task_dead+0x58/0x68
[  137.769882]  make_task_dead+0xe8/0x150
[  137.769883]  die+0x210/0x258
[  137.769884]  die_kernel_fault+0x1ac/0x1c8
[  137.769885]  __do_kernel_fault+0x1cc/0x1d8
[  137.769886]  do_page_fault+0x2b4/0x9f0
[  137.769888]  do_translation_fault+0x54/0xf0
[  137.769889]  do_mem_abort+0x48/0xa0
[  137.769890]  el1_abort+0x58/0xc8
[  137.769891]  el1h_64_sync_handler+0xf0/0x120
[  137.769892]  el1h_64_sync+0x84/0x88
[  137.769892]  alloc_fs_context+0x98/0x2e8 (P)
[  137.769893]  fs_context_for_mount+0x28/0x40
[  137.769894]  vfs_kern_mount.part.0+0x28/0xe8
[  137.769895]  vfs_kern_mount+0x1c/0x38
[  137.769896]  efivarfs_pm_notify+0xf8/0x2f8
[  137.769897]  notifier_call_chain+0xb4/0x220
[  137.769897]  blocking_notifier_call_chain+0x4c/0x78
[  137.769898]  pm_notifier_call_chain+0x2c/0x40
[  137.769899]  snapshot_release+0x60/0xa0
[  137.769900]  __fput+0xf8/0x310
[  137.769901]  ____fput+0x1c/0x30
[  137.769902]  task_work_run+0x88/0x120
[  137.769903]  do_exit+0x19c/0x450
[  137.769904]  do_group_exit+0x38/0xc0
[  137.769905]  __arm64_sys_exit_group+0x20/0x28
[  137.769906]  invoke_syscall.constprop.0+0x64/0xe8
[  137.769907]  el0_svc_common.constprop.0+0xc0/0xe8
[  137.769907]  do_el0_svc+0x24/0x38
[  137.769908]  el0_svc+0x54/0x230
[  137.769909]  el0t_64_sync_handler+0x10c/0x138
[  137.769910]  el0t_64_sync+0x1bc/0x1c0

Observed on Fedora 42 on Apple Virtualization during the same test:

[  473.893249] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000028
[  473.893253] Mem abort info:
[  473.893256]   ESR = 0x0000000096000004
[  473.893258]   EC = 0x25: DABT (current EL), IL = 32 bits
[  473.893262]   SET = 0, FnV = 0
[  473.893264]   EA = 0, S1PTW = 0
[  473.893267]   FSC = 0x04: level 0 translation fault
[  473.893270] Data abort info:
[  473.893272]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[  473.893275]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[  473.893278]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[  473.893282] user pgtable: 4k pages, 48-bit VAs, pgdp=000000027468e000
[  473.893285] [0000000000000028] pgd=0000000000000000, p4d=0000000000000000
[  473.893294] Internal error: Oops: 0000000096000004 [#1]  SMP
[  473.893298] Modules linked in: vfio_iommu_type1 vfio cuse vhost_net tun vhost vhost_iotlb tap uhid overlay isofs uinput snd_seq_dummy snd_hrtimer nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables qrtr sunrpc virtio_snd snd_seq snd_seq_device snd_pcm snd_timer snd virtio_net net_failover failover virtio_balloon soundcore vfat fat joydev loop nfnetlink vsock_loopback vmw_vsock_virtio_transport_common zram vmw_vsock_vmci_transport lz4hc_compress vmw_vmci lz4_compress vsock uas polyval_ce polyval_generic ghash_ce sha3_ce usb_storage sha512_ce virtio_gpu sha512_arm64 virtio_dma_buf apple_mfi_fastcharge fuse
[  473.893383] CPU: 2 UID: 0 PID: 4496 Comm: stress-ng-dev Kdump: loaded Not tainted 6.15.0-rc4+ #1 PREEMPT(voluntary)
[  473.893387] Hardware name: Apple Inc. Apple Virtualization Generic Platform, BIOS 2075.101.2.0.0 03/12/2025
[  473.893390] pstate: 81400005 (Nzcv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[  473.893394] pc : alloc_fs_context+0xc4/0x460
[  473.893401] lr : alloc_fs_context+0xa0/0x460
[  473.893405] sp : ffff80008f01b7d0
[  473.893406] x29: ffff80008f01b7d0 x28: ffff800084754520 x27: 0000000000000006
[  473.893411] x26: 0000000000000004 x25: ffff0000cd714a28 x24: ffff0000ca976540
[  473.893416] x23: ffff000232890000 x22: ffff80008452aa80 x21: 0000000000000000
[  473.893420] x20: 0000000000000000 x19: ffff0001215ab200 x18: 0000000000000001
[  473.893425] x17: ffff00015083bb00 x16: 0f043b79c558efdb x15: 0000000000000000
[  473.893429] x14: 0000000000000063 x13: 657461747320656c x12: 6261697261762067
[  473.893434] x11: 0000000000000000 x10: 0000000000ff0100 x9 : 0000000000000000
[  473.893438] x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000000
[  473.893442] x5 : fffffd7fbf054dd8 x4 : ffff00032d538da8 x3 : ffff00032d538da8
[  473.893447] x2 : ffff80008f01b440 x1 : ffff800082a4aeb8 x0 : 0000000000000000
[  473.893452] Call trace:
[  473.893453]  alloc_fs_context+0xc4/0x460 (P)
[  473.893458]  fs_context_for_mount+0x40/0x58
[  473.893462]  vfs_kern_mount+0x44/0x158
[  473.893468]  efivarfs_pm_notify+0x124/0x320
[  473.893472]  notifier_call_chain+0x11c/0x300
[  473.893476]  blocking_notifier_call_chain+0x60/0x98
[  473.893480]  pm_notifier_call_chain+0x38/0x50
[  473.893484]  snapshot_release+0x9c/0xc0
[  473.893489]  __fput+0x1c4/0x4f0
[  473.893494]  ____fput+0x2c/0x70
[  473.893497]  task_work_run+0x100/0x150
[  473.893501]  do_exit+0x2e4/0xfb8
[  473.893506]  do_group_exit+0xd8/0xe0
[  473.893511]  __arm64_sys_exit_group+0x24/0x30
[  473.893515]  invoke_syscall+0x90/0x180
[  473.893520]  el0_svc_common+0x140/0x178
[  473.893523]  do_el0_svc+0x38/0x50
[  473.893527]  el0_svc+0x58/0x158
[  473.893533]  el0t_64_sync_handler+0x78/0x108
[  473.893538]  el0t_64_sync+0x1bc/0x1c0
[  473.893543] Code: 140000ab 97ebb69c f9006e78 f9463ee8 (f9401519)
[  473.893545] ---[ end trace 0000000000000000 ]---
[  473.893600] Fixing recursive fault but reboot is needed!
[  473.893604] check_preemption_disabled: 40 callbacks suppressed

Tested-on: Apple M2 (fedora 42 asahi remix, 16k pages)
Tested-on: Fedora 42 on Apple Virtualization Generic Platform
Signed-off-by: Yunseong Kim <ysk@kzalloc.com>
---
 fs/fs_context.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/fs_context.c b/fs/fs_context.c
index 582d33e81117..529de43b8b5e 100644
--- a/fs/fs_context.c
+++ b/fs/fs_context.c
@@ -282,6 +282,9 @@ static struct fs_context *alloc_fs_context(struct file_system_type *fs_type,
 	struct fs_context *fc;
 	int ret = -ENOMEM;
 
+	if (!current->nsproxy || !current->nsproxy->net_ns)
+		return ERR_PTR(-EINVAL);
+
 	fc = kzalloc(sizeof(struct fs_context), GFP_KERNEL_ACCOUNT);
 	if (!fc)
 		return ERR_PTR(-ENOMEM);
-- 
2.49.0
Re: [PATCH] fs: Prevent panic from NULL dereference in alloc_fs_context() during do_exit()
Posted by Al Viro 7 months, 1 week ago
On Tue, May 06, 2025 at 05:38:02AM +0900, Yunseong Kim wrote:
> The function alloc_fs_context() assumes that current->nsproxy and its
> net_ns field are valid. However, this assumption can be violated in
> cases such as task teardown during do_exit(), where current->nsproxy can
> be NULL or already cleared.
> 
> This issue was triggered during stress-ng's kernel-coverage.sh testing,
> Since alloc_fs_context() can be invoked in various contexts — including
> from asynchronous or teardown paths like do_exit() — it's difficult to
> guarantee that its input arguments are always valid.
> 
> A follow-up patch will improve the granularity of this fix by moving the
> check closer to the actual mount trigger(e.g., in efivarfs_pm_notify()).

UGH.

> diff --git a/fs/fs_context.c b/fs/fs_context.c
> index 582d33e81117..529de43b8b5e 100644
> --- a/fs/fs_context.c
> +++ b/fs/fs_context.c
> @@ -282,6 +282,9 @@ static struct fs_context *alloc_fs_context(struct file_system_type *fs_type,
>  	struct fs_context *fc;
>  	int ret = -ENOMEM;
>  
> +	if (!current->nsproxy || !current->nsproxy->net_ns)
> +		return ERR_PTR(-EINVAL);
> +
>  	fc = kzalloc(sizeof(struct fs_context), GFP_KERNEL_ACCOUNT);
>  	if (!fc)
>  		return ERR_PTR(-ENOMEM);

That might paper over the oops, but I very much doubt that this will be
a correct fix...  Note that in efivarfs_pm_notify() we have other
fun issues when run from such context - have task_work_add() fail in
fput() and if delayed_fput() runs right afterwards and
        efivar_init(efivarfs_check_missing, sfi->sb, false);
in there might end up with UAF...
Re: [PATCH] fs: Prevent panic from NULL dereference in alloc_fs_context() during do_exit()
Posted by Christian Brauner 7 months, 1 week ago
On Mon, May 05, 2025 at 11:36:15PM +0100, Al Viro wrote:
> On Tue, May 06, 2025 at 05:38:02AM +0900, Yunseong Kim wrote:
> > The function alloc_fs_context() assumes that current->nsproxy and its
> > net_ns field are valid. However, this assumption can be violated in
> > cases such as task teardown during do_exit(), where current->nsproxy can
> > be NULL or already cleared.
> > 
> > This issue was triggered during stress-ng's kernel-coverage.sh testing,
> > Since alloc_fs_context() can be invoked in various contexts — including
> > from asynchronous or teardown paths like do_exit() — it's difficult to
> > guarantee that its input arguments are always valid.
> > 
> > A follow-up patch will improve the granularity of this fix by moving the
> > check closer to the actual mount trigger(e.g., in efivarfs_pm_notify()).
> 
> UGH.
> 
> > diff --git a/fs/fs_context.c b/fs/fs_context.c
> > index 582d33e81117..529de43b8b5e 100644
> > --- a/fs/fs_context.c
> > +++ b/fs/fs_context.c
> > @@ -282,6 +282,9 @@ static struct fs_context *alloc_fs_context(struct file_system_type *fs_type,
> >  	struct fs_context *fc;
> >  	int ret = -ENOMEM;
> >  
> > +	if (!current->nsproxy || !current->nsproxy->net_ns)
> > +		return ERR_PTR(-EINVAL);
> > +
> >  	fc = kzalloc(sizeof(struct fs_context), GFP_KERNEL_ACCOUNT);
> >  	if (!fc)
> >  		return ERR_PTR(-ENOMEM);
> 
> That might paper over the oops, but I very much doubt that this will be
> a correct fix...  Note that in efivarfs_pm_notify() we have other
> fun issues when run from such context - have task_work_add() fail in
> fput() and if delayed_fput() runs right afterwards and
>         efivar_init(efivarfs_check_missing, sfi->sb, false);
> in there might end up with UAF...

We've already accepted a patch that removes the need for
vfs_kern_mount() from efivarfs completely.
Re: [PATCH] fs: Prevent panic from NULL dereference in alloc_fs_context() during do_exit()
Posted by Yunseong Kim 7 months, 1 week ago
Hi Christian,

On 5/6/25 5:45 오후, Christian Brauner wrote:
>>> diff --git a/fs/fs_context.c b/fs/fs_context.c
>>> index 582d33e81117..529de43b8b5e 100644
>>> --- a/fs/fs_context.c
>>> +++ b/fs/fs_context.c
>>> @@ -282,6 +282,9 @@ static struct fs_context *alloc_fs_context(struct file_system_type *fs_type,
>>>  	struct fs_context *fc;
>>>  	int ret = -ENOMEM;
>>>  
>>> +	if (!current->nsproxy || !current->nsproxy->net_ns)
>>> +		return ERR_PTR(-EINVAL);
>>> +
>>>  	fc = kzalloc(sizeof(struct fs_context), GFP_KERNEL_ACCOUNT);
>>>  	if (!fc)
>>>  		return ERR_PTR(-ENOMEM);
>>
>> That might paper over the oops, but I very much doubt that this will be
>> a correct fix...  Note that in efivarfs_pm_notify() we have other
>> fun issues when run from such context - have task_work_add() fail in
>> fput() and if delayed_fput() runs right afterwards and
>>         efivar_init(efivarfs_check_missing, sfi->sb, false);
>> in there might end up with UAF...
> 
> We've already accepted a patch that removes the need for
> vfs_kern_mount() from efivarfs completely.

I’ll take a look at the patch you mentioned, check if the issue reproduces, and get back to you.

Link: https://lore.kernel.org/all/20250318194111.19419-4-James.Bottomley@HansenPartnership.com/

Thanks for checking it!

Best regards,
Yunseong Kim

Re: [PATCH] fs: Prevent panic from NULL dereference in alloc_fs_context() during do_exit()
Posted by Yunseong Kim 7 months, 1 week ago
Hi Alexander,

Thanks for the feedback!

On 5/6/25 7:36 오전, Al Viro wrote:
> On Tue, May 06, 2025 at 05:38:02AM +0900, Yunseong Kim wrote:
>> The function alloc_fs_context() assumes that current->nsproxy and its
>> net_ns field are valid. However, this assumption can be violated in
>> cases such as task teardown during do_exit(), where current->nsproxy can
>> be NULL or already cleared.
>>
>> This issue was triggered during stress-ng's kernel-coverage.sh testing,
>> Since alloc_fs_context() can be invoked in various contexts — including
>> from asynchronous or teardown paths like do_exit() — it's difficult to
>> guarantee that its input arguments are always valid.
>>
>> A follow-up patch will improve the granularity of this fix by moving the
>> check closer to the actual mount trigger(e.g., in efivarfs_pm_notify()).
> 
> UGH.
> 
>> diff --git a/fs/fs_context.c b/fs/fs_context.c
>> index 582d33e81117..529de43b8b5e 100644
>> --- a/fs/fs_context.c
>> +++ b/fs/fs_context.c
>> @@ -282,6 +282,9 @@ static struct fs_context *alloc_fs_context(struct file_system_type *fs_type,
>>  	struct fs_context *fc;
>>  	int ret = -ENOMEM;
>>  
>> +	if (!current->nsproxy || !current->nsproxy->net_ns)
>> +		return ERR_PTR(-EINVAL);
>> +
>>  	fc = kzalloc(sizeof(struct fs_context), GFP_KERNEL_ACCOUNT);
>>  	if (!fc)
>>  		return ERR_PTR(-ENOMEM);
> 
> That might paper over the oops, but I very much doubt that this will be
> a correct fix...  Note that in efivarfs_pm_notify() we have other
> fun issues when run from such context - have task_work_add() fail in
> fput() and if delayed_fput() runs right afterwards and
>         efivar_init(efivarfs_check_missing, sfi->sb, false);
> in there might end up with UAF...

I see your point — simply returning early in alloc_fs_context() may just
paper over a deeper issue, and I agree that this might not be the right
long-term fix. I wasn’t aware of the potential UAF scenario involving
efivarfs_pm_notify() and delayed_fput().

I’ll take a closer look at the call paths involved here, especially
around efivarfs_pm_notify(), fput(), and delayed_fput() interactions
during do_exit().

Also, I’ll loop in the EFI mailing list so we can discuss this
further from the efivarfs side as well.

Thanks again,
Yunseong