[PATCH] riscv: Fix kernel crash due to PR_SET_TAGGED_ADDR_CTRL

Nam Cao posted 1 patch 9 months, 1 week ago
arch/riscv/kernel/process.c | 3 +++
1 file changed, 3 insertions(+)
[PATCH] riscv: Fix kernel crash due to PR_SET_TAGGED_ADDR_CTRL
Posted by Nam Cao 9 months, 1 week ago
When userspace does PR_SET_TAGGED_ADDR_CTRL, but Supm extension is not
available, the kernel crashes:

Oops - illegal instruction [#1]
    [snip]
epc : set_tagged_addr_ctrl+0x112/0x15a
 ra : set_tagged_addr_ctrl+0x74/0x15a
epc : ffffffff80011ace ra : ffffffff80011a30 sp : ffffffc60039be10
    [snip]
status: 0000000200000120 badaddr: 0000000010a79073 cause: 0000000000000002
    set_tagged_addr_ctrl+0x112/0x15a
    __riscv_sys_prctl+0x352/0x73c
    do_trap_ecall_u+0x17c/0x20c
    andle_exception+0x150/0x15c

Fix it by checking if Supm is available.

Fixes: 09d6775f503b ("riscv: Add support for userspace pointer masking")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: stable@vger.kernel.org
---
 arch/riscv/kernel/process.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
index 7c244de77180..3db2c0c07acd 100644
--- a/arch/riscv/kernel/process.c
+++ b/arch/riscv/kernel/process.c
@@ -275,6 +275,9 @@ long set_tagged_addr_ctrl(struct task_struct *task, unsigned long arg)
 	unsigned long pmm;
 	u8 pmlen;
 
+	if (!riscv_has_extension_unlikely(RISCV_ISA_EXT_SUPM))
+		return -EINVAL;
+
 	if (is_compat_thread(ti))
 		return -EINVAL;
 
-- 
2.39.5
Re: [PATCH] riscv: Fix kernel crash due to PR_SET_TAGGED_ADDR_CTRL
Posted by Alexandre Ghiti 9 months, 1 week ago
Hi Nam,

On 04/05/2025 12:19, Nam Cao wrote:
> When userspace does PR_SET_TAGGED_ADDR_CTRL, but Supm extension is not
> available, the kernel crashes:
>
> Oops - illegal instruction [#1]
>      [snip]
> epc : set_tagged_addr_ctrl+0x112/0x15a
>   ra : set_tagged_addr_ctrl+0x74/0x15a
> epc : ffffffff80011ace ra : ffffffff80011a30 sp : ffffffc60039be10
>      [snip]
> status: 0000000200000120 badaddr: 0000000010a79073 cause: 0000000000000002
>      set_tagged_addr_ctrl+0x112/0x15a
>      __riscv_sys_prctl+0x352/0x73c
>      do_trap_ecall_u+0x17c/0x20c
>      andle_exception+0x150/0x15c


It seems like the csr write is triggering this illegal instruction, can 
you confirm it is? If so, I can't find in the specification that an 
implementation should do that when writing envcfg and I can't reproduce 
it on qemu. Where did you see this oops?

Thanks,

Alex


>
> Fix it by checking if Supm is available.
>
> Fixes: 09d6775f503b ("riscv: Add support for userspace pointer masking")
> Signed-off-by: Nam Cao <namcao@linutronix.de>
> Cc: stable@vger.kernel.org
> ---
>   arch/riscv/kernel/process.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
> index 7c244de77180..3db2c0c07acd 100644
> --- a/arch/riscv/kernel/process.c
> +++ b/arch/riscv/kernel/process.c
> @@ -275,6 +275,9 @@ long set_tagged_addr_ctrl(struct task_struct *task, unsigned long arg)
>   	unsigned long pmm;
>   	u8 pmlen;
>   
> +	if (!riscv_has_extension_unlikely(RISCV_ISA_EXT_SUPM))
> +		return -EINVAL;
> +
>   	if (is_compat_thread(ti))
>   		return -EINVAL;
>
Re: [PATCH] riscv: Fix kernel crash due to PR_SET_TAGGED_ADDR_CTRL
Posted by Nam Cao 9 months, 1 week ago
Hi Alex,

On Mon, May 05, 2025 at 06:02:26PM +0200, Alexandre Ghiti wrote:
> On 04/05/2025 12:19, Nam Cao wrote:
> > When userspace does PR_SET_TAGGED_ADDR_CTRL, but Supm extension is not
> > available, the kernel crashes:
> > 
> > Oops - illegal instruction [#1]
> >      [snip]
> > epc : set_tagged_addr_ctrl+0x112/0x15a
> >   ra : set_tagged_addr_ctrl+0x74/0x15a
> > epc : ffffffff80011ace ra : ffffffff80011a30 sp : ffffffc60039be10
> >      [snip]
> > status: 0000000200000120 badaddr: 0000000010a79073 cause: 0000000000000002
> >      set_tagged_addr_ctrl+0x112/0x15a
> >      __riscv_sys_prctl+0x352/0x73c
> >      do_trap_ecall_u+0x17c/0x20c
> >      andle_exception+0x150/0x15c
> 
> 
> It seems like the csr write is triggering this illegal instruction, can you
> confirm it is?

Yes, it is the "csr_write(CSR_ENVCFG, envcfg);" in envcfg_update_bits().

> If so, I can't find in the specification that an implementation should do
> that when writing envcfg and I can't reproduce it on qemu. Where did you
> see this oops?

I can't find it in the spec either. I think it is up to the implementation.

I got this crash on the MangoPI board:
https://mangopi.org/mqpro

Best regards,
Nam
Re: [PATCH] riscv: Fix kernel crash due to PR_SET_TAGGED_ADDR_CTRL
Posted by Alexandre Ghiti 9 months, 1 week ago
On 05/05/2025 18:07, Nam Cao wrote:
> Hi Alex,
>
> On Mon, May 05, 2025 at 06:02:26PM +0200, Alexandre Ghiti wrote:
>> On 04/05/2025 12:19, Nam Cao wrote:
>>> When userspace does PR_SET_TAGGED_ADDR_CTRL, but Supm extension is not
>>> available, the kernel crashes:
>>>
>>> Oops - illegal instruction [#1]
>>>       [snip]
>>> epc : set_tagged_addr_ctrl+0x112/0x15a
>>>    ra : set_tagged_addr_ctrl+0x74/0x15a
>>> epc : ffffffff80011ace ra : ffffffff80011a30 sp : ffffffc60039be10
>>>       [snip]
>>> status: 0000000200000120 badaddr: 0000000010a79073 cause: 0000000000000002
>>>       set_tagged_addr_ctrl+0x112/0x15a
>>>       __riscv_sys_prctl+0x352/0x73c
>>>       do_trap_ecall_u+0x17c/0x20c
>>>       andle_exception+0x150/0x15c
>>
>> It seems like the csr write is triggering this illegal instruction, can you
>> confirm it is?
> Yes, it is the "csr_write(CSR_ENVCFG, envcfg);" in envcfg_update_bits().
>
>> If so, I can't find in the specification that an implementation should do
>> that when writing envcfg and I can't reproduce it on qemu. Where did you
>> see this oops?
> I can't find it in the spec either. I think it is up to the implementation.


The reserved fields of senvcfg are WPRI and contrary to WLRL, it does 
not explicitly "permit" to raise an illegal instruction so I'd say it is 
not up to the implementation, I'll ask around.

Thanks,

Alex


>
> I got this crash on the MangoPI board:
> https://mangopi.org/mqpro
>
> Best regards,
> Nam
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
Re: [PATCH] riscv: Fix kernel crash due to PR_SET_TAGGED_ADDR_CTRL
Posted by Alexandre Ghiti 9 months, 1 week ago
On 05/05/2025 21:27, Alexandre Ghiti wrote:
> On 05/05/2025 18:07, Nam Cao wrote:
>> Hi Alex,
>>
>> On Mon, May 05, 2025 at 06:02:26PM +0200, Alexandre Ghiti wrote:
>>> On 04/05/2025 12:19, Nam Cao wrote:
>>>> When userspace does PR_SET_TAGGED_ADDR_CTRL, but Supm extension is not
>>>> available, the kernel crashes:
>>>>
>>>> Oops - illegal instruction [#1]
>>>>       [snip]
>>>> epc : set_tagged_addr_ctrl+0x112/0x15a
>>>>    ra : set_tagged_addr_ctrl+0x74/0x15a
>>>> epc : ffffffff80011ace ra : ffffffff80011a30 sp : ffffffc60039be10
>>>>       [snip]
>>>> status: 0000000200000120 badaddr: 0000000010a79073 cause: 
>>>> 0000000000000002
>>>>       set_tagged_addr_ctrl+0x112/0x15a
>>>>       __riscv_sys_prctl+0x352/0x73c
>>>>       do_trap_ecall_u+0x17c/0x20c
>>>>       andle_exception+0x150/0x15c
>>>
>>> It seems like the csr write is triggering this illegal instruction, 
>>> can you
>>> confirm it is?
>> Yes, it is the "csr_write(CSR_ENVCFG, envcfg);" in envcfg_update_bits().
>>
>>> If so, I can't find in the specification that an implementation 
>>> should do
>>> that when writing envcfg and I can't reproduce it on qemu. Where did 
>>> you
>>> see this oops?
>> I can't find it in the spec either. I think it is up to the 
>> implementation.
>
>
> The reserved fields of senvcfg are WPRI and contrary to WLRL, it does 
> not explicitly "permit" to raise an illegal instruction so I'd say it 
> is not up to the implementation, I'll ask around.


So I had confirmation that WPRI should not raise an illegal instruction 
so that's an issue with the platform. Your patch is not wrong but I'd 
rather have an explicit errata, what do you think?

Thanks,

Alex


>
> Thanks,
>
> Alex
>
>
>>
>> I got this crash on the MangoPI board:
>> https://mangopi.org/mqpro
>>
>> Best regards,
>> Nam
>>
>> _______________________________________________
>> linux-riscv mailing list
>> linux-riscv@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-riscv
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
Re: [PATCH] riscv: Fix kernel crash due to PR_SET_TAGGED_ADDR_CTRL
Posted by Samuel Holland 9 months, 1 week ago
On 2025-05-06 11:31 AM, Alexandre Ghiti wrote:
> On 05/05/2025 21:27, Alexandre Ghiti wrote:
>> On 05/05/2025 18:07, Nam Cao wrote:
>>> Hi Alex,
>>>
>>> On Mon, May 05, 2025 at 06:02:26PM +0200, Alexandre Ghiti wrote:
>>>> On 04/05/2025 12:19, Nam Cao wrote:
>>>>> When userspace does PR_SET_TAGGED_ADDR_CTRL, but Supm extension is not
>>>>> available, the kernel crashes:
>>>>>
>>>>> Oops - illegal instruction [#1]
>>>>>       [snip]
>>>>> epc : set_tagged_addr_ctrl+0x112/0x15a
>>>>>    ra : set_tagged_addr_ctrl+0x74/0x15a
>>>>> epc : ffffffff80011ace ra : ffffffff80011a30 sp : ffffffc60039be10
>>>>>       [snip]
>>>>> status: 0000000200000120 badaddr: 0000000010a79073 cause: 0000000000000002
>>>>>       set_tagged_addr_ctrl+0x112/0x15a
>>>>>       __riscv_sys_prctl+0x352/0x73c
>>>>>       do_trap_ecall_u+0x17c/0x20c
>>>>>       andle_exception+0x150/0x15c
>>>>
>>>> It seems like the csr write is triggering this illegal instruction, can you
>>>> confirm it is?
>>> Yes, it is the "csr_write(CSR_ENVCFG, envcfg);" in envcfg_update_bits().
>>>
>>>> If so, I can't find in the specification that an implementation should do
>>>> that when writing envcfg and I can't reproduce it on qemu. Where did you
>>>> see this oops?
>>> I can't find it in the spec either. I think it is up to the implementation.
>>
>>
>> The reserved fields of senvcfg are WPRI and contrary to WLRL, it does not
>> explicitly "permit" to raise an illegal instruction so I'd say it is not up to
>> the implementation, I'll ask around.
> 
> 
> So I had confirmation that WPRI should not raise an illegal instruction so
> that's an issue with the platform. Your patch is not wrong but I'd rather have
> an explicit errata, what do you think?

There is no erratum here. Allwinner D1 / T-HEAD C906 implements Ss1p11, which
was before senvcfg was added to the privileged architecture, and does not
implement any of the extensions which would imply senvcfg's existence, so the
CSR is reserved. We could check for Xlinuxenvcfg to determine if the CSR access
will raise an exception, but that does not gain anything over checking for Supm
specifically. So the fix is correct.

Reviewed-by: Samuel Holland <samuel.holland@sifive.com>

That said, I wonder if set_tagged_addr_ctrl(task, 0) should succeed when Supm is
not implemented, matching get_tagged_addr_ctrl(). Without Supm, we know that
have_user_pmlen_7 and have_user_pmlen_16 will both be false, so pmlen == 0 is
the only case where we would call envcfg_update_bits(). And we know it would be
a no-op. So an alternative fix would be to return 0 below the pmlen checks:

diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
index 7c244de77180..536da9aa690e 100644
--- a/arch/riscv/kernel/process.c
+++ b/arch/riscv/kernel/process.c
@@ -309,6 +309,9 @@ long set_tagged_addr_ctrl(struct task_struct *task, unsigned
long arg)
 	if (!(arg & PR_TAGGED_ADDR_ENABLE))
 		pmlen = PMLEN_0;

+	if (!riscv_has_extension_unlikely(RISCV_ISA_EXT_SUPM))
+		return 0;
+
 	if (mmap_write_lock_killable(mm))
 		return -EINTR;


But I don't know if this better matches what userspace would expect.

Regards,
Samuel

>>> I got this crash on the MangoPI board:
>>> https://mangopi.org/mqpro
>>>
>>> Best regards,
>>> Nam

Re: [PATCH] riscv: Fix kernel crash due to PR_SET_TAGGED_ADDR_CTRL
Posted by Nam Cao 9 months, 1 week ago
On Tue, May 06, 2025 at 05:29:57PM -0500, Samuel Holland wrote:
> That said, I wonder if set_tagged_addr_ctrl(task, 0) should succeed when Supm is
> not implemented, matching get_tagged_addr_ctrl(). Without Supm, we know that
> have_user_pmlen_7 and have_user_pmlen_16 will both be false, so pmlen == 0 is
> the only case where we would call envcfg_update_bits(). And we know it would be
> a no-op. So an alternative fix would be to return 0 below the pmlen checks:
> 
> diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
> index 7c244de77180..536da9aa690e 100644
> --- a/arch/riscv/kernel/process.c
> +++ b/arch/riscv/kernel/process.c
> @@ -309,6 +309,9 @@ long set_tagged_addr_ctrl(struct task_struct *task, unsigned
> long arg)
>  	if (!(arg & PR_TAGGED_ADDR_ENABLE))
>  		pmlen = PMLEN_0;
> 
> +	if (!riscv_has_extension_unlikely(RISCV_ISA_EXT_SUPM))
> +		return 0;
> +
>  	if (mmap_write_lock_killable(mm))
>  		return -EINTR;
> 
> 
> But I don't know if this better matches what userspace would expect.

I'm not sure about this either. The man page says:

|If the  arguments  are  invalid,  the mode specified in arg2 is
|unrecognized, or if this feature is unsupported by the kernel or disabled
|via /proc/sys/abi/tagged_addr_disabled, the call fails with the error
|EINVAL.
|
|In particular, if prctl(PR_SET_TAGGED_ADDR_CTRL, 0, 0, 0, 0) fails with
|EINVAL, then all addresses passed to the kernel must be untagged.

So according to the man page, returning -EINVAL is the right thing.

But arm64 returns 0 in this case.

I would say let's follow the man page, and leave it as is.

Best regards,
Nam