arch/riscv/include/asm/uaccess.h | 205 +++++++++++++++++++++++-------- 1 file changed, 152 insertions(+), 53 deletions(-)
This series tries to optimize riscv uaccess by allowing the use of user_access_begin() and user_access_end() which permits grouping user accesses and avoiding the CSR write penalty for each access. The error path can also be optimised using asm goto which patches 3 and 4 achieve. This will speed up jumping to labels by avoiding the need of an intermediary error type variable within the uaccess macros I did read the discussion this series generated. It isn't clear to me which direction to take the patches, if any. V2: I've taken on this series as there isn't any response from Jisheng. No significant changes other than build fixes. - Fixes build breakage in patch 3 to do with not having used 'goto' keyword. - Fixes build breakage in patch 4 on 32bit not having delcared __ptr in the macro. V3: Significant commit message rewrites. - Corrected the justification for patch 2 - Better explained/justified patches 3 and 4 Minor code changes for legibility and more comments. Jisheng Zhang (4): riscv: implement user_access_begin() and families riscv: uaccess: use input constraints for ptr of __put_user() riscv: uaccess: use 'asm goto' for put_user() riscv: uaccess: use 'asm_goto_output' for get_user() arch/riscv/include/asm/uaccess.h | 205 +++++++++++++++++++++++-------- 1 file changed, 152 insertions(+), 53 deletions(-) -- 2.34.1
Hi Cyril, On 21/02/2025 01:09, Cyril Bur wrote: > This series tries to optimize riscv uaccess by allowing the use of > user_access_begin() and user_access_end() which permits grouping user accesses > and avoiding the CSR write penalty for each access. > > The error path can also be optimised using asm goto which patches 3 and 4 > achieve. This will speed up jumping to labels by avoiding the need of an > intermediary error type variable within the uaccess macros > > I did read the discussion this series generated. It isn't clear to me > which direction to take the patches, if any. > > V2: > I've taken on this series as there isn't any response from Jisheng. No > significant changes other than build fixes. > - Fixes build breakage in patch 3 to do with not having used 'goto' keyword. > - Fixes build breakage in patch 4 on 32bit not having delcared __ptr in the > macro. > > V3: > Significant commit message rewrites. > - Corrected the justification for patch 2 > - Better explained/justified patches 3 and 4 > Minor code changes for legibility and more comments. > > Jisheng Zhang (4): > riscv: implement user_access_begin() and families > riscv: uaccess: use input constraints for ptr of __put_user() > riscv: uaccess: use 'asm goto' for put_user() > riscv: uaccess: use 'asm_goto_output' for get_user() > > arch/riscv/include/asm/uaccess.h | 205 +++++++++++++++++++++++-------- > 1 file changed, 152 insertions(+), 53 deletions(-) > Following up on Ben's comment here https://lore.kernel.org/linux-riscv/b45aab1e-6d37-4027-9a15-4fa917d806b9@codethink.co.uk/ The problem that Ben mentions is caused by the use of *macros* which used to make the evaluation of the parameter inside the user-accessible section, and since this parameter could be a sleeping function, we could schedule another process with the SUM bit set, which could be cleared by this process, which would make the first process fault when trying to access user memory. I did not find any macro using unsafe_XXX() functions which could cause a problem right now, but I may have missed one and new could come up later, so we have multiple solutions here: - suppose that a macro using unsafe_get/put_user() and passing a sleeping function as argument won't happen and then do nothing - or save/restore CSR sstatus when switching processes - or simply check that SUM is not set when switching processes Let me know what you think. Thanks, Alex
On 14/03/2025 13:28, Alexandre Ghiti wrote:
> Hi Cyril,
>
> On 21/02/2025 01:09, Cyril Bur wrote:
>> This series tries to optimize riscv uaccess by allowing the use of
>> user_access_begin() and user_access_end() which permits grouping user
>> accesses
>> and avoiding the CSR write penalty for each access.
>>
>> The error path can also be optimised using asm goto which patches 3 and 4
>> achieve. This will speed up jumping to labels by avoiding the need of an
>> intermediary error type variable within the uaccess macros
>>
>> I did read the discussion this series generated. It isn't clear to me
>> which direction to take the patches, if any.
>>
>> V2:
>> I've taken on this series as there isn't any response from Jisheng. No
>> significant changes other than build fixes.
>> - Fixes build breakage in patch 3 to do with not having used 'goto'
>> keyword.
>> - Fixes build breakage in patch 4 on 32bit not having delcared __ptr
>> in the
>> macro.
>>
>> V3:
>> Significant commit message rewrites.
>> - Corrected the justification for patch 2
>> - Better explained/justified patches 3 and 4
>> Minor code changes for legibility and more comments.
>>
>> Jisheng Zhang (4):
>> riscv: implement user_access_begin() and families
>> riscv: uaccess: use input constraints for ptr of __put_user()
>> riscv: uaccess: use 'asm goto' for put_user()
>> riscv: uaccess: use 'asm_goto_output' for get_user()
>>
>> arch/riscv/include/asm/uaccess.h | 205 +++++++++++++++++++++++--------
>> 1 file changed, 152 insertions(+), 53 deletions(-)
>>
> Following up on Ben's comment here https://lore.kernel.org/linux-riscv/
> b45aab1e-6d37-4027-9a15-4fa917d806b9@codethink.co.uk/
>
> The problem that Ben mentions is caused by the use of *macros* which
> used to make the evaluation of the parameter inside the user-accessible
> section, and since this parameter could be a sleeping function, we could
> schedule another process with the SUM bit set, which could be cleared by
> this process, which would make the first process fault when trying to
> access user memory. I did not find any macro using unsafe_XXX()
> functions which could cause a problem right now, but I may have missed
> one and new could come up later, so we have multiple solutions here:
>
> - suppose that a macro using unsafe_get/put_user() and passing a
> sleeping function as argument won't happen and then do nothing
> - or save/restore CSR sstatus when switching processes
> - or simply check that SUM is not set when switching processes
>
> Let me know what you think.
I'm on the save the flag side, for these reasons:
#1 sleeping functions can happen more often when various checks
get enabled in the kernel (this was why the original fault
was found). Adding larger sections is just going to make
the fault pop up again at some point in the future.
#2 the save/restore is a small addition to the swap registers
#3 saving SUM over a regs swap is always going to make sure we
never see this gremlin turn up again
FYI, I think I may have posted our original test thread at some
point, but I could do so again.
>
> Thanks,
>
> Alex
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
>
--
Ben Dooks http://www.codethink.co.uk/
Senior Engineer Codethink - Providing Genius
https://www.codethink.co.uk/privacy.html
On 15/3/2025 12:49 am, Ben Dooks wrote: > On 14/03/2025 13:28, Alexandre Ghiti wrote: >> Hi Cyril, >> >> On 21/02/2025 01:09, Cyril Bur wrote: >>> This series tries to optimize riscv uaccess by allowing the use of >>> user_access_begin() and user_access_end() which permits grouping user >>> accesses >>> and avoiding the CSR write penalty for each access. >>> >>> The error path can also be optimised using asm goto which patches 3 >>> and 4 >>> achieve. This will speed up jumping to labels by avoiding the need of an >>> intermediary error type variable within the uaccess macros >>> >>> I did read the discussion this series generated. It isn't clear to me >>> which direction to take the patches, if any. >>> >>> V2: >>> I've taken on this series as there isn't any response from Jisheng. No >>> significant changes other than build fixes. >>> - Fixes build breakage in patch 3 to do with not having used 'goto' >>> keyword. >>> - Fixes build breakage in patch 4 on 32bit not having delcared __ptr >>> in the >>> macro. >>> >>> V3: >>> Significant commit message rewrites. >>> - Corrected the justification for patch 2 >>> - Better explained/justified patches 3 and 4 >>> Minor code changes for legibility and more comments. >>> >>> Jisheng Zhang (4): >>> riscv: implement user_access_begin() and families >>> riscv: uaccess: use input constraints for ptr of __put_user() >>> riscv: uaccess: use 'asm goto' for put_user() >>> riscv: uaccess: use 'asm_goto_output' for get_user() >>> >>> arch/riscv/include/asm/uaccess.h | 205 +++++++++++++++++++++++-------- >>> 1 file changed, 152 insertions(+), 53 deletions(-) >>> >> Following up on Ben's comment here https://lore.kernel.org/linux- >> riscv/ b45aab1e-6d37-4027-9a15-4fa917d806b9@codethink.co.uk/ >> >> The problem that Ben mentions is caused by the use of *macros* which >> used to make the evaluation of the parameter inside the user- >> accessible section, and since this parameter could be a sleeping >> function, we could schedule another process with the SUM bit set, >> which could be cleared by this process, which would make the first >> process fault when trying to access user memory. I did not find any >> macro using unsafe_XXX() functions which could cause a problem right >> now, but I may have missed one and new could come up later, so we have >> multiple solutions here: >> >> - suppose that a macro using unsafe_get/put_user() and passing a >> sleeping function as argument won't happen and then do nothing >> - or save/restore CSR sstatus when switching processes >> - or simply check that SUM is not set when switching processes >> >> Let me know what you think. > > I'm on the save the flag side, for these reasons: > > #1 sleeping functions can happen more often when various checks > get enabled in the kernel (this was why the original fault > was found). Adding larger sections is just going to make > the fault pop up again at some point in the future. > > #2 the save/restore is a small addition to the swap registers > > #3 saving SUM over a regs swap is always going to make sure we > never see this gremlin turn up again > > FYI, I think I may have posted our original test thread at some > point, but I could do so again. Yes, after Ben pointed out the issue I came to the conclusion we probably want Bens patch which saves the bit. Apologies if I didn't express this thought in email. I'm happy to take the patch and put it on the front of this series, although perhaps it makes more sense you to revive the patch since you're still around Ben? > >> >> Thanks, >> >> Alex >> >> >> _______________________________________________ >> linux-riscv mailing list >> linux-riscv@lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/linux-riscv >> > >
On 17/03/2025 23:52, Cyril Bur wrote: > > > On 15/3/2025 12:49 am, Ben Dooks wrote: >> On 14/03/2025 13:28, Alexandre Ghiti wrote: >>> Hi Cyril, >>> >>> On 21/02/2025 01:09, Cyril Bur wrote: >>>> This series tries to optimize riscv uaccess by allowing the use of >>>> user_access_begin() and user_access_end() which permits grouping >>>> user accesses >>>> and avoiding the CSR write penalty for each access. >>>> >>>> The error path can also be optimised using asm goto which patches 3 >>>> and 4 >>>> achieve. This will speed up jumping to labels by avoiding the need >>>> of an >>>> intermediary error type variable within the uaccess macros >>>> >>>> I did read the discussion this series generated. It isn't clear to me >>>> which direction to take the patches, if any. >>>> >>>> V2: >>>> I've taken on this series as there isn't any response from Jisheng. No >>>> significant changes other than build fixes. >>>> - Fixes build breakage in patch 3 to do with not having used 'goto' >>>> keyword. >>>> - Fixes build breakage in patch 4 on 32bit not having delcared __ptr >>>> in the >>>> macro. >>>> >>>> V3: >>>> Significant commit message rewrites. >>>> - Corrected the justification for patch 2 >>>> - Better explained/justified patches 3 and 4 >>>> Minor code changes for legibility and more comments. >>>> >>>> Jisheng Zhang (4): >>>> riscv: implement user_access_begin() and families >>>> riscv: uaccess: use input constraints for ptr of __put_user() >>>> riscv: uaccess: use 'asm goto' for put_user() >>>> riscv: uaccess: use 'asm_goto_output' for get_user() >>>> >>>> arch/riscv/include/asm/uaccess.h | 205 ++++++++++++++++++++++ >>>> +-------- >>>> 1 file changed, 152 insertions(+), 53 deletions(-) >>>> >>> Following up on Ben's comment here https://lore.kernel.org/linux- >>> riscv/ b45aab1e-6d37-4027-9a15-4fa917d806b9@codethink.co.uk/ >>> >>> The problem that Ben mentions is caused by the use of *macros* which >>> used to make the evaluation of the parameter inside the user- >>> accessible section, and since this parameter could be a sleeping >>> function, we could schedule another process with the SUM bit set, >>> which could be cleared by this process, which would make the first >>> process fault when trying to access user memory. I did not find any >>> macro using unsafe_XXX() functions which could cause a problem right >>> now, but I may have missed one and new could come up later, so we >>> have multiple solutions here: >>> >>> - suppose that a macro using unsafe_get/put_user() and passing a >>> sleeping function as argument won't happen and then do nothing >>> - or save/restore CSR sstatus when switching processes >>> - or simply check that SUM is not set when switching processes >>> >>> Let me know what you think. >> >> I'm on the save the flag side, for these reasons: >> >> #1 sleeping functions can happen more often when various checks >> get enabled in the kernel (this was why the original fault >> was found). Adding larger sections is just going to make >> the fault pop up again at some point in the future. >> >> #2 the save/restore is a small addition to the swap registers >> >> #3 saving SUM over a regs swap is always going to make sure we >> never see this gremlin turn up again >> >> FYI, I think I may have posted our original test thread at some >> point, but I could do so again. > > Yes, after Ben pointed out the issue I came to the conclusion we > probably want Bens patch which saves the bit. Apologies if I didn't > express this thought in email. > > I'm happy to take the patch and put it on the front of this series, > although perhaps it makes more sense you to revive the patch since > you're still around Ben? Yes, I'm currently very busy so happy for someone else to get this merged and tested. We could do with some better testing on whether we leak flags on task switches (possibly), but that's a side quest and not for now. -- Ben Dooks http://www.codethink.co.uk/ Senior Engineer Codethink - Providing Genius https://www.codethink.co.uk/privacy.html
Hi Cyril, On Fri, Feb 21, 2025 at 11:09 AM Cyril Bur <cyrilbur@tenstorrent.com> wrote: > This series tries to optimize riscv uaccess by allowing the use of > user_access_begin() and user_access_end() which permits grouping user accesses > and avoiding the CSR write penalty for each access. I tested this on the upcoming Tenstorrent Ascalon CPU using a simple microbenchmark (getdents64() on a directory with 10 files in it) and it was significantly faster (over 40%). This came from both a reduction in instruction count as well as a reduction in sstatus CSR writes. It would be great if we could get this merged considering x86, arm64 and POWER are exploiting this optimisation already. Thanks, Anton > The error path can also be optimised using asm goto which patches 3 and 4 > achieve. This will speed up jumping to labels by avoiding the need of an > intermediary error type variable within the uaccess macros > > I did read the discussion this series generated. It isn't clear to me > which direction to take the patches, if any. > > V2: > I've taken on this series as there isn't any response from Jisheng. No > significant changes other than build fixes. > - Fixes build breakage in patch 3 to do with not having used 'goto' keyword. > - Fixes build breakage in patch 4 on 32bit not having delcared __ptr in the > macro. > > V3: > Significant commit message rewrites. > - Corrected the justification for patch 2 > - Better explained/justified patches 3 and 4 > Minor code changes for legibility and more comments. > > Jisheng Zhang (4): > riscv: implement user_access_begin() and families > riscv: uaccess: use input constraints for ptr of __put_user() > riscv: uaccess: use 'asm goto' for put_user() > riscv: uaccess: use 'asm_goto_output' for get_user() > > arch/riscv/include/asm/uaccess.h | 205 +++++++++++++++++++++++-------- > 1 file changed, 152 insertions(+), 53 deletions(-) > > -- > 2.34.1 > >
© 2016 - 2025 Red Hat, Inc.